Parametric Sensitivity Analysis (sIPOPT-style)#
This notebook demonstrates discopt’s sIPOPT-inspired sensitivity features for parametric NLP. After a single solve, you can:
Approximate re-solve — predict the optimal solution at perturbed parameter values using first-order Taylor expansion, without re-solving the NLP.
Dual sensitivity — see how constraint multipliers change with parameters.
Reduced Hessian — project the Lagrangian Hessian onto the null space of active constraints for covariance estimation.
Fast sensitivity re-query — solve the cached KKT system with a new right-hand side for arbitrary parameter perturbations.
These features reuse the KKT factorization from differentiable_solve_l3, following the
approach of sIPOPT [Pirnay et al., 2012].
Key references:
Pirnay et al. [2012] — sIPOPT: optimal sensitivity based on IPOPT
Fiacco [1983] — classical parametric sensitivity theory
Amos and Kolter [2017] — differentiable optimization as a neural network layer
import os
os.environ["JAX_PLATFORMS"] = "cpu"
os.environ["JAX_ENABLE_X64"] = "1"
import discopt.modeling as dm
import numpy as np
from discopt._jax.differentiable import differentiable_solve_l3
Background: Implicit Differentiation of the KKT System#
Consider a parametric NLP:
At the optimal solution \((x^*, \lambda^*)\), the KKT conditions hold:
where \(L = f + \lambda^T g\) is the Lagrangian and \(g_a\) are the active constraints.
Differentiating with respect to \(p\) yields the sensitivity system [Fiacco, 1983]:
The left-hand matrix is the KKT matrix from the NLP solve. sIPOPT’s key insight [Pirnay et al., 2012] is that this factorization is already available from the interior-point solver (IPOPT), so sensitivity computation is nearly free — just a back-substitution with a new RHS.
discopt’s differentiable_solve_l3 performs the same computation using JAX automatic
differentiation to build the KKT system, then caches it for reuse.
Example 1: Approximate Re-solve#
Solve once, then predict the solution at nearby parameter values.
Optimal: \(x^* = p\), with \(dx^*/dp = 1\). The first-order approximation \(x^*(p + \Delta p) \approx x^*(p) + (dx^*/dp) \cdot \Delta p\) is exact here.
m = dm.Model("approx_resolve")
p = m.parameter("p", value=3.0)
x = m.continuous("x", lb=-10, ub=10)
m.minimize((x - p) ** 2)
result = differentiable_solve_l3(m)
print(f"Solve at p=3.0: x* = {result.x['x'].item():.6f}")
print(f"dx*/dp = {result.sensitivity_matrix()[0, 0]:.6f}")
# Approximate re-solve at several nearby parameter values
for p_new in [3.1, 3.5, 4.0, 5.0]:
x_approx = result.approximate_resolve([(p, p_new)])
print(f" p={p_new:.1f}: x_approx={x_approx['x'].item():.4f} (exact: {p_new:.4f})")
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
Solve at p=3.0: x* = 3.000000
dx*/dp = 1.000000
p=3.1: x_approx=3.1000 (exact: 3.1000)
p=3.5: x_approx=3.5000 (exact: 3.5000)
p=4.0: x_approx=4.0000 (exact: 4.0000)
p=5.0: x_approx=5.0000 (exact: 5.0000)
Accuracy: first-order approximation#
For problems where the sensitivity is nonlinear, the approximation error is \(O(\|\Delta p\|^2)\). Let’s verify with a constrained example:
Here \(x^* = \max(0, p)\) for \(p > 0\), \(dx^*/dp = 1\) when the constraint is active.
m = dm.Model("approx_constrained")
p = m.parameter("p", value=2.0)
x = m.continuous("x", lb=-10, ub=10)
m.minimize(x**2)
m.subject_to(x >= p)
result = differentiable_solve_l3(m)
print(f"{'dp':>6s} {'x_approx':>10s} {'x_exact':>10s} {'error':>10s}")
print("-" * 42)
for dp in [0.001, 0.01, 0.1, 0.5, 1.0]:
p_new = 2.0 + dp
x_approx = result.approximate_resolve([(p, p_new)])["x"].item()
x_exact = p_new # Active constraint: x* = p
err = abs(x_approx - x_exact)
print(f"{dp:6.3f} {x_approx:10.6f} {x_exact:10.6f} {err:10.2e}")
dp x_approx x_exact error
------------------------------------------
0.001 2.001000 2.001000 9.37e-09
0.010 2.010000 2.010000 9.37e-09
0.100 2.100000 2.100000 9.37e-09
0.500 2.500000 2.500000 9.37e-09
1.000 3.000000 3.000000 9.37e-09
Example 2: Dual Sensitivity#
The dual sensitivity \(d\lambda^*/dp\) tells us how constraint multipliers change with parameter perturbation. This is useful for:
Constraint importance: which constraints are most sensitive to data changes?
Degeneracy detection: rapidly changing multipliers indicate near-degeneracy.
The multiplier at optimality is \(\lambda^* = 2p\) (from stationarity: \(2x - \lambda = 0\)), so \(d\lambda^*/dp = 2\).
m = dm.Model("dual_sens")
p = m.parameter("p", value=2.0)
x = m.continuous("x", lb=-10, ub=10)
m.minimize(x**2)
m.subject_to(x >= p)
result = differentiable_solve_l3(m)
ds = result.dual_sensitivity(p)
print(f"x* = {result.x['x'].item():.6f}")
print("Dual sensitivity (dlambda/dp):")
if ds is not None:
for i in range(ds.shape[0]):
print(f" Constraint {i}: dlambda/dp = {ds[i, 0]:.6f}")
else:
print(" No active constraints (dual sensitivity is None)")
x* = 2.000000
Dual sensitivity (dlambda/dp):
Constraint 0: dlambda/dp = 2.000000
Example 3: Reduced Hessian#
The reduced Hessian is the Hessian of the Lagrangian projected onto the null space of the active constraint Jacobian [Nocedal and Wright, 2006]:
where \(Z\) spans the null space of \(J_a\). This is important for:
Second-order sufficiency: \(H_r \succ 0\) confirms the solution is a strict local minimum.
Covariance estimation: \(\text{Cov}(p) \approx \sigma^2 (Z^T H Z)^{-1}\) estimates parameter uncertainty from measurement noise.
Unconstrained case#
With no active constraints, the reduced Hessian equals the full Hessian.
# Unconstrained: reduced Hessian = full Hessian
m = dm.Model("rh_unconstrained")
p = m.parameter("p", value=3.0)
x = m.continuous("x", lb=-10, ub=10)
m.minimize((x - p) ** 2)
result = differentiable_solve_l3(m)
rh = result.reduced_hessian()
print(f"Reduced Hessian (unconstrained): {rh}")
print("Expected: [[2.0]] (d^2/dx^2 of (x-p)^2 = 2)")
Reduced Hessian (unconstrained): [[2.]]
Expected: [[2.0]] (d^2/dx^2 of (x-p)^2 = 2)
# Constrained: one equality removes one DOF
m = dm.Model("rh_constrained")
p = m.parameter("p", value=4.0)
x1 = m.continuous("x1", lb=-10, ub=10)
x2 = m.continuous("x2", lb=-10, ub=10)
m.minimize(x1**2 + x2**2)
m.subject_to(x1 + x2 == p)
result = differentiable_solve_l3(m)
rh = result.reduced_hessian()
print(f"Reduced Hessian (1 equality, 2 vars): shape = {rh.shape}")
print(f"Value: {rh}")
print(f"Positive definite: {np.all(np.linalg.eigvalsh(rh) > 0)}")
Reduced Hessian (1 equality, 2 vars): shape = (1, 1)
Value: [[2.]]
Positive definite: True
Example 4: Fast Sensitivity Re-query#
The sensitivity(dp) method solves the KKT system with a new right-hand side
for an arbitrary parameter perturbation vector \(\Delta p\). This is the core
sIPOPT operation [Pirnay et al., 2012]: one factorization, many back-substitutions.
It is equivalent to \(dx = (dx^*/dp) \cdot dp\) but uses the cached KKT matrix directly, which is more efficient when you want sensitivities for many different \(dp\) directions.
m = dm.Model("sens_requery")
a = m.parameter("a", value=2.0)
b = m.parameter("b", value=3.0)
x = m.continuous("x", lb=-10, ub=10)
y = m.continuous("y", lb=-10, ub=10)
m.minimize((x - a) ** 2 + (y - b) ** 2)
result = differentiable_solve_l3(m)
print(f"Optimal: x*={result.x['x'].item():.4f}, y*={result.x['y'].item():.4f}")
print(f"Sensitivity matrix:\n{result.sensitivity_matrix()}")
# Query sensitivity for different parameter perturbations
perturbations = [
np.array([1.0, 0.0]), # only a changes
np.array([0.0, 1.0]), # only b changes
np.array([0.5, -0.3]), # both change
]
for dp in perturbations:
dx = result.sensitivity(dp)
print(f" dp={dp} -> dx={dx}")
Optimal: x*=2.0000, y*=3.0000
Sensitivity matrix:
[[ 1. -0.]
[-0. 1.]]
dp=[1. 0.] -> dx=[1. 0.]
dp=[0. 1.] -> dx=[0. 1.]
dp=[ 0.5 -0.3] -> dx=[ 0.5 -0.3]
# Batch query: multiple dp at once (columns of a matrix)
dp_batch = np.array(
[
[1.0, 0.0, 0.5],
[0.0, 1.0, -0.3],
]
) # shape (2 params, 3 queries)
dx_batch = result.sensitivity(dp_batch)
print("Batch sensitivity (3 queries):")
print(f" dx shape: {dx_batch.shape}")
for i in range(3):
print(f" Query {i}: dp={dp_batch[:, i]} -> dx={dx_batch[:, i]}")
Batch sensitivity (3 queries):
dx shape: (2, 3)
Query 0: dp=[1. 0.] -> dx=[1. 0.]
Query 1: dp=[0. 1.] -> dx=[0. 1.]
Query 2: dp=[ 0.5 -0.3] -> dx=[ 0.5 -0.3]
Example 5: Putting It All Together — Process Optimization#
Consider a simplified reactor design problem where we optimize temperature and flow rate to minimize cost, subject to a conversion constraint:
We want to know:
What happens to the optimal design if \(R_{\min}\) changes?
How sensitive are the multipliers to the cost coefficients?
m = dm.Model("reactor")
c_T = m.parameter("c_T", value=2.0)
c_F = m.parameter("c_F", value=3.0)
R_min = m.parameter("R_min", value=10.0)
T = m.continuous("T", lb=1, ub=20)
F = m.continuous("F", lb=1, ub=20)
m.minimize(c_T * T + c_F * F)
m.subject_to(T * F >= R_min)
result = differentiable_solve_l3(m)
print(f"Optimal: T*={result.x['T'].item():.4f}, F*={result.x['F'].item():.4f}")
print(f"Cost: {result.objective:.4f}")
print()
# Sensitivity to R_min
print("Sensitivity analysis:")
sm = result.sensitivity_matrix()
if sm is not None:
# R_min is the 3rd parameter (index 2)
print(f" dT*/dR_min = {sm[0, 2]:.6f}")
print(f" dF*/dR_min = {sm[1, 2]:.6f}")
# Approximate re-solve if R_min increases by 10%
x_new = result.approximate_resolve([(R_min, 11.0)])
print("\nApproximate design at R_min=11:")
print(f" T* ~ {x_new['T'].item():.4f}")
print(f" F* ~ {x_new['F'].item():.4f}")
# Validate against actual re-solve
R_min.value = np.float64(11.0)
result_new = differentiable_solve_l3(m)
print("\nExact design at R_min=11:")
print(f" T* = {result_new.x['T'].item():.4f}")
print(f" F* = {result_new.x['F'].item():.4f}")
Optimal: T*=3.8730, F*=2.5820
Cost: 15.4919
Sensitivity analysis:
dT*/dR_min = 0.193649
dF*/dR_min = 0.129099
Approximate design at R_min=11:
T* ~ 4.0666
F* ~ 2.7111
Exact design at R_min=11:
T* = 4.0620
F* = 2.7080
API Summary#
Method |
Description |
|---|---|
|
Full \(dx^*/dp\) matrix (n_vars x n_params) |
|
\(d f^*/dp\) via envelope theorem (L1) |
|
\(d f^*/dp\) via KKT implicit differentiation (L3) |
|
First-order prediction of \(x^*\) at new parameter values |
|
\(d\lambda^*/dp\) for a given parameter |
|
Hessian projected onto null space of active constraints |
|
Solve KKT with new RHS for arbitrary \(\Delta p\) |
All methods reuse the cached KKT factorization from differentiable_solve_l3,
following the sIPOPT approach [Pirnay et al., 2012]. For problems where you need
sensitivities with respect to many different parameter perturbations, this is
significantly faster than re-solving.