Parametric Sensitivity Analysis (sIPOPT-style)#

This notebook demonstrates discopt’s sIPOPT-inspired sensitivity features for parametric NLP. After a single solve, you can:

  1. Approximate re-solve — predict the optimal solution at perturbed parameter values using first-order Taylor expansion, without re-solving the NLP.

  2. Dual sensitivity — see how constraint multipliers change with parameters.

  3. Reduced Hessian — project the Lagrangian Hessian onto the null space of active constraints for covariance estimation.

  4. Fast sensitivity re-query — solve the cached KKT system with a new right-hand side for arbitrary parameter perturbations.

These features reuse the KKT factorization from differentiable_solve_l3, following the approach of sIPOPT [Pirnay et al., 2012].

Key references:

  • Pirnay et al. [2012] — sIPOPT: optimal sensitivity based on IPOPT

  • Fiacco [1983] — classical parametric sensitivity theory

  • Amos and Kolter [2017] — differentiable optimization as a neural network layer

import os

os.environ["JAX_PLATFORMS"] = "cpu"
os.environ["JAX_ENABLE_X64"] = "1"

import discopt.modeling as dm
import numpy as np
from discopt._jax.differentiable import differentiable_solve_l3

Background: Implicit Differentiation of the KKT System#

Consider a parametric NLP:

\[\min_{x} f(x; p) \quad \text{s.t.} \quad g(x; p) \leq 0\]

At the optimal solution \((x^*, \lambda^*)\), the KKT conditions hold:

\[\nabla_x L(x^*, \lambda^*, p) = 0, \quad g_a(x^*, p) = 0\]

where \(L = f + \lambda^T g\) is the Lagrangian and \(g_a\) are the active constraints.

Differentiating with respect to \(p\) yields the sensitivity system [Fiacco, 1983]:

\[\begin{split}\begin{bmatrix} H_{xx} & J_a^T \\ J_a & 0 \end{bmatrix} \begin{bmatrix} dx^*/dp \\ d\lambda^*/dp \end{bmatrix} = -\begin{bmatrix} H_{xp} \\ \partial g_a/\partial p \end{bmatrix}\end{split}\]

The left-hand matrix is the KKT matrix from the NLP solve. sIPOPT’s key insight [Pirnay et al., 2012] is that this factorization is already available from the interior-point solver (IPOPT), so sensitivity computation is nearly free — just a back-substitution with a new RHS.

discopt’s differentiable_solve_l3 performs the same computation using JAX automatic differentiation to build the KKT system, then caches it for reuse.

Example 1: Approximate Re-solve#

Solve once, then predict the solution at nearby parameter values.

\[\min_x (x - p)^2 \quad x \in [-10, 10]\]

Optimal: \(x^* = p\), with \(dx^*/dp = 1\). The first-order approximation \(x^*(p + \Delta p) \approx x^*(p) + (dx^*/dp) \cdot \Delta p\) is exact here.

m = dm.Model("approx_resolve")
p = m.parameter("p", value=3.0)
x = m.continuous("x", lb=-10, ub=10)
m.minimize((x - p) ** 2)

result = differentiable_solve_l3(m)
print(f"Solve at p=3.0: x* = {result.x['x'].item():.6f}")
print(f"dx*/dp = {result.sensitivity_matrix()[0, 0]:.6f}")

# Approximate re-solve at several nearby parameter values
for p_new in [3.1, 3.5, 4.0, 5.0]:
    x_approx = result.approximate_resolve([(p, p_new)])
    print(f"  p={p_new:.1f}: x_approx={x_approx['x'].item():.4f}  (exact: {p_new:.4f})")
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
Solve at p=3.0: x* = 3.000000
dx*/dp = 1.000000
  p=3.1: x_approx=3.1000  (exact: 3.1000)
  p=3.5: x_approx=3.5000  (exact: 3.5000)
  p=4.0: x_approx=4.0000  (exact: 4.0000)
  p=5.0: x_approx=5.0000  (exact: 5.0000)

Accuracy: first-order approximation#

For problems where the sensitivity is nonlinear, the approximation error is \(O(\|\Delta p\|^2)\). Let’s verify with a constrained example:

\[\min_x x^2 \quad \text{s.t.} \quad x \geq p\]

Here \(x^* = \max(0, p)\) for \(p > 0\), \(dx^*/dp = 1\) when the constraint is active.

m = dm.Model("approx_constrained")
p = m.parameter("p", value=2.0)
x = m.continuous("x", lb=-10, ub=10)
m.minimize(x**2)
m.subject_to(x >= p)

result = differentiable_solve_l3(m)

print(f"{'dp':>6s}  {'x_approx':>10s}  {'x_exact':>10s}  {'error':>10s}")
print("-" * 42)
for dp in [0.001, 0.01, 0.1, 0.5, 1.0]:
    p_new = 2.0 + dp
    x_approx = result.approximate_resolve([(p, p_new)])["x"].item()
    x_exact = p_new  # Active constraint: x* = p
    err = abs(x_approx - x_exact)
    print(f"{dp:6.3f}  {x_approx:10.6f}  {x_exact:10.6f}  {err:10.2e}")
    dp    x_approx     x_exact       error
------------------------------------------
 0.001    2.001000    2.001000    9.37e-09
 0.010    2.010000    2.010000    9.37e-09
 0.100    2.100000    2.100000    9.37e-09
 0.500    2.500000    2.500000    9.37e-09
 1.000    3.000000    3.000000    9.37e-09

Example 2: Dual Sensitivity#

The dual sensitivity \(d\lambda^*/dp\) tells us how constraint multipliers change with parameter perturbation. This is useful for:

  • Constraint importance: which constraints are most sensitive to data changes?

  • Degeneracy detection: rapidly changing multipliers indicate near-degeneracy.

\[\min_x x^2 \quad \text{s.t.} \quad x \geq p\]

The multiplier at optimality is \(\lambda^* = 2p\) (from stationarity: \(2x - \lambda = 0\)), so \(d\lambda^*/dp = 2\).

m = dm.Model("dual_sens")
p = m.parameter("p", value=2.0)
x = m.continuous("x", lb=-10, ub=10)
m.minimize(x**2)
m.subject_to(x >= p)

result = differentiable_solve_l3(m)

ds = result.dual_sensitivity(p)
print(f"x* = {result.x['x'].item():.6f}")
print("Dual sensitivity (dlambda/dp):")
if ds is not None:
    for i in range(ds.shape[0]):
        print(f"  Constraint {i}: dlambda/dp = {ds[i, 0]:.6f}")
else:
    print("  No active constraints (dual sensitivity is None)")
x* = 2.000000
Dual sensitivity (dlambda/dp):
  Constraint 0: dlambda/dp = 2.000000

Example 3: Reduced Hessian#

The reduced Hessian is the Hessian of the Lagrangian projected onto the null space of the active constraint Jacobian [Nocedal and Wright, 2006]:

\[H_r = Z^T H_{xx} Z\]

where \(Z\) spans the null space of \(J_a\). This is important for:

  • Second-order sufficiency: \(H_r \succ 0\) confirms the solution is a strict local minimum.

  • Covariance estimation: \(\text{Cov}(p) \approx \sigma^2 (Z^T H Z)^{-1}\) estimates parameter uncertainty from measurement noise.

Unconstrained case#

With no active constraints, the reduced Hessian equals the full Hessian.

# Unconstrained: reduced Hessian = full Hessian
m = dm.Model("rh_unconstrained")
p = m.parameter("p", value=3.0)
x = m.continuous("x", lb=-10, ub=10)
m.minimize((x - p) ** 2)

result = differentiable_solve_l3(m)
rh = result.reduced_hessian()
print(f"Reduced Hessian (unconstrained): {rh}")
print("Expected: [[2.0]]  (d^2/dx^2 of (x-p)^2 = 2)")
Reduced Hessian (unconstrained): [[2.]]
Expected: [[2.0]]  (d^2/dx^2 of (x-p)^2 = 2)
# Constrained: one equality removes one DOF
m = dm.Model("rh_constrained")
p = m.parameter("p", value=4.0)
x1 = m.continuous("x1", lb=-10, ub=10)
x2 = m.continuous("x2", lb=-10, ub=10)
m.minimize(x1**2 + x2**2)
m.subject_to(x1 + x2 == p)

result = differentiable_solve_l3(m)
rh = result.reduced_hessian()
print(f"Reduced Hessian (1 equality, 2 vars): shape = {rh.shape}")
print(f"Value: {rh}")
print(f"Positive definite: {np.all(np.linalg.eigvalsh(rh) > 0)}")
Reduced Hessian (1 equality, 2 vars): shape = (1, 1)
Value: [[2.]]
Positive definite: True

Example 4: Fast Sensitivity Re-query#

The sensitivity(dp) method solves the KKT system with a new right-hand side for an arbitrary parameter perturbation vector \(\Delta p\). This is the core sIPOPT operation [Pirnay et al., 2012]: one factorization, many back-substitutions.

It is equivalent to \(dx = (dx^*/dp) \cdot dp\) but uses the cached KKT matrix directly, which is more efficient when you want sensitivities for many different \(dp\) directions.

m = dm.Model("sens_requery")
a = m.parameter("a", value=2.0)
b = m.parameter("b", value=3.0)
x = m.continuous("x", lb=-10, ub=10)
y = m.continuous("y", lb=-10, ub=10)
m.minimize((x - a) ** 2 + (y - b) ** 2)

result = differentiable_solve_l3(m)
print(f"Optimal: x*={result.x['x'].item():.4f}, y*={result.x['y'].item():.4f}")
print(f"Sensitivity matrix:\n{result.sensitivity_matrix()}")

# Query sensitivity for different parameter perturbations
perturbations = [
    np.array([1.0, 0.0]),  # only a changes
    np.array([0.0, 1.0]),  # only b changes
    np.array([0.5, -0.3]),  # both change
]

for dp in perturbations:
    dx = result.sensitivity(dp)
    print(f"  dp={dp} -> dx={dx}")
Optimal: x*=2.0000, y*=3.0000
Sensitivity matrix:
[[ 1. -0.]
 [-0.  1.]]
  dp=[1. 0.] -> dx=[1. 0.]
  dp=[0. 1.] -> dx=[0. 1.]
  dp=[ 0.5 -0.3] -> dx=[ 0.5 -0.3]
# Batch query: multiple dp at once (columns of a matrix)
dp_batch = np.array(
    [
        [1.0, 0.0, 0.5],
        [0.0, 1.0, -0.3],
    ]
)  # shape (2 params, 3 queries)

dx_batch = result.sensitivity(dp_batch)
print("Batch sensitivity (3 queries):")
print(f"  dx shape: {dx_batch.shape}")
for i in range(3):
    print(f"  Query {i}: dp={dp_batch[:, i]} -> dx={dx_batch[:, i]}")
Batch sensitivity (3 queries):
  dx shape: (2, 3)
  Query 0: dp=[1. 0.] -> dx=[1. 0.]
  Query 1: dp=[0. 1.] -> dx=[0. 1.]
  Query 2: dp=[ 0.5 -0.3] -> dx=[ 0.5 -0.3]

Example 5: Putting It All Together — Process Optimization#

Consider a simplified reactor design problem where we optimize temperature and flow rate to minimize cost, subject to a conversion constraint:

\[\min_{T, F} \; c_T \cdot T + c_F \cdot F \quad \text{s.t.} \quad T \cdot F \geq R_{\min}\]

We want to know:

  1. What happens to the optimal design if \(R_{\min}\) changes?

  2. How sensitive are the multipliers to the cost coefficients?

m = dm.Model("reactor")
c_T = m.parameter("c_T", value=2.0)
c_F = m.parameter("c_F", value=3.0)
R_min = m.parameter("R_min", value=10.0)

T = m.continuous("T", lb=1, ub=20)
F = m.continuous("F", lb=1, ub=20)

m.minimize(c_T * T + c_F * F)
m.subject_to(T * F >= R_min)

result = differentiable_solve_l3(m)
print(f"Optimal: T*={result.x['T'].item():.4f}, F*={result.x['F'].item():.4f}")
print(f"Cost:    {result.objective:.4f}")
print()

# Sensitivity to R_min
print("Sensitivity analysis:")
sm = result.sensitivity_matrix()
if sm is not None:
    # R_min is the 3rd parameter (index 2)
    print(f"  dT*/dR_min = {sm[0, 2]:.6f}")
    print(f"  dF*/dR_min = {sm[1, 2]:.6f}")

# Approximate re-solve if R_min increases by 10%
x_new = result.approximate_resolve([(R_min, 11.0)])
print("\nApproximate design at R_min=11:")
print(f"  T* ~ {x_new['T'].item():.4f}")
print(f"  F* ~ {x_new['F'].item():.4f}")

# Validate against actual re-solve
R_min.value = np.float64(11.0)
result_new = differentiable_solve_l3(m)
print("\nExact design at R_min=11:")
print(f"  T* = {result_new.x['T'].item():.4f}")
print(f"  F* = {result_new.x['F'].item():.4f}")
Optimal: T*=3.8730, F*=2.5820
Cost:    15.4919

Sensitivity analysis:
  dT*/dR_min = 0.193649
  dF*/dR_min = 0.129099

Approximate design at R_min=11:
  T* ~ 4.0666
  F* ~ 2.7111

Exact design at R_min=11:
  T* = 4.0620
  F* = 2.7080

API Summary#

Method

Description

result.sensitivity_matrix()

Full \(dx^*/dp\) matrix (n_vars x n_params)

result.gradient(param)

\(d f^*/dp\) via envelope theorem (L1)

result.implicit_gradient(param)

\(d f^*/dp\) via KKT implicit differentiation (L3)

result.approximate_resolve([(p, val)])

First-order prediction of \(x^*\) at new parameter values

result.dual_sensitivity(param)

\(d\lambda^*/dp\) for a given parameter

result.reduced_hessian()

Hessian projected onto null space of active constraints

result.sensitivity(dp)

Solve KKT with new RHS for arbitrary \(\Delta p\)

All methods reuse the cached KKT factorization from differentiable_solve_l3, following the sIPOPT approach [Pirnay et al., 2012]. For problems where you need sensitivities with respect to many different parameter perturbations, this is significantly faster than re-solving.