Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

NLP and Linear-System Scaling

Optimization problems whose objective, constraints, or KKT system span many orders of magnitude often converge poorly — or not at all — without some form of rescaling. pounce inherits two independent scaling layers from Ipopt and adds a third option at the linear-system level (see issue #61).

The two layers are conceptually separate:

LayerOptionWhat it touches
NLP scalingnlp_scaling_methodThe objective f and each constraint row c_i, before the IPM sees them. Changes algorithmic behavior (filter, tol, μ).
Linear-system scalinglinear_system_scalingSymmetric scaling of the KKT augmented system D K D for the factorization. Purely numerical — the IPM sees the same iterates.

You can configure them independently. Defaults match upstream Ipopt: nlp_scaling_method = gradient-based, linear_system_scaling = none.

NLP-level scaling

OptionDefaultEffect
nlp_scaling_methodgradient-basednone / gradient-based / user-scaling.
nlp_scaling_max_gradient100.0Cutoff above which gradient-based scaling applies. Per-row scale = min(1, max_gradient / ‖∇c_i‖_∞).
nlp_scaling_min_value1e-8Floor on computed scale factors — prevents inverting near-zero gradients.
nlp_scaling_obj_target_gradient0.0When > 0, pins the scaled objective gradient ∞-norm to this value. Overrides the max_gradient cutoff.
nlp_scaling_constr_target_gradient0.0Same as above, per constraint row.
obj_scaling_factor1.0Constant multiplier on the objective, applied after the automatic factor.

gradient-based (default)

Evaluates ∇f and ∇c_i once at the starting point x_0 and chooses per-row scales that pull each gradient ∞-norm into a reasonable band. Single-shot is mandatory — recomputing per iteration would invalidate the filter’s history (Wächter, 2013).

The clamp at 1.0 means scaling never amplifies a small row; it only damps large ones.

user-scaling

The TNLP is asked for obj_scaling, a per-variable x_scaling, and a per-constraint g_scaling via the get_scaling_parameters callback. Use this when you know the natural units of your problem (e.g. mass in kg vs. distance in mm) and can supply better scales than the gradient-based heuristic.

Note: pounce’s OrigIpoptNlp currently honors obj_scaling and per-constraint g_scaling. The x_scaling request channel is accepted but not yet acted on. Mirrors the design in issue #61.

If the TNLP’s get_scaling_parameters returns false (the default), pounce falls back to no automatic scaling.

Setting user scaling

  • From C — call SetIpoptProblemScaling(problem, obj, x_scaling, g_scaling) then AddIpoptStrOption("nlp_scaling_method", "user-scaling"). See crates/pounce-cinterface/include/pounce.h.
  • From Rust — implement TNLP::get_scaling_parameters on your problem type.
  • From Pythonpounce.Problem.set_problem_scaling(obj_scaling, x_scaling=None, g_scaling=None), followed by add_option("nlp_scaling_method", "user-scaling"). Walked through end-to-end in python/notebooks/07_scaling.ipynb.

Target-gradient overrides

nlp_scaling_obj_target_gradient and nlp_scaling_constr_target_gradient are subtle. When set to a positive value, they override the max_gradient cutoff and the 1.0 clamp: the scaling is computed unconditionally as target / max_gradient_norm, so the scaled gradient ∞-norm becomes exactly the target. Useful when you have a specific numeric range you want the IPM to see.

The default 0.0 means “use the cutoff path” — i.e. only scale rows that are above nlp_scaling_max_gradient.

Linear-system-level scaling

OptionDefaultEffect
linear_system_scalingnonenone / ruiz. mc19 and slack-based are accepted by the option registry but not yet implemented — both fall back to none.
linear_scaling_on_demandyesDefer scaling computation until a linear solve is poor; reduces overhead for well-conditioned KKT systems.

The KKT augmented system is symmetric; all linear-system scalers in pounce use the symmetric form D K D (single diagonal) to preserve that structure for the downstream factorization (MA57, MUMPS, FERAL/SSIDS).

  • none — first-class choice. The inner linear solver (MA57, MUMPS, FERAL) often does its own scaling under some configurations; stacking pounce-level scaling on top can hurt. Default. Use ma57_automatic_scaling=yes to get MA57’s internal scaling instead.
  • ruiz — iterative symmetric ∞-norm equilibration (Ruiz, CERFACS TR/PA/01/14). Pure Rust, no Fortran dependency. Converges geometrically; capped at 10 iterations. The only implemented scaler today; recommended starting point when MA57’s internal scaling is off.
  • mc19 (not yet implemented) — intended HSL MC19 row/column scaling (Curtis-Reid 1972; minimizes Σ log²|a_ij|). Accepted by the registry but currently logs a warning and falls back to none.
  • slack-based (not yet implemented) — intended slack-aware scaling. Accepted by the registry but falls back to none.

Worked example — nql180

nql180 is one of the Mittelmann NLP benchmarks where both default pounce and default Ipopt fail to clear the strict tol gate (see issue #25). Forcing Ruiz symmetric equilibration on the augmented KKT system is enough to push pounce all the way to “Optimal Solution Found”:

pounce nql180.nl presolve=yes linear_system_scaling=ruiz \
       linear_scaling_on_demand=no
default+ Ruiz (forced)
Exit statusSolved To Acceptable LevelOptimal Solution Found
Iterations4150
Primal infeasibility4.0e-111.2e-15
Dual infeasibility1.0e-53.1e-4
Complementarity1.2e-99.9e-10
Overall NLP error2.4e-79.9e-10

The four-orders-of-magnitude primal-feasibility improvement and ~3 orders on the overall NLP error are the textbook Ruiz benefit: symmetric ∞-norm equilibration lowers the condition number of the KKT matrix enough that the back-solve residuals drop the extra fractional digits needed to clear tol. The extra nine iterations are well spent — the 50-iter Ruiz solution is mathematically of strictly higher quality than the 41-iter unscaled “acceptable” solution.

linear_scaling_on_demand=no forces always-on Ruiz; the default (yes) defers scaling computation until the linear solver flags an iterate as poorly scaled, which is the right behavior for problems that don’t need it (most of the Mittelmann set, where the iter count is unchanged with or without Ruiz).

Reporting

All scaling effects are undone before the solve report (final objective, multipliers, dual residuals, KKT termination metric) is handed back to the user. You always see quantities in the natural units of your TNLP.

Internally, the IPM operates in scaled space: stopping criteria (tol, acceptable_tol) compare scaled values, the barrier parameter μ is in scaled units, and the filter’s history is built from scaled function values.

When to override the defaults

Reach for non-default scaling when:

  • The constraint Jacobian has entries spanning many orders of magnitude (chemistry, power-flow, mixed-unit mechanics). Try mc19 or ruiz at the linear-system level, after disabling MA57’s internal scaling.
  • The IPM stalls with small step sizes but no clear infeasibility. Worth turning nlp_scaling_method=none to see whether the default gradient scaling is doing the wrong thing; then re-enable with problem-specific target gradients.
  • You know the natural units of your problem better than the solver can infer from gradients at x_0. Wire user-scaling.

Otherwise the upstream-Ipopt-style defaults (gradient-based at the NLP level, none at the linear-system level with MA57’s internal scaling on) are a reasonable starting point.

References