Introduction

POUNCE is a general interior-point method, implemented in pure Rust — one numerical backbone that now spans nonlinear, conic/quadratic, and polynomial global optimization rather than a single problem class. Its nonlinear-programming core began as a faithful port of the Ipopt filter line-search method — the algorithm, console output, and option semantics follow upstream Ipopt closely enough that anyone used to reading ipopt logs can drop in pounce without relearning where the numbers live — and it has since grown into a family of solvers sharing that backbone:

Nonlinear programming — the filter line-search interior-point method (the Ipopt port) plus an active-set SQP path, for general smooth problems
```
min  f(x)
s.t. g_L <= g(x) <= g_U
     x_L <=   x  <= x_U
```
where f and g are twice-continuously-differentiable.
Conic & quadratic — LP, convex QP, second-order (SOCP), positive-semidefinite (SDP), and the non-symmetric exponential and power cones, each solved to the global optimum.
Global optimization — certified global optima for nonconvex polynomial problems via SOS / Lasserre relaxations. (A general-purpose spatial branch-and-bound solver, pounce-global, is in development on the feature/global branch and not part of this release.)

See Choosing a Solver for which solver fits which problem.

Pure Rust by default

The default build is pure Rust — no Fortran, no commercial solver, no system BLAS required. The bundled FERAL backend provides a sparse symmetric LDLᵀ factorization. The HSL MA57 backend is available behind the optional ma57 feature for users who have a license for libcoinhsl and have it installed (see Installation).

Status

Production-ready for the core IPM workflow. The algorithm-side core, NLP interface, line search, filter, barrier update (monotone + Mehrotra adaptive), KKT solve, restoration phase, AMPL .nl reader, the C ABI (pounce-cinterface), the Python wrapper (pounce-solver), and the CLI all solve a wide range of NLPs from the standard test suites (Hock-Schittkowski, CUTEst, Mittelmann ampl-nlp, CHO parameter estimation, gas/water network design). Sensitivity analysis (sIPOPT port), reduced-Hessian computation, the auxiliary-equality + FBBT presolve, and the active-set SQP path are all wired in and available behind option keys. Existing PyIpopt / cyipopt / JuMP / AMPL clients link against libpounce_cinterface in place of libipopt unchanged.

The conic and global solvers are wired end-to-end alongside the NLP core: the convex interior-point solver (pounce-convex) handles LP / QP, SOCP, exponential / power cones, and small SDPs — with a Conic Benchmark Format (.cbf) reader cross-checked against the CBLIB tier — and adds SOS / Lasserre polynomial global optimization (sos_minimize). These are reachable from the CLI, the Python package, and the JSON solve report. A deterministic spatial branch-and-bound solver for general factorable nonconvex problems (pounce-global) is in development on the feature/global branch and not part of this release.

License

EPL-2.0, the same license as upstream Ipopt.

Where to go next

Installation — build and install POUNCE.
Quick Start — solve your first problem.
Running Solves — the command-line driver in depth.
Acknowledgments — the papers behind the algorithm.

Installation

Prerequisites

A stable Rust toolchain. Nothing else is needed for the default pure-Rust build. Install Rust via rustup:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"

Verify the install:

rustc --version && cargo --version

Build

From the repository root:

make            # release build of the workspace
make test       # run all tests
make clippy     # lint
make doc        # rustdoc for the Rust API

Install

make install                          # installs to $HOME/.local
sudo make install PREFIX=/usr/local   # or system-wide

This drops the pounce binary into $PREFIX/bin and the libpounce_cinterface shared library into $PREFIX/lib. Make sure $HOME/.local/bin is on your PATH, then verify:

pounce --version

HSL MA57 backend (optional)

The default FERAL backend needs no external libraries. To build with the HSL MA57 linear solver instead, you need a CoinHSL install whose lib/ directory holds libcoinhsl. Point the COINHSL_DIR environment variable at it and build with the ma57 feature:

export COINHSL_DIR=/path/to/CoinHSL
cargo build -p pounce-cli --release --features ma57

Build CoinHSL from https://www.hsl.rl.ac.uk/ipopt/. MA57 is primarily useful for benchmarking against upstream Ipopt; the FERAL backend is the supported default for everyday use, and a build without --features ma57 never touches COINHSL_DIR.

Using POUNCE as a Rust library

The workspace is a set of library crates (see Algorithm & Workspace for the layout). To browse the Rust API, build and open the rustdoc:

make doc        # generates target/doc

Quick Start

This page assumes POUNCE is built and on your PATH (see Installation).

Solve an AMPL `.nl` file

pounce problem.nl

This solves the problem and writes a sibling problem.sol next to the input, following the AMPL solver convention. The console output mirrors upstream ipopt’s banner, per-iteration table, and final summary.

Append KEY=VALUE pairs to override options — the syntax and semantics match the upstream Ipopt CLI:

pounce problem.nl print_level=8 max_iter=500 tol=1e-10

See Solver Options for details.

Try a built-in problem

POUNCE ships several self-contained test problems that exercise the full pipeline without parsing a .nl file (run pounce --list-problems for the full set):

pounce --list-problems
pounce --problem rosenbrock
pounce --problem quadratic

From Python

import numpy as np
from pounce import minimize

res = minimize(lambda x: ((x - 1) ** 2).sum(), x0=np.zeros(3))
print(res.fun, res.x)

See the Python API chapter for the full cyipopt-compatible interface.

From Pyomo

import pyomo_pounce  # registers 'pounce'
from pyomo.environ import SolverFactory

SolverFactory('pounce').solve(model)

See the Pyomo chapter for details.

Full help

pounce --help

Choosing a Solver

POUNCE is not a single solver but a small family of them sharing one numerical backbone. This page is the map: what each solver is, when to reach for it, and how they fit together.

POUNCE solver landscape

The one-sentence version: convex and conic problems are solved to the global optimum; nonconvex problems are solved locally by default, or to a certified global optimum via the SOS (polynomial) and spatial branch-and-bound (general) paths. Every solver, whatever its flavor, ultimately factorizes a symmetric KKT system through the shared pounce-linsol layer, which in turn drives a pluggable backend (FERAL by default, HSL MA57 optionally).

The solvers at a glance

Solver	Problem class	Optimum	Crate	Entry points
NLP filter-IPM	general smooth NLP (nonconvex OK)	local (KKT)	`pounce-algorithm` + `pounce-nlp`	CLI default; Python `Problem`/`minimize`; `--solver nlp`
NLP active-set SQP	general smooth NLP	local	`pounce-algorithm` (subproblems via `pounce-qp`)	`algorithm=active-set-sqp`
Convex IPM (LP/QP)	LP, convex QP	global	`pounce-convex`	`solve_qp_ipm`; `pounce.qp.solve_qp`; `--solver lp-ipm`/`qp-ipm`
Convex IPM (conic)	SOCP, exp/power/PSD cones, convex QCQP	global	`pounce-convex`	`solve_socp_ipm`; `pounce.qp.solve_socp`; `minimize` (convex QCQP); `--solver socp`; `pounce <file>.cbf`
Active-set QP	QP, convex or indefinite	local	`pounce-qp`	`ParametricActiveSetSolver`; `--solver qp-active-set`
SOS / Lasserre	polynomial (nonconvex)	global	`pounce-convex`	`sos_minimize`; `pounce.sos_minimize`

A general-purpose spatial branch-and-bound solver for factorable nonconvex NLPs (pounce-global) is in development on the feature/global branch and is not part of this release — there is no --solver global CLI route or minimize_global Python entry point yet. Today the only certified-global path for nonconvex problems is SOS / Lasserre, for polynomials.

When to choose each

General nonlinear program (the common case) → NLP filter-IPM

If your model has nonlinear objective or constraints and you don’t know (or can’t assume) convexity, this is the default and the most mature path. It is POUNCE’s port of Ipopt’s filter line-search interior-point method: robust on nonconvex problems, with a feasibility restoration phase for hard starts and exact or limited-memory Hessians. It returns a local KKT point — for a nonconvex problem there is no global guarantee.

CLI: pounce model.nl (or a built-in problem).
Python: the cyipopt-style Problem class, or the scipy-style minimize facade.
Reach for limited-memory Hessians (hessian_approximation=limited-memory) when second derivatives are unavailable or expensive.

Selected with algorithm=active-set-sqp. It solves the NLP as a sequence of quadratic subproblems (handed to pounce-qp), which warm-starts extremely well when the active set is stable across solves — e.g. a parametric sweep or a control loop. For a single cold solve of a general NLP, prefer the filter-IPM.

Linear or convex quadratic program → Convex IPM (LP/QP)

If P ⪰ 0 (or P = 0 for an LP), use the convex interior-point solver: it returns the global optimum, detects primal/dual infeasibility, and offers warm-starting, batched and multiple-RHS solving, a build-once / solve-many QpFactorization handle, and post-optimal sensitivity (QpSensitivity — the sIPOPT analog). The CLI’s auto routing classifies an .nl and sends LP/convex-QP problems here automatically.

Python: pounce.qp.solve_qp (and solve_qp_batch, solve_qp_multi_rhs).

Second-order, exponential, or power cones → Convex IPM (conic)

The same convex solver handles conic programs: second-order cones, the exponential and power cones that express geometric programming, entropy / log-sum-exp, logistic models, and p-norm constraints, and the positive-semidefinite cone for small dense SDPs. Also global. This is the path to use when you can cast a nominally-nonconvex problem into a convex cone — you trade modeling effort for a global guarantee. (The PSD cone is self-scaled and runs on the symmetric driver; the exp/power cones run on the non-symmetric HSDE driver, so the two families can’t yet be mixed in one problem.)

A common special case routes here automatically: a convex quadratically-constrained QP (QCQP). When auto routing finds a convex-quadratic inequality ½xᵀHx + aᵀx + b ≤ 0 (H ⪰ 0), it reformulates each such constraint to one second-order cone (H = FᵀF) and sends the whole problem to the conic solver — no .cbf and no manual cone bookkeeping needed. This works from a .nl/Pyomo model on the CLI and from minimize() in Python (which probes each constraint’s Hessian and only routes when it can prove the feasible set is convex). See LP / QP Solver Routing.

Python: pounce.qp.solve_socp(..., cones=[("exp", 3), ("pow", 0.5), ...]) for an explicit cone program, or just minimize(...) for a convex QCQP.
CLI: a Conic Benchmark Format file, pounce model.cbf (see the CBLIB benchmark tier), or any convex-QCQP .nl under auto routing.

Nonconvex problem, global optimum required → SOS or spatial branch-and-bound

When the problem is genuinely nonconvex and a local optimum is not good enough, the shipped path to a certified global optimum is for polynomials:

Polynomial objective/constraints → SOS / Lasserre (sos_minimize, or pounce.sos_minimize). A single semidefinite program certifies the global minimum (the largest γ with p − γ in the Putinar cone), and the global minimizers are recovered from the moment matrix — even multiple ones, via a facial-reduction step. Best for modest degree and dimension; the SDP grows with the relaxation order.

A general-purpose spatial branch-and-bound solver for factorable nonconvex problems (including exp/ln/trig) — pounce-global — is in development on the feature/global branch and is not part of this release.

See Global Optimization for the SOS path in depth.

Indefinite QP, or a QP inner-solver → Active-set QP

pounce-qp is a sparse parametric active-set solver that accepts an indefinite Hessian (via inertia control), with two-sided bounds and factorization-reuse across a homotopy. It is the engine behind the active-set SQP path, and is the right choice for MPC-style problems or any setting where you re-solve a slowly-changing QP many times. Use the convex IPM instead when P ⪰ 0 and you want a single robust solve with infeasibility certificates.

How to override the automatic routing

The CLI classifies each .nl problem and picks a solver, but you can force the choice:

pounce model.nl --solver auto          # default: classify, then route
pounce model.nl --solver nlp           # filter-IPM (or active-set-sqp via algorithm=)
pounce model.nl --solver lp-ipm        # convex LP interior-point
pounce model.nl --solver qp-ipm        # convex QP interior-point
pounce model.nl --solver socp          # conic interior-point (convex QCQP)
pounce model.nl --solver qp-active-set # active-set QP

(The CLI spelling of the option is solver_selection=<value>, e.g. pounce model.nl solver_selection=qp-ipm.)

See LP / QP Solver Routing for how classification works and when it falls back to the more general solver.

The shared backbone

Every interior-point and active-set solver above assembles a symmetric KKT system and factorizes it through pounce-linsol. That trait layer is backend-agnostic:

FERAL (pounce-feral) — a pure-Rust sparse symmetric LDLᵀ factorization. The default; no external dependencies.
HSL MA57 (pounce-hsl) — the well-known Harwell solver via libcoinhsl, enabled with the ma57 build feature for large or ill-conditioned systems.

Because the backend is pluggable, the same solver code runs on either without change.

Cross-cutting layers

These are not solvers you select, but stages and tools the solvers share:

Presolve (pounce-presolve) — an optional front-end that tightens bounds (feasibility-based bound tightening), removes redundant rows, and repairs LICQ degeneracies before the solve.
Restoration (pounce-restoration) — the feasibility-recovery phase the filter-IPM enters when a step cannot reduce both infeasibility and the objective; pounce-l1penalty offers an ℓ₁-exact penalty reformulation for degenerate / LICQ-violating problems.
Sensitivity — pounce-sensitivity gives sIPOPT-style parametric steps and reduced Hessians for the NLP; QpSensitivity does the same for the convex QP. See Sensitivity Analysis.
Cone library (pounce-convex) — nonnegative, second-order, exponential, power, and (for small dense problems) positive-semidefinite cones, so small SDPs solve as a convex class. The PSD cone cannot yet be mixed with the exponential/power cones in one problem (they use different drivers).
Solve report — every path can emit the machine-readable pounce.solve-report/v1 JSON (status, iterations, residuals, timing). See JSON Solve Report.

Global vs. local — the honest summary

POUNCE settles a problem globally along two routes, and locally along one:

Global by convexity — LP, convex QP, SOCP, and the exponential / power / PSD cone classes. Local is global, so a convex or conic reformulation buys the guarantee outright.
Global by certificate (polynomials) — the SOS / Lasserre optimizer certifies the global minimum of a nonconvex polynomial from a single SDP; see Global Optimization.
Local for general NLP — the filter-IPM and SQP paths converge to a KKT point, which for a nonconvex problem carries no global guarantee.

A general-purpose spatial branch-and-bound route for factorable nonconvex problems (pounce-global) is in development on the feature/global branch but not in this release.

Two practical levers for a “global” answer: modeling (cast as much as you can into the convex cone library) and, when that is not possible, the SOS / Lasserre optimizer for polynomials.

Running Solves

The pounce command-line driver solves built-in TNLPs and AMPL .nl files. Its console output mirrors upstream ipopt’s banner, per-iteration table, and final summary, so anyone used to reading ipopt logs can read pounce logs unchanged.

Basic usage

pounce problem.nl
pounce problem.nl print_level=8 max_iter=500 tol=1e-10
pounce problem.nl linear_solver=ma57            # with --features ma57
pounce problem.nl --options-file ipopt.opt      # upstream-format options file

Trailing KEY=VALUE pairs follow the same syntax and semantics as the upstream Ipopt CLI; they override values loaded from --options-file. See Solver Options.

Built-in problems

pounce --list-problems
pounce --problem quadratic
pounce --problem rosenbrock

quadratic — min (x[0]-3)² + (x[1]-4)² (unconstrained, optimum (3, 4)).
rosenbrock — min 100·(x[1]-x[0]²)² + (1-x[0])² (unconstrained, optimum (1, 1)).
bounded-quadratic — quadratic with box bounds 0 ≤ x ≤ 2 (optimum at the upper corner (2, 2)).
eq-quadratic — min x[0]² + x[1]² s.t. x[0] + x[1] = 1 (a single equality).
circle — min x[0] s.t. x[0]² + x[1]² = 1 (a nonlinear equality).
infeasible-eq — two contradictory equalities (x[0]+x[1]=1 and =2); exercises the infeasibility-detection path.

Run pounce --list-problems for the authoritative list.

Built-in problems have no .nl stub, so they only write a .sol file when --sol-output is given explicitly.

Degenerate / MPCC NLPs — the ℓ₁-exact penalty-barrier wrapper

For problems where the standard IPM thrashes in restoration because LICQ fails at the iterate (degenerate equalities, MPCC-like complementarity), enable the Thierry–Biegler ℓ₁-exact penalty-barrier wrapper:

pounce problem.nl l1_exact_penalty_barrier=yes

The wrapper turns every equality row c_i(x) = g_i into a slack-relaxed c_i(x) − p_i + n_i = g_i with (p_i, n_i) ≥ 0, augments the objective by ρ · Σ(p + n), and runs a Byrd–Nocedal–Waltz outer loop that escalates ρ until the slacks collapse (constraints satisfied) or saturate (locally infeasible problem detected). The user-visible (x*, λ*) are reported in the original variable space.

For everyday use, the simpler form is an auto-fallback:

pounce problem.nl l1_fallback_on_restoration_failure=yes

POUNCE first runs the standard solve. If it terminates in Restoration_Failed, Infeasible_Problem_Detected, Solved_To_Acceptable_Level, Maximum_Iterations_Exceeded, or Not_Enough_Degrees_Of_Freedom, the wrapper is invoked transparently and the result is promoted to Solve_Succeeded only if the retry succeeds. Otherwise the original status is preserved.

The tuning knobs are listed under Solver Options.

AMPL / Pyomo solver mode

AMPL drivers — and Pyomo’s ASL interface — invoke a solver as solver problem.nl -AMPL. Pass -AMPL to run pounce that way:

pounce problem.nl -AMPL

It changes nothing about the solve itself; it switches the process to the AMPL exit-code contract (see below), so the driver reads the termination from the .sol file rather than the exit status. The pyomo-pounce package builds on top of this mode.

Exit codes

0 — Solve_Succeeded (or Solved_To_Acceptable_Level).
non-zero — any other ApplicationReturnStatus.

In AMPL solver mode (-AMPL) the exit code instead follows the AMPL contract: 0 for any solve that ran and produced a .sol file — limit-reached, infeasible, even a failed solve — since the termination is carried by the file’s solve_result_num. Genuine startup failures (unreadable .nl, bad option) still exit non-zero.

Diagnostics & introspection

pounce --about                                   # version, build info, features, backends
pounce problem.nl --dump kkt:5-10 --dump iterate # dump per-iteration diagnostics
pounce problem.nl --dump kkt --dump-dir /tmp/d   # override the dump root

--about — print version, build info, enabled features, and linear-solver backends, then exit.
--dump <cat>[:<spec>] — write the diagnostic category to per-iteration files (JSONL). Wired categories are kkt and iterate; an optional :<spec> selects iterations (e.g. kkt:5, kkt:2-10, iterate:all).
--dump-dir <path> — override the dump root (default ./pounce-dump-<timestamp>).
--dump-format <fmt> — dump format (default jsonl).

Help

pounce --help
pounce --version          # also -v, -V

Solver Options

POUNCE accepts options the same way upstream Ipopt does. Option names and semantics follow Ipopt’s, so an existing Ipopt options file or KEY=VALUE invocation works unchanged.

Setting options

On the command line — append KEY=VALUE pairs after the input:

pounce problem.nl tol=1e-10 max_iter=500 print_level=8

From an options file — upstream ipopt.opt format:

pounce problem.nl --options-file ipopt.opt

Command-line KEY=VALUE pairs override values loaded from the options file.

Commonly used options

Option	Meaning
`tol`	Overall convergence tolerance on the KKT error.
`max_iter`	Maximum number of outer iterations.
`print_level`	Console verbosity, 0 (silent) – 12 (maximum debug).
`linear_solver`	KKT linear-solver backend. `ma57` requires the `ma57` feature build.
`mu_strategy`	Barrier-parameter update strategy (`monotone` / `adaptive`).
`solver_selection`	Route LP/convex-QP to the specialized convex IPM. See LP/QP Routing.
`qp_presolve`	Presolve on the convex LP/QP path (`yes` / `no`, default `yes`). See LP/QP Routing.

For the full upstream option catalogue, see the Ipopt options reference; POUNCE reuses those names.

For scaling-specific options (nlp_scaling_method, target-gradient overrides, linear_system_scaling), see the Scaling reference page. For nonlinear bound tightening (presolve_fbbt, fbbt_tol, fbbt_max_iter, fbbt_max_constraints), see the FBBT reference page.

Barrier-parameter (μ) strategy

The barrier parameter μ controls the inner subproblem’s relaxation of complementarity. The two strategies are monotone (default — geometric schedule) and adaptive (quality-function oracle picks each μ from the current iterate’s complementarity). See μ-strategy for when to switch.

Option	Default	Meaning
`mu_strategy`	`monotone`	`monotone` (Fiacco–McCormick schedule) or `adaptive` (oracle-driven).
`mu_oracle`	`quality-function`	Adaptive oracle: `quality-function` / `loqo` / `probing`.
`mu_init`	`0.1`	Seed value for μ at the first iterate.
`mu_min`	`1e-11`	Floor on μ; the solver stops decreasing past this.
`mu_max`	`1e5`	Cap on μ (adaptive mode). When set explicitly it overrides the `mu_max_fact` initialization.
`mu_max_fact`	`1e3`	Initializes `mu_max` as `mu_max_fact · curr_avrg_compl` at the first iterate (adaptive mode).
`mu_target`	`0.0`	Stop target for μ in monotone mode.
`mu_linear_decrease_factor`	`0.2`	κ_μ in `μ ← min(κ_μ · μ, μ^θ_μ)`.
`mu_superlinear_decrease_power`	`1.5`	θ_μ in the same formula.
`barrier_tol_factor`	`10.0`	Inner-subproblem tolerance scales as `barrier_tol_factor · μ`.
`sigma_max`	`1e2`	Upper clamp on σ chosen by the quality-function oracle.
`sigma_min`	`1e-6`	Lower clamp on σ (raising this to `1e-2` can break a stair-stepping stall on some problems).
`adaptive_mu_globalization`	`obj-constr-filter`	Adaptive-mode globalization: `kkt-error`, `obj-constr-filter`, or `never-monotone-mode`.

Quality-function oracle (adaptive-μ details)

These are only consumed when mu_strategy=adaptive and mu_oracle=quality-function. Defaults mirror upstream IpQualityFunctionMuOracle::RegisterOptions.

Option	Default	Meaning
`quality_function_norm_type`	`2-norm-squared`	Norm used to aggregate KKT components inside `q(σ)`: `1-norm`, `2-norm`, `2-norm-squared`, `max-norm`.
`quality_function_centrality`	`none`	Centrality penalty term: `none`, `log`, `reciprocal`, `cubed-reciprocal`.
`quality_function_balancing_term`	`none`	Balancing penalty when complementarity ≪ infeasibilities: `none` or `cubic`.
`quality_function_max_section_steps`	`8`	Cap on golden-section iterations when picking σ.
`quality_function_section_sigma_tol`	`1e-2`	Width tolerance in σ-space terminating the golden-section search.
`quality_function_section_qf_tol`	`0.0`	Relative flatness tolerance on `q(σ)` terminating golden section.

Adaptive-μ globalization

Tuning the safeguards that fall back to monotone-μ mode when the adaptive oracle stops making progress. Defaults mirror upstream IpAdaptiveMuUpdate::RegisterOptions.

Option	Default	Meaning
`adaptive_mu_safeguard_factor`	`0.0`	LOQO safeguard floor on the oracle’s μ candidate.
`adaptive_mu_monotone_init_factor`	`0.8`	Multiplier on `avrg_compl` when seeding monotone mode after a bailout.
`adaptive_mu_restore_previous_iterate`	`no`	Restore the latest free-mode iterate when switching to fixed mode.
`adaptive_mu_kkterror_red_iters`	`4`	Window length for the `kkt-error` globalization history.
`adaptive_mu_kkterror_red_fact`	`0.9999`	Required relative KKT-error reduction over that window.
`adaptive_mu_kkt_norm_type`	`2-norm-squared`	Norm used to score the iterate in adaptive globalization decisions.

ℓ₁ penalty-barrier wrapper options

These tune the degenerate-NLP wrapper described in Running Solves. All are default-tuned and rarely need overriding:

Option	Default	Meaning
`l1_exact_penalty_barrier`	`no`	Run the ℓ₁-exact penalty-barrier wrapper unconditionally.
`l1_fallback_on_restoration_failure`	`no`	Retry with the wrapper only when the standard solve fails.
`l1_penalty_init`	`1.0`	Initial penalty weight ρ.
`l1_penalty_max`	`1e6`	Maximum penalty weight before declaring infeasibility.
`l1_penalty_increase_factor`	`8.0`	Multiplier applied to ρ each outer iteration.
`l1_penalty_max_outer_iter`	`8`	Maximum penalty outer iterations.
`l1_slack_tol`	`1e-6`	Slack tolerance for “constraints satisfied”.
`l1_steering_factor`	`10.0`	Steering-rule factor for ρ escalation.

NLP Presolve

POUNCE’s TNLP-wrapper presolve pipeline runs before the IPM starts. It tightens variable bounds, drops redundant rows, and (optionally) eliminates square auxiliary-equality sub-systems structurally. All are off by default — set the master switch first:

Option	Default	Meaning
`presolve`	`no`	Master switch for the whole presolve layer. Off → wrapper is a no-op.
`presolve_bound_tightening`	`yes`	Phase 1 — Andersen-style bound propagation from linear rows.
`presolve_redundant_constraint_removal`	`yes`	Phase 2 — drop linear constraints already implied by current bounds.
`presolve_linear_eq_reduction`	`no`	Phase ≥2 — eliminate fixed singleton variables exposed by linear equalities.
`presolve_licq_check`	`yes`	Phase 3 — detect rank-deficient equality blocks before the IPM starts.
`presolve_licq_action`	`warn`	What to do on degeneracy: `warn` (just report) or `auto_l1` (turn on ℓ₁).
`presolve_warm_z_bounds`	`yes`	Phase 4 — warm-start bound multipliers when bounds get tightened by Phase 1.
`presolve_bound_mult_init_val`	`1.0`	Value used by Phase 4 for those warm-start hints.
`presolve_max_passes`	`3`	Fixed-point iteration cap across the bound-tightening passes.
`presolve_print_level`	`0`	Per-pass verbosity (0 silent, 5 per-pass, 8 per-transformation).

Feasibility-based bound tightening (Phase 1b)

Interval-arithmetic propagation through nonlinear constraint expression DAGs (see FBBT). Available today for .nl-loaded problems via NlTnlp; other TNLP sources opt out silently.

Option	Default	Meaning
`presolve_fbbt`	`no`	Master switch. Requires `presolve=yes` and an `ExpressionProvider`.
`fbbt_tol`	`1e-6`	Minimum per-variable bound improvement to keep iterating.
`fbbt_max_iter`	`10`	Outer-sweep cap.
`fbbt_max_constraints`	`0`	Per-sweep cap on constraints inspected (`0` = unlimited).

Auxiliary-equality preprocessing (Phase 0)

A separate set of options controls the structural elimination pass documented in Auxiliary-Equality Preprocessing:

Option	Default	Meaning
`presolve_auxiliary`	`no`	Master switch for the Phase-0 structural elimination pass.
`presolve_auxiliary_coupling`	`safe`	Which coupling classes are eligible: `none` / `safe` / `aggressive`.
`presolve_auxiliary_tol`	`1e-8`	Residual tolerance for accepting a candidate block solve.
`presolve_auxiliary_max_block_dim`	`8`	Largest block the lightweight Newton solver will attempt (larger blocks rejected in v1).
`presolve_auxiliary_wall_time_fraction`	`0.1`	Fraction of the solver’s wall-time budget the pass is allowed to spend.
`presolve_auxiliary_diagnostics`	`no`	Emit the diagnostics summary via the journalist after Phase 0 runs.

FERAL backend tuning

linear_solver=feral (the default — see Commonly used options) is configurable through six feral_* options. Defaults are tuned for the IPM workload and rarely need changing; reach for these when profiling a specific problem.

Option	Default	Meaning
`feral_ordering`	`auto`	Fill-reducing ordering method (see table below). `auto` lets feral’s adaptive dispatcher pick per-matrix; `auto_race` measures the actual symbolic outcome and keeps the best.
`feral_pivtol`	`1e-8`	Relative Bunch-Kaufman partial-pivoting threshold `u`. Analog of `ma27_pivtol` / `ma57_pivtol`. Smaller → sparser `L`, faster, less stable; larger → more 2×2 blocks, denser, more stable. LAPACK’s textbook maximum-stability value is `0.5`.
`feral_refine`	`yes`	Iterative refinement on every back-solve. Closes the residual floor from cascade-break’s `L`-factor perturbation; disable only when timing the bare factor + back-solve in isolation.
`feral_cascade_break`	(unset)	Tri-state. Unset → inherit feral’s Phase B default (CB on with bounded delayed-pivot catchment). `yes` records explicit intent (no behavioural change). `no` reproduces pre-Phase-B behaviour by surfacing `DelayBudgetExceeded` on non-root cascade victims.
`feral_fma`	`no`	Dispatch dense kernels through fused multiply-add intrinsics. Roughly 2× throughput on aarch64 / x86_v3, at the cost of per-pivot rounding drift that trips more `WrongInertia` checks. Turn on when kernel throughput dominates and the IPM tolerates a noisier inertia signal.
`feral_singular_pivot_floor`	`1e-20`	Pounce’s analog of MA57’s `CNTL(2)`. After a successful factor, the smallest accepted `D`-block pivot magnitude (scaled space) is compared against this absolute floor; if it falls below, the factor is reported `Singular` so the IPM bumps `δ_w`. `0` disables.

`feral_ordering` variants

All six concrete and adaptive options live under the same string option. feral_ordering also falls back to the POUNCE_FERAL_ORDERING environment variable when not set on the OptionsList.

Value	Strategy
`auto`	Default. Adaptive dispatcher: picks a concrete method per matrix from cheap pattern features. Branches: very-large-and-sparse (`n > 100 000`, avg degree < 5) → AMD; `n ≤ 10 000` → AMF; otherwise → MetisND. One symbolic pass; right when the heuristic shape rules apply (the common case).
`auto_race`	Race-based dispatcher: runs full symbolic factorization on AMD, MetisND, ScotchND, KahipND and keeps the smallest `factor_nnz`. ~4× a single symbolic pass, paid once per problem (symbolic factorization is cached across numeric refactorizations with the same pattern). Use when the cheap dispatcher’s guess is suspect — e.g. `pinene_3200_0009`, where `auto` picks MetisND (88 s numeric factor) but `amd` factors in 19.5 s on the same matrix.
`amd`	Approximate Minimum Degree (Amestoy/Davis/Duff). Pins AMD regardless of problem shape; robust default for IPM workloads. Best for very-large-and-sparse cases that the adaptive dispatcher already routes here.
`amf`	Approximate Minimum Fill (HAMF4 variant of Amestoy 1999). Strong on small-and-sparse populations (`n ≤ 10 000`); aggregate fill ≈ 0.87× AMD on feral’s IPM small-sparse inventory.
`metis`	feral-metis multilevel nested dissection. Tends to produce squarer fronts than AMD on banded / nearly-1D structure; preferred for large structured matrices.
`scotch`	feral-scotch nested dissection. Similar regime to METIS; alternative when METIS is unavailable or for cross-validation.
`kahip`	feral-kahip flow-based nested dissection with K1 preprocessing. Ties METIS on fill geomean at 4–6× per-call symbolic cost. Reach for it only when ND fill matters and per-call cost is amortized.

When in doubt: leave feral_ordering at the default. When a hard problem looks linear-solver-bound, try feral_ordering auto_race before per-variant manual sweeping — it’s the safe choice when the per-problem winner is uncertain.

Caller-supplied ordering (`External`)

Beyond the string variants above, a structure-aware caller can inject a precomputed permutation the generic AMD/METIS pass cannot see — a block-triangular / Schur ordering (Parker, Garcia & Bent, arXiv:2602.17968) or a tearing ordering from equation-oriented decomposition. Because a permutation is a vector it cannot travel through the string feral_ordering option; supply it programmatically instead:

Python: Problem.set_ordering(perm) (and get_ordering() / clear_ordering()) — see the Python guide.
Rust: IpoptApplication::set_external_ordering(perm).

perm is a 0-based, new-to-old permutation (perm[k] is the original index that becomes index k) whose length must equal the augmented KKT system dimension (variables + slacks + constraint duals), not the problem’s n. FERAL validates it as a bijection and fails the factorization with an error on a wrong length or duplicate — a valid but poor ordering only costs fill/time, never correctness. This maps to FERAL’s OrderingMethod::External (feral#107) and honors only the default FERAL backend.

Logging and colored output

POUNCE emits structured logs and a colored iteration table through the tracing ecosystem. Behavior is governed by environment variables (not solver options), so they apply to the pounce CLI, the C/Python frontends, and anything embedding the library.

Variable	Values	Effect
`RUST_LOG`	e.g. `info`, `debug`, `pounce::restoration=debug`	Log verbosity / per-target filtering. Default `info`. Logs go to stderr.
`POUNCE_LOG_FORMAT`	`text` (default) · `json`	`json` emits line-delimited JSON on stderr (incl. the per-iteration `pounce::iteration` stream) for Studio / CI ingestion.
`NO_COLOR`	set to any value	Disables ANSI color in the iteration table and logs (see https://no-color.org).
`CLICOLOR_FORCE`	set to any value	Forces color even when stdout is not a terminal.

Filtering by subsystem. Solver internals log under namespaced targets — pounce::algorithm, pounce::linsol, pounce::mu, pounce::sqp, pounce::linesearch, pounce::restoration, pounce::presolve, pounce::py. For example, to trace only the restoration phase:

RUST_LOG=pounce::restoration=debug pounce problem.nl

Program output vs. logs. The iteration table, the final summary, and --dump diagnostics are program output on stdout; diagnostic and progress messages are logs on stderr. Redirecting one does not affect the other:

pounce problem.nl > result.txt 2> solve.log

Color. The iteration table is colored with a tiger/rust theme: restoration lines take a background that varies by restoration kind (soft-stay → tan, soft-exit → amber, hard → deep rust), and the row text shades from black toward red as the primal step length alpha shrinks (stalling). Color is emitted only when stdout is a terminal; redirected output and NO_COLOR get plain text with identical column alignment.

Machine-readable iterations. POUNCE_LOG_FORMAT=json turns the per-iteration records into JSON on stderr:

POUNCE_LOG_FORMAT=json pounce problem.nl 2> iters.jsonl

LP / QP Solver Routing

POUNCE can route linear programs (LP), convex quadratic programs (QP), and convex quadratically-constrained QPs (QCQP) to a specialized interior-point solver (pounce-convex) instead of the general nonlinear (NLP) filter-IPM. The specialized path uses Mehrotra predictor-corrector and reaches the solution in materially fewer iterations on these problem classes — typically 30–50% fewer than the general NLP path on bound- or inequality-constrained convex QPs.

Routing is automatic and transparent: you do not change how you call POUNCE. The same pounce problem.nl, the same SolverFactory('pounce') in Pyomo, and the same AMPL solve all work unchanged — POUNCE inspects the problem and picks the solver.

How routing works

When POUNCE loads a problem it classifies it into one of:

Class	Routed to
LP	convex IPM (`pounce-convex`)
convex QP	convex IPM (`pounce-convex`)
convex QCQP	conic IPM (`pounce-convex`, SOCP)
nonconvex QP	NLP filter-IPM (finds a local minimum)
NLP	NLP filter-IPM

The classifier is conservative: a problem is sent to the convex solver only when POUNCE can prove it is convex — an LP or convex QP (degree-≤2 objective with a positive-semidefinite Hessian, linear constraints), or a convex QCQP (additionally allowing convex-quadratic inequality constraints, each with a positive-semidefinite Hessian and a one-sided ≤ bound, which are reformulated to second-order cones). Anything it cannot prove convex — transcendental terms, an indefinite objective Hessian, a quadratic equality, or a quadratic inequality whose feasible set is nonconvex — falls back to the general NLP solver, which always produces a correct (locally optimal) answer. You never get a wrong “optimum” from a misclassification.

Note on QP detection. The AMPL .nl format has no dedicated quadratic section: a QP’s quadratic terms are written into the nonlinear expression tree. POUNCE walks that tree to recover the Hessian and test convexity, the same way QP-capable AMPL solvers do.

Choosing the solver explicitly

The solver_selection option overrides the automatic choice. It is a normal POUNCE option, so it works on the command line, in an options file, or through Pyomo’s solver.options.

Value	Behavior
`auto`	Default. Route by detected class (table above).
`nlp`	Always use the NLP filter-IPM, regardless of class.
`lp-ipm`	Force the convex IPM; errors if the problem is not an LP.
`qp-ipm`	Force the convex IPM; errors if the problem is not LP/convex-QP.
`socp`	Force the conic IPM; errors if the problem is not a convex QCQP.
`qp-active-set`	Reserved for the active-set QP track; currently falls back to NLP.

# Let POUNCE decide (default):
pounce model.nl

# Force the NLP path even on a convex QP (e.g. to compare):
pounce model.nl solver_selection=nlp

# Insist the problem is a convex QP — fail loudly if it is not:
pounce model.nl solver_selection=qp-ipm

A forced value that does not match the detected class is rejected with a clear message rather than silently ignored:

pounce: problem class NLP does not match forced solver qp-ipm
        (expected an LP or convex QP)

From Pyomo

solver = SolverFactory('pounce')
solver.options['solver_selection'] = 'qp-ipm'   # or 'auto', 'nlp', ...
solver.solve(model)

What you get back

Before solving, POUNCE prints a one-line routing banner naming the detected class, the solver it selected, and the effective solver_selection — so it is always clear which of POUNCE’s solvers ran and why:

Problem class: LP. Selected solver: convex QP interior-point (pounce-convex) [solver_selection=auto].

(The banner is suppressed alongside the startup banner — sb yes or JSON-debug protocol mode — to keep stdout clean for machine consumers.)

The convex IPM then reports the same way as the NLP path: an optimal-status line, the objective value (in your original sense — a maximize objective and any constant term are reported correctly), and a .sol file with the primal solution when one is requested.

POUNCE (LP IPM, pounce-convex): Optimal Solution Found.
        obj=2.00000000  iters=2

Driver. The convex path uses the homogeneous self-dual embedding (HSDE) interior-point driver — the same self-dual formulation Clarabel/ECOS use. It is self-starting, returns verified infeasibility/unboundedness certificates, and conditions the KKT system internally through its per-cone scaling, so it solves even badly-scaled LPs (e.g. NETLIB nl, ‖c‖ ~ 1e6) without external pre-scaling.

Presolve

Before the convex interior-point solve, POUNCE runs a presolve pass that shrinks the problem and can detect trivial infeasibility or unboundedness without solving. It removes empty, duplicate, and activity-redundant rows; fixes and substitutes structural columns (singleton-row fixings, free columns, free column singletons); and recovers both the primal and dual of the eliminated pieces so the reported solution is for your original problem. When it reduces the model, it logs a one-line summary:

Presolve: 40 → 32 vars, 12 → 8 rows (fixed 3, free-fixed 2, substituted 3)

Presolve is on by default. Turn it off with qp_presolve=no (e.g. to compare timings or isolate a solver issue):

pounce model.nl qp_presolve=no

Scope and limitations

Convex problems only. Nonconvex (indefinite-Hessian) QPs, quadratic equalities, and quadratic inequalities whose feasible set is nonconvex are solved by the NLP path to a local minimum; POUNCE does not do global optimization.
Convex QCQP (convex-quadratic constraints) routes to the conic IPM: each convex-quadratic inequality ½xᵀQx + aᵀx + b ≤ 0 (with Q ⪰ 0) is reformulated to one second-order cone (Q = FᵀF, so ‖Fx‖² = xᵀQx) and solved alongside the QP objective and linear constraints.

Both the primal solution and the constraint duals are written to the .sol file, in the same sign convention as POUNCE’s NLP path (so Pyomo and AMPL read them identically regardless of which solver ran).

Infeasible and unbounded problems

The convex solver detects infeasibility and unboundedness directly, reporting a clean status instead of exhausting the iteration budget:

Primal infeasible — no point satisfies the constraints. Reported with AMPL solve_result_num 200.
Unbounded (dual infeasible) — the objective decreases without bound along a feasible direction. Reported with solve_result_num 300.

Each verdict is backed by a verified certificate (a Farkas infeasibility proof or an unbounded recession direction that is checked, not merely inferred), so these statuses are never reported in error; a problem the solver cannot certify simply runs to the iteration limit.

The design and roadmap live in dev-notes/lp-qp-routing.md.

Convex Solver: LP, QP, and SOCP

POUNCE ships a specialized convex conic interior-point solver (pounce-convex) alongside the general NLP filter-IPM. It solves the standard-form convex program

minimize    ½ xᵀP x + cᵀx
subject to  A x = b
            G x ⪯_K h
            lb ≤ x ≤ ub

where P ⪰ 0 and the inequality block lies in a product cone K of nonnegative orthants and second-order cones. P = 0 is an LP; an all-orthant K is an LP/QP; second-order blocks make it an SOCP.

The method is a Mehrotra predictor–corrector primal–dual interior-point algorithm with Nesterov–Todd scaling for the cones, sharing the pure-Rust feral sparse LDLᵀ backend with the NLP path. It reaches optimality in materially fewer iterations than routing the same problem through the general NLP solver (≈30–50% fewer on bound/inequality QPs).

Inspiration. The conic interior-point design follows Clarabel (Goulart & Chen) — handling a quadratic objective directly and a product of symmetric cones — and the presolve follows PaPILO (the presolving library of SCIP). POUNCE does not wrap either (the pure-Rust guarantee) but ports their ideas; see Acknowledgments.

This chapter covers the Python API (pounce.qp and the differentiable pounce.jax layers). For automatic CLI/Pyomo routing of .nl LPs/QPs, see LP / QP Solver Routing. Runnable, progressive notebooks live in python/notebooks/: 15_convex_qp.ipynb, 16_socp.ipynb, 17_differentiable_convex.ipynb.

Quadratic programs

import numpy as np
from pounce.qp import solve_qp

# min ½·2‖x‖² − 3x₀ − 4x₁  s.t.  x₀ + x₁ ≤ 1,  0 ≤ x ≤ 1
r = solve_qp(
    P=np.diag([2.0, 2.0]),
    c=[-3.0, -4.0],
    G=[[1.0, 1.0]], h=[1.0],
    lb=[0, 0], ub=[1, 1],
)
r.status   # 'optimal'
r.x        # primal solution
r.y, r.z   # equality / inequality multipliers
r.z_lb, r.z_ub  # bound multipliers (≥ 0)
r.obj, r.iters

P (lower triangle used, assumed symmetric), A, and G accept dense arrays or scipy-sparse matrices; any of them may be omitted. The result is a QpResult dataclass with a .success property. The solver reports verified infeasibility / unboundedness ('primal_infeasible' / 'dual_infeasible') backed by a Farkas / recession certificate rather than an iteration-limit guess.

Second-order cone programs

A second-order (Lorentz) cone is { (t, x) : t ≥ ‖x‖₂ }. Partition the inequality rows of Gx ⪯_K h with cones — a list of (kind, dim) specs ("nonneg" or "soc"; a bare int means a second-order cone). Each slack block s = h − Gx must lie in its cone.

from pounce.qp import solve_socp

# minimize ‖x − x*‖  ⇔  min t s.t. (t, x − x*) ∈ SOC
r = solve_socp(
    c=[1.0, 0.0, 0.0],                 # minimize t
    G=-np.eye(3), h=[0.0, -2.0, 1.0],  # s = (t, x₀−2, x₁+1) ∈ SOC(3)
    cones=[("soc", 3)],
)
r.x   # ≈ [0, 2, -1]:  t* = 0, x = x*

Mixed cones compose — e.g. cones=[("nonneg", 1), ("soc", 2)] puts the first slack in ℝ₊ and the next two in a 2-D second-order cone. Large cones use a sparse diagonal-plus-rank-1 KKT representation (one auxiliary variable per cone, the ECOS/Clarabel “sparse SOC” trick) so the factorization stays sparse.

Warm starting

Feed a previous (or nearby) solution back to seed the interior-point iteration — useful for parametric sweeps, receding-horizon MPC, and branch-and-bound subproblems:

base = solve_qp(P=P, c=c, G=G, h=h, lb=lb, ub=ub)
nxt  = solve_qp(P=P, c=c2, G=G, h=h, lb=lb, ub=ub, warm_start=base)

The warm start only affects the iteration count, never the solution (a mismatch is ignored). The recentering is adaptive for the orthant (sized to the warm point’s KKT residual, so it exploits a nearby problem’s duals yet self-corrects when the active set moves) and re-centers the cone duals for second-order blocks (a converged conic point sits on the cone boundary, where the scaling is singular).

Batching and factorization reuse

from pounce.qp import solve_qp_batch, QpFactorization

# Solve many independent QPs in parallel (rayon, across instances).
results = solve_qp_batch([dict(P=P, c=c_k, G=G, h=h) for c_k in cs])

# Build the KKT symbolic factor once, solve many same-structure problems.
fac = QpFactorization(P=P, c=c0, G=G, h=h, lb=lb, ub=ub)
for c_k in cs:
    rk = fac.solve(P=P, c=c_k, G=G, h=h, lb=lb, ub=ub)  # reuses the factor

solve_qp_batch parallelizes across instances (outer-parallel / inner-serial) and QpFactorization reuses the AMD ordering and symbolic factorization across solves that share a structure — the two compose with warm starting.

Presolve (PaPILO-inspired)

Before the interior-point solve, POUNCE can apply a transaction-stack presolve with full primal and dual postsolve, modeled on PaPILO. The catalog:

empty / duplicate / parallel (scalar-multiple) rows,
fixed-variable elimination (singleton equalities),
free columns and free-column singletons,
activity-based redundancy and infeasibility detection,
forcing constraints (a row at its activity extreme pins its variables),
dominated columns (sign-definite columns optimal at a bound),
bound tightening (domain propagation), with the active-bound multiplier re-attributed to its source row in postsolve,

iterated to a fixpoint so reductions cascade. Each reduction carries the data to reverse itself, and the postsolve reconstructs a valid KKT point of the original problem — the dual recovery is the contract, and is verified by KKT-residual tests. A cone-aware variant (presolve_conic) gates the ≤-row reductions off second-order-cone blocks (which are coupled) and recovers the reduced cone partition.

Presolve is applied automatically on the CLI LP/QP route; it lives in pounce-convex::presolve for Rust callers. See LP / QP Solver Routing.

Differentiable convex layers (JAX)

pounce.jax exposes the solve as a differentiable JAX op via the implicit-function theorem on the KKT system at the optimum (Amos & Kolter, OptNet, 2017). The forward calls the solver; the backward is a single linear solve through the same KKT matrix.

import jax, jax.numpy as jnp
from pounce.jax import solve_qp, solve_socp, QpLayer

# x*(c) for a parametric QP, differentiable w.r.t. all of P, c, G, h, A, b.
def loss(c):
    x = solve_qp(P=P, c=c, G=G, h=h)
    return jnp.sum((x - target) ** 2)

grad_c = jax.grad(loss)(c0)        # exact gradient via implicit diff
J = jax.jacrev(lambda c: solve_qp(P=P, c=c, G=G, h=h))(c0)

Gradients are provided w.r.t. every parameter that enters through the optimum: c, b, h, and the matrices P, G, A (the full OptNet matrix derivatives; ∇P is the symmetric gradient).
solve_socp differentiates SOCPs too — the complementarity row uses the cones’ arrow operators in place of the orthant’s diagonal.
QpLayer captures a fixed P/G/A structure for use inside a larger JAX model, with jax.grad / jacrev / vmap and a parallel .batch.
A warm start may be passed through (non-differentiated — it cannot change the solution or its gradients, only the iteration count).

All gradients are validated against finite differences in the test suite.

Global Optimization

Most of POUNCE settles a problem at a local optimum (the NLP filter-IPM and SQP) or exploits convexity so that local is global (the convex/conic IPM). For a genuinely nonconvex problem, the path to a certified global optimum that ships in this release is for polynomials:

The SOS / Lasserre hierarchy (pounce-convex) — for polynomial problems, via a single semidefinite program. Callable from Rust (sos_minimize) and Python (pounce.sos_minimize).

It returns a result that is certified: a lower bound together with a moment certificate that, when exact, pins the global minimum and recovers its minimizer(s).

A second path — general-purpose spatial branch-and-bound (pounce-global) for factorable nonconvex NLPs with exp/ln/trig — is in development on the feature/global branch and is not part of this release. It is described at the end of this chapter for context, but there is no pounce-global crate in the shipped workspace, no pounce.minimize_global Python entry point, and no --solver global CLI route here.

The SOS / Lasserre path (polynomials)

When the objective and constraints are polynomials, the sum-of-squares / moment approach in pounce-convex certifies the global minimum from a single semidefinite program — no branching — by searching for the largest γ such that p(x) − γ lies in the Putinar cone (a sum of squares plus constraint multipliers). The SDP is solved by POUNCE’s own convex conic interior-point method; flat truncation of the resulting moment matrix certifies when the bound is exact, and a facial-reduction step recovers every global minimizer — even when the optimum is attained at several points.

From Python, a polynomial is a dict mapping an exponent tuple to its coefficient (the all-zeros key is the constant term):

from pounce.sos import sos_minimize

# x**4 - 2 x**2 + 3  ->  global minimum 2, attained at BOTH x = +1 and x = -1
r = sos_minimize({(4,): 1.0, (2,): -2.0, (0,): 3.0})
r.lower_bound       # ≈ 2.0
r.is_exact          # True — flat-truncation certificate: the bound is the minimum
r.minimizers        # both x = +1 and x = -1

Constraints are polynomials too, passed as inequalities (g_i(x) ≥ 0) and equalities (h_j(x) = 0); raise the relaxation order to tighten the bound (the Lasserre hierarchy) at the cost of a larger SDP. A runnable walkthrough — double well, a constrained problem, and a 2-D example — is in 18_sos_global_optimization.ipynb.

The same solver from Rust:

#![allow(unused)]
fn main() {
use pounce_convex::{sos_minimize, PolyProblem, Polynomial};
use pounce_feral::FeralSolverInterface;
use pounce_linsol::SparseSymLinearSolverInterface;
fn backend() -> Box<dyn SparseSymLinearSolverInterface> { Box::new(FeralSolverInterface::new()) }
// x⁴ − 2x² + 3 → global minimum 2 at x = ±1.
let p = Polynomial::new(1, vec![(vec![4], 1.0), (vec![2], -2.0), (vec![0], 3.0)]);
let sol = sos_minimize(&PolyProblem::new(p), None, backend);
// sol.lower_bound ≈ 2; when the moment matrix is flat, sol.minimizers holds
// the global minimizer(s) — here both x = +1 and x = −1.
}

The full treatment lives in the pounce_convex::sos module documentation.

When SOS fits: polynomials of modest degree and dimension — one SDP, recovers all global minimizers, but the SDP grows with the relaxation order. For general factorable problems (exp/ln/trig), or polynomials where the SDP would be too large, the tool is spatial branch-and-bound — which is still in development (below).

Spatial branch-and-bound (in development)

Not in this release. Everything in this section describes the pounce-global crate as it exists on the feature/global branch. It is not in the shipped workspace, and the Rust snippets below will not compile against the published crates. There is no Python or CLI binding for it in this release. The section is kept for design context and to set expectations for what the general nonconvex path will look like.

The problem

minimize    f(x)
subject to  cl_j ≤ g_j(x) ≤ cu_j        (j = 0 … m−1)
            x_lo ≤ x ≤ x_hi

f and the g_j are factorable — built from + − × ÷, integer powers, √, exp, ln, |·|, sin, and cos. A bounded box is required (the relaxation needs finite bounds).

The idea

Branch-and-bound brackets the global optimum between a lower bound (valid over a region) and an upper bound (the value of some feasible point), then subdivides the search region until the two meet. The whole game is making the lower bound tight enough, fast enough.

For each node — a box [lo, hi] — the solver:

Tightens the box. Feasibility-based bound tightening (FBBT) propagates interval bounds through each constraint; optimization-based bound tightening (OBBT) then minimizes and maximizes each variable over the relaxation (with an incumbent cutoff). Either may prove the box empty, in which case it is pruned.
Computes a lower bound. A convex relaxation of the problem over the box — built so that it underestimates f and contains every feasible point — is solved as a linear program through pounce-convex. Its optimum is a valid lower bound. Crucially the relaxation is exact in the limit of a zero-width box, so as branching shrinks boxes the bound converges to the truth.
Improves the incumbent. Feasible points are probed (the relaxation solution, the box center) and polished with a local NLP solve (pounce-algorithm), giving a sharp upper bound.
Branches. The variable with the largest relaxation violation (the one whose nonconvexity is driving the gap) is split at the relaxation point — falling back to the widest box side when nothing is violated — and the two child boxes join a best-first frontier ordered by node lower bound.

The search stops when the frontier’s lowest bound meets the incumbent within tolerance — at which point the incumbent is the certified global optimum.

// On the `feature/global` branch — not in this release.
use pounce_global::{expr::var, solve_global, GlobalProblem, GlobalOptions, GlobalStatus};
use pounce_feral::FeralSolverInterface;

// Six-hump camel — six local minima, two global (value ≈ −1.0316).
let x = var(0);
let y = var(1);
let f = 4.0 * x.clone().powi(2) - 2.1 * x.clone().powi(4) + (1.0 / 3.0) * x.clone().powi(6)
    + x.clone() * y.clone() - 4.0 * y.clone().powi(2) + 4.0 * y.powi(4);

let prob = GlobalProblem::new(vec![-2.0, -1.5], vec![2.0, 1.5], &f);
let sol = solve_global(&prob, &GlobalOptions::default(),
                       || Box::new(FeralSolverInterface::new()));

assert_eq!(sol.status, GlobalStatus::Optimal);
// sol.objective ≈ −1.0316  (a certified global minimum, not just a local one)
// sol.lower_bound brackets it; sol.gap() is the optimality gap; sol.nodes the
// branch-and-bound node count.

Constraints use the same expression DSL — .ge, .le, .equality, and .subject_to(g, lo, hi); an infeasible problem returns GlobalStatus::Infeasible with a proof:

let obj = var(0) + var(1);
let g = var(0) * var(1);
// min x + y  s.t.  x·y ≥ 4 on [1,5]²  → 4 at (2,2)
let prob = GlobalProblem::new(vec![1.0, 1.0], vec![5.0, 5.0], &obj).ge(&g, 4.0);

The relaxation suite

The lower bound is everything, and POUNCE’s is built term by term over the factorable expression tape (the same FbbtTape representation FBBT uses), with the techniques a state-of-the-art global solver uses:

Component	Role
Tight univariate envelopes	The exact convex/concave hull of each atom (`xⁿ`, `√`, `exp`, `ln`, `sin`, `cos`, `
McCormick	The exact convex hull of each bilinear product.
Sandwich cuts	After the LP solve, tangent cuts are added at the solution for loose atoms and the LP re-solved — tightening the bound without branching.
OBBT	Optimization-based bound tightening: the single biggest box reducer.
αBB	A convex underestimator of the whole objective, from a rigorous interval-Hessian spectral shift (`α ≥ max(0, −½λ_min)`), complementing the term-wise relaxation.
RLT	Level-1 reformulation-linearization: each affine constraint times each variable bound factor, linearized with shared product columns.
Multilinear	A 3-way product `x·y·z` is relaxed by intersecting all three bilinear groupings, not just the one nested grouping.

Each is a verified global under/over-estimator — so any of them can be turned on or off without affecting correctness, only the bound’s tightness (and the node count). On the six-hump camel, the envelope engine alone certifies in 287 nodes; adding sandwich cuts brings it to ~220, and OBBT to ~60.

Tuning

GlobalOptions exposes the gap tolerances and every relaxation knob:

Field	Default	Meaning
`abs_gap`, `rel_gap`	`1e-6`	stop when `ub − lb` clears either tolerance
`feas_tol`	`1e-6`	constraint tolerance for accepting an incumbent
`box_tol`	`1e-7`	stop branching a box this narrow
`max_nodes`	`5000`	node budget (else `NodeLimit`, with bound + incumbent)
`local_solve_iters`	`50`	IPM iteration cap for the NLP upper-bound polish (`0` off)
`sandwich_rounds`	`4`	cutting-plane rounds per node (`0` off)
`obbt_passes`	`2`	OBBT sweeps per node (`0` off — costly: `2n` LP solves/pass)
`alphabb_cuts`	`1`	αBB tangent planes added to the objective (`0` off)
`rlt`	`true`	level-1 RLT cuts
`multilinear`	`true`	multi-grouping trilinear relaxation
`branching`	`MostViolation`	branching rule: `Widest`, `MostViolation`, or `Reliability`
`parallel`	`false`	run OBBT’s `2n` solves on a thread pool (deterministic)
`threads`	`1`	`> 1` runs the parallel node pool (non-deterministic order)
`fbbt`	—	FBBT configuration

The branching rule (BranchRule) chooses the variable to split: Widest (box geometry), MostViolation (the variable whose nonconvexity drives the relaxation gap — the default), or Reliability (pseudocosts learned from child solves, with strong branching until a variable’s pseudocost is reliable — the MILP/MINLP SOTA rule). Because OBBT tightens every node here, the relaxation is usually tight enough that the rule is second-order; reliability is most useful on larger problems where variable choice dominates the node count.

The defaults aim for robustness on small problems. OBBT dominates the per-node cost; turn obbt_passes down (or off) on larger problems where the LP solves outweigh the node savings.

There are two opt-in forms of parallelism:

parallel = true parallelizes OBBT’s 2n independent solves per pass on a thread pool — deterministically (the same nodes and optimum as serial, only faster). On a 7-variable problem it cut wall-clock ≈2.3× on 14 cores; the speedup is sub-linear because the relaxation build, sandwich cuts, αBB, RLT, the local NLP solve, and branching remain serial within a node.
threads > 1 runs the node pool: workers pull whole frontier nodes and process them concurrently (OBBT stays serial inside each worker). This is coarser-grained and the larger speedup, but non-deterministic — the certified optimum and gap are unchanged, yet the node count varies run to run (parallel best-first explores some nodes a serial run would have pruned). On a small 5-variable problem it was ≈2.6× on 14 cores (≈40 nodes — too few to saturate the cores); it scales further as the tree widens.

Honest limits

On the feature/global branch, pounce-global is a complete, correct continuous global solver. It is not yet at commercial-solver scale (and, as noted, not yet wired into a shipped release):

Continuous only — no integer branching (MINLP).
Branching offers widest, most-violation (default), and reliability (pseudocost + strong branching) rules; with OBBT every node the rule is usually second-order here, so it is a tunable knob rather than a fixed win.
Atoms outside the supported set, sin/cos over a box spanning more than a few full periods, and division by an interval straddling zero fall back to the (valid but weak) interval box bound, which branching sharpens. (sin/cos over a box wider than π but within a few periods now gets a valid sloped relaxation rather than the bare box.)

For the classes it does cover, the answer is global and certified.

Solution Output

The `.sol` file

Following the AMPL solver convention, solving a positional .nl file writes a sibling <stub>.sol next to it — pounce problem.nl produces problem.sol. The file carries the primal x and dual lambda blocks plus an objno line with the AMPL solve_result_num, so AMPL (or any .sol reader) can pull the solution back:

pounce problem.nl                       # writes problem.sol
pounce problem.nl --sol-output out.sol  # write to an explicit path
pounce problem.nl --no-sol              # skip the .sol write

A .sol is written even when the solve fails, so the solve_result_num is always recoverable. Built-in problems (--problem …) have no .nl stub, so they only produce a .sol when --sol-output is given explicitly.

Choosing an output format

You want…	Use
AMPL / Pyomo to read the result back	the `.sol` file (default)
A structured, schema-versioned report for tooling	`--json-output` (see JSON Solve Report)
Just the console summary	`--no-sol`

The .sol and JSON outputs are not exclusive — you can request both in the same run.

JSON Solve Report

Pass --json-output PATH to write a structured solve report alongside the regular console output:

pounce problem.nl --json-output result.json
pounce problem.nl --json-output result.json --json-detail full

The report carries everything an AMPL .sol file holds — status, primal x, dual lambda, suffix blocks — plus FAIR-aligned provenance metadata (Wilkinson et al. 2016, DOI 10.1038/sdata.2016.18) and, optionally, the per-iteration trajectory.

Detail levels

Level	Emits
`summary` (default)	FAIR metadata, problem dimensions, final solution, aggregate statistics.
`full`	The above plus the per-iteration trajectory (`iter`, `objective`, `inf_pr`, `inf_du`, `mu`, step norms, alphas, line-search trials) and sensitivity / suffix blocks.

Choose summary for production logs and batch runs; full for debugging — it is the JSON equivalent of upstream’s print_level=8.

Schema stability

The schema is versioned (pounce.solve-report/v1) so downstream tooling can pin against a major version:

Adding fields is non-breaking — consumers must tolerate unknown fields.
Removing or renaming a field bumps the major version (v1 → v2).

The Schema v1 Reference documents every field, the FAIR mapping, and the versioning policy in full.

POUNCE solve-report schema, v1

Schema tag: pounce.solve-report/v1

This document is the canonical reference for the JSON solve report emitted by pounce --json-output PATH and pounce_sens --json-output PATH. The report carries everything an AMPL .sol file holds — status, primal x, dual lambda, suffix blocks — plus FAIR-aligned provenance metadata and (optionally) the per-iteration trajectory.

Implementation: the serde structs live in crates/pounce-solve-report/src/lib.rs (per-iteration IterRecord in crates/pounce-nlp/src/solve_statistics.rs); crates/pounce-cli/src/solve_report.rs wires them to the CLI.

Why a structured solve report?

Production NLP workflows often need to (a) capture which solve produced which numbers for audit / reproducibility, (b) feed solver output into downstream tooling (notebooks, dashboards, ML pipelines) that don’t want to parse a free-form .sol file, and (c) compare runs across versions of pounce. Both upstream Ipopt’s stdout summary and AMPL’s .sol were designed for human consumption and AMPL’s reader respectively — neither carries provenance metadata, neither is schema-versioned, and neither is trivially machine-parseable across ecosystems.

A versioned JSON schema with FAIR-aligned provenance solves all three.

FAIR alignment

The fair_metadata block maps onto the four FAIR principles (Wilkinson et al. 2016, “The FAIR Guiding Principles for scientific data management and stewardship”, Scientific Data 3, 160018, DOI 10.1038/sdata.2016.18; citation verified via Crossref on 2026-05-14):

Principle	Mapping in this schema
Findable	`result_id` (`<unix_nanos>-<pid>`, globally unique and time-ordered), `created_at_iso`, `created_at_unix_nanos`.
Accessible	Plain-text JSON on disk; no protocol gating; UTF-8. Same trust model as the `.sol` file.
Interoperable	Schema-versioned (`pounce.solve-report/v1`); JSON primitives only (no binary blobs); units documented per-field below; `solution.status` is the enum-variant string for cross-language consumption.
Reusable	`solver` (name + version + git commit + target triple), `license`, `input` (kind + path + size) capture enough provenance to reproduce a solve.

Versioning policy

schema is the version tag. Compatibility rules:

Adding fields is non-breaking. Consumers MUST tolerate unknown fields. New optional fields land between versions; the major version doesn’t bump.
Removing or renaming fields bumps the major version (v1 → v2). Consumers should pin against a major version (schema starts_with "pounce.solve-report/v1").
Changing field semantics without a rename is forbidden. If semantics need to change, add a new field and deprecate the old.

The pre-1.0 phase of POUNCE itself does NOT relax this rule for the schema. Once a solve-report version ships, its field set is frozen even while the rest of the solver is under churn.

Top-level shape

{
  "schema": "pounce.solve-report/v1",
  "fair_metadata": { ... },
  "problem":       { ... },
  "solution":      { ... },
  "statistics":    { ... },
  "iterations":    [ ... ],  // optional, omitted when empty
  "linear_solver": { ... }   // optional, omitted when backend did not report
}

Fields

`schema` (string, required)

Identifier for this schema version. Always "pounce.solve-report/v1" for v1. Major-version bumps change the prefix; minor / patch (additive) changes do not.

`fair_metadata` (object, required)

Field	Type	Notes
`result_id`	string	Format: `<unix_nanos>-<process_id>`. Monotonically ordered within a process, globally unique across processes. No external UUID library needed.
`created_at_iso`	string	Solve start time as ISO-8601 UTC: `YYYY-MM-DDTHH:MM:SS.sssZ`.
`created_at_unix_nanos`	integer	Same instant as Unix nanoseconds since 1970-01-01 UTC. Provided alongside the ISO string for consumers that prefer integer arithmetic.
`elapsed_seconds`	float	Wallclock seconds the solve took (matches `statistics.total_wallclock_time_secs` modulo float precision).
`solver`	object	See below.
`license`	string	SPDX identifier. Always `"EPL-2.0"` for this version.
`input`	object	See `Input descriptor` below.

`solver` sub-object

Field	Type	Notes
`name`	string	Always `"pounce"`.
`version`	string	Crate version (e.g. `"0.1.0"`). Read from `CARGO_PKG_VERSION` at build time.
`git_commit`	string \| omitted	Build-time git revision. Omitted when the build environment did not set `POUNCE_GIT_COMMIT` (e.g. development builds). Set via `POUNCE_GIT_COMMIT=$(git rev-parse HEAD) cargo build` to populate.
`target_triple`	string	Build target triple (e.g. `"x86_64-apple-darwin"`); falls back to `"unknown"` when Cargo did not expose `TARGET` at build time.

`Input descriptor` (`input`)

Tagged enum keyed on kind. Possible shapes:

{ "kind": "nl-file", "path": "/path/to/foo.nl", "size_bytes": 366 }
{ "kind": "builtin", "name": "rosenbrock" }
{ "kind": "tnlp-direct" }

nl-file — the input came from .nl file at path. size_bytes is present when the file’s metadata is readable; consumers that want bit-exact provenance can hash the file themselves.
builtin — the input was a built-in problem named by name (e.g. pounce --problem rosenbrock).
tnlp-direct — used by library callers building a TNLP in-process without a .nl round-trip.

`problem` (object, required)

Problem dimensions reported by the TNLP at get_nlp_info().

Field	Type	Notes
`n_variables`	integer	Number of primal variables.
`n_constraints`	integer	Number of constraints (equalities + inequalities).
`n_objectives`	integer	Number of objectives. The IPM uses objective 0; extras are read but ignored.
`minimize`	boolean	`true` for minimization (the AMPL default).
`nnz_jac_g`	integer \| omitted	Number of declared non-zeros in the constraint Jacobian.
`nnz_h_lag`	integer \| omitted	Number of declared non-zeros in the lower triangle of the Lagrangian Hessian.

`solution` (object, required)

Field	Type	Notes
`status`	string	`ApplicationReturnStatus` enum variant name verbatim (e.g. `"SolveSucceeded"`, `"MaximumIterationsExceeded"`).
`solve_result_num`	integer	AMPL-style solve-result code (Gay 2005, “Hooking Your Solver to AMPL” §5, p. 23 table): 0 = solved, 100-range = warning, 200-range = infeasible, 400-range = limit reached, 500-range = failure.
`objective`	float	Final unscaled objective value. `0.0` (not NaN) when the solve never completed; check `statistics.iteration_count > 0` to distinguish.
`x`	array of float \| empty	Primal vector, length `problem.n_variables`. Empty when the binary doesn’t capture the final iterate (currently: `pounce` on the `newton_driver` fast-path). Omitted from JSON when empty.
`lambda`	array of float \| empty	Constraint multipliers, length `problem.n_constraints`. Same omission convention as `x`.
`suffixes`	array of object \| empty	sIPOPT-style suffix blocks; emitted only at `--json-detail full`. See below.

Suffix entries

{
  "name": "sens_sol_state_1",
  "target": "var",
  "kind": "real",
  "values": [0.576..., 0.378..., -0.046..., 4.5, 1.0]
}

Field	Type	Notes
`name`	string	AMPL suffix name.
`target`	string	One of `"var"`, `"con"`, `"obj"`, `"problem"`. Matches AMPL’s `Sufkind_*` enum.
`kind`	string	`"real"` or `"int"`. Selects which payload array is populated.
`values`	array of float	Dense values (length = target dimension). Present when `kind = "real"`.
`int_values`	array of integer	Present when `kind = "int"`.

`statistics` (object, required)

Projection of pounce_nlp::solve_statistics::SolveStatistics minus the per-iteration history (which lives at the top level when present).

Field	Type	Notes
`iteration_count`	integer	Number of accepted outer iterations.
`final_objective`	float	Unscaled. Matches `solution.objective`.
`final_scaled_objective`	float	Scaled by the IPM’s internal NLP scaling. Equal to `final_objective` when no scaling is in effect.
`final_dual_inf`	float	`
`final_constr_viol`	float	`
`final_compl`	float	Max complementarity over the four bound blocks.
`final_kkt_error`	float	Overall KKT error reported by the convergence check.
`num_obj_evals`	integer	`eval_f` call count.
`num_constr_evals`	integer	`eval_g` call count.
`num_obj_grad_evals`	integer	`eval_grad_f` count.
`num_constr_jac_evals`	integer	`eval_jac_g` count.
`num_hess_evals`	integer	`eval_h` count.
`total_wallclock_time_secs`	float	Wall time spent inside `optimize_*`.
`restoration_calls`	integer	Number of restoration-phase entries (pounce#12).
`restoration_inner_iters`	integer	Cumulative inner-IPM iterations across all restoration calls.
`restoration_outer_iters`	integer	Outer iterations that ran in restoration mode (`R`-line equivalents).
`restoration_wall_secs`	float	Wall time spent inside `perform_restoration`.

Eval counters (num_*_evals) populate only on the .nl-file path because the pounce binary’s CountingTnlp wrapper tracks them. Library callers using IpoptApplication::optimize_tnlp directly see zeros there; the underlying counts are still available through upstream’s IpoptCalculatedQuantities if needed.

`iterations` (array of object, optional)

Per-iteration trajectory. Emitted only at --json-detail full (when IpoptApplication::enable_iter_history() was called). Omitted from JSON entirely when empty.

Each row maps to one line of the upstream-formatted console iter table. Fields:

Field	Type	Notes
`iter`	integer	0-based iteration index.
`objective`	float	`f(x_k)` at the start of iter `k` (unscaled).
`inf_pr`	float	Primal infeasibility `
`inf_du`	float	Dual infeasibility `
`mu`	float	Barrier parameter μ_k (not log10; consumers can take `log10` if they want the console format).
`d_norm`	float	`
`regularization`	float	Hessian regularization `δ_w` applied this iter; `0.0` when none was needed.
`alpha_dual`	float	Dual step length.
`alpha_primal`	float	Primal step length.
`alpha_primal_char`	string (1 char)	Single-character tag (`f`, `h`, `r`, etc.) matching the alpha-primal column of upstream’s iter table.
`ls_trials`	integer	Number of backtracking line-search trials this iter.

`linear_solver` (object, optional)

Aggregate post-mortem from the symmetric-indefinite linear backend that solved the KKT systems. Populated only when the backend self-instruments (the default FERAL backend does; HSL MA57 and custom backends plugged through set_linear_backend_factory do not). Omitted from JSON when no backend reported.

Field	Type	Notes
`solver_name`	string	Backend identifier (e.g. `"feral"`).
`n_factors`	integer	Total numeric factorizations performed.
`n_pattern_reuse`	integer	Factor calls that reused the existing symbolic pattern.
`n_pattern_changes`	integer	Factor calls that triggered a re-analysis.
`max_fill_ratio`	float \| omitted	Peak `nnz(L) / nnz(A)` observed across all factorizations.
`min_abs_pivot`	float \| omitted	Smallest absolute pivot magnitude seen across all factorizations (diagnostic for near-singularity).
`max_abs_pivot`	float \| omitted	Largest absolute pivot magnitude.
`last_inertia`	`[int, int, int]` \| omitted	`(positive, negative, zero)` inertia of the final factor. Should match `(n, m, 0)` at a regular KKT optimum.
`last_nnz_a`	integer \| omitted	Non-zero count of the assembled KKT matrix at the final factor.
`last_nnz_l`	integer \| omitted	Non-zero count of the L-factor at the final factor.

Detail levels

The --json-detail LEVEL flag selects how much detail is emitted. Levels map to verbosity in the same spirit as upstream’s print_level (0 silent → 12 maximum debug):

Level	What’s emitted	What’s omitted
`summary` (default)	FAIR metadata, problem, solution scalars + arrays, aggregate statistics	`iterations`, `solution.suffixes`
`full`	All of the above plus per-iteration trajectory and suffix blocks	nothing — full detail

summary is the right choice for production logs and batch runs. full is the debugging equivalent of upstream’s print_level=8.

Worked example

pounce_sens crates/pounce-cli/tests/fixtures/parametric.nl out.sol --json-output result.json --json-detail full produces (truncated for brevity):

{
  "schema": "pounce.solve-report/v1",
  "fair_metadata": {
    "result_id": "1778777029606881000-76543",
    "created_at_iso": "2026-05-14T16:43:49.606Z",
    "created_at_unix_nanos": 1778777029606881000,
    "elapsed_seconds": 0.011,
    "solver": {
      "name": "pounce",
      "version": "0.1.0",
      "target_triple": "x86_64-apple-darwin"
    },
    "license": "EPL-2.0",
    "input": {
      "kind": "nl-file",
      "path": "crates/pounce-cli/tests/fixtures/parametric.nl",
      "size_bytes": 366
    }
  },
  "problem": { "n_variables": 5, "n_constraints": 4, "n_objectives": 1, "minimize": true },
  "solution": {
    "status": "SolveSucceeded",
    "solve_result_num": 0,
    "objective": 0.5510204081632656,
    "x":      [0.6326530575201161, 0.3877551079678144, 0.020408165487930466, 5.0, 1.0],
    "lambda": [-0.16326530000405073, -0.28571431357898697, -0.16326530000405073, 0.18075803406303625],
    "suffixes": [{
      "name": "sens_sol_state_1",
      "target": "var",
      "kind": "real",
      "values": [0.5765305974643309, 0.3775510440570709, -0.04591835847859835, 4.5, 1.0]
    }]
  },
  "statistics": { "iteration_count": 9, "final_dual_inf": 2.89e-14, "...": "..." },
  "iterations": [
    { "iter": 0, "objective": 0.0451, "inf_pr": 5.0, "inf_du": 0.407, "mu": 0.1,
      "d_norm": 0.0, "regularization": 0.0, "alpha_dual": 0.0, "alpha_primal": 0.0,
      "alpha_primal_char": " ", "ls_trials": 0 },
    { "iter": 1, "objective": 0.957, "inf_pr": 0.212, "...": "..." }
  ]
}

Consumer guidance

Pin the major version. Check schema.startswith("pounce.solve-report/v1") before consuming.
Tolerate unknown fields. New optional fields will land between minor versions of pounce. Use serde(default) / equivalent.
Distinguish “no solve” from “solve produced zero”. Pre-solve, scalar fields are 0.0 (not NaN, because JSON has no NaN literal). statistics.iteration_count == 0 is the signal that no solve occurred.
solution.x / solution.lambda may be empty. When the binary couldn’t capture the final iterate (currently: the pounce binary on its newton_driver fast-path for m=0, n≤1000 problems), the arrays are empty and the keys are omitted from JSON entirely. pounce_sens always populates them.

References

Wilkinson et al. (2016). “The FAIR Guiding Principles for scientific data management and stewardship.” Scientific Data 3, 160018. DOI 10.1038/sdata.2016.18. (Verified via Crossref 2026-05-14.)
Gay (2005). “Hooking Your Solver to AMPL.” https://ampl.com/REFS/hooking2.pdf. §5 (Returning Results to AMPL) for the .sol baseline this schema is structured around.
SPDX license identifiers: https://spdx.org/licenses/.

Verifying Solutions

pounce verify <problem.nl> <claim.sol> [OPTIONS]

pounce verify independently checks that the solution in a .sol file actually satisfies the constraints and bounds of a .nl problem. It re-derives feasibility from the model itself — it does not trust the .sol’s status line, and it does not rerun the solver. This makes it the trust anchor when pounce is a tool an agent calls: the agent proposes a solution, and a small, deterministic checker disposes.

Optimization is unusually well-suited to this because a solution is far cheaper to verify than to produce: a claimed x* is just numbers, and feasibility is a single constraint evaluation — g_l ≤ g(x*) ≤ g_u, x_l ≤ x* ≤ x_u — O(nnz) work with no resolve and no dense linear algebra.

Status. The verify check itself — recompute feasibility against the canonical model, with a content-addressed receipt — is solid and ready to use; it needs no secrets and is the recommended default. The signing and remote-service trust layer layered on top of it (HMAC receipts, the signer_service.py reference, running the MCP server as a remote authority) is a proof of concept: it demonstrates the architecture but is not hardened for production. If you want to rely on the signed/remote path for real, see Status and hardening at the end for the checklist of what that would take.

What it defends against

In an agent workflow, three things can go wrong with “here is a solution”:

Failure mode	How `verify` catches it
Fabrication — a `.sol` that looks like a pounce result but wasn’t solved	invented numbers fail the residual check against the real model
Ignoring the solver — claiming success without actually solving	a consumer gates on the receipt’s `verified: true` + the problem hash, not on prose
Solving the wrong problem — dropping or relaxing a constraint to dodge infeasibility	the check runs against the canonical constraints/bounds, so a point that is only feasible for a relaxed model is rejected here

The key design rule: always verify against the canonical problem, never against whatever the agent claims it solved. If the agent loosened a bound to manufacture feasibility, the returned x* still violates the canonical bound, and verify reports it.

Output and exit codes

$ pounce verify gaslib40_steady.nl good.sol
pounce verify — independent solution check
  problem : gaslib40_steady.nl  (1694 vars, 1682 cons)
            sha256:4bb435a3…
  solution: good.sol
            sha256:b77d9e7b…
  claimed solve_result_num: 0

  feasibility (tol 1.0e-6):
    max constraint violation: 1.407e-12  at c[114] (value 1.4e-12, bounds [0, 0])
    max bound violation     : 9.775e-9   at x[24]  (value 1.05, bounds [1.05, 2.0])
  objective at x*: 1.2899875310e0

  optimality (tol 1.0e-6, duals supplied):
    KKT stationarity residual: 2.675e-3  (dual sign +1)
    complementarity residual : 0.000e0

  VERDICT: VERIFIED — solution is feasible for the canonical problem

Exit code	Meaning
`0`	`VERIFIED` — every violation within tolerance
`20`	`REJECTED` — a constraint or bound violation exceeds tolerance
`2`	usage / I/O error (missing file, malformed `.sol`, dimension mismatch)

A consumer (CI step, agent harness, Makefile) gates on the exit code.

Options

Flag	Default	Meaning
`--feas-tol <t>`	`1e-6`	feasibility tolerance for constraints and bounds
`--opt-tol <t>`	`1e-6`	stationarity tolerance for the optimality check
`--require-optimal`	off	also fail (exit 20) if the KKT stationarity residual exceeds `--opt-tol`
`--json-output <path>`	—	write a JSON verification receipt

Feasibility is the gate; optimality is reported

By default only feasibility gates the exit code. Feasibility is rigorous and sign-convention-independent — it is the guarantee that matters when the claim is “this solution meets the constraints.”

When the .sol carries constraint duals, verify also reports a KKT stationarity residual (the bound-projected “dual infeasibility”: the part of ∇f + Jᵀλ that a valid sign-constrained bound multiplier cannot absorb) and a complementarity residual. These are informational unless you pass --require-optimal. The AMPL dual-sign convention can differ from pounce’s, so verify computes the residual for both signs and reports the better one plus the sign it used. Bound multipliers z_L, z_U are not present in a .sol, so they are inferred from which bounds are active.

The JSON receipt

--json-output writes a machine-readable receipt that content-addresses both inputs by SHA-256 — so a downstream consumer can confirm exactly which problem and which solution were checked:

{
  "pounce_verify_version": 1,
  "solver": "pounce 0.4.0",
  "problem":  { "path": "…", "sha256": "4bb435a3…", "n_vars": 1694, "n_cons": 1682 },
  "solution": { "path": "…", "sha256": "b77d9e7b…", "duals_present": true },
  "tolerances": { "feasibility": 1e-6, "optimality": 1e-6 },
  "feasibility": {
    "max_constraint_violation": 1.4e-12,
    "worst_constraint": { "index": 114, "name": "c[114]", "value": 1.4e-12,
                          "lower": 0.0, "upper": 0.0, "violation": 1.4e-12 },
    "max_bound_violation": 9.77e-9,
    "worst_bound": { "index": 24, "name": "x[24]", … },
    "feasible": true
  },
  "optimality": { "available": true, "stationarity_residual": 2.6e-3, … },
  "verdict": "VERIFIED",
  "verified": true
}

A consumer should accept a solution iff:

verified == true, and
problem.sha256 equals the SHA-256 of its own canonical .nl, and
(when signing is used) the signature validates — see below.

Checking the hash in step 2 is what closes the “solved the wrong problem” gap at the receipt layer: the receipt is only meaningful for the exact problem bytes it names.

The default: recompute, don’t trust a receipt

The strongest and simplest design uses no key and no signature at all: the consumer runs pounce verify itself, against its own copy of the canonical .nl.

# the consumer does this — not the agent
pounce verify ./canonical/problem.nl ./from-agent/claim.sol || reject

Because verification is keyless, deterministic, and cheap (O(nnz), no resolve), the consumer can afford to just do it rather than trust someone else’s word. In this design the agent is never in the trust path: it hands over x*, and the consumer believes its own arithmetic. There is no key to steal, so the question “what if the agent gets the key?” does not arise. Forgery is impossible because nothing is being trusted on faith — feasibility is decided by evaluating g(x*), not by matching fields in a document.

This is the recommended default. Prefer it whenever the consumer can run pounce verify (or call a verifier it controls). Reach for signatures only when it genuinely cannot — see below.

Signed receipts — trust transport, conditional on key isolation

Signing addresses a narrower situation: the consumer won’t or can’t recompute — a remote or expensive verifier, or an audit log you want to trust later without re-solving — and instead wants to trust a receipt produced elsewhere. A signature lets that receipt be checked without redoing the work.

When the POUNCE_VERIFY_KEY environment variable is set (non-empty), the receipt gains:

"signature_alg": "HMAC-SHA256",
"signed_fields": ["verify_version","nl_sha256","sol_sha256",
                  "n_vars","n_cons","feasible","verified","verdict"],
"signature": "5bdcc146bf60754e…"

The signature is HMAC-SHA256(key, preimage), where preimage is a deliberately float-free byte string — only hex hashes, integer counts, and the verdict — so any language reproduces it byte-for-byte without float-formatting parity problems. The exact preimage is:

pounce-verify-receipt/v1
verify_version=1
nl_sha256=<hex>
sol_sha256=<hex>
n_vars=<int>
n_cons=<int>
feasible=<true|false>
verified=<true|false>
verdict=<VERIFIED|REJECTED>

(eight lines, \n-joined, with a trailing newline; booleans lowercase.) A holder of the key recomputes the HMAC over this preimage and compares it to signature.

What the signature does and does not guarantee

HMAC gives existential unforgeability under chosen-message attack — but only while the key stays secret. That single condition carries the entire weight of the scheme:

A party without the key cannot mint a receipt that validates. It cannot produce a correct MAC for any message it has not already seen one for. This is the real guarantee.
A party with the key can sign anything — including a hand-written receipt that falsely says verified: true. If the agent obtains the key, signing buys you nothing, and you must fall back to recomputation.

So the security reduces to a question that is not cryptographic: can the agent reach the key? If the keyholder and the agent share a process, a user account, an environment, or a filesystem, the agent simply reads it — echo "$POUNCE_VERIFY_KEY", cat keyfile, or introspecting shared memory — and the signature becomes theater. An agent with a shell on the same user/host as the signer is not kept out by HMAC.

Setting POUNCE_VERIFY_KEY for a pounce verify that runs in the agent’s own environment is the weakest form and provides no real protection against that agent. It is only meaningful when the key lives behind a privilege boundary the agent cannot cross. See Out-of-process signing below.

Two further limits HMAC does not address, even with a perfectly isolated key:

Replay. A previously, legitimately-signed receipt can be re-presented. Here that is benign: a receipt attests “this specific x* (by sol_sha256) is feasible for this specific problem (by nl_sha256),” and that statement stays true. Replay cannot manufacture a valid-looking attestation for a new (problem, solution) pair the signer never blessed. If you ever need freshness (“the work was done this time”), add a nonce/timestamp to the signed preimage — but for “does this solution meet the constraints,” freshness is irrelevant.
Wrong model. The signature says nothing about whether the .nl encodes the problem you meant. That spec-ownership question must live outside the agent regardless of any crypto. verify certifies feasibility against a given model; it does not certify that the model is the right one. Likewise, feasibility is fully checkable but global optimality is not — the stationarity residual certifies a first-order/KKT point, not a global minimum.

Out-of-process signing

For a signature to mean anything against a capable agent, the key must live in a separate trust boundary — a different process and user/container/ host that the agent cannot exec into or read. The signer exposes “verify-and-sign” as an RPC; the key never enters the agent’s address space.

 ┌─ agent boundary (no key) ────────────┐      ┌─ signer boundary (holds key) ─────┐
 │ agent proposes x*                     │      │ POUNCE_VERIFY_KEY in THIS env only │
 │   │                                   │ RPC  │   1. run `pounce verify` on the    │
 │   └── POST /verify {nl, sol} ─────────┼─────▶│      CANONICAL .nl (+ the key)     │
 │                                       │      │   2. binary signs the receipt      │
 │   signed receipt ◀────────────────────┼──────┤   3. return receipt JSON           │
 └───────────────────────────────────────┘      └────────────────────────────────────┘
        │
        └── relays receipt to the consumer
consumer: accept iff  verified==true  ∧  problem.sha256==canonical  ∧  signature valid

What each party can do under this split:

Party	Has key?	Can forge a verdict?
Agent (proposer)	no	no — it can only ask the signer to verify a real `x*`
Signer service	yes	yes, but it is the trusted authority — that’s the point
Consumer	shares key or recomputes	detects any tampering / can verify independently

The boundary is only real if the agent cannot run code as the signer’s user or on its host. Running the signer as a separate user, container, or host (or behind a KMS/HSM that signs without exposing the key) is what turns “signed” from theater into a guarantee. An MCP server is already a separate process from the model, which helps — but only achieves isolation if the agent also lacks a shell on the same user/host.

A minimal reference signer is in studio/mcp/examples/signer_service.py: a stdlib HTTP service that holds the key in its own environment, shells out to pounce verify, and returns the signed receipt. The agent calls it; the agent’s environment never contains the key.

Use in an agent workflow

Putting it together — recompute by default, sign only to transport trust:

agent ── proposes x* ──▶  consumer / verifier-it-controls
                            1. pin + hash the canonical .nl
                            2. pounce verify .nl .sol   (against the CANONICAL model)
                          ◀─ accept iff verified==true ∧ problem.sha256==canonical

When the verifier must be remote and the consumer won’t recompute, insert an out-of-process signer (above) and add ∧ signature valid to the consumer’s acceptance test — remembering that the last clause is only as strong as the signer’s key isolation.

The pounce-studio MCP server exposes verify_solution so an agent can request a check but cannot fake its result. Deploy that server as a distinct boundary from the agent (separate user/container) for the signature to carry weight; otherwise rely on the consumer recomputing.

Status and hardening

What is ready to use as-is:

The feasibility check (pounce verify, and the consumer-recomputes pattern). It is deterministic, keyless, content-addressed, and rigorous — this is the part to build on.

What is a proof of concept — demonstrates the shape, not hardened:

HMAC signing via POUNCE_VERIFY_KEY, the signer_service.py reference, and treating a remotely-deployed MCP server as a signing authority.

If you ever want to depend on the signed/remote path in production, these are the gaps to close. None are implemented here.

Key management

Don’t keep the key in a plain environment variable or file. Use a KMS/HSM (or sealed secret) that signs without exposing the key to the process — then even a compromised signer host can’t exfiltrate it.
Add key rotation and a key id in the receipt (kid) so a consumer knows which key to check against and old receipts stay verifiable across rotations.
Consider an asymmetric scheme (e.g. Ed25519) instead of HMAC when more than one party must verify without also being able to sign — HMAC’s symmetric key means every verifier is also a forger. Public-key signatures give public verifiability with a single private signer.

Transport / service (the moment it leaves stdio)

TLS on the endpoint; never plaintext for a service that holds a key.
Authn/authz — bearer token or OAuth on every request (MCP’s HTTP transport supports this). An unauthenticated endpoint that runs solves and shells out is effectively remote code execution.
Resource limits — request-size caps, solve timeouts (there is a timeout_seconds, but also wall/CPU/memory limits at the OS level), concurrency caps, and rate limiting.
Sandbox the solve — treat every .nl as untrusted input. Parsing and evaluating an arbitrary model is attacker-controlled computation; run it in a locked-down container/user with no network and a constrained filesystem.

Input handling

Over a network the path-based tools (nl_file/sol_file) assume a shared filesystem. Prefer content upload (the server receives and hashes the exact .nl/.sol bytes) so there’s no path-traversal surface and the receipt binds what was actually sent. The reference signer’s POUNCE_SIGNER_ROOT allowlist is a stopgap, not a substitute.

Freshness / replay

The current preimage has no nonce or timestamp, so a signed receipt is replayable. That is benign for “is this x* feasible” (a timeless fact), but if a consumer needs “this was checked recently” or “in response to my request,” add a nonce/timestamp (and a receipt expiry) to the signed preimage and bump the pounce-verify-receipt version.

Auditability

Log every verification (problem hash, solution hash, verdict, key id, caller identity) to an append-only store, so a disputed result can be reconstructed. Keep the key out of the logs.

Standing non-goals (true regardless of hardening)

verify certifies feasibility against a given model — it does not certify the model is the right one. Model/spec correctness must be owned outside the agent.
Feasibility is fully checkable; global optimality is not. The stationarity residual certifies a first-order/KKT point, not a global minimum.

Sensitivity Analysis

POUNCE includes a parametric sensitivity capability compatible with upstream Ipopt’s contrib/sIPOPT/ (Pirnay, López-Negrete & Biegler 2012, DOI 10.1007/s12532-012-0043-2). It computes the first-order change in the optimal primal solution with respect to a problem parameter, reusing the KKT factorization from the converged solve. Four entry points cover the common workflows.

AMPL CLI

The main pounce driver auto-detects the sIPOPT suffixes (sens_state_1, sens_state_value_1, sens_init_constr) in an input .nl, runs a post-optimal sensitivity step after the solve, and writes the perturbed primal back as a sens_sol_state_1 suffix — no separate binary or flag needed:

pounce problem.nl                   # writes problem.sol
pounce problem.nl out.sol --json-output result.json --json-detail full

pounce_sens is retained as a thin backward-compatibility alias: pounce_sens in.nl out.sol is identical to pounce in.nl out.sol, so existing AMPL / solver scripts keep working unchanged.

Related flags:

--sens-boundcheck / --sens-bound-eps EPS — clamp the perturbed primal x* + Δx onto the declared [x_l, x_u] box.
--compute-red-hessian / --rh-eigendecomp — compute the reduced Hessian (and its eigendecomposition) over the variables tagged by the red_hessian integer var-suffix.

Rust library

SensSolve is a builder that wraps the on_converged callback plumbing into a single call:

#![allow(unused)]
fn main() {
use pounce_sensitivity::SensSolve;

let result = SensSolve::new(vec![2, 3])
    .with_deltas(vec![0.05, 0.0])
    .with_reduced_hessian()
    .run(&mut app, tnlp);
// result.dx, result.reduced_hessian, result.status
}

with_reduced_hessian_eigen() adds the eigendecomposition; with_boundcheck(eps) enables the bound projection.

Python

solve_with_sens exposes the same capability from the cyipopt-compatible Python wrapper:

# pin_constraint_indices is required; pass deltas=..., compute_reduced_hessian=True,
# or both. Returns (x, info) — sensitivity outputs live in the info dict.
x, info = prob.solve_with_sens(x0, pin_constraint_indices=[2, 3],
                               deltas=[0.05, 0.0], sens_boundcheck=True)
# info["dx"], info["reduced_hessian"], info["reduced_hessian_eigenvalues"], ...

compute_reduced_hessian=True returns the reduced Hessian in info["reduced_hessian"]; rh_eigendecomp=True adds its eigendecomposition; sens_bound_eps=… tunes the bound projection. See python/notebooks/04_sensitivity.ipynb for a walkthrough.

Pyomo

pyomo_pounce wraps the same machinery in a declare-then-query interface: flag the parameters that matter while building the model (no perturbed values required), solve normally, then ask for derivatives. Parameters are declared with declare_sens_param (mutable Param or fixed Var, scalar or indexed); when declarations are present, SolverFactory("pounce").solve(m) runs in-process and keeps the converged KKT factorization, so every query afterwards is a single backsolve.

import pyomo.environ as pyo
import pyomo_pounce
from pyomo_pounce import declare_sens_param, gradient, estimate

m.p = pyo.Param(initialize=2.0, mutable=True)
declare_sens_param(m.p)                 # a flag, not a perturbation

pyo.SolverFactory("pounce").solve(m)    # ordinary solve

gradient(m.x, wrt=m.p)                  # dx*/dp (float)
gradient(m.con, wrt=m.p)                # d(multiplier of con)/dp
G = gradient(m.z, wrt=m.r)              # containers -> Gradient object
G[m.z[1], m.r[2]]; G.to_dataframe()     # element access / full Jacobian
estimate(m, [(m.p, 2.5)])               # first-order solution estimate at
                                        # new values, clamped to bounds

gradient returns exact first-order derivatives (unit-perturbation backsolves, no finite differencing); estimate combines the stored derivative columns for arbitrary perturbed values after the fact, and warns when the linear step leaves the variable bounds (a single-pass projection analogous to the CLI’s --sens-boundcheck). Multiplier sensitivities are available for equality constraints. Models without declarations solve through the ordinary AMPL/CLI path, unchanged. See python/notebooks/25_pyomo_sensitivity.ipynb for a worked optimal-control example (initial conditions as parameters; the first-move gradient IS the NMPC feedback gain).

Units and NLP scaling

All sensitivity outputs are in natural (unscaled) units. The IPM holds its converged KKT factor in an internally scaled space whenever NLP scaling is active (the default nlp_scaling_method = "gradient-based" fires when an objective gradient or constraint row exceeds nlp_scaling_max_gradient = 100 at the starting point); pounce undoes that scaling in every held-factor back-solve, so dx, kkt_solve, and the reduced Hessian are independent of how the problem was scaled internally (#128).

In particular, for a parameter-estimation NLP with the parameters pinned by equality constraints, -inv(info["reduced_hessian"]) is directly the parameter covariance — no per-problem scale factor, no need to set nlp_scaling_method = "none". (Sign convention: over pin constraint rows, B K⁻¹ Bᵀ equals the multiplier sensitivity ∂λ/∂p = −∂²f*/∂p², hence the minus in the covariance recipe.)

For callers that calibrated against the pre-#128 behavior, the solver-space value and the factors that relate the two are exposed:

Python: info["reduced_hessian_scaled"], info["obj_scaling_factor"], info["pin_g_scaling"]; Solver.reduced_hessian(pins, scaled=True), Solver.kkt_solve(rhs, scaled=True), and the Solver.nlp_scaling dict ({"obj": df, "c_scale": …, "d_scale": …}).
Rust: SensResult::{reduced_hessian_scaled, obj_scaling_factor, pin_g_scaling}, Solver::{compute_reduced_hessian_scaled, kkt_solve_scaled, nlp_scaling, pin_g_scaling}, and PdSensBacksolver::solve_scaled_space.

The relation is H_scaled[i,j] = df / (dc_i·dc_j) · H[i,j], where df is the objective scaling factor and dc_i the pin rows’ constraint scaling factors.

One caveat: the IPM’s inertia-correction perturbations (δ_x, δ_s, δ_c, δ_d) are added to the factor in scaled space, so on a problem whose final factorization needed regularization (e.g. linearly dependent pin rows) the unscaling maps a slightly different perturbed system per scaling method. The perturbations are reported — info["kkt_perturbations"] / Solver.kkt_perturbations (Python), SensResult::kkt_perturbations / Solver::kkt_perturbations (Rust) — so a covariance workflow can assert they are all zero before trusting -inv(reduced_hessian); on well-posed estimation problems the final factor is unregularized and the invariance is exact.

Verification

All three entry points are verified against upstream sIPOPT 3.14.19’s parametric_cpp golden output to within roughly 6e-9 per component. The bound projection is a single-pass clamp; upstream’s iterative Schur refinement (re-factorize on each violation) is intentionally not ported.

Sessions: Factor-Once / Solve-Many

POUNCE’s IPM converges to a KKT linear system that, once factored, answers a number of useful follow-up questions cheaply: parametric steps, reduced Hessians, custom back-solves. The session APIs let you hold that factor alive between operations, rather than rebuilding it on every call. The same machinery serves two workloads:

Sensitivity / many-RHS. After one solve, issue many cheap operations against the converged factor — parametric steps for several parameter perturbations, reduced Hessians over several pinned-row sets, raw KKT back-solves.
Factor-only. For non-IPM uses (shift-invert eigensolves, custom Newton iterations) the underlying [Factorization] handle in pounce-linsol exposes factor / refactor / back-solve directly, without the IPM in the loop.

Which layer do I want?

You want…	Use
One solve plus a few sensitivity queries, from Python	`pounce.Solver` (Python)
The same, from C	`IpoptSolver` (C ABI)
The same, from Rust	`pounce_sensitivity::Solver`
Just a sparse symmetric factor — no IPM involved	`pounce_linsol::Factorization`
A one-shot sensitivity computation with a fluent builder	`pounce_sensitivity::SensSolve` (Rust) or `Problem.solve_with_sens` (Python)

The session API does not rebuild the IPM. Each solve() call runs the full barrier method from scratch. What it reuses is the factor that exists at convergence: KKT back-solves and sensitivity operations skip the symbolic factor, AMD ordering, and numeric factorization.

Python

import pounce

problem = pounce.Problem(...)
solver = pounce.Solver(problem)

x, info = solver.solve(x0=x0)
assert solver.converged

# Parametric step ∂x*/∂p · Δp, with p pinned by g(x) row indices.
dx = solver.parametric_step([2, 3], [-0.5, 0.0])

# Reduced Hessian B K⁻¹ Bᵀ over the same pinned-row set.
hr = solver.reduced_hessian([2, 3])

# Raw KKT back-solve, useful for custom workflows.
dim = solver.kkt_dim
rhs = np.zeros(dim)
lhs = solver.kkt_solve(rhs)

The KKT compound vector is laid out as x || s || y_c || y_d || z_l || z_u || v_l || v_u. pin indices in parametric_step / reduced_hessian are 0-based row indices into g(x); they are mapped internally to the matching y_c rows (through the equality/inequality split, so inequalities may precede the pins).

All back-solves are in natural (unscaled) units — any NLP scaling the IPM applied internally is undone, so results are independent of nlp_scaling_method (#128). The solver-space values remain available via reduced_hessian(pins, scaled=True) / kkt_solve(rhs, scaled=True), and the factors via the Solver.nlp_scaling dict — see Sensitivity Analysis.

pounce.Problem.solve() and Problem.solve_with_sens() still work unchanged — each internally builds a fresh session — but new code that issues more than one sensitivity query per solve should prefer pounce.Solver to skip rebuilding the application.

C

IpoptProblem prob = CreateIpoptProblem(...);
AddIpoptStrOption(prob, "linear_solver", "feral");

/* Consumes prob — the IpoptSolver is now the sole owner.
   prob is NULLed; calling FreeIpoptProblem(prob) on the now-null
   pointer is harmless. */
IpoptSolver sol = IpoptCreateSolver(&prob);

double x[n], obj;
IpoptSolverSolve(sol, x, NULL, &obj, NULL, NULL, NULL, user_data);

Index dim = IpoptSolverGetKktDim(sol);     /* compound KKT dim     */
double rhs[dim], lhs[dim];                  /* memset rhs as needed */
IpoptSolverKktSolve(sol, rhs, lhs);

Index pins[2] = {2, 3};
double deltas[2] = {-0.5, 0.0};
double dx[n];
IpoptSolverParametricStep(sol, 2, pins, deltas, dx);

double hr[2 * 2];                           /* column-major dense   */
IpoptSolverReducedHessian(sol, 2, pins, 1.0, hr);

IpoptFreeSolver(sol);

The classic IpoptSolve API is unchanged and unaffected; the session handle lives alongside it.

Rust

#![allow(unused)]
fn main() {
use pounce_sensitivity::Solver;

let mut solver = Solver::new(app, tnlp);
solver.solve();
assert!(solver.converged().is_some());

let dx = solver.parametric_step(&[2, 3], &[-0.5, 0.0])?;
let hr = solver.compute_reduced_hessian(&[2, 3], 1.0)?;

let mut lhs = vec![0.0; solver.kkt_dim().unwrap()];
solver.kkt_solve(&rhs, &mut lhs)?;
}

For purely linear-algebra uses with no IPM in the loop:

#![allow(unused)]
fn main() {
use pounce_linsol::Factorization;

let mut fact = Factorization::new(dim, ia, ja, &values, backend)?;
fact.solve(&mut rhs, 1)?;          // back-substitute in place
fact.refactor(&new_values)?;       // pattern preserved; numeric reuse
fact.solve_one(&mut another_rhs)?;
}

What’s preserved across operations

Symbolic factor / AMD ordering. Owned by the linear-solver backend; reused on every back-solve and on refactor().
Numeric factor. Reused on every back-solve until you refactor.
The converged primal-dual state (x*, multipliers, g(x*), iteration stats).

What’s not preserved across `solve()` calls

The session is currently a factor-and-query value: one solve, many follow-up operations. A separate resolve() that re-runs the IPM while reusing the symbolic factor + AMD ordering across top-level solves (for MPC / B&B / warm-start workloads) is planned but not yet implemented. Each solve() call today runs a fresh IPM.

Verification

All session entry points are tested for numerical equivalence with the corresponding one-shot APIs:

pounce.Solver.solve ≡ Problem.solve (1e-12).
pounce.Solver.parametric_step ≡ Problem.solve_with_sens(deltas=…)['dx'] (1e-10).
pounce.Solver.reduced_hessian ≡ Problem.solve_with_sens(compute_reduced_hessian=True)['reduced_hessian'] (1e-10).
pounce_sensitivity::Solver::parametric_step ≡ SensSolve::with_deltas (1e-10).

See python/tests/test_solver_session.py and crates/pounce-sensitivity/tests/solver_session.rs for the full test matrix.

Differentiable Solves & the `DiffHandoff` Contract

POUNCE solves are differentiable: a solve can sit inside a JAX or PyTorch model and pass gradients with respect to the problem parameters. This page documents the handoff contract — the well-defined bundle of post-convergence data every solve produces — so that any consumer (the built-in JAX/Torch layers, a downstream tool such as discopt, or your own autodiff code) can differentiate a POUNCE solve from one stable surface rather than from solver internals.

Design notes: dev-notes/diff-handoff-contract.md.

What a differentiable backward needs

The gradient of an optimal solution x*(p) with respect to a parameter p comes from the implicit-function theorem applied to the KKT conditions at the solution. To assemble it, a backward pass needs:

the primal solution x* and the constraint / bound multipliers;
the active set — which variable bounds bind and which constraint rows are active — so inactive directions drop out correctly;
(for performance) the converged KKT factorization, reused as a back-solve rather than rebuilt.

POUNCE produces all three. The first two ride out in the solve info dict; the third is reused automatically by JaxProblem (see Sessions).

The active-set masks (the `DiffHandoff` core)

Every NLP solve’s info dict carries a precomputed active set, derived once on the Rust side (pounce_sensitivity::DiffHandoff) so no consumer re-derives it under its own tolerance:

`info` key	Type	Meaning
`pinned_vars`	`bool[n]`	Variable `i` has an active bound — its sensitivity is zero (`dx_i/dp = 0`). True when `mult_x_L[i] > active_tol` or `mult_x_U[i] > active_tol`.
`active_constraints`	`bool[m]`	Constraint row `i` is active: an equality (`g_l[i] == g_u[i]`) or a binding inequality (`abs(mult_g[i]) > active_tol`).
`active_tol`	`float`	The activity threshold used to derive the two masks above (default `1e-6`).

pinned_vars is the seam used for mixed-integer problems: a branch-and-bound leaf fixes integer variables at their optimal values, and those variables differentiate exactly like an active bound (dx/dp = 0). A producer of a fixed-integer leaf adds them to the mask (DiffHandoff::pin on the Rust side).

Multiplier conventions (canonical mapping)

The same dual quantity is named differently across POUNCE’s solver surfaces — deliberately, because each surface preserves an external contract. The canonical field is DiffHandoff.lambda (general constraint multipliers); this table maps every surface onto it so a consumer knows the correspondence:

Surface	Problem form	General-constraint dual	Bound duals	Why this naming
NLP (`Problem`, C ABI)	`min f(x) s.t. g_l ≤ g(x) ≤ g_u, x_l ≤ x ≤ x_u`	`mult_g`	`mult_x_L`, `mult_x_U`	cyipopt-compatible — drop-in for cyipopt / JuMP / AMPL clients.
Convex QP/SOCP (`solve_qp`)	`min ½xᵀPx + cᵀx s.t. Gx ≤ h, Ax = b`	`z` (inequality `G`), `y` (equality `A`)	`z_lb`, `z_ub`	OptNet / convex-solver convention (Amos & Kolter 2017).
`DiffHandoff` (canonical)	general	`lambda`	`mult_x_lower`, `mult_x_upper`	one name for the contract.

Caution. The internal symbol lam is not a single quantity: in the NLP backward (jax/_diff.py) it is all constraint multipliers (= mult_g); in the QP backward (jax/_qp.py) it is the inequality-only duals (= z). Always map through the table above rather than assuming a shared name means a shared quantity.

These names are stable: the NLP keys are an external cyipopt contract and will not be renamed.

Consuming the contract

JAX / PyTorch (built in)

pounce.jax and pounce.torch already differentiate solves; you do not touch the masks directly. Use pounce.jax.solve / JaxProblem (or the torch equivalents) and call jax.grad / .backward() as usual. For batched and repeated solves, JaxProblem reuses the converged KKT factor in the backward (factor_reuse=True, default) — see Sessions.

Across a language / tool boundary (e.g. discopt)

A downstream tool that drives POUNCE as its NLP backend and composes its own autodiff reads the contract straight from the info dict returned by Problem.solve:

x, info = problem.solve(x0=...)
# primal + duals
lam   = info["mult_g"]      # general-constraint multipliers (the canonical λ)
z_L   = info["mult_x_L"]
z_U   = info["mult_x_U"]
# precomputed active set — do NOT re-derive |mult| > tol yourself
pinned = info["pinned_vars"]          # bool[n]: dx/dp = 0 on these
active = info["active_constraints"]   # bool[m]: rows in the KKT block
tol    = info["active_tol"]

Because the active set is computed once in the producer, every consumer sees the same masks under the same tolerance — which is what makes a gradient assembled on one side of the boundary agree with one assembled on the other.

Verification

The contract is exercised by the test suite:

python/tests/test_problem.py::test_diff_handoff_masks_in_info asserts the masks against a problem with a known active set (HS071: one variable on its lower bound, a binding inequality, and an equality).
python/tests/test_jax.py (85 finite-difference gradient checks) and python/tests/test_parity_jax_torch.py (JAX↔Torch gradient agreement) confirm the backward passes that rest on this data are correct and frontend-independent.

Interactive Solver Debugger

POUNCE ships an interactive debugger for the interior-point loop — a pdb for the IPM. You can pause the solve at well-defined points, inspect and mutate the live mathematical state (the iterate, multipliers, the barrier parameter μ), set breakpoints (by iteration, on a numeric condition, or on a solver event), step through an iteration’s internal phases, rewind to an earlier iterate, re-solve from a saved point with new options, and drop in automatically when a solve fails.

It has two front ends sharing one command engine:

a human REPL (--debug) with history, Ctrl-R search, and Tab completion, and
a newline-delimited JSON protocol (--debug-json) that an LLM agent, a script, or a visual debugger (e.g. a VS Code Debug Adapter) can drive programmatically.

No production NLP solver ships anything like this; if you have used ipopt you have had print_level and a log. This is a live debugger.

The same debugger spans every POUNCE solver — the NLP filter-IPM and the convex / conic interior-point solver share one command engine and one REPL. See Beyond the interior-point loop for the small set of commands whose availability is backend-conditional.

The debugger has zero effect on the solve when it is not attached. The checkpoint fire-sites short-circuit when no debugger is installed, so the standard regression suite is bit-for-bit identical with and without the feature compiled in.

Quick start

pounce problem.nl --debug          # human REPL, pauses at iteration 0
pounce problem.nl --debug-json     # JSON protocol on stdin/stdout
pounce problem.nl --debug-on-error # run freely; drop in only if it fails
pounce problem.nl --debug-on-interrupt   # run; Ctrl-C drops you in

A 30-second session (human REPL):

$ pounce --problem rosenbrock --debug

── pounce-dbg ── iter 0 @iter_start  mu=1.000e-1  obj=2.420000e1  inf_pr=0.00e0  inf_du=1.00e2
pounce-dbg> info
iter      = 0
mu        = 1.000000e-1
objective = 2.42000000e1
...
pounce-dbg> print x
x = [-1.200000e0, 1.000000e0]
pounce-dbg> break if inf_du<1e-6
conditional breakpoint: inf_du<1e-6
pounce-dbg> continue
... solver runs ...
── pounce-dbg ── iter 21 @iter_start  mu=...  inf_du=8.7e-7
   ↳ inf_du<1e-6
pounce-dbg> quit

The prompt is on stderr; the solver’s own iteration table stays on stdout, so a redirected log is unaffected.

The two front ends

	`--debug` (REPL)	`--debug-json`
Audience	human at a terminal	agent / script / GUI
Channel	prompt + output on stderr	pure JSON on stdout
Line editing	rustyline: history (`~/.pounce_dbg_history`), Ctrl-R, Tab completion	n/a (caller supplies UI)
Solver table	shown on stdout	suppressed (`print_level 0`)
Commands	bare strings	bare strings or `{"cmd":…,"args":[…],"id":…}`

On a non-TTY stdin (a pipe), the REPL falls back to a plain line reader (no history/Tab) but otherwise behaves identically — handy for scripted tests.

The JSON protocol is documented in full below.

Pausing and flow control

Checkpoints

The loop fires the debugger at these points (a pause reports which one via its checkpoint field):

Checkpoint	Fires	What’s fresh
`iter_start`	top of each outer iteration	the accepted iterate from the previous step
`after_mu`	μ updated for this iteration	the new barrier parameter
`after_search_dir`	Newton step `δ` solved	the step (`dx` …), regularization, KKT inertia
`after_step`	trial accepted	the step lengths α, the new iterate
`step_rejected`	line search gave up (tiny step / all backtracks failed), before restoration	the search direction `δ` and the un-accepted iterate
`pre_restoration_entry`	just before restoration	the iterate that tripped restoration
`post_restoration_exit`	restoration returned	what restoration produced
`terminated`	once, before the solve returns	the final / failing iterate + status

By default the debugger only stops at iter_start (and terminated). The sub-iteration checkpoints fire every iteration but resume immediately unless you ask to stop at them.

Stepping into restoration. The same debugger drives the restoration inner IPM: when the solve enters restoration, the inner solve’s checkpoints fire too. A step/stepi that lands on an inner iteration pauses there with in_restoration: true (REPL banner shows [restoration]), and print x shows the restoration sub-NLP iterate. stop-at resto (pre_restoration_entry) is the easy way to catch the hand-off and then step inward.

Stepping

Command	Effect
`step` / `s` / `n`	run to the next `iter_start`
`step sub` / `stepi` / `si`	run to the next checkpoint of any kind (walk an iteration’s phases)
`continue` / `c`	run to the next breakpoint (or to completion)
`run N` / `r N`	run until iteration `N`
`stop-at <cp>`	always pause at checkpoint `<cp>`
`detach`	stop pausing; run to completion
`quit` / `q`	stop the solve now

stop-at takes a checkpoint name or a friendly alias:

stop-at after_search_dir     # or:  stop-at kkt
stop-at pre_restoration_entry   # or:  stop-at resto
stop-at                      # list active stop-at checkpoints
stop-at clear

Aliases: mu → after_mu, kkt/search_dir → after_search_dir, step → after_step, resto → pre_restoration_entry, resto_exit → post_restoration_exit.

Breakpoints

Three kinds, all reported in break and surfaced as the pause reason.

By iteration

break 12            # pause at iteration 12   (alias: b 12)
tbreak 12           # one-shot: pause at 12, then delete itself (alias: tb)
break               # list all breakpoints
break del 12        # remove
break clear         # remove everything (iters + conditions + events)

Watchpoints (data breakpoints)

watchpoint x[3]        # pause when x[3] changes (alias: wp)
watchpoint x 1e-3      # pause when any x component moves by > 1e-3
watchpoint             # list; watchpoint del x[3]; watchpoint clear

Distinct from watch (which only displays): a watchpoint pauses the solve when the watched value changes by more than its threshold (default 0 = any change) between iterations. Useful for a component expected to stay put (e.g. a variable pinned at a bound).

Breakpoint command lists

Attach commands to a breakpoint that run automatically when it hits — semicolon-separated, ending with a flow command to auto-resume:

break 5
commands 5 print kkt ; set mu 0.1 ; continue   # at iter 5: inspect, tweak μ, go
commands 5 clear                               # remove
commands                                       # list all

When iteration 5 is reached, the debugger emits the pause, runs the attached commands (each result is reported), and if one of them resumes/stops, honors it without dropping to the prompt — otherwise it falls through to the interactive prompt as usual.

Conditional (with compound predicates)

break if inf_pr<1e-6
break if mu<1e-4 && inf_pr>1e-3
break if iter>10 && (inf_du>1e-2 || obj<0)
break clear cond

Metrics: mu, inf_pr, inf_du, obj, err (overall NLP error), iter.
Operators: <, <=, >, >=, == (== is float-tolerant).
Compound: && and ||, evaluated strictly left-to-right with no precedence; parentheses are accepted but stripped (they don’t group). For real grouping, register several conditions — any one that holds fires.

Conditions are evaluated at iter_start.

On a solver event

break on regularized
break on resto_entered
break clear events

Event	Fires when
`resto_entered`	the algorithm enters restoration
`resto_exited`	restoration returns
`regularized`	the KKT system needed regularization (δ_w > 0 — inertia correction)
`tiny_step`	the primal step is numerically negligible (‖dx‖∞ < 1e-10)
`ls_rejected`	the line search tried more than one trial point
`mu_stalled`	μ held (to tolerance) for 3 consecutive iterations
`nan`	the NLP error or objective became non-finite

Events fire at whatever checkpoint makes them observable (e.g. regularized at after_search_dir), and pause with reason: "event: <name>".

Inspecting state

info                 # one-line summary: iter, mu, obj, inf_pr, inf_du, nlp_error, dims
print x              # a primal/dual block (alias: p x)
print dx             # a search-direction block (d + block name)
print mu             # a scalar: mu|obj|inf_pr|inf_du|err|compl|iter
print kkt            # KKT inertia + regularization (see below)
print rank           # SVD numerical rank of the equality Jacobian J_c (see below)
print active         # which bound categories are near-active (small slack)
watch mu             # auto-print a target at every pause (alias: display)
watch                # list watches; watch del mu; watch clear

watch <target> registers any print target (block, dx, scalar, kkt) to be shown automatically at every subsequent pause — the debugger’s equivalent of gdb’s display. In JSON mode the values arrive in the pause event’s watches array.

Blocks (the eight components of the primal-dual iterate):

Name	Meaning
`x`	primal variables
`s`	inequality slacks
`y_c`	equality-constraint multipliers
`y_d`	inequality-constraint multipliers
`z_l`, `z_u`	bound multipliers on `x`
`v_l`, `v_u`	bound multipliers on `s`

Prefix any block with d (dx, dz_l, …) to print the corresponding block of the most recent Newton step.

Model names (`.col` / `.row`)

A solver-internal diagnostic that says “variable 132 in equation 3 looks singular” is far less actionable than one that says “T_reactor in energy_balance”. Lee et al. (2024) identify this gap — between detecting an issue numerically and tracing it back to a named equation in the modeling environment — as a central roadblock for debugging equation-oriented models.¹

AMPL .nl files carry no names, but AMPL emits two optional sibling files when the modeler sets option auxfiles rc;:

File	Contents
`stub.col`	one variable name per line, in column order
`stub.row`	one constraint name per line, in row order

When these sit next to the .nl, pounce captures them (NlProblem::var_names / con_names) and exposes them through the ExpressionProvider::variable_name / constraint_name seam. Missing or malformed name files are non-fatal — names are a diagnostic aid, never load-blocking, so the debugger simply falls back to index labels.

print residuals uses these names directly. Residual values live in the solver’s split space (equalities and inequalities separated, fixed variables removed), so a name only labels the right row if it is carried through the same permutations. The TNLP publishes its .col/.row names under the conventional idx_names metadata key, and OrigIpoptNlp projects them into split space (x_not_fixed_map for variables, c_map for equalities, d_map for inequalities) — the debugger reads the result via DebugCtx::split_names. So a near-singular equality residual prints as

c[energy_balance] = +3.142e-04   |3.142e-04|

instead of c[3]. The same idx_names pool labels grad_x_L[...] (variable names) and grad_s_L[...] / d-s[...] (inequality names). The JSON payload keeps the numeric index and adds a name field.

Status. Capture, exposure, and print residuals labeling are live on the AMPL .nl path with names projected through the bound / c-d-split permutations. Presolve renumbers rows, so PresolveTnlp declines idx_names rather than risk mislabeling a permuted row — under presolve the debugger safely falls back to index labels. Carrying names through the presolve map and decorating print active are the next steps built on this foundation.

`print equation` — the algebra of a named constraint

Naming a culprit row is only half the story; the next question is always what does that equation actually say? Lee et al. (2024) make this the core of actionable equation-oriented diagnostics — a debugger should surface the named equation, not just a row index.¹ print equation closes that loop: once print residuals points at, say, c[energy_balance], you read the constraint’s source algebra directly.

(dbg) print equation energy_balance
energy_balance:  T_reactor*flow - 300*flow - Q = 0

(dbg) print equation 14          # by original .nl row index
c[14]:  x[3]^2 + x[7]^2 <= 1

A constraint is addressable by its model name (preferred, and robust to row reordering) or its original .nl row index. With no argument, print equation reports how many equations are available. The renderer works from the faithful Expr DAG the .nl parser built — not the lossy evaluation tape — so common-subexpressions, imported functions, and piecewise/conditional forms render as written. The affine part is printed with tidy signs (a - 2*b, not a + -2*b), zero-coefficient Jacobian placeholders are suppressed, and bounds render in their natural relation (= rhs, lo <= body <= hi, >= lo, <= hi). The JSON payload carries {index, name, equation}.

Equations are static model data in original .nl row order, so unlike residuals they need no split-space projection — print equation works regardless of presolve. It is available whenever a model was loaded from an .nl file; the JSON name field is present only when a .row auxfile supplied one.

`print kkt` — inertia and regularization

Available at/after after_search_dir (use stop-at kkt). This is the view a solver expert reaches for when a step looks wrong:

pounce-dbg> stop-at kkt
pounce-dbg> continue
── pounce-dbg ── iter 3 @after_search_dir ...
pounce-dbg> print kkt
dim       = 3
inertia   = n+=2 n-=1 (expected n-=1) → correct
delta_w   = 0.000000e0   (primal regularization)
delta_c   = 0.000000e0   (dual regularization)
status    = Success

The augmented (KKT) system has expected inertia (n₊ = n, n₋ = m, n₀ = 0) where m is the number of equality + inequality multipliers. A mismatch — or a nonzero delta_w/delta_c — is the classic signal that the step is being stabilized (the solver added regularization to fix the inertia).

For the matrix and factor themselves:

viz kkt     # the assembled augmented-system matrix (triplets) + inertia
viz L       # the LDLᵀ factor (strict-lower triplets + values)

viz kkt writes the KKT matrix as 1-based lower-triangle triplets (dim, irn, jcn, vals) alongside the inertia summary — point $POUNCE_DBG_VIEWER at a heatmap script. viz L writes the LDLᵀ factor (n, fill-reducing perm, strict-lower l_irn/l_jcn/l_vals in permuted coordinates), read out of the factor the solver actually computed.

Both are read-only and always show the most recent factorization: the current iteration’s system at an after_search_dir stop, or the previous iteration’s at the default iter_start pause (the step that produced where you’re standing). The matrix and factor are captured every iteration while the debugger is stepping; once you detach (run free) the capture is dropped — so on a large problem a free run doesn’t pay the O(nnz) assembly. If you viz kkt/viz L right after a free run, step once to re-capture.

`print rank` — numerical rank of the equality Jacobian

print kkt tells you that the dual system needed regularization (delta_c > 0) or that the inertia was wrong; the structural_singularity finding names equations that are dependent by sparsity pattern. print rank closes the last gap: a rank-revealing SVD of the equality Jacobian J_c at the current iterate. It factors the matrix the solver actually sees (constraint scaling already applied), so it localizes the dependency to specific equations — including dependencies that are numerical only (values that cancel over a full sparsity pattern), which the structural Dulmage–Mendelsohn pass cannot detect.

It doesn’t just name the culprit equations — it prints them. When a .nl model is loaded, each implicated row’s source algebra is rendered directly beneath it (the same DAG-faithful text print equation shows), so you read the dependency without a second command:

pounce-dbg> print rank
equality Jacobian J_c: 3 row(s) × 4 column(s)
numerical rank = 2 / 3  (deficiency 1)
σ_max = 3.162e0   σ_min = 0.000e0   cond = inf (σ_min = 0)   (rank tol τ = 1.40e-15)
singular values: [3.162e0, 1.414e0, 0.000e0]
rank-deficient: 1 equation(s) lie in the near-null space (linearly dependent / redundant) — the source of δ_c regularization:
  c[mass_balance]       (participation 0.50)
      x[0] + x[1] - 10 = 0
  c[mass_balance_dup]   (participation 0.50)
      x[0] + x[1] - 10 = 0

The two equations print identically — that is the redundancy, now visible on its face.

For the SVD J_c = U Σ Vᵀ, the left singular vectors u_k whose singular value σ_k ≈ 0 span the left null space — the row combinations u_kᵀ J_c ≈ 0 that vanish. Each row’s participation w_i = Σ_{k : σ_k ≤ τ} u[i,k]² ∈ [0, 1] localizes the dependency: a redundancy shared between two equations splits ≈ 0.5/0.5, while w_i = 1 means row i lies entirely in the null space. The numerical-rank threshold is the standard LAPACK/NumPy τ = σ_max · max(m, n) · ε; the implicated rows are resolved to model names through the same .row plumbing as print residuals / print equation.

The inline algebra is resolved by model name, so it appears for named rows. The rank report’s row index is the split equality position, not the original .nl row the equation source keys on, so an unnamed row can’t be mapped — there print rank falls back to a print equation <name> hint instead of guessing. When J_c has full row rank, that is reported as a positive signal (J_c has full row rank at this iterate.) with the σ_min/cond witnessing how far it is from degenerate — silence would be ambiguous. The command is available whenever the iterate has an equality block; a problem with no equality constraints returns a short explanatory error. The JSON payload is {iter, n_rows, n_cols, rank, deficiency, rank_deficient, sigma_max, sigma_min, cond, tol, singular_values, culprits: [{row, kind, index, name, label, weight, equation}]} (equation is the rendered source or null when unresolved; cond is null when σ_min = 0, since JSON has no infinity).

`diagnose` — a live, named health report

info, print residuals, and print kkt each expose one facet of the current iterate. diagnose (alias diag) runs a panel of heuristics over all of them at once and returns a ranked list of findings — and, crucially, names the culprit equation or variable behind each numerical symptom. That last step is the actionable-diagnostics path of Lee et al. (2024):¹ a report that says “mass_balance is the worst constraint residual” is worth far more than “row 13 is infeasible.”

pounce-dbg> diagnose
[  error] primal_infeasible: Primal infeasibility 1.70e+02; worst constraint
         residual is c[mass_balance] = +1.701e+02. Inspect this equation's
         feasibility and scaling (`print equation mass_balance`).
[warning] dual_infeasible: Dual infeasibility 9.84e-01; largest stationarity
         residual is grad_x_L[T_reactor] = -9.838e-01.
[warning] inertia_wrong: KKT inertia is wrong (n-=2 vs expected 1): the system
         was indefinite/singular and the step had to be stabilized.
[   info] bounds_pinned: 3 variable bound(s) are active (slack < 1e-6).

This is the live counterpart to the pounce-studio diagnose tool, which runs temporal heuristics over a finished solve report. The two share a {severity, code, message} shape so a client can treat them uniformly, but the live command sees what a saved report cannot: the current KKT inertia and regularization, and the named primal/dual residuals at this exact point. Findings are sorted error → warning → info; a clean iterate yields a single healthy finding. The checks:

code	severity	fires when
`primal_infeasible`	error/warning	`inf_pr` above tol → names the worst constraint residual
`dual_infeasible`	warning	`inf_du` above tol → names the worst stationarity residual
`inertia_wrong`	warning	KKT inertia ≠ expected (rank-deficient Jacobian / indefinite Hessian)
`heavy_regularization`	info	primal δ_w applied (Hessian indefinite)
`dual_regularization`	warning	dual δ_c applied (linearly dependent / redundant equalities)
`structural_singularity`	warning	a subset of equalities is over-determined → names the dependent equations
`rank_deficient_jacobian`	warning	SVD of `J_c` is numerically rank-deficient → names the equations in the near-null space (catches value-only dependencies too)
`large_multipliers`	warning	a multiplier exceeds 1e8 (constraint-qualification / scaling)
`bounds_pinned`	info	variables pressed against their bounds
`tiny_step`	warning	accepted α_pr collapsed
`heavy_line_search`	warning	≥10 backtracking trials for the accepted step
`in_restoration`	warning	currently inside feasibility restoration
`mu_stalled`	warning	μ flat for ≥3 consecutive iterations

KKT-derived findings (inertia_wrong, *_regularization) need a computed search direction, so they appear at/after after_search_dir. Names follow the same rule as print residuals: present on the .nl path with .col/.row files, index labels (c[13]) under presolve. The JSON payload is {iter, findings: [{severity, code, message}], n_findings}.

Structural rank: naming the dependent equations

inertia_wrong and dual_regularization detect a rank-deficient Jacobian, but only as a scalar — they tell you a redundancy exists, not which equations are redundant. structural_singularity closes that gap with a Dulmage–Mendelsohn decomposition of the equality Jacobian’s sparsity pattern (the same structural check at the heart of IDAES’s DiagnosticsToolbox). A maximum bipartite matching between equality rows and variables partitions the system; any over-determined block — more equations than the variables they jointly touch — forces at least one of those equations to be redundant or mutually inconsistent (LICQ fails). The finding lists those equations by model name, e.g.:

pounce-dbg> diagnose
[warning] structural_singularity: Constraint Jacobian is structurally singular
         (Dulmage–Mendelsohn): 2 equation(s) over-determine the 1 variable(s)
         they jointly touch (flow_rate), so ≥1 of them must be redundant or
         mutually inconsistent (LICQ fails on this block). Candidate dependent
         equations: mass_balance, mass_balance_dup. Inspect them with
         `print equation <name>`; this names the rows behind any δ_c
         dual-regularization / wrong-inertia signal.

This is the named-culprit payoff of Lee et al. (2024):¹ reporting “mass_balance and mass_balance_dup are linearly dependent” rather than “the Jacobian is singular.” The check is iterate-independent (it reads only the sparsity pattern), so unlike the KKT-derived findings it fires from iteration 0 — it can flag a structurally broken model before the solver ever stalls on it. It is suppressed for well-posed problems: an NLP with more variables than equality constraints is the normal case (the spare degrees of freedom are pinned by the objective, bounds, and inequalities), so only the over-determined side is reported, never the under-determined one. Available on the .nl path; names fall back to c[i]/x[i] when no .col/.row auxiliary files were emitted.

Numerical rank: the value-dependency the structure can’t see

structural_singularity reads only the sparsity pattern, so it is blind to a redundancy that lives in the values — three equations whose every entry is nonzero (a structurally full-rank pattern) but whose rows satisfy row₂ = row₀ + row₁ numerically. rank_deficient_jacobian is the numerical complement: it runs the same SVD as print rank over J_c at the current iterate and, when the numerical rank falls short, names the equations in the near-null space:

pounce-dbg> diagnose
[warning] rank_deficient_jacobian: Equality Jacobian J_c is numerically
         rank-deficient at this iterate: rank 2/3 (deficiency 1),
         σ_min=0.00e0, cond=inf (σ_min = 0). Linearly dependent or redundant
         equality constraints — the root cause behind δ_c regularization /
         wrong inertia. Implicated equations: c[mass_balance],
         c[mass_balance_dup].

Unlike the structural check, this one is iterate-dependent — it factors J_c at the current x, so it reflects the matrix the solver is actually regularizing and catches dependencies that only appear at certain points. The two checks are deliberately layered: structural_singularity fires from iteration 0 on the pattern alone; rank_deficient_jacobian confirms it numerically and, more importantly, surfaces the value-only dependencies the structural pass provably cannot. See print rank for the SVD math and the per-equation participation weights.

Mutating state

Mutations feed straight back into the solve.

set mu 0.5           # overwrite the barrier parameter
set x[2] 1.5         # overwrite one component of a block
set x 1.0,2.0,3.0    # overwrite a whole block (comma-separated)

Setting any block works (set z_l[0] 1e-3, …). Iterate edits rebuild the iterate with a fresh change-tag, so the cached derived quantities (curr_f, slacks, σ, …) invalidate correctly and the next step is computed from the new point — exactly as if the line search had produced it.

Staging a solver option (validated against the registry):

set opt mu_strategy adaptive
set opt linear_solver ma57

Staged options are not applied to the strategies already built for the running solve (they don’t re-read options mid-iteration). They take effect on a resolve or the next solve.

The read-side counterpart is get opt <name>, which reports an option’s current (or staged) value and its registry metadata — so you can confirm what a set opt actually staged before you resolve:

get opt mu_strategy        # → mu_strategy = adaptive  (staged)

Discovering options

opt                  # list every registered option
opt mu               # filter by name/category substring
complete pri         # completion candidates for a prefix

opt <exact-name> also prints the long description. In the REPL, Tab completes command verbs, block names, metric names (after break if), checkpoint names (after stop-at), event names (after break on), option names (after set opt / opt), and filesystem paths (after load / sweep / save / source — directories get a trailing /). The same contexts are available programmatically via the complete <prefix…> command (JSON complete), so an agent or GUI can offer the same completions.

Time travel

Rewind (`goto` / `restart`)

The debugger snapshots the primal-dual state (x, s, multipliers, μ, τ) every iteration. goto rewinds to a captured iteration and stays paused so you can re-tune before resuming:

goto 3               # rewind to the start of iteration 3
restart              # rewind to the earliest snapshot

Caveat — this is a soft rewind. Only the primal-dual state is restored; strategy history (the filter, the adaptive-μ oracle, the quasi-Newton memory) is not rolled back. So continuing from a rewound point is “resume from here,” not a bit-exact replay of the original run.

Re-solve from a saved point

resolve re-runs the solve from the current x with any set opt edits applied — a primal warm start with new options. Use it for “what if I change mu_strategy from here?”:

pounce-dbg> set opt mu_strategy adaptive
pounce-dbg> resolve
re-solving from current x with 1 staged option override(s)…
── pounce-dbg ── iter 0 @iter_start ...   # fresh solve, seeded from the captured x

Because each solve rebuilds its strategies from the options, the changes do take effect on the re-solve. The seed is dropped (falling back to the problem’s own start) if presolve / fixed-variable elimination changed the coordinate count.

Saving and visualizing artifacts

save                 # write the current iterate + residuals to a temp JSON
save /tmp/iter3.json # explicit path
viz x                # write a block and open it in an external viewer
viz dx               # a search-direction block
viz kkt              # the KKT inertia/regularization report

save writes every non-empty block, the search-direction blocks, and the residual scalars (iter, mu, objective, inf_pr, inf_du, nlp_error) — a self-contained artifact for external analysis.

`load` — the inverse of `save`

Typing a start point by hand is fine for a 2-variable toy and miserable for anything real. load reads a block straight into the live iterate, so you generate the point once (a prior solve, a surrogate, a sampler) and pull it in:

load /tmp/it0.json       # a `save` artifact: every block it contains is loaded
load start.csv           # a plain numeric file → x (comma/space/newline sep)
load start.csv s         # … into a named block instead of x

Two input shapes are accepted:

A save artifact (JSON). Blocks are read from the top level or from an iterate object; every block present (x, s, multipliers, …) is written, each validated against the current dimension. So save→load round-trips a full point, and you can lift just the part that fits if dimensions changed.
A plain numeric file — values separated by commas, whitespace, or newlines — written into the named block (default x). This is the many-variable escape hatch: numpy.savetxt("start.csv", x0) then load start.csv.

A loaded x becomes the seed for the next step (or for resolve — a warm start from an externally-computed point with no typing).

Interactive figures (`pounce-dbg-viz`)

viz writes a JSON artifact and hands it to a viewer. The Python package ships an interactive Plotly viewer that renders these properly — a spy/heatmap for viz kkt (the augmented matrix, colored by value, with the inertia/regularization in the title) and viz L (the LDLᵀ factor), and a bar chart for vector blocks (viz x, viz dx):

pip install 'pounce-solver[viz]'    # installs the `pounce-dbg-viz` script

When pounce-dbg-viz is on PATH, viz uses it automatically (opening an interactive figure in your browser). The launch order is:

$POUNCE_DBG_VIEWER — a command template ({} ← the artifact path), if set;
pounce-dbg-viz — the bundled Plotly viewer, if installed;
the OS opener (xdg-open / open) on the raw JSON.

So export POUNCE_DBG_VIEWER='python my_plot.py {}' overrides with your own plotter, and with nothing set + the viz extra installed it just works. The same pounce-dbg-viz <file.json> also renders a save artifact (the full iterate).

Multi-start and initialization sensitivity

Interior-point methods find a local solution, and which one depends on where you start. Two commands turn the debugger into an initialization-sensitivity probe: they run many full solves — each from a different start — and tabulate where each one ends up. Both build on the same re-solve machinery as resolve (so they need the restart cell the CLI wires by default; they error in contexts without it), and both leave you at a normal prompt on the final solve afterward.

`sweep <file>` — explicit starts

Run one solve per start point listed in a file (one start per line, comma/whitespace-separated; #/// comments and blank lines skipped):

pounce-dbg> sweep starts.txt
   sweep 1/4: Success                iters=21   obj=3.743990e-21 inf_pr=0.00e0
   sweep 2/4: Success                iters=15   obj=1.233088e-28 inf_pr=0.00e0
   sweep 3/4: Success                iters=14   obj=1.328861e-28 inf_pr=0.00e0
   sweep 4/4: Success                iters=29   obj=2.982346e-18 inf_pr=0.00e0

── sweep complete ── 4 solves, 4 succeeded, 1 distinct minima
     #  status                 iters       objective     inf_pr
     0  Success                   21    3.743990e-21     0.00e0
     1  Success                   15    1.233088e-28     0.00e0
     2  Success                   14    1.328861e-28     0.00e0
     3  Success                   29    2.982346e-18     0.00e0
   best: solve #2  obj=1.32886077e-28

Each start must have the same length as x (mismatches are reported with the line number). The summary clusters successful objectives to a relative 1e-6 to count distinct minima and flags the best (lowest-objective) solve. This is the “is this solve fragile to its start, and to which basins does it fall?” diagnostic — and unlike a black-box global search it leaves every solve’s trajectory observable: set a break on resto_entered or a stop-at kkt first and the sweep will pause inside whichever solve trips it.

`multistart <N> [rel]` — sampled restarts

When you don’t have a file of starts, multistart generates N of them:

pounce-dbg> multistart 8          # 8 starts
pounce-dbg> multistart 8 0.3      # wider jitter on any unbounded vars

Each variable that has a finite box [x_Lᵢ, x_Uᵢ] is sampled uniformly inside it — a genuine box multistart. Variables that are unbounded on either side fall back to a relative jitter ±rel·(|xᵢ|+1) around the current point (rel default 0.1, with a floor so components at zero still move). The command reports the split, e.g. multistart 8 (box 5/7 vars; 2 unbounded → jitter rel=0.1).

Start 0 is always the unperturbed current x (so the run includes where you already are), and the sampler is a fixed-seed PRNG, so a multistart run reproduces exactly.

The bounds are the ones the algorithm sees — full-length, post-scaling, after any bound_relax_factor — so every sampled start is a valid seed. For a problem with no finite bounds (a pure unconstrained NLP) multistart degrades to jitter around x; sweep an external sample if you want a specific spread there.

Driving a sweep from a file with `load`

The pieces compose. To seed a sweep from points computed elsewhere, write them with numpy.savetxt and sweep the file directly — or, for a single externally-computed warm start, load it and resolve:

import numpy as np
np.savetxt("starts.txt", sampler(n=32), delimiter=",")   # 32 starts, one per row

pounce-dbg> sweep starts.txt

sweep vs. `find_minima`

sweep/multistart are diagnostics: they show you how a handful of starts behave, with full visibility into each solve’s path. For an automated global search — Sobol sampling, deduplication, minimum certification (PSD Hessian), redundant-descent avoidance — reach for the Python pounce.find_minima, whose multistart and mlsl methods are the production tools. Rule of thumb: debugger sweep when you’re asking why a solve is start-sensitive; find_minima when you want the minima themselves.

Ask an LLM about the state

ask [question] packages the current paused state — checkpoint, residuals, step lengths, dimensions, and the KKT inertia/regularization — into a prompt and runs it through an LLM CLI (by default Claude Code, claude -p headless print mode), printing the reply inline. It’s AI-assisted debugging without leaving the loop:

pounce-dbg> stop-at kkt
pounce-dbg> continue
pounce-dbg> ask why is the dual infeasibility stalling?
# → the model's analysis of the state + suggested options to try

With no question it defaults to “explain the current state and suggest what to try next.”

Choosing the LLM

Set $POUNCE_DBG_LLM to pick the backend. It accepts either a bare provider keyword — which expands to that CLI’s correct non-interactive invocation — or a full command template:

export POUNCE_DBG_LLM=claude     # Claude Code      → claude -p   (default)
export POUNCE_DBG_LLM=codex      # OpenAI Codex CLI → codex exec <prompt>
export POUNCE_DBG_LLM=gemini     # Google Gemini    → gemini -p <prompt>
export POUNCE_DBG_LLM=llm        # simonw's llm     → llm <prompt>
# …or a full template:
export POUNCE_DBG_LLM='llm -m claude-opus'  # any prompt-on-stdin CLI
export POUNCE_DBG_LLM='mytool --ask {}'     # prompt substituted into {}

For a template, the prompt is fed on the tool’s stdin unless it contains a {} placeholder, in which case it is substituted as an argument. A bare word that isn’t a known provider is treated as a program name with the prompt on stdin.

Graceful when the CLI is absent. If the selected tool isn’t installed or on PATH, ask returns an error naming the tool and listing the provider keywords — the rest of the debugger (and the solve) is unaffected. ask is the only command that shells out; nothing else depends on an LLM being present.

In JSON mode the reply comes back in the result event’s data.reply.

Attaching to a run

You don’t have to single-step from iteration 0.

Drop in on failure — --debug-on-error runs the solve freely and pauses at the terminated checkpoint only if the solve did not succeed, leaving you at the failing iterate for a post-mortem. (Plain --debug also pauses at terminated for a final-point inspect.)
Attach with Ctrl-C — --debug-on-interrupt runs normally but installs a SIGINT handler; a first Ctrl-C drops you in at the next iteration (reason: "interrupt (Ctrl-C)"), a second Ctrl-C aborts. Ctrl-C also breaks into any other debug mode mid-continue.

Ctrl-C at the prompt. At a rustyline prompt Ctrl-C arrives as input, not a signal, so it has its own analogous double-tap: the first Ctrl-C cancels the current input line (readline convention), a second in a row stops the solve (a clean UserRequestedStop, same as quit). So whether you are running or sitting at the prompt, two Ctrl-Cs always get you out; quit/q and Ctrl-D (EOF, which detaches and finishes) remain the explicit exits.

Scripting

Run a sequence of debugger commands from a file — one per line, # and // comments and blank lines skipped:

# warmup.pdbg
break if inf_pr<1e-6
watch mu
stop-at after_search_dir
continue

pounce problem.nl --debug-script warmup.pdbg        # run at the first pause

pounce-dbg> source warmup.pdbg                       # or interactively

A script runs top-to-bottom and stops early if a command resumes or stops the solve (so ending with continue hands control back at the first breakpoint). --debug-script implies --debug when no --debug* mode is given, and runs once at the first pause (not on a resolve).

Example: a scripted initialization-sensitivity run

Because load, sweep, and set opt are ordinary commands, a whole diagnostic fits in a script file. This one watches each solve’s path and sweeps a set of externally-generated starts:

# sensitivity.pdbg — generate starts.txt first (e.g. numpy.savetxt)
break on resto_entered      # surface any start that falls into restoration
sweep starts.txt            # one solve per row; tabulated at the end

pounce model.nl --debug-script sensitivity.pdbg

Or compare a baseline against a what-if on the same starts by staging an option before the sweep:

# adaptive-vs-monotone.pdbg
set opt mu_strategy adaptive
multistart 16 0.2           # 16 sampled restarts, all under adaptive μ

Example: drive a multistart from a program (JSON protocol)

For many variables and many starts, hold the x0s as arrays in a driver program and let it assemble the commands — no point is ever typed. The --debug-json protocol emits a sweep_result per solve and a final sweep_summary:

import subprocess, json, numpy as np

p = subprocess.Popen(["pounce", "big.nl", "--debug-json"],
                     stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True)
send = lambda c, **k: (p.stdin.write(json.dumps({"cmd": c, **k}) + "\n"), p.stdin.flush())

recv = lambda: json.loads(p.stdout.readline())
recv()                                   # hello
recv()                                   # initial pause

# Option A — let the debugger sample: N restarts (uniform in finite boxes).
send("multistart", args=["32", "0.25"])

# Option B — supply your own starts via a file and sweep it:
# np.savetxt("starts.txt", my_sampler(n=32), delimiter=",")
# send("sweep", args=["starts.txt"])

results = []
for line in p.stdout:
    ev = json.loads(line)
    if ev.get("event") == "sweep_result":
        results.append((ev["status"], ev["objective"]))
    elif ev.get("event") == "sweep_summary":
        print(f"{ev['succeeded']}/{ev['solves']} ok, "
              f"{ev['distinct_minima']} distinct minima, "
              f"best obj {ev['best_objective']:.6e}")
        break

Each sweep_result carries index, status, iters, objective, inf_pr, and the seed it started from; the sweep_summary adds distinct_minima, best_index, and best_objective. A client can feature-detect support via hello.capabilities.sweep.

Exit model

Path	Result
`quit`	stops now → `UserRequestedStop`
Ctrl-C ×2 at the prompt	cancel line, then stop → `UserRequestedStop`
Ctrl-C ×2 mid-`continue`	break in, then abort (exit 130)
`continue` / `detach`	run to natural completion
stdin EOF, REPL (Ctrl-D)	detach and finish (pdb convention)
stdin EOF, JSON (pipe closed)	abort — the controlling client is gone
external SIGKILL	process dies (no `terminated` event)

Every non-kill path ends with a terminated event in JSON mode.

Command reference

Command (aliases)	Summary
`help` (`h`, `?`)	list commands
`info` (`i`)	current-iterate summary
`print <what>` (`p`)	block, `d`-block, scalar, `kkt`, or `residuals`
`print equation <name\|row>`	source algebra of a constraint, by model name or `.nl` row
`step` (`s`, `n`)	run to next `iter_start`
`step sub` / `stepi` (`si`)	run to next checkpoint of any kind
`continue` (`c`)	run to next breakpoint
`run N` (`r`)	run until iteration N
`break …` (`b`)	iteration / `if` / `on` breakpoints; list; `clear`; `del N`
`stop-at <cp>`	always pause at a checkpoint
`set mu/x/<block>/opt …`	mutate μ, the iterate, or stage an option
`get opt <name>` (`get <name>`)	report an option’s current/staged value, source, and default
`opt [filter]`	list/search registered options
`complete <prefix>`	completion candidates
`viz <target>`	open an artifact in a viewer
`save [path]`	dump the iterate to JSON
`load <file> [block]`	read a block (default `x`) from a save artifact / numeric file
`sweep <file>`	one solve per start in `<file>`; tabulate outcomes
`multistart <N> [rel]`	`N` restarts (uniform in each finite box; jitter elsewhere); tabulate
`watch <target>` (`display`)	auto-print a target at every pause
`tbreak N` (`tb`)	one-shot iteration breakpoint
`commands N <c>;<c>…`	auto-run commands when iteration N’s breakpoint hits (`commands N clear` removes)
`watchpoint <blk>[<i>] [τ]` (`wp`)	pause when a value changes by > τ
`diff`	what changed in the iterate since the last iteration
`diagnose` (`diag`)	live health report: named culprit residuals, KKT inertia, stalls
`source <file>`	run debugger commands from a file
`goto N` / `restart`	soft-rewind to a captured iteration
`resolve`	re-solve from current x with staged options
`ask [question]`	ask an LLM about the state (default Claude Code; `$POUNCE_DBG_LLM`=`claude`/`codex`/`gemini`/`llm` or a template)
`progress [on/off]`	toggle JSON progress events
`detach`	stop pausing; run to completion
`quit` (`q`, `exit`)	stop the solve

The JSON protocol

--debug-json makes stdout a pure stream of newline-delimited JSON objects (the banner, problem stats, and final summary are routed to stderr, and print_level is forced to 0). A program reads one JSON object per line.

For an LLM agent: the whole contract

You do not need this page to drive the debugger — the protocol is self-describing. The contract is five lines:

Launch pounce <model> --debug-json (or --problem <name>), with the child’s stdin and stdout piped.
Read the first line — hello. It enumerates everything you can do: commands (the verbs), events (breakpoint triggers), checkpoints (where you can pause), metrics (the scalar field names), blocks (the inspectable vectors), and a capabilities map. Feature-detect off these lists, never off the version string.
Send commands, one JSON object (or bare string) per line, e.g. {"cmd":"break if inf_pr<1e-6","id":1} then {"cmd":"continue","id":2}. Set id to correlate the matching result.
Read events until you see the one you want. Every pause / progress / terminated event carries the same scalar metric fields, under the exact names listed in hello.metrics (objective, mu, inf_pr, inf_du, nlp_error, complementarity, iter) — so you can index them directly.
Finish with {"cmd":"continue"} to run to completion (then read terminated), or {"cmd":"quit"} to stop early.

A complete minimal transcript (→ sent, ← received), eliding long lines:

←  {"event":"hello","protocol":"pounce-dbg/1","commands":[…],"metrics":[…],…}
←  {"event":"pause","checkpoint":"iter_start","iter":0,"objective":24.2,…}
→  {"cmd":"break if inf_du<1e-6","id":1}
←  {"event":"result","request_id":1,"command":"break","ok":true,…}
→  {"cmd":"continue","id":2}
←  {"event":"progress","iter":1,"objective":4.7,"inf_du":2.1e1,…}
   … more progress events …
←  {"event":"pause","checkpoint":"iter_start","iter":21,"inf_du":8.7e-7,"reason":"inf_du<1e-6"}
→  {"cmd":"continue","id":3}
←  {"event":"terminated","status":"SolveSucceeded","iterations":21,…}

If you are wired in through the pounce-studio MCP server, you don’t even spawn the CLI yourself: call debug_start to open a live session and debug_command to step it (debug_state / debug_sessions / debug_close round it out) — the server owns the child process and the framing, and debug_start hands you the same hello handshake. Call debug_session_guide for the contract and a launch snippet if you’d rather drive --debug-json directly. The MCP analysis tools (diagnose, find_stalls, …) are post-mortem over a finished report; the debug_* tools and --debug-json are the live loop.

Session lifecycle

hello — emitted once, up front. The handshake.
pause — at each stop.
result — one per command, echoing the client’s request_id.
progress — one per iteration while running between pauses.
sweep_result / sweep_summary — during a sweep/multistart: one sweep_result per completed solve, then a sweep_summary at the end.
terminated — once, after the solve.

Commands

Write one per line to stdin, either a bare string or an object:

{"cmd": "print", "args": ["x"], "id": 7}
{"cmd": "break if inf_pr<1e-6", "id": 8}
"continue"

id (any JSON value) is echoed back as request_id on the matching result, for async correlation.

`hello`

{"event":"hello","protocol":"pounce-dbg/1","pounce_version":"0.4.0",
 "capabilities":{"inspect":true,"mutate_iterate":true,"mutate_mu":true,
   "conditional_breakpoints":"compound","request_ids":true,
   "viz":["block","delta","kkt","L"],"save":true,"load":true,"sweep":true,
   "kkt_inspect":true,"diagnose":true,"llm_assist":true,
   "pause_command":true,"equations":false,"structural_diagnose":false,
   "rewind":"primal_dual","resolve":true,"terminal_checkpoint":true,
   "interruptible":true,"progress_events":true,"async_pause":"checkpoint"},
 "checkpoints":["iter_start","after_mu","after_search_dir","after_step",
                "step_rejected","pre_restoration_entry",
                "post_restoration_exit","terminated"],
 "events":["resto_entered","resto_exited","regularized","tiny_step",
           "ls_rejected","mu_stalled","nan"],
 "commands":[…],"blocks":[…],"metrics":[…]}

A client should feature-detect off capabilities / checkpoints / events rather than the protocol string — those lists are additive as the debugger grows. A few capabilities are model-conditional: equations and structural_diagnose are true only when the solve came from an .nl file (which carries the source algebra and structural metadata) and false for a built-in problem, as shown above.

`pause`

{"event":"pause","checkpoint":"iter_start","status":null,
 "iter":3,"mu":2.0e-2,"objective":5.05,"inf_pr":0.0,"inf_du":2.7e-14,
 "nlp_error":0.0237,"complementarity":1.9e-2,"dims":{"x":2,"s":0,"y_c":0,
 "y_d":0,"z_l":2,"z_u":2,"v_l":0,"v_u":0},"breakpoints":[],"conditions":[],
 "reason":"mu<0.05"}

status is non-null only at the terminated checkpoint. reason carries the firing breakpoint / condition / event / interrupt.

`result`

{"event":"result","request_id":7,"command":"print x","ok":true,
 "output":["x = [-1.18e0, 1.38e0]"],"data":{"name":"x","values":[-1.18,1.38]}}

output is human-readable lines; data is the structured payload (present for inspection commands).

`progress`

{"event":"progress","iter":42,"mu":1.0e-5,"inf_pr":3.2e-7,"inf_du":1.1e-6,
 "objective":12.34,"nlp_error":1.1e-6,"complementarity":9.0e-6}

Emitted once per outer iteration during a continue, so a UI can show live progress instead of a hang. Carries the same scalar fields, under the same names, as pause — so hello.metrics names index directly off either event. Default on; toggle with the progress command.

`terminated`

{"event":"terminated","status":"SolveSucceeded",
 "status_message":"Optimal Solution Found.","iterations":6,
 "objective":4.9999999,"evals":{"obj":7,"obj_grad":7,"constr":1,
 "constr_jac":12,"hess":6}}

Async pause

A running continue can be interrupted two ways, both pausing at the next checkpoint with a reason:

SIGINT — process.kill(pid, "SIGINT") (or Ctrl-C). This is what a Debug Adapter’s pause button maps to. Reason: "interrupt (Ctrl-C)".
In-band command — send {"cmd":"pause"} on stdin while the solve is running (JSON mode). No signals, so it works on Windows. Reason: "pause (requested)".

hello.capabilities.async_pause is "checkpoint", and pause_command is true.

Tutorials

1. Why did this problem go to restoration?

$ pounce hard.nl --debug-json
{"cmd":"break on resto_entered"}
{"cmd":"continue"}
# → pause at checkpoint "pre_restoration_entry", reason "event: resto_entered"
{"cmd":"info"}                # how infeasible is the iterate?
{"cmd":"print kkt"}           # was the KKT singular / heavily regularized?
{"cmd":"print x"}

2. Catch a step that gets regularized

break on regularized
continue
# → pause at after_search_dir when delta_w > 0
print kkt        # inertia n- vs expected; delta_w / delta_c
print dx         # the (stabilized) Newton step

3. What-if: try a different μ strategy from here

break 5
continue                 # stop at iteration 5
set opt mu_strategy adaptive
resolve                  # re-solve from the iter-5 point with adaptive μ

4. Post-mortem on a failure

pounce maybe-infeasible.nl --debug-on-error

Runs unattended; if the solve returns anything but success you land at the final iterate:

── pounce-dbg ── TERMINATED (LocalInfeasibility)  iter 11  obj=1.13e0  inf_pr=5.0e-1  inf_du=1.2e-8
pounce-dbg> print x
pounce-dbg> print kkt

5. Drive it from a program / agent

import subprocess, json
p = subprocess.Popen(["pounce", "hs071.nl", "--debug-json"],
                     stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True)

def send(cmd, **kw): p.stdin.write(json.dumps({"cmd": cmd, **kw}) + "\n"); p.stdin.flush()
def recv():          return json.loads(p.stdout.readline())

hello = recv()                       # capabilities / vocabulary
print(recv())                        # initial pause
send("break if inf_du<1e-6", id=1)
print(recv())                        # result, request_id=1
send("continue")
for line in p.stdout:                # progress … pause … terminated
    ev = json.loads(line)
    if ev["event"] == "terminated": break

6. Is this solve sensitive to its start?

break on resto_entered       # flag any start that falls into restoration
multistart 16                # 16 restarts (uniform in each finite box)
# → per-solve lines, then a table: succeeded / distinct minima / best

Swap multistart 16 for sweep starts.txt to run your own start points (numpy.savetxt("starts.txt", X0, delimiter=",")). See Multi-start and initialization sensitivity.

Beyond the interior-point loop

Everything above is the NLP filter-IPM. The same debugger — same command engine, same REPL — drives the other solvers too.

Convex and conic solves

The convex LP/QP interior-point solver and the HSDE conic drivers (SOCP, the exponential / power cones, and small PSD cones) expose the same checkpoints and commands as the NLP loop. The iterate blocks follow the QP standard form — x (variables), s (cone slacks), y (equality multipliers), z (inequality / cone multipliers) — and the HSDE drivers additionally expose the homogenizing scalars tau / kappa as 1-element blocks (print tau). set <block> and goto work as on the NLP path; set mu is rejected, because the convex μ is derived from ⟨s, z⟩ (edit s/z to move it).

pounce model.nl --debug                 # LP / convex-QP (auto-routed) — IPM REPL
pounce_cblib model.cbf --debug          # SOCP / exp / power / PSD (conic) — IPM REPL
pounce_cblib model.cbf --debug-script s.pdbg

Capability matrix

The flow-control core — checkpoints, stepping, breakpoints, watchpoints, block/scalar inspection, diff, goto/restart, save, ask, and the JSON protocol — works identically on every backend. The table below is just the commands whose availability is backend- or model-conditional; anything not listed is universal. A command that isn’t available on the current backend returns an explicit “not available for this solver” error (it never silently no-ops), and a JSON client should feature-detect off hello.capabilities rather than this table.

Command	NLP filter-IPM	Convex / conic IPM	Notes
`print kkt`	✅	➖	convex IPM exposes no augmented-system inertia
`print rank`	✅	➖	SVD rank of the equality Jacobian — NLP only
`print residuals`	✅	➖	per-component primal/dual residuals — NLP only
`print active` / `inactive`	✅	➖	needs a bound-slack notion
`print equation <name\|row>`	⚠️	⚠️	needs a source `.nl` model (`capabilities.equations`)
`viz kkt` / `viz L`	✅	➖	depends on a captured KKT matrix / factor
`diagnose`	✅	➖	live health report — NLP only
`resolve`	✅	➖	warm re-solve from the current iterate — NLP only
`sweep` / `multistart` / `load`	✅	➖	initialization-sensitivity tools — NLP only
`set opt <name> <val>`	✅	➖	staged option edits — NLP only
`set mu`	✅	❌	rejected on convex: μ is derived from `⟨s, z⟩` (edit `s`/`z`)
`set <block>` / `goto` / `restart`	✅	✅	snapshots are supported on both

✅ available · ⚠️ model-conditional · ➖ reports “not available for this solver” · ❌ explicitly rejected with an explanation

The streamed scalar metric vocabulary (iter, mu, objective, inf_pr, inf_du, nlp_error, complementarity) is the same on every backend — see hello.metrics. Each backend maps its native quantities onto these NLP-centric names; the convex IPM, for instance, reports nlp_error = max(pinf, dinf, μ). A backend that has no value for a metric reports it as JSON null (never a dropped field), and a test pins the emitted set to that single advertised vocabulary so it can’t drift.

A third backend — an interactive branch-and-bound tree debugger for a spatial global solver — is not part of this release.

Limitations

Soft rewind only. goto/restart restore the primal-dual state, not strategy history (see the caveat above).
set opt is staged, not hot-applied to a running solve; it takes effect on resolve / the next solve.

A. Lee, R. B. Parker, S. Poon, D. Gunter, A. W. Dowling, and B. Nicholson, “Model Diagnostics for Equation-Oriented Models: Roadblocks and the Path Forward,” Systems and Control Transactions 3:966–974 (2024). https://doi.org/10.69997/sct.147875 ↩ ↩2 ↩3 ↩4

Pyomo

Because POUNCE speaks the AMPL NL/SOL protocol, it drops into Pyomo through the AMPL Solver Library interface — exactly how Pyomo drives Ipopt.

The pyomo-pounce package registers pounce as a Pyomo SolverFactory solver:

import pyomo_pounce  # registers 'pounce'
from pyomo.environ import ConcreteModel, Var, Objective, SolverFactory

model = ConcreteModel()
model.x = Var(bounds=(-10, 10))
model.obj = Objective(expr=(model.x - 3) ** 2)

solver = SolverFactory('pounce')
solver.solve(model)

Options pass through the usual Pyomo mechanism:

solver.solve(model, options={'tol': 1e-10, 'max_iter': 500})

Under the hood, Pyomo writes the model to an AMPL .nl file, invokes pounce problem.nl -AMPL, and reads the result back from the .sol file. See Running Solves for the -AMPL solver mode.

Preflight and initialization

A Var whose .value was never set is written as 0 into the .nl file, so an uninitialized model actually starts at the origin (see Initialization and Warm Starts). The package ships a preflight check plus an initialization pipeline for exactly this:

import pyomo_pounce

report = pyomo_pounce.preflight(model)   # what will POUNCE see at x0?
print(report)                            # unset vars, bound/constraint
if report.fatal:                         # violations, NaN/inf evaluations
    ...

# fill -> repair -> block-solve, with the decisions held constant:
rep = pyomo_pounce.initialize(model, decisions=[m.feed, m.reflux])
if not rep.block.square:
    print(rep)          # names of what you forgot to specify

preflight evaluates every active constraint and the objective at the current values with unset values treated as 0 (exactly what the NL writer sends), restores the model untouched, and reports what iteration 0 will see; report.fatal means the solve would abort with Invalid_Number_Detected.

initialize follows the workflow you would run by hand on, say, a distillation column: set the decisions (feed, reflux, boilup), solve for a physical profile with them held constant, then let the optimizer move them. Its three stages are also available individually:

pyomo_pounce.initialize_missing_values(model)   # bounds-aware fill
                                                # (midpoint / one unit
                                                # inside / zero)

pyomo_pounce.project_to_feasible(model)         # min-norm repair: move the
                                                # current point onto the
                                                # model's own constraints
                                                # (one POUNCE solve)

rep = pyomo_pounce.block_initialize(            # solve the equality
    model, decisions=[m.feed, m.reflux])        # system's square blocks
                                                # in calculation order

initialize_missing_values fills each variable independently, so the fill can be internally inconsistent (mole fractions that do not sum to one); project_to_feasible repairs that by minimizing sum((v - v0)**2) subject to the model’s active constraints and bounds — the full nonlinear projection, solved with POUNCE, with the original objective restored afterwards.

block_initialize is IDAES-flavored initialization without hand-written routines. decisions= holds the listed variables at their current values for the solve and releases them afterwards (each must have a value). The active equality constraints are decomposed (Dulmage-Mendelsohn, via pyomo.contrib.incidence_analysis); the square part is solved block by block in topological order by Pyomo’s solve_strongly_connected_components (1x1 blocks by Newton, larger blocks by POUNCE), filling Var.value along the way. When the system is not square, report.square is False and the offending variables and constraints are reported by name — underconstrained_variables is the list of things you forgot to specify or flag as decisions, overconstrained_constraints the redundant or conflicting specifications. Permanently-known inputs can simply be fix()ed instead of listed as decisions.

GAMS

POUNCE plugs into GAMS as an NLP solver, so a model can hand its problem to POUNCE with:

option nlp = pounce;
solve mymodel using nlp minimizing obj;

There are two ways to make POUNCE available to GAMS. Pick one:

Route	Install	What it is
pip (recommended)	`pip install pounce-solver[gams]` then `pounce-gams register`	A pure-Python solver link built on GAMS’s own `gamsapi` package. No compiler, no `sudo`, survives GAMS upgrades.
native C link	build + `sudo make -C gams install`	A C shared library installed into the GAMS system directory. Adds active-set-SQP working-set / state-file warm starts. See `gams/README.md`.

Both register POUNCE under the same name (pounce) for NLP, DNLP, and RMINLP models — POUNCE is a continuous local NLP solver, so mixed-integer and conic model types are not offered here.

The pip route

1. Install

pip install pounce-solver[gams]

The [gams] extra pulls in gamsapi[core] — GAMS’s own expert-level GMO/GEV Python bindings — and PyYAML. The bindings dlopen the GAMS C libraries from your local install, so gamsapi must match your GAMS version. POUNCE itself redistributes nothing GAMS-owned. If your GAMS and gamsapi versions disagree, install the matching one from your GAMS system (GAMS ships a gamsapi wheel under apifiles/Python/), or:

pip install 'gamsapi[core]==<your GAMS X.Y.Z>'

2. Check the install

pounce-gams status

reports whether gamsapi imports, the config directory POUNCE will register into, and whether POUNCE is already registered:

gamsapi:       available
               gamsapi 53.2.0 importable
config dir:    /Users/you/Library/Preferences/GAMS
gamsconfig:    /Users/you/Library/Preferences/GAMS/gamsconfig.yaml (missing)
POUNCE solver: not registered

3. Register

pounce-gams register

This writes a tiny launcher script and a solverConfig entry into your GAMS per-user gamsconfig.yaml. It merges — any other solvers already in that file (CONOPT overrides, discopt, …) are preserved — and is idempotent (re-running just updates POUNCE in place). The per-user config directory GAMS searches is OS-specific:

OS	Directory
macOS	`~/Library/Preferences/GAMS`
Linux	`$XDG_CONFIG_HOME/GAMS` (else `~/.config/GAMS`)
Windows	`%LOCALAPPDATA%\GAMS` (else `…\Documents\GAMS`)

Override the target with --config-dir <path> (e.g. to register into the GAMS system directory instead). To undo, pounce-gams unregister.

No sudo is needed and nothing is written into the GAMS system directory, so a GAMS upgrade does not wipe the registration.

4. Solve

option nlp = pounce;
solve mymodel using nlp minimizing obj;

GAMS invokes the launcher with a control file; the launcher runs the Python link, which reads the model through GMO/GEV, solves it with POUNCE, and writes the primal/dual solution and GAMS model/solve status back.

Option files

If a model sets mymodel.optfile = 1, POUNCE reads pounce.opt (.op2, .op3, … for higher optfile values). Each line is a keyword value pair using POUNCE’s option names; lines starting with * or # are comments. The GAMS iterlim and reslim are honored as max_iter and max_wall_time.

* pounce.opt
tol        1e-10
max_iter   500

Machine-readable solve report

Set json_output in pounce.opt to also emit a structured pounce.solve-report/v1 JSON report (identical to the CLI’s --json-output, consumable by pounce-studio):

json_output  my_solve.json
json_detail  full        * "summary" or "full"; default is "full"

See the JSON Solve Report schema for the format.

Notes & limitations

Version match. The single most common failure is a gamsapi ↔ GAMS version mismatch; pounce-gams status diagnoses it.
Warm starts. The active-set-SQP working-set / state-file warm-start features (algorithm active-set-sqp, sqp_state_file) are currently only in the native C link, where each solve reuses an in-process state. The pip link runs each solve as a fresh process; full warm-start parity is a planned follow-up.

Python API

POUNCE ships a Python wrapper that is intentionally cyipopt-compatible: code written for cyipopt typically runs against POUNCE by changing only the import.

Install

cd python
pip install maturin
maturin develop --release    # builds the native extension into your venv

Optional extras:

pip install -e .[jax]        # JAX integration
pip install -e .[torch]      # PyTorch integration
pip install -e .[dev]        # tests + jax + torch + scipy

cyipopt-style interface

import numpy as np
import pounce

class HS071:
    def objective(self, x):
        return x[0]*x[3]*(x[0]+x[1]+x[2]) + x[2]
    def gradient(self, x):
        return np.array([
            x[0]*x[3] + x[3]*(x[0]+x[1]+x[2]),
            x[0]*x[3],
            x[0]*x[3] + 1.0,
            x[0]*(x[0]+x[1]+x[2]),
        ])
    def constraints(self, x):
        return np.array([np.prod(x), np.dot(x, x)])
    def jacobianstructure(self):
        return (np.repeat([0, 1], 4), np.tile([0, 1, 2, 3], 2))
    def jacobian(self, x):
        return np.array([
            x[1]*x[2]*x[3], x[0]*x[2]*x[3], x[0]*x[1]*x[3], x[0]*x[1]*x[2],
            2*x[0], 2*x[1], 2*x[2], 2*x[3],
        ])

prob = pounce.Problem(
    n=4, m=2,
    problem_obj=HS071(),
    lb=[1]*4, ub=[5]*4,
    cl=[25, 40], cu=[2e19, 40],
)
prob.add_option('tol', 1e-8)
x, info = prob.solve(x0=np.array([1.0, 5.0, 5.0, 1.0]))
print(info['status_msg'], info['obj_val'], x)

Verifying convergence / trustworthy duals

info carries the final KKT residuals so a consumer can independently check how converged a returned point is — useful when the duals (info["mult_g"], info["mult_x_L"], info["mult_x_U"]) feed a downstream certificate (e.g. dual bound tightening). Two flavors:

final_kkt_error / final_dual_inf / final_constr_viol / final_compl — the residuals the convergence test saw, in the internally-scaled NLP space (the nlp_scaling_method factors).
final_unscaled_kkt_error / final_unscaled_dual_inf / final_unscaled_constr_viol / final_unscaled_compl — the same residuals with the scaling divided back out, i.e. in your original problem units. Equal to the scaled values when no scaling activates.

On ill-conditioned problems nlp_scaling can deflate the scaled residual enough that the default test reports Solve_Succeeded while the unscaled duals have drifted. The robust guard is to read the residual yourself rather than trust the status enum alone — info["status"] is a coarse signal, and some callers treat Solve_Succeeded (0) and Solved_To_Acceptable_Level (1) identically:

x, info = prob.solve(x0=...)
converged = info['final_unscaled_kkt_error'] <= 1e-6   # your own threshold

If you’re feeding the duals into a downstream certificate (e.g. a dual bound), prefer building a safe bound from the multipliers — valid for any dual-feasible point, so it doesn’t hinge on the solver’s exact convergence.

Two convenience knobs back this up:

Tighten the (unscaled) component tolerances — dual_inf_tol, constr_viol_tol, compl_inf_tol gate on the unscaled residuals, so the solver keeps iterating until it actually meets them (or exits non-success).
kkt_fidelity_tol (default 0 = off) — a defensive post-solve relabel: a Solve_Succeeded whose final_unscaled_kkt_error exceeds it is demoted to Solved_To_Acceptable_Level. Note this only helps a caller that distinguishes those two statuses; if yours doesn’t, gate on the residual directly as above.

Where the time went (`info["timing"]`)

Every Problem.solve attaches a per-subsystem wall-clock breakdown so you can attribute a solve’s runtime without patching or rebuilding the solver. info["wall_time"] is the overall-algorithm total (seconds); info["timing"] is a dict of the same total plus its components:

x, info = prob.solve(x0=...)
t = info["timing"]
print(t["overall_alg"])                    # total solve wall time
print(t["linear_system_factorization"],    # KKT factorization vs …
      t["linear_system_back_solve"],        # … back-solve, and their
      t["linear_system_total"])             # sum (total linear algebra)
print(t["eval_objective"], t["eval_gradient"],
      t["eval_constraints"], t["eval_constraint_jacobian"],
      t["eval_lagrangian_hessian"])         # per-callback eval time

The scipy-style pounce.minimize facade mirrors these onto the result as res.wall_time and res.timing (also in res.info). The callback split is what lets you see, for example, that a reduced-space / variable-aggregation solve converges in few iterations but spends most of its time in a densified Lagrangian-Hessian evaluation — the func/Jacobian/Hessian story becomes a direct measurement rather than an inference. All values are wall-clock seconds; unused subsystems read 0.0.

Caller-supplied KKT ordering (`set_ordering`)

A structure-aware presolve can hand pounce a fill-reducing permutation for the KKT linear solver that the built-in AMD/METIS pass cannot derive — a block-triangular / Schur ordering (Parker, Garcia & Bent, arXiv:2602.17968) or a tearing ordering from equation-oriented decomposition. Install it on the low-level Problem before solving:

prob = pounce.Problem(n, m, problem_obj=...)
prob.set_ordering(perm)        # 0-based new-to-old permutation (list / int array)
x, info = prob.solve(x0=...)
# prob.get_ordering()  -> the installed permutation, or None
# prob.clear_ordering() -> restore the feral_ordering default

perm[k] is the original index that becomes index k. Its length must equal the augmented KKT system dimension (variables + slacks + constraint duals), not the problem’s n; for an unconstrained problem that is n, but with constraints it is larger. The ordering is validated inside FERAL as a bijection — a wrong length or a duplicate fails the factorization and the solve returns a non-success status (e.g. Error_In_Step_Computation) rather than crashing or returning a wrong answer, since a permutation only affects fill and pivot order, never the computed solution. set_ordering is persistent config (it applies to every subsequent solve() until clear_ordering()) and is honored only by the default FERAL backend. This maps to FERAL’s OrderingMethod::External (feral#107).

Block-triangular / Schur KKT solve (`set_kkt_schur_block`)

If a presolve can identify a reducible block of the KKT system — e.g. the nonsingular block-triangular submatrix a reduced-space / variable-aggregation analysis exposes (Parker, Garcia & Bent, arXiv:2602.17968) — it can hand that block to pounce, which Schur-complements it out and factorizes only the two diagonal blocks, recovering the full-system inertia a priori via Sylvester’s law:

prob = pounce.Problem(n, m, problem_obj=...)   # needs an exact Hessian
prob.set_kkt_schur_block(indices)              # KKT-space indices of the Schur block
x, info = prob.solve(x0=...)
# prob.get_kkt_schur_block()  -> installed indices, or None
# prob.clear_kkt_schur_block()

indices are KKT-space indices into 0..dim where dim = n + n_slack + n_eq + n_ineq, in the solver’s internal x, slack, eq-dual, ineq-dual block order (e.g. for an all-equality problem the constraint-dual block is range(n, n + n_eq), and the primal block is the positive-definite eliminated block — the classic range/null-space split). The method wins only when the Schur block is much smaller than the eliminated block (the dense Schur complement is O(n_schur²) to store and O(n_schur³) to factor). When the partition is unsuitable — too large a fraction of the system, malformed, or a diagonal block turns out singular — the solver falls back to the standard full-space path transparently, so the hook can never break a solve; it only changes how the identical system is factored, never the solution. Honored on the default feral + exact-Hessian path.

Batched NLP solving (`solve_nlp_batch`)

pounce.solve_nlp_batch solves N independent NLPs and returns one (x, info) pair per input, in input order — for parametric sweeps, multi-start, MPC chains, or branch-and-bound node relaxations where each sibling differs only in tightened bounds.

import numpy as np
import pounce

base = pounce.read_nl("model.nl")          # native-Rust evaluators

# One parsed structure, many variations (cheap clones of the AD tapes):
rng = np.random.default_rng(0)
batch = [base.variant(x0=np.asarray(base.x0) + rng.normal(0, 0.01, base.n))
         for _ in range(24)]

results = pounce.solve_nlp_batch(batch, options={"tol": 1e-8})
for x, info in results:
    print(info["status_msg"], info["obj_val"])

NlProblem.variant(x0=, x_l=, x_u=, g_l=, g_u=) builds a sibling instance with per-instance starting point / bounds; everything structural (expression DAG, AD tapes, sparsity, coloring) is shared work that is not redone.

Native vs. callback inputs — the GIL caveat. Both kinds solve in parallel, with different ceilings:

NlProblem inputs (from read_nl / variant) are native-Rust reverse-mode-AD evaluators. The batch runs on a Rayon thread pool with the GIL fully released; each worker solves its instance end-to-end with an inner-serial factorization (outer-parallel / inner-serial, the same model as solve_qp_batch).
Callback-based Problem inputs (pass x0s=, one starting point per instance) also run one instance per worker, but every objective / gradient / constraints / jacobian / hessian call re-acquires the GIL. The Python share of the work is therefore serialized: the speedup scales with the Rust/Python work ratio — medium and large problems whose factorizations dominate parallelize well (~4x on 4 cores for an n=800 banded NLP with vectorized NumPy callbacks); tiny problems whose callbacks dominate won’t. Each Problem’s own add_option settings are honored per instance, with options= as a batch-level overlay.

With parallel=False either path solves one instance at a time, letting each factorization parallelize internally — better for a few large instances. For the batch, print_level defaults to 0 (N workers interleaving iteration tables is noise); pass an explicit print_level to override.

Warm-start chaining (MPC / B&B). Feed one batch’s results into the next solve of a nearby batch:

results = pounce.solve_nlp_batch(batch_t)              # cold
results = pounce.solve_nlp_batch(batch_t1, warms=results)  # warm

Each instance is seeded with the previous x and duals, the converged barrier parameter (info["mu"]) is threaded into mu_init, and warm_start_init_point=yes is forced. A warm start changes iteration counts, never solutions (re-solving the 24-instance gaslib sweep warm drops 482 total iterations to 120). A dimension-mismatched warm entry falls back to that instance’s cold start.

Identical-sparsity batches (share_structure=True). When every instance shares its KKT sparsity (parametric sweeps, multi-start, B&B siblings), this opt-in keeps each worker’s factorization backend alive across instances so the symbolic analysis (fill-reducing ordering, supernode structure) runs once per worker rather than once per instance. Always correct — a pattern change just triggers a fresh analysis — but pooled solver state means results are within solver tolerance of, not bit-identical to, the default fresh-backend solves. The win scales with how expensive ordering is for your model (small models: negligible; large sparse models: worth measuring).

scipy.optimize-style

import numpy as np
from pounce import minimize

res = minimize(lambda x: (x - 1) @ (x - 1) + 1, x0=np.zeros(5))
print(res.fun, res.x)

minimize is a thin facade over pounce.Problem shaped after scipy.optimize.minimize, so SciPy code ports with few changes — including as a method= callable handed to scipy.optimize.minimize itself. It returns a genuine scipy.optimize.OptimizeResult (res.x, res.fun, res.success, res.status, res.message, res.nit, and the res.nfev / res.njev / res.nhev evaluation counters), with pounce-specific extras under res.info and a back-compat shim so a key absent at the top level falls back to res.info.

Compatibility with `scipy.optimize.minimize`

minimize(fun, x0, args=(), jac=None, hess=None, bounds=None,
         constraints=None, callback=None, **options)

Argument	Status	Notes
`fun`, `x0`	✅	objective callable and start point
`args`	✅	tuple of extra positional arguments forwarded to `fun` / `jac`
`jac`	✅	callable, or `jac=True` (then `fun` returns `(value, gradient)`, cached so the gradient is not recomputed); omitted → central finite differences (`eps^(1/3)` step) and a one-time `UserWarning`. Provide one (or use `pounce.jax` / `pounce.torch`) for production.
`hess`	⚠️	used when there are no constraints or all constraints are linear (the constraint curvature is then zero, so the objective Hessian is the Lagrangian Hessian); with nonlinear constraints the solver falls back to L-BFGS (`hessian_approximation=limited-memory`)
`bounds`	✅	a sequence of `(lo, hi)` pairs or a scipy `Bounds` object; a `None` element or endpoint means ±∞
`constraints`	✅	scipy dict(s) `{"type": "eq"\|"ineq", "fun": …, "jac": …}` or scipy `LinearConstraint` object(s) (dense or sparse `A`); multiple are concatenated; dict `"jac"` optional (finite-diff fallback)
`callback`	✅	called each iteration; both scipy signatures supported — `callback(xk)` and `callback(intermediate_result)`
`tol`	✅	accepted directly (scipy `gtol` / `ftol` / `xtol` are synonyms)
`options` / `**options`	✅	pass options as keyword args (legacy `options={…}` dict still works); keys are pounce/Ipopt names, with scipy synonyms mapped: `maxiter`→`max_iter`, `gtol`/`ftol`/`xtol`→`tol`, `disp`→`print_level`, `maxcor`→`limited_memory_max_history`
`method`	✅	`scipy.optimize.minimize(fun, x0, method=pounce.minimize, …)` works — pounce satisfies scipy’s custom-method contract
`hessp`	❌	no Hessian-vector-product mode

Conventions that match SciPy (so constraints port directly):

Inequalities use the SciPy sign convention g(x) ≥ 0; equalities are g(x) = 0. A LinearConstraint(A, lb, ub) becomes lb ≤ A x ≤ ub.
The result object is a genuine scipy.optimize.OptimizeResult (subset of fields + an info map).

Gaps worth knowing:

NonlinearConstraint objects are not accepted — pass nonlinear constraints as the dict form {"type": …, "fun": …, "jac": …}. (Bounds and LinearConstraint objects are accepted.)
A constraint dict’s Jacobian is dense; for large sparse Jacobians use the Problem class directly (a LinearConstraint may carry a sparse A, which is honored).
options={"maxiter": 100} now works (scipy synonyms are mapped), but the underlying pounce option is still max_iter; an unrecognized key is forwarded verbatim to the backend.

Solver routing in `minimize`

By default minimize uses the general NLP filter line-search interior-point method and does no structure probing — an expensive fun pays nothing. Opt in with solver_selection="auto" (the same key the CLI uses) and minimize probes the callables: a problem that is provably a linear program or a convex quadratic program is dispatched to the specialized convex interior-point solver (pounce.solve_qp, the HSDE driver), and a provably convex QCQP (convex-quadratic objective and/or constraints) is reformulated to a second-order cone program and dispatched to the conic solver (pounce.solve_socp). Both reach a global optimum in materially fewer iterations; everything else falls through to the NLP solver.

The catch is that minimize only sees opaque callables — it cannot read a .nl expression tree the way the CLI can. So instead of reading the structure it probes it: it evaluates fun/jac/hess at several points, fits a linear/quadratic model, and then validates that model against the true callables at held-out points before trusting it. The two misclassification directions are not symmetric, and the validation gates the dangerous one:

A convex LP/QP/QCQP mistakenly sent to the NLP solver is merely slower — the filter-IPM still solves it correctly.
A genuinely nonlinear or nonconvex problem sent to the convex solver would return a silently wrong answer.

So any probe that raises, any model mismatch beyond route_tol, a non-constant Hessian/Jacobian, an indefinite objective Hessian (a nonconvex QP), a quadratic equality, or a quadratic inequality whose feasible set is nonconvex (a non-PSD constraint Hessian) all fall back to the NLP solver. You never get a wrong “optimum” from a misclassification.

Forcing the solver

The solver_selection option (passed in options=) overrides the automatic choice — mirroring the CLI option of the same name:

`solver_selection=…`	Behavior
`"nlp"`	Default. Skip routing entirely; always use the NLP solver — no probe overhead.
`"auto"`	Probe-and-validate; route provable LP/convex-QP to `solve_qp`, a convex QCQP to `solve_socp`, else NLP.
`"lp-ipm"`	Force the convex solver; raise `ValueError` if the problem is not detected as an LP.
`"qp-ipm"`	Force the convex solver; raise `ValueError` if it is not detected as a convex LP/QP.
`"socp"`	Force the conic solver; raise `ValueError` if it is not detected as a convex QCQP.

# Default: the general NLP solver, no probing.
res = minimize(fun, x0, bounds=bounds)

# Opt into routing: a convex QP goes to the fast convex IPM automatically.
res = minimize(fun, x0, bounds=bounds, solver_selection="auto")
print(res.info.get("solver"))      # 'qp-ipm' / 'socp' when routed; None on the NLP path

# Insist the problem is a convex QP; fail loudly if the probe disagrees:
res = minimize(fun, x0, solver_selection="qp-ipm")

# A convex QCQP (e.g. a quadratic ball constraint) routes to the conic solver
# under `solver_selection="auto"`. Give the objective and constraint analytic
# `jac`s: derivative-free detection recovers the constraint Hessian from a
# finite-difference-of-finite-difference Jacobian, which is too noisy to confirm
# the quadratic, so without `jac` the probe conservatively defers to NLP (still
# the correct answer, just slower).
ball = {"type": "ineq",
        "fun": lambda x: 1.0 - x @ x,        # x·x ≤ 1
        "jac": lambda x: -2.0 * np.asarray(x)}
res = minimize(lambda x: -x[0] - x[1], [0.1, 0.1],
               jac=lambda x: np.array([-1.0, -1.0]),
               constraints=[ball], solver_selection="auto")
print(res.info.get("solver"))      # 'socp' (None on the NLP fall-back path)

route_tol (default 1e-5) sets the relative tolerance for the held-out validation; raise it if a genuinely-linear problem with noisy finite-difference Jacobians is being conservatively rejected, lower it to be stricter. The routing keys are consumed by minimize and never forwarded to the backend, so the rest of options still reaches the NLP solver unchanged.

When you still need a typed entry point

Auto-routing handles LP, convex QP, and convex QCQP from the minimize(fun, x0, …) shape. The remaining specialized solvers need structure that a callable cannot carry — an explicit cone list (exp/power/PSD cones), a symbolic objective to relax and bound — so each keeps its own pounce-native entry point:

Want	Entry point	You provide	Optimum
General nonlinear, fast local solve	`minimize(fun, x0, …)`	callables (`fun`/`jac`/`hess`)	local
LP / convex QP	`minimize` (auto) or `solve_qp(P, c, A, b, G, h, lb, ub, …)`	callables / matrices	global
Convex QCQP	`minimize` (auto / `socp`) or `solve_socp(…, cones=…)`	callables / matrices + cone list	global
SOCP / exp / power / PSD cones	`solve_socp(P, c, A, b, G, h, *, cones, …)`	matrices + cone list	global
Polynomial, certified global	`sos_minimize(objective, *, inequalities, equalities, …)`	a polynomial	global

The solve_qp / solve_socp / sos_minimize functions are pounce-native (not SciPy-shaped) by necessity — e.g. sos_minimize takes a polynomial as a coefficient dict and returns a certificate, not callables and SciPy dicts. See Choosing a Solver for the full map.

A minimize_global entry point for factorable nonconvex problems (spatial branch-and-bound) is in development on the feature/global branch and is not exposed in this release; today the certified-global Python path is sos_minimize, for polynomials.

Curve fitting

pounce.curve_fit is the data-fitting companion to minimize — a scipy.optimize.curve_fit-style front end that adds parameter constraints, robust losses, confidence intervals, and ∂params/∂data sensitivity, with the covariance read from the solver’s reduced Hessian. See Curve Fitting.

from pounce import curve_fit

res = curve_fit(model, xdata, ydata, p0=[1, 1, 0])   # model written with jax.numpy
print(res.summary())

Finding multiple minima

pounce.find_minima is the global-search companion to minimize: it drives the same solver in a loop to discover many distinct minima (flooding, deflation, tunneling, multistart, MLSL, basin-hopping). See Finding Multiple Minima for the methods and references, Choosing a Method for selection guidance (including high-dimensional behavior), and notebooks 19, 20, 21 for the three families.

from pounce import find_minima

r = find_minima(fun, x0, method="deflation", jac=jac, hess=hess,
                bounds=bounds, n_minima=6)
print(r.status, len(r), "minima; best f =", r.fun)

JAX integration

The pounce.jax subpackage provides five entry points:

Surface	Use it for
`from_jax(f, g, …)`	Build a one-shot `pounce.Problem` from JAX-traced `f(x)` and `g(x)`.
`solve(p, …)`	`custom_vjp`-wrapped differentiable solve over a parameter `p`.
`solve_with_warm(p, …, warm_start=)`	`solve` + dual-triple (`x, λ, z`) warm-start hand-off across calls.
`vmap_solve(p_batch, …)` / `vmap_solve_parallel(…)`	Batched solve over a leading axis of `p`; the `_parallel` variant uses a `ThreadPoolExecutor` and releases the GIL inside each solve.
`JaxProblem(f, g, n, m, p_example=, …)`	Build-once / solve-many handle that caches JIT artefacts, the sparsity probe, and the underlying `pounce.Problem` across calls.

One-shot build with `from_jax`

import jax.numpy as jnp
from pounce.jax import from_jax

def f(x): return jnp.sum((x - 1) ** 2)
def g(x): return jnp.stack([jnp.sum(x) - 5.0])

prob = from_jax(f, g, n=4, m=1, lb=jnp.zeros(4), ub=jnp.full(4, 10.0),
                cl=jnp.zeros(1), cu=jnp.zeros(1))
x, info = prob.solve(x0=jnp.ones(4))

Sparse Jacobian/Hessian compression (`sparse=`)

By default the constraint Jacobian and the Lagrangian Hessian are computed densely — jax.jacrev/jacfwd/hessian build the full matrix, which is then sliced to the detected sparsity pattern. The reported structure is sparse, but the AD work and memory are O(m·n) (Jacobian) and O(n²) (Hessian) regardless of how sparse the true matrices are. On a 10,000-variable banded system that means computing ~10⁸ entries per iteration to keep ~50,000.

Passing sparse=True switches both derivatives to CPR-style colored AD (pounce#83): structurally-orthogonal columns are colored, one JVP (Jacobian) / HVP (Hessian) is taken per color — k ≪ n colors — and the compressed result is scattered back to the known nonzeros. The per-iteration cost drops from O(n) to O(k) AD passes. This is the same compression strategy the Rust .nl tape path already uses for its Hessian.

prob = from_jax(f, g, n=4, m=1, lb=jnp.zeros(4), ub=jnp.full(4, 10.0),
                cl=jnp.zeros(1), cu=jnp.zeros(1),
                sparse=True)              # colored JVP/HVP instead of dense slice

The flag is also accepted by JaxProblem, where it applies to both the single-solve and the batched block-diagonal paths. The reported structure, the values, and the solution are identical to the dense path either way — only the cost of producing the derivative values changes. The differentiable backward (factor_reuse / implicit diff) is unaffected.

When to use it. sparse=True wins on problems whose Jacobian/Hessian are genuinely sparse with bounded per-row fill (banded, block, finite differences/elements, PDE-constrained, separable). On a dense problem the coloring finds no orthogonality (k = n) and the flag is a small, bounded overhead, so it is opt-in rather than the default. Measured on a banded family (python/benchmarks/bench_sparse_ad_83.py):

n	colors (Jac / Hess)	per-eval Jacobian	per-eval Hessian	full solve
800	2 / 3	6.2× faster	2.0× faster	1.3× faster
2000	2 / 3	18.4× faster	5.4× faster	7.6× faster
5000	2 / 3	560× faster	200× faster	—

The color count stays constant in n while the dense path grows linearly, so the gap widens without bound as the problem scales.

Pattern detection. Sparsity is found by probing the dense derivative at random points and recording where entries are nonzero. Under sparse=True a mis-probe is costlier — it corrupts the compression seed, not just a reported nonzero — so detection unions 3 probes by default (vs 1 for the dense path). Override with n_probes=. Truly value-dependent structure (branchy where/abs) should still be hand-rolled via the Problem API.

Differentiable solve

pounce.jax.solve(p, f=, g=, …) is a custom_vjp-wrapped solve that differentiates x*(p) through the implicit function theorem on the converged KKT system. Inequality rows that are not active at x* are dropped from the KKT block before the implicit-diff back-solve, so the gradient matches the analytic active-set sensitivity even on slack-inequality problems (pounce#73).

import jax, jax.numpy as jnp
from pounce.jax import solve as psolve

def f(x, p): return jnp.sum((x - p) ** 2)
def g(x, p): return jnp.stack([x[0] + x[1] - 1.0])   # equality

def x_star(p):
    return psolve(
        p, f=f, g=g, x0=jnp.zeros(2), n=2, m=1,
        lb=jnp.full(2, -10.0), ub=jnp.full(2, 10.0),
        cl=jnp.zeros(1),       cu=jnp.zeros(1),
        options={"tol": 1e-10, "print_level": 0},
    )

# Gradient of the L2 distance to the target as p moves:
loss = lambda p: jnp.sum(x_star(p) ** 2)
print(jax.grad(loss)(jnp.array([0.3, 0.7])))

Warm-start across a parameter trajectory

solve_with_warm returns the full primal-dual triple alongside x*, and consumes one on the next call. The warm-state is opaque from the JAX side (pytree of jnp arrays) but maps directly onto the x0 / λ0 / z0 ports of the underlying solver — for a sequence of nearby p values this often cuts solver iterations by an order of magnitude (pounce#74).

from pounce.jax import solve_with_warm

trajectory = [jnp.array([0.3 + 0.01 * k, 0.7 - 0.01 * k]) for k in range(50)]

x, warm = solve_with_warm(
    trajectory[0], f=f, g=g, x0=jnp.zeros(2), n=2, m=1,
    lb=jnp.full(2, -10.0), ub=jnp.full(2, 10.0),
    cl=jnp.zeros(1), cu=jnp.zeros(1),
    warm_start=None,                           # first call → cold start
    options={"tol": 1e-10, "print_level": 0},
)
xs = [x]
for p_k in trajectory[1:]:
    x, warm = solve_with_warm(
        p_k, f=f, g=g, x0=x, n=2, m=1,
        lb=jnp.full(2, -10.0), ub=jnp.full(2, 10.0),
        cl=jnp.zeros(1), cu=jnp.zeros(1),
        warm_start=warm,                       # reuse λ, z
        options={"tol": 1e-10, "print_level": 0},
    )
    xs.append(x)

Batched solve (`vmap_solve` / `vmap_solve_parallel`)

vmap_solve runs one solve per row of p_batch sequentially. vmap_solve_parallel is the same surface but dispatches each row to a ThreadPoolExecutor; the underlying Rust solve releases the GIL via py.allow_threads, so workers actually run in parallel on multi-core CPUs (pounce#74).

import numpy as np
from pounce.jax import vmap_solve_parallel

rng   = np.random.default_rng(0)
batch = jnp.asarray(rng.standard_normal((32, 2)))

X = vmap_solve_parallel(
    batch, f=f, g=g, x0=jnp.zeros(2), n=2, m=1,
    lb=jnp.full(2, -10.0), ub=jnp.full(2, 10.0),
    cl=jnp.zeros(1), cu=jnp.zeros(1),
    workers=8,                                 # ThreadPoolExecutor size
    options={"tol": 1e-9, "print_level": 0},
)
assert X.shape == (32, 2)

Both batched surfaces are custom_vjp-wrapped, so a downstream jax.grad/jax.jacobian over a batched loss works end-to-end.

Build once, solve many: `JaxProblem`

For iterative use — a parameter trajectory in a continuation loop, a training step that calls the solver inside a batch, a notebook cell that sweeps a knob — from_jax/solve rebuild the JIT artefacts, the sparsity probe, and the underlying pounce.Problem on every call. JaxProblem does that work once at construction and exposes the same four method shapes against the cached state. On the pounce#75 microbench shape (n=5, m=6, 20 sequential solves at different p) this is roughly a 14× speedup, taking per-solve time from ~96 ms down to ~7 ms (pounce#75).

from pounce.jax import JaxProblem

jp = JaxProblem(
    f=f, g=g, n=2, m=1, p_example=jnp.zeros(2),     # p_example fixes shape/dtype only
    lb=jnp.full(2, -10.0), ub=jnp.full(2, 10.0),
    cl=jnp.zeros(1),       cu=jnp.zeros(1),
    options={"tol": 1e-9, "print_level": 0},
    # sparse=True,                                  # colored AD on sparse problems (see above)
)

# Sequential, differentiable:
x = jp.solve(jnp.array([0.3, 0.7]), x0=jnp.zeros(2))

# Dual-warm-start trajectory (composes warm-state hand-off with reuse):
x, warm = jp.solve_with_warm(trajectory[0], x0=jnp.zeros(2), warm_start=None)
for p_k in trajectory[1:]:
    x, warm = jp.solve_with_warm(p_k, x0=x, warm_start=warm)

# Batched parallel solve over a row-axis of p_batch:
X = jp.vmap_solve_parallel(batch, x0=jnp.zeros(2), workers=8)

Each worker thread in vmap_solve_parallel keeps its own cached pounce.Problem via threading.local, so the per-thread build cost is paid at most once per worker rather than once per batch row.

Factor-reuse backward (`factor_reuse=`)

JaxProblem.solve and solve_with_warm default to a k_aug-style backward that reuses the IPM’s converged compound KKT factor (pounce.Solver.kkt_solve) instead of assembling a dense (n+m) × (n+m) block and running jnp.linalg.solve on it (pounce#76). The held LDLᵀ factor turns the bwd back-solve from O((n+m)³) into O(nnz(L)) and drops the explicit active-set masking that the dense path does — the barrier rows on the bound multipliers (z_l, z_u) already encode “active bounds force Δx_i = 0” exactly, and the (v_l, v_u) rows do the same for slack inequalities. The accuracy of the resulting gradient is O(μ) at the IPM barrier parameter, which sits well below tol after convergence.

jp = JaxProblem(..., factor_reuse=True)   # default; reuse the IPM factor
jp = JaxProblem(..., factor_reuse=False)  # dense JAX backward

Pick factor_reuse=False when you want higher-order differentiation (jax.grad(jax.grad(...)) through the solver) — the dense backward stays JAX-traced and is itself differentiable, the factor-reuse one crosses to the Rust host via pure_callback and is opaque to a second-order trace.

When to pick which on `batched_solve` workloads (pounce#77)

factor_reuse=False is itself a form of factor reuse — it builds the per-block (n+m) × (n+m) KKT at pounce’s converged (x*, λ*, μ_l*, μ_u*) (saved in the custom_vjp residual) and solves it under jax.vmap with a JIT-fused per-block jnp.linalg.solve. So both modes reuse pounce’s converged solution; they differ only in what they back-solve:

factor_reuse=True — back-solves pounce’s held LDLᵀ factor of the full stacked KKT (Rust-side, via FFI through a single-thread executor pin).
factor_reuse=False — back-solves a freshly assembled per-block dense KKT in JAX, fused under vmap.

For batched_solve + jax.jacrev / jax.vmap minibatch projections factor_reuse=False is faster at every scale we measured (n = 3 through 48 per block, B = 64 stacked):

n=3   reuse bwd =  16.6 ms   dense bwd =  20.6 ms   reuse/dense = 0.80×
n=8   reuse bwd =  52.5 ms   dense bwd =  38.5 ms   reuse/dense = 1.36×
n=16  reuse bwd = 157.6 ms   dense bwd =  57.2 ms   reuse/dense = 2.76×
n=32  reuse bwd = 558.6 ms   dense bwd = 103.6 ms   reuse/dense = 5.39×
n=48  reuse bwd =1262.9 ms   dense bwd = 137.4 ms   reuse/dense = 9.19×

The dense path scales as B · (n+m)³; the factor-reuse path scales as N · kkt_dim ≈ B² · n · (n+m) because jax.jacrev fans out N = B·n cotangents and each triggers a back-solve of the full stacked LDLᵀ even though only one block has nonzero signal.

Guidance:

Single solve + many sensitivities — jax.jacrev(jp.solve, argnums=0)(p, x0) and friends — keep factor_reuse=True. One LDLᵀ back-solve per cotangent against the held factor beats JAX dense-solving a fresh (n+m) × (n+m) block.
Batched solve + jacrev / vmap — jax.jacrev(lambda P: jp.batched_solve(P, x0))(pb) — set factor_reuse=False. Treat the dense path as the default for minibatch projections.

Each fwd registers its converged factor in a bounded LRU on the JaxProblem (default capacity 128). For very long-running training loops with many distinct forward solves you can drop the cache explicitly:

jp.clear_solver_cache()

Off-thread dispatch (training loops, `jit(value_and_grad(...))`)

pounce.Solver is a !Send PyO3 type (it holds an Rc<RefCell<dyn TNLP>> interior), so any attempt to touch the held factor from a thread other than the one that built it raises a PyO3 panic. JAX hits this whenever the bwd pure_callback lands on an XLA worker thread — typical for jax.jit(jax.value_and_grad(...)) inside a training step.

JaxProblem(factor_reuse=True) defends against this by routing every pounce.Solver interaction (fwd register, warm-start solve, batched solve, bwd kkt_solve) through a dedicated single-thread ThreadPoolExecutor owned by the JaxProblem (pounce#77). All solver touches are pinned to that one worker thread regardless of which thread JAX dispatches from. vmap_solve_parallel bypasses the pin (it doesn’t register with the factor cache), so its B-way thread concurrency is preserved.

Pickle / distributed training

JaxProblem round-trips through pickle.dumps / pickle.loads, so it works with the realistic distributed-training paths:

multiprocessing(start_method='spawn') — the default on macOS and what torch.utils.data.DataLoader(num_workers>0) uses;
Ray and Dask actors via cloudpickle;
Naive checkpointing for resume.

The per-process runtime state (JIT’d closures, threading.Lock, threading.local, the factor-reuse executor, the held LDLᵀ factor registry) is dropped from the pickle and rebuilt on the receiving side. The sparsity-pattern arrays survive the round trip, so the worker doesn’t redo the one-shot JAX probe. Held factors do not survive — a fresh process has no history of fwd solves, so the receiver’s registry starts empty and the bwd factor-reuse path picks up from the next solve.

User-side requirement: f and g must themselves be picklable. Module-level functions work with stdlib pickle; lambdas / inner functions need cloudpickle (which is what Ray, Dask, and torch.multiprocessing use by default anyway).

multiprocessing(start_method='fork') is not supported — JAX itself warns that os.fork() is incompatible with its threading; use spawn instead.

Stacked block-diagonal batched solve (`batched_solve`)

JaxProblem.batched_solve(p_batch, x0) runs one IPM solve over a single NLP whose variables are [x^(1); ...; x^(B)], constraints are concat(g(x^(k), p^(k))), and objective is Σ_k f(x^(k), p^(k)). The Jacobian and Lagrangian Hessian are block-diagonal — each block-k constraint touches only the block-k slice of X, and the objective is a pure sum, so there’s no cross-block coupling. The IPM sees one big sparse problem but does only B × (per-block factor cost) work on the linear system.

p_batch = jnp.array([[0.3, 0.7], [0.5, 0.5], [-0.1, 0.4]])
x_batch = jp.batched_solve(p_batch, x0=jnp.zeros(2))    # (B, n)

custom_vjp-wrapped, so jax.grad/jax.jacobian through the batched solve work end-to-end:

def loss(P):
    return jnp.sum(jp.batched_solve(P, x0=jnp.zeros(2)) ** 2)

dloss_dP = jax.grad(loss)(p_batch)                       # (B, p_shape)

The backward path follows factor_reuse=:

factor_reuse=True (default) — one Solver.kkt_solve against the stacked held LDLᵀ factor; the per-block ∂²L/∂x∂p / ∂g/∂p are jax.vmap’d autodiff over the user’s f / g, then contracted with the per-block u_x / u_g slices of the single back-solve. Composes (A) and (B) — one factor for both forward and per-batch sensitivities (pounce#76).
factor_reuse=False — jax.vmap of the per-element dense (n+m) × (n+m) JAX KKT solve. Exact for the same reason: block- diagonal coupling means ∂x^(k)*/∂p^(j) = 0 for k ≠ j.

When to pick batched_solve vs the existing batched surfaces:

Surface	Wins when
`vmap_solve`	Long batches, want one solve per iterate sequentially.
`vmap_solve_parallel`	Batch elements have very different convergence behaviour — slow blocks don’t drag fast ones (B independent IPMs in worker threads, GIL released per solve).
`batched_solve`	Blocks have similar convergence behaviour (shared barrier homotopy and symbolic factorisation amortise) and B is large enough that the per-call Python overhead of B fwd dispatches becomes visible (one Rust crossing instead of B).

Per-block lb/ub/cl/cu are tiled across the batch; the parameter p is what varies, not the feasible region. Stacked Problems are cached per (thread, B) in a tiny LRU (cap 4), so calls in a loop with one or two batch sizes pay the build cost at most once per worker.

Post-solve Jacobian and sensitivities (`batched_solve_with_jacobian`)

When you need the explicit per-block Jacobian J[k] = ∂x^(k)*/∂p^(k) as a first-class result — for validation, linear-update layers, or diagnostics — batched_solve_with_jacobian returns it directly from the held KKT factor instead of wrapping batched_solve in jax.jacrev:

x_star, (lam, zL, zU), J = jp.batched_solve_with_jacobian(p_batch, x0)
# x_star : (B, n)   J : (B, n, p_dim)   duals match batched_solve_with_warm

J’s row i is the reverse-mode VJP at cotangent e_i (the KKT system is symmetric), so the whole Jacobian is one multi-RHS back-solve against the held LDLᵀ factor — no NLP re-solve, no repeated public jax.vjp calls. Pass wrt_cols (1-D p only) to keep just the parameter columns you care about, e.g. wrt_cols=slice(0, ny) to drop context columns; J then has trailing dim len(wrt_cols).

For the linear-update pattern — anchor once, then apply several nearby sensitivity products — pin the factor with an AnchorState and reuse it:

with jp.anchor(p_batch, x0, wrt_cols=slice(0, ny)) as state:
    dx     = jp.batched_jvp_from_state(state, dp)      # J @ dp   (forward)
    dp_bar = jp.batched_vjp_from_state(state, x_bar)   # J^T @ x_bar (reverse)

batched_jvp_from_state is the cheap path for linear updates that only need the directional sensitivity delta_x = J @ delta_p and never the full J: it assembles the parameter-side RHS [∂²L/∂x∂p · dp; ∂g/∂p · dp] and back-solves once against the held factor. When the state was anchored with wrt_cols, pass the reduced dp (one entry per selected column); otherwise pass a full (B,) + p_shape perturbation (zero out the columns you don’t want to move).

anchor(...) (and batched_solve_with_jacobian(..., return_state=True)) return an AnchorState that holds the factor across calls. Prefer the context-manager form; for handles that must outlive a single block (e.g. stored on a projection layer), use explicit ownership:

state = jp.anchor(p_batch, x0)
...                          # later calls reuse `state`
state.reanchor(p_new, x0)    # swap the solve in place (closes prior pin)
state.close()                # release the held factor

Pinned factors are exempt from the backward LRU but capped (_pinned_capacity, default 16) so a missed close() fails loudly rather than leaking; a weakref finalizer reclaims the factor if a handle is garbage-collected without close(). A worked example — projection layer, full Jacobian, JVP/VJP-from-state, and the lifetime patterns — is in notebooks/13_post_solve_jacobian.ipynb.

PyTorch integration

The pounce.torch subpackage is a PyTorch frontend mirroring pounce.jax, one-for-one. It is a thin adapter, not a second solver: the numerical core (the Rust IPM) and the implicit-function-theorem backward are framework-agnostic — only the array namespace differs. A solve is a torch.autograd.Function you can drop inside a torch.nn model and backprop through, with the same constraint-satisfaction guarantee the JAX path gives. Install with pip install pounce[torch] (torch.func requires torch ≥ 2.2).

Because PyTorch is eager, the adapter is smaller than the JAX one: there is no pure_callback / ShapeDtypeStruct machinery (the forward calls problem.solve(...) directly), no host-callback registry or single-thread executor (the converged Solver is stashed on the autograd ctx / AnchorState and read back in the backward on the same thread), and no global jax_enable_x64 flag — float64 tensors are requested explicitly (torch.set_default_dtype(torch.float64) or .double() your inputs; the implicit-diff and KKT solves need double precision and the layers validate it).

JAX surface	PyTorch equivalent
`from_jax(f, g, …)`	`from_torch(f, g, …)`
`solve(p, …)`	`solve(p, …)` (`torch.autograd.Function` + KKT backward)
`solve_with_warm(p, …, warm_start=)`	`solve_with_warm(…)` (dual triple + barrier-μ, pounce#86)
`vmap_solve` / `vmap_solve_parallel`	`vmap_solve` / `vmap_solve_parallel`
`JaxProblem(…)`	`TorchProblem(…)` (build-once, factor-reuse backward)
`solve_qp` / `solve_qp_batch` / `solve_socp` / `QpLayer`	same names
`PathFollower` / `inverse_map_rhs`	same names

import torch
torch.set_default_dtype(torch.float64)
from pounce.torch import solve as psolve

def f(x, p): return torch.sum((x - p) ** 2)
def g(x, p): return torch.stack([x[0] + x[1] - 1.0])   # equality

p = torch.tensor([0.3, 0.7], requires_grad=True)
x_star = psolve(
    p, f=f, g=g, x0=torch.zeros(2), n=2, m=1,
    lb=torch.full((2,), -10.0), ub=torch.full((2,), 10.0),
    cl=torch.zeros(1), cu=torch.zeros(1),
    options={"tol": 1e-10, "print_level": 0},
)
(x_star ** 2).sum().backward()   # dL/dp via the implicit function theorem
print(p.grad)

The differentiable conic layers are feasible-by-construction (the same “one roof” as cvxpylayers/theseus, off one core):

from pounce.torch import solve_qp
P = torch.eye(2); c = torch.tensor([-4.0, -4.0], requires_grad=True)
G = torch.tensor([[1.0, 1.0]]); h = torch.tensor([0.5])
x = solve_qp(P=P, c=c, G=G, h=h)   # min ½xᵀPx+cᵀx s.t. Gx ≤ h
x.sum().backward()                  # OptNet implicit-diff gradients

Validation. Every layer is checked with torch.autograd.gradcheck against finite differences, and a JAX↔Torch parity suite asserts both frontends agree on x* and dL/dp to tolerance on shared fixtures (python/tests/test_torch.py, test_qp_torch.py, test_socp_torch.py, test_parity_jax_torch.py).

Thread-safety note. torch.func transforms share a process-global layer stack and are not thread-safe; vmap_solve_parallel therefore serializes the (already GIL-bound) Python derivative callbacks with a lock while the Rust IPM linear algebra still runs concurrently (GIL released). Double-backward is supported on the conic layers but not guaranteed on the NLP implicit-diff path (the parameter sensitivities are taken with torch.func, outside the autograd graph) — set factor_reuse=False on TorchProblem for the in-framework dense backward if you need higher-order behaviour.

Notebooks

The notebooks under python/notebooks/ work through getting started, JAX autodiff, implicit differentiation, sensitivity analysis, the Pyomo integration, NLP scaling (set_problem_scaling + nlp_scaling_method=user-scaling), and FBBT (nonlinear bound tightening via presolve_fbbt=yes on Pyomo models).

Curve Fitting

pounce.curve_fit fits a model f(x, *params) to data — the same call shape as scipy.optimize.curve_fit — but returns a much richer result and adds capabilities scipy’s fitter does not have. It runs on pounce’s interior-point solver, so it inherits parameter constraints, and because the solver keeps its converged factorization it can hand back the parameter covariance (from the reduced Hessian) and the data sensitivity ∂params/∂data essentially for free.

import numpy as np
import jax.numpy as jnp
import pounce

def model(x, a, b, c):
    return a * jnp.exp(-b * x) + c       # write the model with jax.numpy

x = np.linspace(0.2, 5, 40)
y = 3.0 * np.exp(-0.9 * x) + 0.5 + 0.05 * np.random.default_rng(0).normal(size=x.size)

res = pounce.curve_fit(model, x, y, p0=[1, 1, 0])
print(res.summary())
res.popt          # fitted parameters
res.pcov          # covariance matrix
res.perr          # standard errors  = sqrt(diag(pcov))
res.ci            # (n, 2) confidence intervals at `alpha`

How it differs from `scipy.optimize.curve_fit`

	scipy.curve_fit	pounce.curve_fit
Least-squares fit + `pcov`	✅	✅
Weighted (`sigma`, `absolute_sigma`)	✅	✅
Box bounds on parameters	✅	✅
Relations between parameters (e.g. `a + b ≤ 1`)	❌	✅
Robust losses with covariance	partial	✅ (sandwich)
Confidence intervals / goodness-of-fit in the result	❌	✅
Data sensitivity `∂params/∂data`	❌	✅
Exact derivatives via JAX	❌	✅

The statistics follow the same conventions as scipy and pycse.nlinfit: the covariance is s² · (JᵀJ)⁻¹ with s² = SSE/(m − n) (the reduced χ²) unless absolute_sigma=True, and confidence intervals use the Student-t quantile popt ± t_{dof,1−α/2} · perr.

Derivatives: prefer JAX

Accurate derivatives are what make the covariance and sensitivity sharp — and they let the solver converge in a couple of iterations so the pounce-native factor route is available. The Jacobian ∂f/∂p is resolved in this order:

an analytic jac=<callable> returning (len(x), n_params),
JAX autodiff (the default when the model is written with jax.numpy),
a finite-difference fallback (used only if neither of the above applies; it emits a warning and the covariance falls back to the Jacobian form).

res = pounce.curve_fit(model, x, y, p0=[1, 1, 0])           # JAX (model uses jnp)
res = pounce.curve_fit(model, x, y, p0=[1, 1, 0], jac=myjac) # analytic
res = pounce.curve_fit(model_np, x, y, p0=[1, 1, 0])         # numpy model -> FD (warns)

Loss functions

Only smooth (C²) losses are supported, because the underlying solver is an interior-point method. Non-smooth L1/MAE is intentionally out of scope; use a robust loss instead.

`loss`	use
`"sse"` (default), `"chi2"`	ordinary / weighted least squares
`"soft_l1"` = `"huber"`	smooth pseudo-Huber, downweights outliers
`"cauchy"`	strong outlier rejection

"huber" and "soft_l1" are the same smooth (C²) pseudo-Huber loss: a true piecewise Huber is only C¹ (its curvature jumps at the knee), which the interior-point solver can’t use, so both names map to the C² form.

res = pounce.curve_fit(model, x, y, p0=[1, 1, 0], loss="huber", f_scale=0.1)
res.cov_source        # "sandwich"  (robust covariance estimator)

Parameter constraints

Box bounds express positivity / negativity / ranges; constraints= expresses relations between parameters using the scipy-style dict format.

# positivity, ranges
pounce.curve_fit(model, x, y, p0=[1, 1, 0.2],
                 bounds=[(0, np.inf), (None, None), (0, 1)])

# a relation: require a + b <= 1   (ineq g(p) >= 0)
cons = [{"type": "ineq", "fun": lambda p: 1.0 - (p[0] + p[1])}]
pounce.curve_fit(model, x, y, p0=[0.4, 0.4, 0], constraints=cons)

When a bound or constraint is active at the optimum, the covariance is projected onto the active-constraint nullspace (pounce’s reduced Hessian does exactly this), and the affected parameter is flagged in res.active_mask with an effectively degenerate confidence interval. res.cov_source reports "reduced_hessian(projected)" in that case.

Data sensitivity: `∂params/∂data`

Pass sensitivity=True to get res.dpopt_ddata, an (n_params, n_data) matrix whose entry [j, i] is how fitted parameter j moves when data point y_i is perturbed. This is the implicit-function-theorem influence ∂p*/∂y_i = 2 wᵢ² · H_S⁻¹ gᵢ, computed as a single batched back-solve against the converged factor (Solver.kkt_solve_many).

res = pounce.curve_fit(model, x, y, p0=[1, 1, 0], sensitivity=True)
db = res.dpopt_ddata[1]              # sensitivity of parameter b
i = int(np.abs(db).argmax())         # most influential point for b
print("most influential x:", x[i])

The result object

CurveFitResult carries everything in one place and supports dict-style access (res["popt"]).

field	meaning
`popt`, `pcov`, `perr`, `ci`	parameters, covariance, std errors, confidence intervals
`correlation`	normalized covariance
`residuals`, `sse`, `rmse`, `mae`	fit residuals and error norms
`r_squared`, `adj_r_squared`	coefficient(s) of determination
`chi_square`, `reduced_chi_square`, `dof`	χ² statistics and degrees of freedom
`param_names`	parameter names inferred from the model signature
`active_mask`	which parameters sit on a bound
`cov_source`	how the covariance was computed
`dpopt_ddata`	data sensitivity (if requested)
`optimize_result`	the raw solver info dict

Methods: res.predict(xnew), res.confidence_band(...) (see below), and res.summary() (a formatted report).

Confidence vs prediction bands

res.confidence_band(x, kind=..., sigma=...) returns (yhat, lower, upper), but there are two different bands and they answer different questions.

Confidence band (kind="confidence", the default) — uncertainty in the fitted curve itself, i.e. where the true mean E[y | x] lies. Its variance is gᵀ Σ g (delta method, g = ∂f/∂p, Σ = pcov). It is narrow, it shrinks toward zero as you collect more data, and most data points fall outside it — that is correct, not a miscalibration.
Prediction band (kind="prediction") — uncertainty in a new observation y = f(x) + ε. It adds the observation-noise variance: gᵀ Σ g + σ²(x). This is the band that contains about 1 − alpha of the data; it does not shrink to zero, it floors at the noise level.

Both use the Student-t quantile t_{dof, 1−α/2} (not the normal z), so the degrees of freedom are accounted for.

yhat, lo, hi = res.confidence_band(xx)                       # band on the curve
yhat, lo, hi = res.confidence_band(xx, kind="prediction")    # band on new data

For the prediction band the noise level σ(x) is taken from the fit: the sigma weights you supplied, scaled by the fitted variance s² (so a heteroscedastic fit gives a heteroscedastic band — wider where the noise is larger), or the homoscedastic level √s² if the fit was unweighted. Pass an explicit sigma= (scalar or array over x) to override it, e.g. for new x where you know the measurement noise.

Rule of thumb: use the confidence band to show how well the model is pinned down; use the prediction band to show where the next measurement will land. If “~95% of my points should be inside,” you want the prediction band.

Out-of-core data: `curve_fit_streaming`

When the dataset is too large to hold in memory, pounce.curve_fit_streaming fits exactly the same model and objective as curve_fit, but reads the data in mini-batches instead of as in-memory arrays. The solver’s objective, gradient, and Gauss-Newton Hessian are all additive sums over data points, so streaming and accumulating them produces the identical fit — only one batch (plus an n_params × n_params matrix) is ever resident.

Instead of xdata, ydata you pass a data_source: a zero-argument callable (a factory) that returns a fresh iterator of (x_batch, y_batch) — or (x_batch, y_batch, sigma_batch) — tuples. It is called once per solver pass, so it must yield the full dataset every time (re-open the file, re-slice the mmap, …); a one-shot iterator is rejected.

import numpy as np
import pounce

# 50M points living on disk — re-read in 100k-row batches each pass
x_mm = np.load("x.npy", mmap_mode="r")
y_mm = np.load("y.npy", mmap_mode="r")
BATCH = 100_000

def data_source():                       # fresh iterator every call
    for i in range(0, x_mm.shape[0], BATCH):
        yield x_mm[i : i + BATCH], y_mm[i : i + BATCH]

res = pounce.curve_fit_streaming(model, data_source, p0=[1, 1, 0])
print(res.summary())
res.popt, res.pcov, res.perr             # identical to the in-memory fit

Notes and trade-offs:

Re-readable, not one-shot. Each solver iteration (~10–50) makes one pass over data_source, so it must replay the whole dataset on every call. Uniform batch sizes avoid an extra JAX retrace on a smaller final batch.
Provide p0. The data-driven seed curve_fit uses needs a full in-memory pass, so give a starting vector. With only n_params the seed falls back to ones clipped into bounds. If the model signature doesn’t name the parameters and you omit p0, pass n_params=.
What you get back is the same — all scalar diagnostics (SSE, χ², R², dof) and the full covariance / standard errors / confidence intervals are computed and are bit-for-bit the in-memory result. Everything else carries over too: weighted fits (sigma batches), robust loss (the sandwich covariance is accumulated over batches), bounds, and constraints (active sets project the covariance exactly as in the in-memory fit).
What is omitted — the two O(n_data) outputs are not returned: res.residuals and the data sensitivity res.dpopt_ddata are both None (they are the size of the data and would defeat the purpose). confidence_band still works for new x, but uses a homoscedastic noise level since the per-point sigma is not retained.

Multiple parameter sets: `curve_fit_minima`

Nonlinear least squares is generally non-convex, so the objective curve_fit minimizes can have several local minima — distinct parameter sets that each explain the data (peak-assignment ambiguity, frequency aliasing in sinusoids, amplitude/decay trade-offs in sums of exponentials, sign/label symmetry, …). pounce.curve_fit_minima drives find_minima over exactly the same objective — same sigma weighting, robust loss, f_scale, constraints, and resolved Jacobian — to enumerate those minima, then refines each into a full CurveFitResult:

fits = pounce.curve_fit_minima(
    model, x, y,
    bounds=[(0, 3), (-10, 10), (0.1, 2.5)],  # finite bounds = the search box
    method="multistart",   # or "deflation" | "flooding" | "mlsl" | ...
    n_minima=5,
    seed=0,
)

for r in fits:               # ranked best (lowest SSE) first
    print(r.popt, r.sse, r.r_squared)
fits[0].summary()            # each is a full CurveFitResult

It reuses everything curve_fit does: the data-driven seed becomes the search’s starting point, the model Jacobian is reused as the search gradient and the Gauss-Newton matrix as the search Hessian — which sharpens the basin escapes and lets find_minima certify each point as a true minimum (rejecting saddles) before recording it. The returned list is ranked by SSE and may contain fewer than n_minima entries when the landscape has fewer minima.

Finite bounds are strongly recommended — they define the box the search samples / repels within. With the default unbounded box the search degrades to jittered restarts around the seed. The method, n_minima, max_solves, patience, dedup, and seed arguments pass straight through to find_minima; see Finding Multiple Minima and Choosing a Method.

See python/examples/curve_fit_demo.py and the 22_curve_fit.ipynb and 23_curve_fit_minima.ipynb notebooks for complete, runnable walkthroughs.

Boundary Value Problems

pounce.bvp.solve_bvp solves two-point boundary value problems

dy/dx = f(x, y, p),    a ≤ x ≤ b
bc(y(a), y(b), p) = 0

with a drop-in for scipy.integrate.solve_bvp. It discretises the problem with the 4th-order Lobatto IIIA (Hermite–Simpson) collocation scheme — the same one SciPy uses — and solves the resulting square root-find as a pounce feasibility NLP (min 0 subject to the collocation residual R(z) = 0).

The motivation is differentiability: because the discretised problem is an NLP, the converged solution z*(θ) is differentiable with respect to any parameter θ baked into f or bc, via the implicit-function theorem on the collocation KKT system. The differentiable entry points live in the autodiff frontends, pounce.jax.solve_bvp and pounce.torch.solve_bvp.

A runnable tour of every feature is in python/notebooks/24_boundary_value_problems.ipynb, and a SciPy speed/accuracy comparison in python/examples/bvp_scipy_compare.py (plus the GLC tritium-column case in python/examples/glc_feral_vs_scipy.py). The GLC problem was suggested by Milan Rother and is adapted from pathsim-chem (MIT License).

Drop-in NumPy solve

import numpy as np
import pounce

# y'' = -|y|, y(0) = 0, y(4) = -2
def fun(x, y):
    return np.vstack((y[1], -np.abs(y[0])))

def bc(ya, yb):
    return np.array([ya[0], yb[0] + 2.0])

x = np.linspace(0, 4, 41)
y0 = np.zeros((2, x.size)); y0[0] = 1.0

res = pounce.solve_bvp(fun, bc, x, y0)
print(res.success, res.rms_residuals.max())
res.sol(np.linspace(0, 4, 9))   # cubic-Hermite interpolant, shape (n, 9)

The call signature and the returned bunch (sol, x, y, yp, p, rms_residuals, niter, status, message, success) match SciPy, so existing code consumes the result unchanged. Unknown parameters work the same way — pass p=[...] and a fun(x, y, p) / bc(ya, yb, p):

# Eigenvalue: y'' + k² y = 0, y(0)=y(1)=0, y'(0)=k
def fun(x, y, p): return np.vstack((y[1], -p[0]**2 * y[0]))
def bc(ya, yb, p): return np.array([ya[0], yb[0], ya[1] - p[0]])

res = pounce.solve_bvp(fun, bc, x, y0, p=[3.0])
res.p           # ≈ [π]

Differences from SciPy

Mesh. adaptive=True (default, like SciPy) refines the mesh to meet tol. adaptive=False solves the mesh you pass as-is — fast and predictable, and the mode the differentiable frontends use internally (a fixed mesh keeps θ ↦ y smooth).
verbose mirrors SciPy: 1 prints a one-line termination report, 2 also prints per-iteration mesh-refinement progress.
Solver (method). method="newton" (default) runs a modified (frozen-Jacobian) Newton on the square collocation system, factorising the N×N Jacobian with FERAL’s unsymmetric sparse LU (pounce._pounce.SparseLU) and reusing that factor across steps (refactoring only when progress stalls — the same trick SciPy’s solve_newton uses). Both scale linearly in the mesh; at equal mesh pounce is typically faster than SciPy (≈0.6–1.0×), including large nonlinear problems, because the factorisation dominates and it does far fewer of them. The Jacobian is the exact sparse collocation Jacobian (analytic per-node ∂f/∂y blocks from fun_jac/bc_jac if supplied, else a vectorised finite difference that perturbs each state across the whole mesh — O(n) fun calls, not O(n·m)). method="ipm" instead poses the system as a pounce feasibility NLP and solves with the interior-point method (factoring the 2N saddle KKT each iteration — slower, but the basis for the constrained solver below). Accuracy is identical to SciPy either way.
Singular term S is not yet supported.

Differentiable solves (JAX / PyTorch)

The differentiable frontends take fun(x, y, p, theta) / bc(ya, yb, p, theta) (drop p when there are no unknown parameters), where theta is the autodiff knob, and return a solution whose y / p participate in the autodiff graph. Everything fun / bc close over is differentiable: a physical coefficient, a boundary value, or the sensitivity of a solved-for unknown parameter.

import jax, jax.numpy as jnp
import pounce.jax as pj

# Bratu: y'' + λ e^y = 0, y(0)=y(1)=0
def fun(x, y, lam): return jnp.vstack((y[1], -lam * jnp.exp(y[0])))
def bc(ya, yb, lam): return jnp.array([ya[0], yb[0]])

x = jnp.linspace(0, 1, 51)
y0 = jnp.zeros((2, x.size))

def y_mid(lam):
    sol = pj.solve_bvp(fun, bc, x, y0, theta=lam)
    return sol.y[0, sol.y.shape[1] // 2]

grad = jax.grad(y_mid)(1.0)          # d y(0.5) / d λ
J = jax.jacobian(lambda l: pj.solve_bvp(fun, bc, x, y0, theta=l).y[0])(1.0)

The PyTorch frontend mirrors this exactly:

import torch
import pounce.torch as pt
torch.set_default_dtype(torch.float64)

lam = torch.tensor(1.0, dtype=torch.float64, requires_grad=True)
sol = pt.solve_bvp(fun, bc, x, y0, theta=lam)  # fun/bc written with torch ops
sol.y[0, 25].backward()
lam.grad

What’s differentiable

Target	How	Demo
ODE/BC coefficient `θ`	`jax.grad` / `.backward()` through `sol.y`	`examples/bvp_scipy_compare.py` (a)
Boundary value	put it in `bc` and differentiate `θ`	(b)
Solved-for unknown `p*`	differentiate `sol.p`	(c)
Full solution `dy/dθ`	`jax.jacobian` over `sol.y`	(d)
Vector `θ`	one reverse pass	(e)
Second derivative / Hessian	`second_order=True`	(f)

All of these are validated against finite differences to ~1e-11 in python/examples/bvp_scipy_compare.py, which also benchmarks accuracy and speed against SciPy.

Differentiable solver backends (`method`)

The differentiable solve_bvp (both pounce.jax and pounce.torch) takes the same method switch:

method="newton" (default) — the fast path. Forward is the FERAL sparse-LU Newton solve; the backward is the implicit-function-theorem VJP dz/dθ = −R_z⁻¹ R_θ, solving R_zᵀu = v with the same sparse LU (SparseLU.solve_transpose). Both directions stay on the N system — no 2N saddle — so it is fast and differentiable. First-order only (the forward is an opaque callback).
method="ipm" — routes the forward through pounce.jax.solve / pounce.torch.solve (the interior-point feasibility NLP). Needed for second-order derivatives (below).

Second-order derivatives

With method="ipm", pass second_order=True to wrap the solve in a custom_jvp whose tangent rule re-applies the implicit-function theorem to the square collocation root-find,

dz/dθ = -(∂R/∂z)⁻¹ (∂R/∂θ),

and recovers z* through the same custom-ruled primitive, so JAX recurses to arbitrary order:

def y_mid(lam):
    sol = pj.solve_bvp(fun, bc, x, y0, theta=lam,
                       method="ipm", second_order=True)
    return sol.y[0, sol.y.shape[1] // 2]

jax.grad(jax.grad(y_mid))(1.0)     # d²y(0.5)/dλ²  — works

The cost is one extra forward solve per differentiation level (the rule re-solves to recover z*); the opaque forward is still only evaluated for primal values. Leave it off for plain gradient-based training; turn it on for Hessians / Newton-type outer loops.

Adaptive refinement is on by default (like SciPy), driven by tol / max_nodes. Pass adaptive=False to solve the given mesh as-is:

res = pounce.solve_bvp(fun, bc, x, y0, tol=1e-6, max_nodes=2000)  # adaptive
res = pounce.solve_bvp(fun, bc, x, y0, adaptive=False)            # fixed mesh

Each round: solve on the current mesh (to round-off), estimate the relative RMS residual of the continuous solution per interval with a 5-point Lobatto quadrature at the superconvergent Gauss points x_mid ± ½h√(3/7), insert nodes where it exceeds tol (one node, or two if it’s >100× over), and re-solve warm-started off the previous solution. This is a faithful port of SciPy’s estimator and refinement rule, so it reproduces SciPy’s mesh sequence essentially node-for-node:

problem	SciPy nodes	pounce nodes	solution agreement
`y''+y=0`	6 → 31	6 → 31	1e-16
Bratu	5 → 29	5 → 29	6e-17
`y’’=-	y	` (kink)	11 → 58

Adaptive is numpy-only — the differentiable pounce.jax / pounce.torch paths are always fixed-mesh, because a parameter-dependent mesh would make y(θ) nonsmooth and break the gradients. Pick a fixed mesh fine enough for your θ range, or run an adaptive solve once to size it.

Constrained / optimal-control BVPs (pounce-unique)

pounce.solve_bvp_constrained solves a collocation BVP subject to bounds on the states/parameters and inequality path constraints, optionally minimising an objective:

dy/dx = f(x, y, p),  bc(y(a), y(b), p) = 0
ylo <= y(x) <= yhi              (state bounds, every node)
clo <= c(x, y, p) <= chi        (path constraints, every node)
minimise  J(Y, p)               (optional)

This is a genuine NLP, so it goes through pounce’s interior-point method (not the Newton path), and SciPy’s solve_bvp cannot express any of it. A fully determined BVP (n + k boundary residuals) has a unique solution, so constraints only bite when there is freedom — return fewer boundary residuals and let the objective resolve the remainder (an optimal-control collocation):

import numpy as np, pounce

# minimise ∫(y-1)² s.t. y''=0, y(0)=0  (slope free) — optimal control.
def fun(x, y): return np.vstack((y[1], np.zeros_like(y[0])))
def bc(ya, yb): return np.array([ya[0]])          # one boundary residual → 1 DOF
x = np.linspace(0, 1, 41); y0 = np.zeros((2, x.size)); y0[0] = x

obj = lambda Y, p: np.trapezoid((Y[0] - 1.0) ** 2, x)

r  = pounce.solve_bvp_constrained(fun, bc, x, y0, objective=obj)        # y(1) ≈ 1.5
rc = pounce.solve_bvp_constrained(fun, bc, x, y0, objective=obj,
                                  y_bounds=([-np.inf, -np.inf], [1.2, np.inf]))
rc.y[0].max()    # ≤ 1.2 — the bound is active and respected

path=path(x, Y, p) -> (q, m) with path_bounds=(clo, chi) adds inequality path constraints at every node (assembled with a sparse block-diagonal Jacobian). The objective’s gradient is finite-differenced; the Lagrangian Hessian uses pounce’s limited-memory quasi-Newton (the path constraints make it nonzero in general).

How it works

For a mesh x₀ < … < x_{m-1}, each interval contributes the Hermite–Simpson collocation residual

y_mid = (y_i + y_{i+1})/2 - h/8 (f_{i+1} - f_i)
r_i   = y_{i+1} - y_i - h/6 (f_i + 4 f(x_mid, y_mid) + f_{i+1})  = 0

Stacking the n·(m-1) collocation residuals with the n + k boundary residuals gives a square system R(z) = 0 in the unknowns z = [vec(Y); p] of size N = n·m + k. pounce solves it as min 0 s.t. R(z) = 0. At the solution the interior-point method holds the KKT factor of [[H, Jᵀ], [J, 0]] with J = ∂R/∂z; for this all-equality, no-bounds, zero-objective problem the generic implicit-diff backward collapses to the Newton sensitivity

dz*/dθ = -(∂R/∂z)⁻¹ (∂R/∂θ),

which is exactly what jax.grad / autograd return — no BVP-specific backward code. The collocation residual itself is shared verbatim across the NumPy, JAX, and PyTorch paths (pounce/bvp/_core.py).

ODE / DAE Initial Value Problems

pounce.ode.solve_ivp integrates stiff initial value problems

M y' = f(t, y),    y(t0) = y0

as a drop-in for scipy.integrate.solve_ivp with the implicit Radau method. It implements the 3-stage Radau IIA collocation scheme (order 5, L-stable) — the same method SciPy’s Radau uses, and the classic RADAU5 of Hairer & Wanner. Each step’s coupled stage system is solved by a simplified Newton iteration whose Jacobian is factored with FERAL’s sparse LU.

Two things set it apart from SciPy:

Mass matrix / DAEs. Pass mass=M to integrate M y' = f. When M is singular this is an index-1 differential-algebraic equation — something scipy.integrate.solve_ivp cannot do at all.
Differentiability. pounce.jax.odeint and pounce.torch.odeint integrate on a fixed mesh and return the trajectory differentiably with respect to the ODE parameters and the initial condition, via the implicit-function theorem on the collocation system (no per-step adjoint, no unrolled tape).

solve_ivp only implements method="Radau" — the implicit, stiff/DAE capable method that is pounce’s niche. For non-stiff explicit integration, SciPy or diffrax are the right tools, and solve_ivp raises for those methods rather than silently substituting.

A SciPy speed/accuracy comparison, a DAE example, and a differentiability demo are in python/examples/ode_scipy_compare.py.

Drop-in stiff solve

import numpy as np
import pounce.ode as po

# Van der Pol, mu = 1000 (very stiff)
mu = 1000.0
def f(t, y):
    return [y[1], mu * (1 - y[0]**2) * y[1] - y[0]]

res = po.solve_ivp(f, (0.0, 3000.0), [2.0, 0.0],
                   method="Radau", rtol=1e-6, atol=1e-8, dense_output=True)

print(res.t.shape, res.y.shape)   # (nsteps,) (2, nsteps)
ys = res.sol(np.linspace(0, 3000, 1000))   # continuous extension

The call signature and the returned object match SciPy: res.t, res.y (n, n_points), res.sol (when dense_output=True), res.nfev / res.njev / res.nlu, res.status / res.message / res.success. The result is also dict-subscriptable like SciPy’s Bunch, so res["y"] and "success" in res work too.

Provide an analytic Jacobian with jac=... (else it is estimated by finite differences), and the usual t_eval, args, first_step, max_step, rtol, atol controls.

Index-1 DAE via a mass matrix

A singular mass matrix turns the same solver into a DAE integrator. Robertson kinetics, written with the conservation law as an algebraic constraint:

import numpy as np
import pounce.ode as po

k1, k2, k3 = 0.04, 3e7, 1e4
def f(t, y):
    return [-k1*y[0] + k3*y[1]*y[2],
             k1*y[0] - k3*y[1]*y[2] - k2*y[1]**2,
             y[0] + y[1] + y[2] - 1.0]      # 0 = ...  (algebraic)

M = np.diag([1.0, 1.0, 0.0])                # third equation is algebraic
res = po.solve_ivp(f, (0, 1e4), [1.0, 0.0, 0.0], mass=M,
                   rtol=1e-6, atol=1e-8)

The algebraic constraint is satisfied to round-off at every accepted step.

Differentiable integration (JAX / PyTorch)

For gradient-based work — fitting ODE parameters, neural ODEs, optimal control — use the autodiff frontends. They integrate on a fixed mesh t (make it fine enough to resolve the dynamics) and return the trajectory differentiably w.r.t. the parameters theta and the initial condition y0:

import jax, jax.numpy as jnp
import pounce.jax as pj

def f(t, y, theta):           # dy/dt, JAX-traceable
    k = theta[0]
    return jnp.array([-k * y[0]])

t = jnp.linspace(0.0, 2.0, 81)

def y_final(k):
    sol = pj.odeint(f, jnp.array([1.0]), t, jnp.array([k]))
    return sol.y[0, -1]

val  = y_final(0.7)            # = exp(-0.7 * 2)
grad = jax.grad(y_final)(0.7)  # exact d/dk via the implicit-function theorem

The PyTorch mirror is pounce.torch.odeint, with theta/y0 as tensors and .backward() filling theta.grad / y0.grad. Both return a solution whose y is (n, m) in SciPy layout and carries the autodiff graph; sol is a (detached) cubic-Hermite interpolant for plotting.

Under the hood an IVP on a fixed mesh is just a boundary value problem with bc(ya, yb) = ya - y0, so the differentiable path reuses pounce’s Hermite–Simpson collocation and the same FERAL sparse-LU implicit-diff back-solve as pounce.jax.solve_bvp. The result is the collocation solution on the mesh you pass, and its gradients are exact for that discretisation.

Performance

pounce.ode runs the same algorithm as scipy.integrate.solve_ivp(method= "Radau") (a faithful RADAU5), so it takes essentially the same number of steps and reaches the same accuracy. The wall-clock difference is implementation overhead: pounce’s stepper is pure Python, SciPy’s inner loop is compiled.

Practical guidance:

Small / few-state stiff systems (state dimension up to roughly 10–20): pounce is at or below SciPy’s wall-clock. There is effectively no speed penalty for a single solve — and you get DAE support and differentiability on top.
Large stiff systems (hundreds of states, e.g. a method-of-lines PDE): pounce is currently ~3–4× slower than SciPy in absolute terms, but still sub-second. That gap matters only when solving such a system many thousands of times in a loop — and if you need the differentiable path, SciPy is not an option at all.

Illustrative single-solve timings (best of 7; relative ratios are stable, the absolute milliseconds are machine-dependent):

problem	states	`pounce.ode`	SciPy `Radau`
Van der Pol, μ=1000, t∈[0, 3000]	2	~100 ms	~105 ms
Brusselator (method-of-lines)	100	~80 ms	~24 ms
Brusselator (method-of-lines)	300	~410 ms	~94 ms

These reflect three optimisations in the stepper, none of which change accuracy or the public API: a RADAU5 stage predictor (warm-start each step’s Newton from the previous step’s collocation polynomial), a wider step-size hold band (reuse the cached factor across more steps), and reusing the LU pattern across refactors (build FERAL’s symbolic analysis once per solve, refactor in place). The last is what makes the large-n cost scale sensibly.

What it is and isn’t

It is a faithful, L-stable Radau IIA(5) implementation that tracks SciPy’s Radau step-for-step on stiff problems and adds DAE and differentiability support SciPy lacks.
It is not a general non-stiff integrator: only method="Radau" is implemented.
Event detection (events=) is supported, matching SciPy: each event is a callable g(t, y) with optional terminal (bool / count) and direction attributes; crossings are root-found on the dense output and returned in t_events / y_events (a terminal event stops with status=1).
The differentiable layer is fixed-mesh (the mesh keeps theta → y smooth); the adaptive solver is the non-differentiable solve_ivp.

Fully-implicit DAEs

pounce.ode.solve_dae integrates a fully-implicit, index-1 differential-algebraic equation

\[ F(t, y, y’) = 0 \]

with the same Radau IIA(5) collocation as solve_ivp, written in residual form. This is a pounce extension: scipy.integrate.solve_ivp has no fully-implicit DAE solver (its closest relative, the mass-matrix form M y' = f, is also available via solve_ivp(..., mass=M)).

import numpy as np
from pounce.ode import solve_dae

# Robertson kinetics as an index-1 DAE: two rate equations + a conservation law.
k1, k2, k3 = 0.04, 3.0e7, 1.0e4
def F(t, y, yp):
    return np.array([
        yp[0] - (-k1*y[0] + k3*y[1]*y[2]),
        yp[1] - ( k1*y[0] - k3*y[1]*y[2] - k2*y[1]**2),
        y[0] + y[1] + y[2] - 1.0,            # algebraic constraint (no y')
    ])

res = solve_dae(F, (0.0, 1e4), y0=[1.0, 0.0, 0.0], rtol=1e-8, atol=1e-10)
print(res.y[:, -1], res.y[:, -1].sum())     # constraint held to round-off

Consistent initial conditions

A DAE solve needs (y0, y'0) with F(t0, y0, y'0) = 0. By default (consistent="project") solve_dae computes them for you: it detects which variables are algebraic (those that F does not depend on y' for — a structurally-zero column of ∂F/∂y') and Newton-projects onto the constraint manifold, holding the differential y and algebraic y' fixed and solving for the differential y' and algebraic y (the IDA IDA_YA_YDP_INIT computation). So a rough y0 (even one off the constraint) and yp0=None are fine:

# y0 violates the constraint (sum = 1.5) and no derivative guess is given —
# both are projected to a consistent state before integrating.
solve_dae(F, (0.0, 1e4), y0=[1.0, 0.0, 0.5], yp0=None)

Pass consistent="assume" with an explicit yp0 to skip the projection (you guarantee F(t0, y0, yp0) == 0).

Jacobians

jac(t, y, yp) -> (∂F/∂y, ∂F/∂y') is optional; both blocks are finite-differenced (2n evaluations) when omitted. Supplying them avoids the FD cost and improves robustness on stiff problems.

Scope

Index-1 only. The stage matrix I₃⊗∂F/∂y' + h(A⊗∂F/∂y) stays nonsingular for index-1 problems; higher index needs index reduction (not done here).
Same adaptive Radau engine as solve_ivp — stiff-capable, sparse-LU stage solve, dense output (dense_output=True / t_eval=), args=.
Events are not supported.

Differentiable integration (JAX / PyTorch)

pounce.jax.daeint / pounce.torch.daeint integrate F(t, y, y', theta) = 0 on a fixed mesh and return the node trajectory differentiable w.r.t. the parameters theta and the initial condition y0, via the implicit-function theorem on the collocation system. As with pounce.jax.odeint, the mesh is fixed (keeping the solution map smooth); accuracy is controlled by the mesh. The default scheme is BDF2 (order=2, L-stable, second-order); pass order=1 for backward Euler. F must be framework-traceable.

import jax, jax.numpy as jnp
from pounce.jax import daeint

def F(t, y, yp, theta):                 # y0' + theta*y0 - y1 = 0 ; y0 + y1 = 1
    return jnp.array([yp[0] + theta*y[0] - y[1], y[0] + y[1] - 1.0])

t  = jnp.linspace(0.0, 2.0, 81)
y0 = jnp.array([0.5, 0.5])
loss = lambda th: daeint(F, y0, t, th)[0, -1] ** 2
g = jax.grad(loss)(1.3)                  # exact for the discretisation

The forward solve and the R_yᵀ back-solve run on the host (FERAL sparse LU); the parameter VJP is taken by framework autodiff of the collocation residual at the converged nodes. Gradients are validated against finite differences in the test suite (python/tests/test_dae.py).

Finding Multiple Minima

pounce.minimize finds a single local minimum from a starting point. pounce.find_minima is its global-search companion: it drives the same local solver in a loop to discover many distinct minima, or the global one among them.

import pounce

result = pounce.find_minima(
    fun, x0,
    method="deflation",     # see the method families below
    jac=jac, hess=hess,     # same as minimize; analytic derivatives recommended
    bounds=bounds,
    n_minima=6,             # target number of distinct minima
    max_solves=None,        # budget; default 8 * n_minima
    patience=8,             # give up after this many solves with nothing new
    dedup=1e-3,             # minima closer than this are "the same"
    seed=0,
)

result.minima   # list of minima, sorted by objective (lowest first)
result.values   # their objective values
result.x        # the best (lowest) minimum
result.status   # "target_reached" | "converged" | "budget_exhausted"
result.n_solves # solver calls used
result.trace    # per-solve diagnostics

Every method reuses minimize, so bounds and constraints carry through unchanged, and the acceptance test is shared: each candidate is polished on the clean objective, checked against the bounds, and — when a Hessian is supplied — certified as a true minimum (positive-semidefinite Hessian, so saddles and maxima are rejected) before being de-duplicated and recorded.

The six methods fall into three families by how they escape a minimum they have already found.

Repulsion — transform the problem and re-solve

These modify the problem so the solver can no longer settle where it just did, then re-solve. They share the lineage of the filled-function method and metadynamics: make the found minimum unattractive.

`flooding`

Add a repulsive Gaussian bump to the objective at each found minimum x*_k:

F(x) = f(x) + Σ_k A_k · exp(−‖x − x*_k‖² / 2σ_k²)

The bump does not move the stationary point (a Gaussian is flat on top); it flips its curvature. The minimum turns into a saddle once the bump is taller than the basin’s curvature — precisely when A/σ² > λ_min(∇²f(x*)) — and the solver rolls off it into a new basin. The bump is smooth with an analytic gradient and Hessian, so the flooded problem is as solvable as the original.

Knobs (strategy_kw): sigma (width) and amplitude (height). Both are "auto" by default. sigma is per-dimension — a fraction (sigma_frac, default 0.1) of each variable’s bounds range — so variables on very different scales are handled automatically. amplitude is set per minimum from the local curvature (amp_margin × μ_min, the well-tempered escape height, where μ_min is the smallest generalized eigenvalue of the Hessian against the bump metric) and raised adaptively if the solver returns to a flooded basin — so no manual energy scale is needed (a steep PES whose wells are ~150 deep needs no amplitude=150). Override either with a scalar or a length-n vector (sigma).
Best for broad enumeration of all minima of a smooth objective.
References. Ge, R. “A filled function method for finding a global minimizer of a function of several variables.” Mathematical Programming 46, 191–204 (1990). doi:10.1007/BF01585737. Laio, A. & Parrinello, M. “Escaping free-energy minima.” PNAS 99(20), 12562–12566 (2002). doi:10.1073/pnas.202427399. Grubmüller, H. “Predicting slow structural transitions in macromolecular systems: Conformational flooding.” Phys. Rev. E 52(3), 2893–2906 (1995). doi:10.1103/PhysRevE.52.2893. Adaptive bump heights: Barducci, A., Bussi, G. & Parrinello, M. “Well-tempered metadynamics.” Phys. Rev. Lett. 100, 020603 (2008). doi:10.1103/PhysRevLett.100.020603.

`deflation`

Instead of a finite local bump, add a singular pole penalty:

F(x) = f(x) + Σ_k η / (‖x − x*_k‖² + s)^(p/2)

Each found minimum becomes infinitely costly. The pole reaches further than a Gaussian (it decays as 1/r^p rather than vanishing exponentially), so it can clear a basin a narrow Gaussian would miss. This is the additive, minimization-friendly realization of the deflation idea, whose original form multiplies the residual of a nonlinear system by a deflation operator to exclude known roots for a Newton iteration.

Knobs: eta (penalty strength), power p, soft s (softening that keeps the pole finite), and length — the per-dimension pole scale, also "auto" from the bounds range by default (scalar or vector to override).
Best for enumeration on problems where the longer-reach repulsion helps; the most Newton/IPM-native of the repulsion methods.
References. Brown, K.M. & Gearhart, W.B. “Deflation techniques for the calculation of further solutions of a nonlinear system.” Numerische Mathematik 16, 334–342 (1971). doi:10.1007/BF02165004. Farrell, P.E., Birkisson, Á. & Funke, S.W. “Deflation techniques for finding distinct solutions of nonlinear partial differential equations.” SIAM J. Sci. Comput. 37(4), A2026–A2045 (2015). doi:10.1137/140984798.

`tunneling`

Rather than climb out of a basin, tunneling crosses sideways at constant height to a point past the barrier, then descends. Between local solves it seeks a point at the height of the most-recently found minimum while being repelled from all known minima, and then re-minimizes there. The result is a monotonically non-increasing sequence of minima.

Knobs: eta, power, soft (the repelling poles).
Best for finding the global minimum and a descending trail to it, not exhaustive enumeration.
Reference. Levy, A.V. & Montalvo, A. “The tunneling algorithm for the global minimization of functions.” SIAM J. Sci. Stat. Comput. 6(1), 15–29 (1985). doi:10.1137/0906002.

A worked example of all three is in python/notebooks/19_find_minima_repulsion.ipynb.

Restart — choose the next start cleverly

These leave the objective untouched and only change where each local solve begins.

`multistart`

Random (or Sobol low-discrepancy) sampling of the bounds box, one local solve per start. Simple, a strong baseline, and embarrassingly parallel.

Knobs: sobol (low-discrepancy sampling, on by default), restart_jitter (used when no bounds box is given).
Best for a robust default, especially when local solves are cheap and can be parallelized.

`mlsl`

Multi-Level Single Linkage grows a pool of sample points and starts a local solve from a sample only when (a) no better sample lies within a shrinking “reduced distance,” and (b) it is not near an already-found minimum. The effect is that each basin is descended approximately once, instead of many times as plain multistart re-discovers knowns.

Knobs: samples_per_round, gamma (reduced-distance scale).
Best for expensive local solves on funneling landscapes, where avoiding redundant descents matters.
Reference. Rinnooy Kan, A.H.G. & Timmer, G.T. “Stochastic global optimization methods part II: Multi level methods.” Mathematical Programming 39, 57–78 (1987). doi:10.1007/BF02592071.

See python/notebooks/20_find_minima_restart.ipynb, which shows multistart spending ~15 solves (9 redundant) to find all six camel minima where MLSL needs ~6 (0 redundant).

Hopping — a Markov chain over minima

`basinhopping`

From the current minimum, apply a random perturbation, locally minimize to a neighboring minimum, and accept or reject by a Metropolis rule on the objective. The chain is biased downhill, so it reliably reaches the global minimum while collecting the distinct minima it visits.

Knobs: step (perturbation size), temperature (acceptance).
Best for the global minimum on rugged, high-dimensional landscapes — the workhorse of cluster and protein optimization.
References. Li, Z. & Scheraga, H.A. “Monte Carlo-minimization approach to the multiple-minima problem in protein folding.” PNAS 84(19), 6611–6615 (1987). doi:10.1073/pnas.84.19.6611. Wales, D.J. & Doye, J.P.K. “Global optimization by basin-hopping…” J. Phys. Chem. A 101(28), 5111–5116 (1997). doi:10.1021/jp970984n. Cousin with history feedback: Goedecker, S. “Minima hopping…” J. Chem. Phys. 120(21), 9911–9917 (2004). doi:10.1063/1.1724816.

See python/notebooks/21_find_minima_hopping.ipynb.

Beyond minima: saddle points and critical points

The same ideas extend to every stationary point of f — saddles (transition states) and maxima included. A critical point has ∇f(x) = 0; its Morse index (the number of negative Hessian eigenvalues) classifies it: 0 = minimum, 1 = transition state, …, n = maximum. Two entry points are provided.

`find_critical_points` — enumerate and classify

Stationary points are the roots of ∇f(x) = 0, which are exactly the minima of the gradient-norm merit ½‖∇f(x)‖² (zero there). So find_critical_points runs find_minima on that merit — using any enumeration method ("deflation", "multistart", …) — then keeps the points where ‖∇f‖ is truly zero and labels each by its Morse index. This treats pounce as a root-finder and reuses the whole find_minima machine.

r = pounce.find_critical_points(
    fun, x0, grad=grad, hess=hess, bounds=bounds,
    method="deflation", n_points=12, dedup=1e-2,
)
r.minima      # index 0
r.saddles     # 0 < index < n  (transition states)
r.maxima      # index n
for p in r.points:
    print(p.kind, p.x, p.f, p.index)

`find_saddles` — eigenvector following

A saddle is a minimum in most directions and a maximum along a few. By walking uphill along the index softest Hessian eigenvectors and Newton- downhill in the rest, eigenvector following lands directly on an index-index saddle; multistart enumerates several.

s = pounce.find_saddles(fun, x0, grad=grad, hess=hess, bounds=bounds,
                        index=1, n_saddles=4)

Together with the minima, the index-1 saddles between them form the transition-state network / disconnectivity graph — flooding fills the basins, and the saddles are the barriers crossed between filled basins.

`reaction_network` — states, barriers, and connectivity in one call

reaction_network packages the whole workflow: it finds the minima (stable states), finds the index-1 transition states, and connects each transition state to the two minima it joins — by descending its unstable mode into each adjacent basin — returning the barrier table and the minimum-energy paths.

net = pounce.reaction_network(
    fun, x0, grad=grad, hess=hess, bounds=bounds,
    n_states=3, n_transition_states=2,
    minima_kw={"sigma": 0.4, "amplitude": 150.0},   # find_minima tuning
    saddle_kw={"max_step": 0.05},                    # find_saddles tuning
)

print(net.summary())
net.minima                 # stable states, sorted by energy (CriticalPoint)
net.transition_states      # index-1 saddles
net.connections            # each: .ts, .minima=(i,j), .barrier=(fwd,rev), .path (MEP)
net.barrier(i, j)          # lowest single-step barrier from state i to state j
net.neighbors(i)           # states reachable from i over one transition state
net.path_between(i, j)     # the connecting minimum-energy path, oriented i -> j

This is the natural high-level entry point for reaction-barrier and energy-landscape work: the connectivity it returns is the reaction network (equivalently, a disconnectivity graph), and the barrier of an elementary step i → j is E(transition state) − E(state i).

References. Cerjan, C.J. & Miller, W.H. “On finding transition states.” J. Chem. Phys. 75, 2800 (1981). Henkelman, G. & Jónsson, H. “A dimer method for finding saddle points…” J. Chem. Phys. 111, 7010 (1999). doi:10.1063/1.480097. Henkelman, G., Uberuaga, B.P. & Jónsson, H. “A climbing image nudged elastic band method…” J. Chem. Phys. 113, 9901 (2000). doi:10.1063/1.1329672. E, W. & Zhou, X. “The gentlest ascent dynamics.” Nonlinearity 24, 1831 (2011). doi:10.1088/0951-7715/24/6/008.

Runnable demos: a landscape with 4 minima, 4 saddles, and 1 maximum in python/examples/critical_points.py, and a molecular reaction barrier on the Müller-Brown potential — one reaction_network call locating the stable states and the transition states between them, then reading off barrier heights and the minimum-energy path — in python/examples/reaction_barrier.py.

Termination

The search stops on whichever fires first, reported in result.status:

condition	meaning	`status`
`n_minima` distinct minima found	got what you asked for	`target_reached`
`patience` solves in a row find nothing new	landscape appears exhausted	`converged`
`max_solves` reached	spent the budget	`budget_exhausted`

patience is what makes the “fewer minima exist than requested” case efficient: ask for 6, find 2, try a few more times, and stop with converged rather than burning the whole budget. find_minima always returns however many minima it actually found — falling short of n_minima is not an error.

A solve is many function evaluations; max_solves counts solver calls. A true per-evaluation ceiling belongs inside each solve via options={"max_iter": ...}.

Choosing a method

See Choosing a Multiple-Minima Method, including how the families behave as the dimension grows.

Scope

find_minima covers methods that drive pounce’s local solver as their inner loop. Rigorous deterministic global optimization (branch-and-bound, DIRECT), population/stochastic globals (differential evolution, CMA-ES — already in SciPy), and homotopy continuation (all stationary points of polynomial systems) are different machinery and out of scope.

Choosing a Multiple-Minima Method

All six find_minima methods drive the same local solver; they differ in how they leave a minimum once found. Use this page to pick one.

By goal

Your goal	Prefer	Why
Enumerate all minima of a smooth, low-dimensional objective	`flooding`, `deflation`	repulsion clears each basin so the next solve finds a new one; analytic derivatives keep the inner solve fast
Just the global minimum	`basinhopping`, `tunneling`	both are biased downhill and do not try to cover the whole space
A robust, parallel baseline	`multistart`	independent starts, trivially parallel, no tuning
Expensive solves on a funneling landscape	`mlsl`	clustering avoids re-descending basins it has already mapped
Rugged, high-dimensional landscape (clusters, conformers)	`basinhopping`	a local random walk over minima; the standard tool at scale

By problem structure

Have an analytic Hessian? Repulsion methods (flooding, deflation) exploit it directly and certify each result as a true minimum. Without a Hessian, saddle rejection is skipped and the restart/hopping methods are a safer default.
Constrained problem? All methods pass bounds/constraints through. Repulsion only touches the objective, so it is the cleanest with general constraints; restart and hopping sample/perturb inside the bounds box.
No bounds? multistart/mlsl fall back to jittering around x0 (give a bounds box for genuine global coverage). flooding/deflation and basinhopping work without bounds.
Variables on very different scales? Handled automatically. The repulsion bump widths (sigma / length) are per-dimension and "auto" by default — sized to each variable’s bounds range — and the default dedup metric measures distance in that same scaled space, so a single dedup tolerance is scale-free. Give bounds so the scales can be inferred; pass an explicit scalar or length-n vector to override.
Symmetric or periodic coordinates (e.g. a periodic box): pass a custom distance= metric so that images of the same minimum de-duplicate correctly.

Tuning cheat-sheet

method	key knobs	rule of thumb
`flooding`	`sigma`, `amplitude`	both `"auto"` by default (`sigma` per-dimension from the bounds; `amplitude` per-minimum from local curvature) — leave them; override only to force a specific width/height
`deflation`	`eta`, `power`, `soft`, `length`	`length` is per-dimension `"auto"` by default; raise `eta` if the solver returns to a known minimum
`tunneling`	`eta`, `power`	increase `patience`; it descends in a chain
`multistart`	`sobol`	leave Sobol on for coverage
`mlsl`	`samples_per_round`, `gamma`	more samples/round on rugged landscapes
`basinhopping`	`step`, `temperature`	`step` ≈ basin spacing; raise `temperature` to cross higher barriers

If a run stops at converged with fewer minima than you wanted, raise patience (search longer before giving up) and/or max_solves. If it stops at budget_exhausted, raise max_solves.

Scaling to high dimensions

The honest headline: enumerating all minima is intractable in high dimensions for every method here — and that is a property of the problem, not of the solver. The number of local minima typically grows exponentially with dimension (Rastrigin has on the order of k^n; molecular energy landscapes grow exponentially with the number of atoms). No method can list exponentially many minima. What changes with dimension is which goal remains reachable and which methods stay efficient.

Two costs scale independently:

Cost per local solve. This is just pounce’s interior-point solve and scales well with sparse, large n — provided the objective stays sparse. Here is the catch for repulsion methods: each Gaussian or pole term adds a dense n×n Hessian contribution, so with K found minima the augmented Hessian is the sparse base plus K dense updates. On large sparse problems this destroys sparsity and the inner solve slows sharply. Restart and hopping never modify the objective, so they keep the original sparsity and the per-solve cost scales as minimize itself does.
Number of solves needed. For coverage this grows exponentially for all methods. For the global minimum it grows much more slowly for the downhill-biased methods, which is why they remain usable at scale.

How each family behaves as n grows:

Repulsion (flooding, deflation). Two problems compound. A Gaussian of width σ covers a vanishing volume fraction ~ σⁿ, so filling space needs exponentially many bumps; and the bumps densify the Hessian (above). The standard high-dimensional fix — used by metadynamics in practice — is to flood in a low-dimensional collective-variable subspace rather than all n coordinates. In full coordinates these are best kept to roughly n ≲ 10–20.
Restart (multistart, mlsl). Each start is cheap and parallel, but the number of starts to cover (or to hit the global basin) grows exponentially. MLSL’s clustering relies on a reduced radius ∝ (ln N / N)^(1/n); as n grows that exponent → 0, distances concentrate, and MLSL degenerates toward plain multistart. So MLSL’s advantage is a low-to-moderate-dimension phenomenon; in high dimension prefer plain parallel multistart (for the global basin) and spend the budget on more starts.
Hopping (basinhopping). This is the family that scales best in practice, and it is exactly what the chemistry/physics community uses for hundreds to thousands of degrees of freedom. It performs a local random walk in minimum-space — it never tries to cover the domain — keeps the objective (and its sparsity) untouched, and the Metropolis bias funnels toward low minima. Pair it with multiple independent chains for parallelism.

Practical guidance:

n ≲ 10–20, want all minima: flooding, deflation, or mlsl.
High n, want the global (or a few good) minima: basinhopping first; multistart with many parallel starts as a baseline; tunneling for a descending trail.
High n and you still want flooding-style biasing: restrict the bumps to a handful of collective variables, as in metadynamics, rather than the full coordinate vector.
Always: each individual solve inherits pounce’s scalability; the bottleneck is the number of solves and, for repulsion, the loss of sparsity — not the local solver.

Finding Multiple Minima from the CLI

The pounce command line solves one problem from one starting point. The --minima family turns that single solve into a global search: it drives the same interior-point solver in a loop, escaping each minimum it finds, and collects the distinct local minima into a deduplicated archive. It is the pure-Rust counterpart of the Python find_minima API and needs no Python — it works on built-in problems and on AMPL .nl files alike.

$ pounce model.nl --minima flooding --n-minima 10

Methods

--minima <method> selects one of six strategies. They share the same local solver and acceptance test and differ only in how they leave a minimum once found:

method	how it escapes a found minimum	reference
`multistart`	independent random / Sobol’ starts across the box	—
`mlsl`	Multi-Level Single-Linkage clustering of sampled starts	Rinnooy Kan & Timmer (1987)
`basinhopping`	Metropolis random walk over minima	Wales & Doye (1997)
`flooding`	repulsive Gaussian bumps added at found minima (filled-function)	Ge (1990)
`deflation`	softened `1/‖x−x*‖^p` poles added at found minima	Farrell, Birkisson & Funke (2015)
`tunneling`	equal-height tunnel term between descents	Levy & Montalvo (1985)

--multistart is shorthand for --minima multistart. For help choosing, see Choosing a Method — the guidance there applies unchanged to the CLI.

Shared options

flag	default	meaning
`--n-minima <N>`	10	target number of distinct minima (a stop condition)
`--max-solves <N>`	`8 × n-minima`	hard cap on solver calls
`--patience <N>`	8	stop after `N` solves in a row that find nothing new
`--dedup <d>`	1e-4	minima within this per-dimension-scaled distance are the same
`--psd-tol <t>`	1e-6	smallest Hessian eigenvalue tolerated by the saddle-rejection check
`--seed <S>`	0	seed for sampling / Sobol’ scramble (runs are reproducible)
`--sobol` / `--no-sobol`	on	use a scrambled Sobol’ sequence for box sampling

A candidate is accepted when its solve converged, the point is finite and inside the bounds, its objective Hessian is positive semidefinite within --psd-tol (saddle rejection; skipped when no Hessian is available or the problem is large), and it is not already within --dedup of an archived minimum. The dedup distance is measured in a per-dimension-scaled space (‖(a−b)/L‖, with L the box width per variable), so a single tolerance is scale-free.

The search stops at the first of: target_reached (--n-minima found), converged (--patience consecutive empty solves), or budget_exhausted (--max-solves reached).

Strategy knobs

Each is optional and used only by the relevant method; omit them to take the defaults (which mirror find_minima). "auto" widths are sized per dimension from the bounds.

flags	method
`--sigma`, `--sigma-frac`, `--amplitude`, `--amp-margin`	`flooding`
`--eta`, `--power`, `--soft`, `--length`, `--length-frac`	`deflation`, `tunneling`
`--gamma`, `--samples-per-round`	`mlsl`
`--step`, `--temperature`	`basinhopping`
`--restart-jitter`	all (perturbation scale for restart fallbacks)

The repulsion methods (flooding, deflation, tunneling) run each escape solve under hessian_approximation = limited-memory — the analytic penalty term is added to the objective and its gradient, and the quasi-Newton update supplies curvature, so the dense augmented Hessian is never assembled. Each accepted point is then polished by re-solving the clean objective with the exact Hessian, so the reported minima sit on the true problem.

Output

The console prints a ranked table of the distinct minima (rank, objective, and scaled distance to the best), followed by the stop status and the number of solves:

find-minima: 6 distinct minima in 17 solves (target_reached)
  rank        objective     dist-to-best
     0      -1.03162845e0       0.000000e0
     1      -1.03162845e0      4.772232e-1
     ...

Solution files. With .sol output enabled, the global best minimum is written to the usual <stub>.sol (preserving the AMPL contract), and the remaining minima, ranked by objective, to siblings <stub>.min001.sol, <stub>.min002.sol, … (so min001 is the second-best point).

JSON report. --json-output writes the standard single-solve report for the best minimum, plus a backward-compatible minima section:

"minima": {
  "method": "multistart",
  "status": "target_reached",
  "n_solves": 17,
  "n_minima": 6,
  "minima": [{ "x": [...], "objective": -1.0316 }, ...],
  "values": [-1.0316, -1.0316, -0.2155, ...]
}

Omitting --minima leaves the default single-solve output completely unchanged.

Example

The six-hump camel function has six local minima (two global at f ≈ −1.0316). Searching for all of them from an .nl model:

$ pounce sixhump.nl --minima multistart --n-minima 6 \
        --max-solves 120 --patience 40 --dedup 1e-3 --seed 0

References

Ge, R. “A filled function method for finding a global minimizer of a function of several variables.” Mathematical Programming 46, 191–204 (1990).
Rinnooy Kan, A.H.G. & Timmer, G.T. “Stochastic global optimization methods part II: Multi level methods.” Mathematical Programming 39, 57–78 (1987).
Levy, A.V. & Montalvo, A. “The tunneling algorithm for the global minimization of functions.” SIAM J. Sci. Stat. Comput. 6(1), 15–29 (1985). doi:10.1137/0906002.
Wales, D.J. & Doye, J.P.K. “Global optimization by basin-hopping and the lowest energy structures of Lennard-Jones clusters containing up to 110 atoms.” J. Phys. Chem. A 101(28), 5111–5116 (1997).
Farrell, P.E., Birkisson, Á. & Funke, S.W. “Deflation techniques for finding distinct solutions of nonlinear partial differential equations.” SIAM J. Sci. Comput. 37(4), A2026–A2045 (2015). doi:10.1137/140984798.

Algorithm & Workspace

Algorithm

POUNCE implements the interior-point filter line-search algorithm of Wächter & Biegler (2006) — the same algorithm upstream Ipopt uses. A solve proceeds as a sequence of barrier subproblems: for a decreasing sequence of barrier parameters μ, it takes primal-dual Newton steps on the perturbed KKT system, accepting each step through a filter line-search that balances objective descent against constraint infeasibility. When a regular step cannot be found, a restoration phase minimizes constraint violation to return the iterate to a filter-acceptable region.

See Acknowledgments for the papers behind each component.

Workspace layout

POUNCE is a Cargo workspace. Each crate maps onto a part of the upstream Ipopt source tree:

Crate	Purpose
`pounce-common`	Types, exceptions, journalist, options, tagged objects, cached results (Ipopt `src/Common`).
`pounce-linalg`	BLAS-1, dense/compound vectors and matrices, triplet storage, CSC conversion (Ipopt `src/LinAlg`).
`pounce-linsol`	Symmetric linear-solver trait layer — no FFI; backends plug in below.
`pounce-feral`	Pure-Rust sparse symmetric LDLᵀ backend. The default.
`pounce-hsl`	MA57 backend via `libcoinhsl` (optional, behind the `ma57` feature).
`pounce-nlp`	TNLP trait, TNLPAdapter, `IpoptApplication` entry point (Ipopt `src/Interfaces`).
`pounce-algorithm`	IteratesVector, IpoptData, calculated quantities, KKT, line search, μ update, convergence check, main loop (Ipopt `src/Algorithm`).
`pounce-restoration`	Restoration phase (Ipopt `Algorithm/Resto*`).
`pounce-presolve`	Presolve / problem-reduction pass run before the IPM.
`pounce-l1penalty`	ℓ₁-exact penalty-barrier wrapper for degenerate / MPCC NLPs.
`pounce-sensitivity`	Parametric sensitivity (port of Ipopt `contrib/sIPOPT`).
`pounce-cinterface`	C ABI shim — `CreateIpoptProblem` / `IpoptSolve` / `FreeIpoptProblem`.
`pounce-py`	Python bindings (the `pounce` Python package).
`pounce-cli`	The `pounce` command-line driver.

The C ABI shim lets existing PyIpopt / cyipopt / JuMP / AMPL clients link against POUNCE in place of Ipopt.

Initialization and Warm Starts

POUNCE is a local NLP solver: every solve starts from a point, and that point often decides whether the solve takes 15 iterations or 150, or whether it converges at all. This page collects the initialization story in one place: where the starting point comes from on each frontend, what the solver does with it (the part that surprises people), how to warm-start each algorithm path, and how to diagnose a bad start. The per-algorithm details live in their own pages; this is the map.

Where the starting point comes from

Frontend	Primal starting point
Python `Problem.solve(x0=...)`	the `x0` argument
Python `minimize(fun, x0, ...)`	the `x0` argument
CLI / AMPL	the `.nl` file’s initial-guess segment; zeros for variables without one
Pyomo	each `Var`’s `.value`, serialized into the `.nl` by Pyomo’s writer
GAMS	variable levels (`x.L`) via GMO
Rust	`Nlp::new(problem).x0(&[...])`, or `TNLP::get_starting_point`

Two silent-zero traps hide in that table:

Pyomo: a Var whose .value was never set is written as 0 in the .nl file. A model initialized “nowhere” is actually initialized at the origin, which for many process models is outside every variable’s meaningful range (and a domain error for log, /, and friends).
GAMS: levels default to 0 unless assigned. Set x.L before the solve statement.

Dual estimates can be seeded too: Problem.solve accepts lagrange=, zl=, zu= keyword arguments, and the .nl format carries constraint-dual guesses when the modeling layer writes them. Dual seeds are ignored unless you opt into a warm start (below). The scipy-style minimize facade does not expose dual seeding; use pounce.Problem directly when you need it.

What the solver does with your point (cold start)

The default interior-point path ports Ipopt’s iterate initializer (crates/pounce-algorithm/src/init/default.rs). The sequence:

The primal point is pushed into the interior of the bounds. Per component, with bounds lo <= x <= hi: p_l = min(bound_push * max(|lo|, 1), bound_frac * (hi - lo)), likewise p_u, and x is clamped into [lo + p_l, hi - p_u]. One-sided bounds use the bound_push term alone; free variables are untouched. With the defaults (bound_push = bound_frac = 1e-2), a variable sitting exactly on its lower bound 1.0 starts at 1.01 instead. Your point is honored approximately, and the deliberately-at-a-bound part of it is not honored at all. This is the single most common reason a “perfect” starting point does not behave like one.
Slacks are set to s = d(x) and pushed into the slack bounds the same way.
Duals get fixed defaults: constraint multipliers y = 0 (or a least-square estimate, see bound_mult_init_method below) and bound multipliers z = v = bound_mult_init_val = 1.0.
The barrier parameter starts at mu_init = 0.1 (monotone mu_strategy, the default) regardless of how good your point is.

The knobs, all Ipopt-compatible:

Option	Default	Meaning
`bound_push`	`1e-2`	Absolute push off each bound (relative to `max(
`bound_frac`	`1e-2`	Cap on the push as a fraction of the bound interval.
`slack_bound_push` / `slack_bound_frac`	`1e-2`	Same, for inequality slacks.
`bound_mult_init_val`	`1.0`	Initial bound-multiplier value.
`bound_mult_init_method`	`constant`	`constant` / `mu-based` / `least-square`.
`constr_mult_init_max`	`1e3`	Cap on the least-square constraint-multiplier estimate; `0` keeps `y = 0`.
`least_square_init_primal`	`no`	Replace the starting `x` with the min-norm solution of the linearized constraints before the interior push.
`mu_init`	`0.1`	Initial barrier parameter (monotone strategy).
`start_with_resto`	`no`	Jump straight into feasibility restoration at iteration 1 (aborts if the start is already feasible).

An infeasible starting point is fine: the IPM does not require feasibility, and least_square_init_primal=yes can cheaply reduce iteration-0 infeasibility on mostly-linear models (the mehrotra_algorithm LP/QP cascade turns it on for you, along with more aggressive bound_push / bound_frac / bound_mult_init_val). A point where a function fails to evaluate is not fine; see Diagnosing a bad start.

Warm-starting the interior-point path

From Python, the packaged form is one object:

x, info = prob.solve(x0=x0)                  # cold solve
ws = pounce.WarmStart.from_info(x, info)     # captures x, duals, mu
x2, info2 = prob.solve(warm_start=ws)        # warm re-solve
ws.save("state.npz")                         # reuse across processes

warm_start= is accepted by Problem.solve and pounce.minimize, seeds the primal and dual iterates, applies the enabling options below, and forwards the SQP working set when the state was captured from that path. The rest of this section is what it does under the hood (and the only route from the CLI or an options file).

Passing a previous solution as x0 is not a warm start by itself. The IPM warm start is a package of three things, and skipping any one of them silently degrades to (roughly) a cold solve:

Opt in and seed the duals. Set warm_start_init_point=yes and pass the previous multipliers.
Lower mu_init. The default 0.1 makes the solver walk the barrier schedule down from scratch even when started at the optimum. Seed it near the converged complementarity (e.g. 1e-7 after a tol=1e-8 solve).
Tighten the warm-start pushes. The warm initializer applies its own interior clamp with warm_start_bound_push / _frac (default 1e-3), which shoves an at-the-bound solution back off its bounds. Tighten them to keep the point.

x, info = make_problem().solve(x0=x0_cold)      # cold solve

warm = make_problem()
warm.add_option("warm_start_init_point", "yes")
warm.add_option("mu_init", 1e-7)
for k in ("warm_start_bound_push", "warm_start_bound_frac",
          "warm_start_slack_bound_push", "warm_start_slack_bound_frac",
          "warm_start_mult_bound_push"):
    warm.add_option(k, 1e-9)

x2, info2 = warm.solve(
    x0=x,
    lagrange=np.asarray(info["mult_g"]),
    zl=np.asarray(info["mult_x_L"]),
    zu=np.asarray(info["mult_x_U"]),
)

On HS071 this takes the re-solve from 11 iterations to 5, while warm_start_init_point=yes alone saves nothing; the full runnable comparison is python/examples/hs071_warm_start.py. On the CLI the same options apply as KEY=VALUE pairs, with dual seeds coming from the .nl file’s dual segment when present.

Option	Default	Meaning
`warm_start_init_point`	`no`	Master switch: honor supplied primal and dual seeds.
`warm_start_bound_push` / `warm_start_bound_frac`	`1e-3`	Interior clamp used instead of `bound_push` / `bound_frac`.
`warm_start_slack_bound_push` / `warm_start_slack_bound_frac`	`1e-3`	Same, for slacks.
`warm_start_mult_bound_push`	`1e-3`	Floor on seeded bound multipliers (a carried-in `z = 0` must not start on the barrier’s boundary).
`warm_start_mult_init_max`	`1e6`	Cap on seeded equality multipliers.

Even a well-executed IPM warm start has a structural limit: the barrier pushes iterates off the bounds, so the active-set information in a converged solution cannot be fully exploited. When you are solving a sequence of related NLPs (MPC steps, branch-and-bound nodes, homotopy paths), that limit is the reason the active-set SQP path exists.

Warm-starting the active-set SQP path

With algorithm=active-set-sqp, the warm-start payload is different: alongside the primal/dual seeds it carries the working set (which bounds and constraints are active), and an unchanged working set means the next solve converges in a handful of QP iterations.

prob.add_option("algorithm", "active-set-sqp")

ws = None
for k in range(horizon_steps):
    x, info = prob.solve(x0=x_prev, working_set=ws)
    ws = info["working_set"]
    x_prev = x

The two paths’ warm-start inputs are deliberately path-local: the IPM-side options above (warm_start_init_point, mu_init, bound_push, …) are silently ignored on the SQP path, and working_set= is ignored on the IPM path. Details, the classify_working_set helper for reconstructing a working set from multipliers, and the GAMS sqp_state_file / marginal-based routes are in Active-Set SQP & Warm Starts. Note the GAMS warm-start features currently live in the native C link only, not the pip link (see GAMS).

Sequences of solves: batch chaining and sessions

For MPC chains, parametric sweeps, and B&B node relaxations from Python, solve_nlp_batch packages the whole IPM warm-start recipe for you:

results = pounce.solve_nlp_batch(batch_t)                   # cold
results = pounce.solve_nlp_batch(batch_t1, warms=results)   # warm

Each instance is seeded with the previous primal and duals, the converged mu is threaded into mu_init, and warm_start_init_point=yes is forced; see Batched NLP solving. For post-solve sensitivity queries against the converged KKT factor (a different kind of reuse, no re-solve at all), see Sessions. JAX users get warm-start hand-off along a parameter trajectory via JaxProblem; see the Python guide.

Diagnosing a bad start

The first stop is the preflight check, which evaluates the model once at its starting point (no solve) and reports everything this page has warned about: NaN/inf evaluations, bound violations, how far the interior clamp will move the point, initial constraint violation, and derivative scale spread.

pounce check-x0 model.nl              # text report; --json for tools
pounce check-x0 model.nl --x0-file candidate.txt

report = pounce.preflight(problem_obj, x0, lb=lb, ub=ub, cl=cl, cu=cu)
print(report)          # report.fatal, report.warnings, report.to_dict()

Exit code 0 means the model evaluates cleanly at x0 (warnings allowed); 21 means a solve from this point would abort. The other diagnostics:

Invalid_Number_Detected means an evaluator returned NaN/inf, and the very first evaluation at the starting point is the usual culprit (log(0) or a division at an all-zeros default start). The interior clamp only repairs bound violations; it cannot fix domain errors on free variables. Move the start into the domain, or add bounds that keep the clamp inside it.
derivative_test=first-order runs the derivative checker at the starting point; wrong derivatives look exactly like a bad start (immediate restoration, tiny steps).
The interactive debugger (--debug) breaks at iteration 0, so you can inspect the initial objective, inf_pr, and inf_du before a single step is taken, and resolve from an edited iterate.
Presolve (presolve=yes) reports structural trouble that no starting point can fix, like rank-deficient equality blocks (LICQ check), and its bound tightening shrinks the box the interior clamp places you in. See Troubleshooting Recipes and FBBT.
pounce-studio analyze-nl gives a structural pre-flight of a model file without solving.

No good starting point at all?

Three composable primitives cover the “generate or repair a point” workflows from Python:

# N diverse starts (the sampler behind find_minima): sobol / uniform /
# jitter / bounds midpoint. Feed them to solve_nlp_batch or race them.
starts = pounce.generate_starts(16, bounds=bounds, seed=0)

# Min-norm repair of a candidate onto the linearized constraints +
# bounds (the standalone form of least_square_init_primal).
x0 = pounce.project_to_feasible(problem_obj, x0, lb=lb, ub=ub, cl=cl, cu=cu)

# Cheap tournament: a few iterations from each start, ranked; continue
# the winner at full effort with a WarmStart.
best = pounce.race_starts(fun, starts, bounds=bounds, iters=10)[0]
res = pounce.minimize(fun, best.x,
                      warm_start=pounce.WarmStart.from_info(best.x, best.info))

When the model has many local minima and you want all of them (or a managed search rather than a tournament), the global search drivers (multistart, mlsl, deflation, flooding, tunneling, basinhopping) manage populations of starting points and warm-start bookkeeping for you, from Python (pounce.find_minima) or the CLI (--minima).

Tutorial: active-set SQP and working-set warm starting

This is the user-facing walkthrough for pounce’s Phase 5b/5c active-set SQP driver. It assumes you can already drive pounce’s default IPM via the standard interface (Problem.solve in Python, IpoptSolve in C, option nlp = pounce in GAMS).

The design rationale and algorithmic choices live in the design note — read that if you want to know why the solver works the way it does. This tutorial covers how to use the solver: switching to the SQP path, carrying a working set across solves, and stitching the parametric predictor + SQP corrector pattern together.

1. When to use the active-set SQP

Use it when the same NLP shape is solved many times under small perturbations — MPC closed-loop, parametric continuation, homotopy sweeps, sensitivity-driven design exploration. The IPM re-solves each instance from scratch (the central-path push at the beginning of a fresh solve typically costs 4–8 iterations even when the previous optimum is essentially correct); the SQP warm-started from the previous working set typically picks up where it left off in 0–3 outer iterations when the active set is stable, or grows by a few QP add/drop steps when one or two constraints flip.

Stick with the IPM (the default) for cold solves of a single problem or large-scale problems with thousands of active inequalities. The IPM scales linearly in the active set; the active-set SQP’s per-QP cost grows with the number of active constraints.

2. Switching to the SQP path

The switch is a single option flip — algorithm from its default interior-point to active-set-sqp. Everything else (callbacks, bounds, starting point, finalize_solution) is unchanged.

Python

import pounce
import numpy as np

prob = pounce.Problem(
    n=2, m=1, problem_obj=MyNlp(),
    lb=[0.0, 0.0], ub=[10.0, 10.0],
    cl=[1.0], cu=[1.0],
)
prob.add_option("algorithm", "active-set-sqp")
prob.add_option("print_level", 0)
x, info = prob.solve(x0=np.array([0.5, 0.5]))

C

#include "pounce.h"

IpoptProblem prob = CreateIpoptProblem(/* ... */);
AddIpoptStrOption(prob, "algorithm", "active-set-sqp");
double x[2] = {0.5, 0.5};
double obj;
int status = IpoptSolve(prob, x, NULL, &obj, NULL, NULL, NULL, NULL);

GAMS

* pounce.opt
algorithm  active-set-sqp

Model mymodel / all /;
option nlp = pounce;
mymodel.optfile = 1;
Solve mymodel using nlp minimizing obj;

SQP-specific options

All SQP knobs live under the sqp_* namespace. The defaults mirror SqpOptions::default().

Option	Default	Meaning
`sqp_globalization`	`filter`	`filter` or `l1-elastic` (Fletcher-Leyffer / Han-Powell)
`sqp_hessian`	`exact`	`exact`, `damped-bfgs`, or `lbfgs`
`sqp_max_iter`	`200`	outer iteration cap
`sqp_tol`	`1e-8`	stationarity tolerance (max-norm)
`sqp_constr_viol_tol`	`1e-6`	constraint-violation tolerance
`sqp_dual_inf_tol`	`1e-4`	dual-infeasibility tolerance
`sqp_l1_penalty`	`1.0`	initial ν (Han-Powell only)
`sqp_l1_penalty_safety`	`0.1`	additive ν margin
`sqp_l1_penalty_max`	`1e10`	ν upper clamp
`sqp_bt_reduction`	`0.5`	backtracking factor
`sqp_bt_min_alpha`	`1e-12`	minimum step before line-search failure
`sqp_print_level`	`0`	0=silent, 1=per-iter summary, 2+=trace
`sqp_lbfgs_max_history`	`6`	L-BFGS history size

Algorithm-path isolation guarantees

The two solver paths share the TNLP layer, the OrigIpoptNlp adapter, the linear-solver backend, the options registry, and finalize_solution. Beyond that they are deliberately isolated, so toggling algorithm is always safe — no Phase 5 addition can change IPM behaviour, and no IPM warm-start setting can change SQP behaviour. Concretely:

The default (algorithm = interior-point) is unchanged. No user who hasn’t typed active-set-sqp ever runs Phase 5 code.
sqp_* options are silently ignored on the IPM path. Setting sqp_globalization, sqp_hessian, sqp_max_iter, … while algorithm is interior-point is a no-op. The option-list parser still validates them (out-of-range numeric values fail validation regardless of algorithm), but the IPM driver never reads the resolved values.
IPM warm-start options are silently ignored on the SQP path. warm_start_init_point, bound_push, bound_frac, slack_bound_push, mult_init_max, mu_init, mu_target and the rest of the IPM-side initializer knobs sit on the AlgorithmBuilder but are not consulted when the SQP outer loop runs.
Warm-start payloads are path-local. IpoptApplication::set_sqp_warm_start(SqpIterates) / Problem.solve(working_set=…) / IpoptSetWarmStartWorkingSet feed the SQP loop only — the IPM never reads sqp_warm_start. Symmetrically, lagrange= / zl= / zu= on Problem.solve (paired with warm_start_init_point=yes) feed the IPM only — the SQP loop never consults them.
You can flip between paths across solves on the same Problem handle. The application’s per-solve setup (restoration factory, options snapshot, statistics reset) is rebuilt for every solve(), so a cold IPM solve followed by an SQP solve with algorithm re-set in between is a supported pattern. This is exactly how the parametric corrector in §4 hands off from a cold IPM warm-up to the SQP corrector.
The C ABI is strictly additive. Existing cyipopt / JuMP / AMPL clients link against the new libpounce_cinterface unchanged; the four new entry points (IpoptGetWorkingSet, IpoptSetWarmStartWorkingSet, IpoptClearWarmStartWorkingSet, IpoptSolveWarmStart) are pure additions.
info["working_set"] is always present, sometimes None. Python callers that don’t touch the SQP path never have to read that key, but reading it is safe — it returns None on the IPM path so a downstream loop won’t crash on a missing key.

This isolation is verified by the existing test suite: 868 workspace tests cover both paths, plus crosscutting tests like application_sqp_warm_start_auto_clears_after_use (asserts the SQP-side warm-start state doesn’t leak between solves) and application_default_does_not_select_sqp (asserts the default solver path is IPM).

3. The working-set warm-start contract

The §6 contract is the tuple (x, λ_g, λ_x, 𝒲) — primal, constraint multipliers, packed bound multipliers, and the discrete working set 𝒲 (which bounds and constraints are active at the optimum). The first three are floating-point; only the last is the parametric-warm-start payoff over the IPM, because IPM-side multipliers are continuous interior-point estimates whereas the active set is what tells the next QP which rows to keep in the KKT block from iteration zero.

Python: carry across solves

prob.add_option("algorithm", "active-set-sqp")

ws = None
for k in range(horizon_steps):
    # ... user code updates the parameter inside MyNlp ...
    x, info = prob.solve(x0=x_prev, working_set=ws)
    ws = info["working_set"]   # (bounds_int8_array, constraints_int8_array)
    x_prev = x

The status codes in the working_set tuple use these values (int8 arrays):

0 = Inactive
1 = AtLower   (active at lower bound)
2 = AtUpper   (active at upper bound)
3 = Fixed (variable) or Equality (constraint)

C: carry across solves

IpoptBoundStatus *bounds = malloc(n * sizeof *bounds);
IpoptConsStatus  *cons   = malloc(m * sizeof *cons);

for (int k = 0; k < horizon_steps; k++) {
    /* ... user code updates the parameter ... */
    if (k == 0) {
        IpoptSolve(prob, x, NULL, &obj, NULL, NULL, NULL, NULL);
    } else {
        IpoptSolveWarmStart(prob, x, NULL, &obj, NULL, NULL, NULL,
                            bounds, cons,        /* in */
                            bounds, cons,        /* out, may alias */
                            NULL);
    }
    /* read the WS out for next iteration */
    IpoptGetWorkingSet(prob, bounds, cons);
}

GAMS: working set persists automatically

The GAMS solver link reads variable and equation marginals (x.m, con.m) at the top of each pouCallSolver invocation and reconstructs the working set from them. No solve-statement gymnastics required — every subsequent solve automatically warm-starts from the previous solution’s marginals.

Use the §7.4(b) state file option for the precision-critical case where the marginal signs are ambiguous (degenerate active set):

* pounce.opt
algorithm        active-set-sqp
sqp_state_file   .mymodel.pou-ws

The link writes a small binary blob after each solve and reads it at the start of the next, keyed by a checksum over (n, m, x_l, x_u, g_l, g_u) so structural changes invalidate the file cleanly.

4. Worked example: parametric continuation

The headline use case. You have an NLP

min f(x; p) s.t. g(x; p) = 0, x ≥ 0

and you want to trace x*(p) as p sweeps a path. The pounce playbook is:

Solve at p₀ with the IPM (better cold-start convergence than the SQP elastic phase).
Predictor: ask pounce_sensitivity::SensSolve for Δx ≈ ∂x*/∂p · Δp at p₀.
Classify the active set at the converged IPM iterate via pounce.classify_working_set(...).
Update p in your TNLP. Apply x* + Δx as the predictor.
Corrector: switch algorithm to active-set-sqp, install the working set + predictor as warm start, solve.
The corrector lands on x*(p₀ + Δp) in 0–3 outer iterations for small Δp.

Python (full code)

import numpy as np
import pounce


class ParamNlp:
    """min ½‖x − p‖²  s.t.  sum(x) = 1, x ≥ 0  with parameter p."""

    def __init__(self):
        self.p = np.zeros(3)

    def set_p(self, p):
        self.p = np.asarray(p, dtype=float)

    def objective(self, x):
        d = x - self.p
        return 0.5 * float(d @ d)

    def gradient(self, x):
        return x - self.p

    def constraints(self, x):
        return np.array([float(x.sum())])

    def jacobianstructure(self):
        return (np.zeros(3, dtype=np.int64), np.arange(3, dtype=np.int64))

    def jacobian(self, x):
        return np.ones(3)

    def hessianstructure(self):
        idx = np.arange(3, dtype=np.int64)
        return (idx, idx)

    def hessian(self, x, lagrange, obj_factor):
        return np.full(3, obj_factor)


nlp = ParamNlp()


def build_problem(algorithm):
    p = pounce.Problem(
        n=3, m=1, problem_obj=nlp,
        lb=[0.0] * 3, ub=[1e20] * 3,
        cl=[1.0], cu=[1.0],
    )
    p.add_option("algorithm", algorithm)
    p.add_option("print_level", 0)
    return p


# --- Step 1: cold IPM solve at p₀ ---
nlp.set_p([0.5, 0.4, -0.1])
x_ipm, info_ipm = build_problem("interior-point").solve(x0=np.full(3, 1.0 / 3))
print(f"IPM   converged: x = {x_ipm}, f = {info_ipm['obj_val']:.4f}")

# --- Step 3: classify the active set at x_ipm ---
ws = pounce.classify_working_set(
    x=x_ipm,
    x_l=np.array([0.0, 0.0, 0.0]),
    x_u=np.array([1e20, 1e20, 1e20]),
    g=info_ipm["g"],
    g_l=np.array([1.0]),
    g_u=np.array([1.0]),
    lambda_g=info_ipm["mult_g"],
    z_l=info_ipm["mult_x_L"],
    z_u=info_ipm["mult_x_U"],
    m_eq=1,
)
bounds, cons = ws
print(f"      working set: bounds = {bounds.tolist()}, cons = {cons.tolist()}")

# --- Step 4: perturb p and run the SQP corrector ---
nlp.set_p([0.52, 0.39, -0.05])     # Δp = (0.02, -0.01, 0.05)
x_sqp, info_sqp = build_problem("active-set-sqp").solve(
    x0=x_ipm, working_set=ws,
)
print(f"SQP   corrector: x = {x_sqp}, f = {info_sqp['obj_val']:.4f}")
print(f"      info['working_set'] for the next step: {info_sqp['working_set']}")

Expected output (deterministic, ran live before this tutorial was checked in):

IPM   converged: x = [5.5e-01 4.5e-01 1.2e-07], f = 0.0075
      working set: bounds = [0, 0, 1], cons = [3]
SQP   corrector: x = [0.565 0.435 0.   ], f = 0.0033
      info['working_set'] = (array([0, 0, 1], dtype=int8), array([3], dtype=int8))

The IPM lands x₃ essentially-zero (1.2e-7) — it’s an IPM artifact of the central-path push; for classify_working_set’s default primal_tol = 1e-6 that’s already inside the “at the bound” band, so bounds[2] = 1 (AtLower). The SQP corrector hits x₃ = 0 exactly because the working set tells it x₃ is an active lower bound from iteration zero — no central-path detour.

bounds = [0, 0, 1] means x[0], x[1] inactive (interior), x[2] at its lower bound. cons = [3] means the sum constraint is binding (equality).

Running it

Save as parametric_demo.py and run:

python parametric_demo.py

For an executable variant see python/examples/sqp_warm_start_mpc.py (a 20-step parametric sweep) and the Jupyter notebook in python/notebooks/06_sqp_parametric_continuation.ipynb.

5. Choosing a globalization

sqp_globalization = filter (the default) follows Fletcher-Leyffer 2002 — a Pareto-frontier filter on (constraint violation, objective). Robust, no penalty parameter to tune, recommended for general nonlinear NLPs.

sqp_globalization = l1-elastic is the SNOPT-style Han-Powell merit φ(x; ν) = f(x) + ν · violation(x) with adaptive ν. The new sqp_l1_penalty_safety (default 0.1) and sqp_l1_penalty_max (default 1e10) options control the ν update:

ν ← clamp(max(ν, ‖λ_qp‖_∞ + sqp_l1_penalty_safety), 0, sqp_l1_penalty_max)

Use l1-elastic when you want behaviour close to SNOPT for comparison studies, or when the filter is rejecting too many trial steps on a problem where the merit decreases steadily.

6. Choosing a Hessian source

Source	When to use
`exact`	NLP provides `eval_h`; the QP’s inertia-control handles indefinite ∇²L
`damped-bfgs`	Dense `n×n` Powell-damped BFGS; guaranteed PSD; n ≤ a few hundred
`lbfgs`	Limited-memory BFGS with `sqp_lbfgs_max_history` pairs; large `n`

The default exact is fastest when reliable. Switch to damped-bfgs for ill-scaled nonconvex NLPs where the QP solver’s inertia retries dominate the iteration cost. Use lbfgs only when the dense n² BFGS storage is the bottleneck (n ≥ ~1000).

7. Pitfalls

Calling Problem.solve(working_set=…) with a stale working set whose dimensions changed. Validated and rejected with ValueError. Pass the WS only when it came from a solve of the same problem shape.
Mixing IPM and SQP across solves without resetting state. The IPM path ignores set_sqp_warm_start, and the SQP path ignores the IPM warm-start options (warm_start_init_point etc.). Each path’s warm-start input is path-local.
Degenerate active set after IPM convergence. The multiplier-sign + primal-distance heuristic in classify_working_set is lossy at degenerate optima — same trade-off CONOPT/IPOPT/KNITRO have under GAMS. The first QP step in the SQP corrector re-classifies any wrongly-tagged rows, so correctness is preserved; only the iteration count may be slightly higher than ideal.
L1Elastic with a hard cap. If your problem’s QP multipliers spike (poorly scaled constraints), bump sqp_l1_penalty_max or rescale.

8. Where the code lives

Concern	File
SQP outer loop	`crates/pounce-algorithm/src/sqp/sqp_alg.rs`
QP subproblem solver	`crates/pounce-qp/src/solver.rs`
Working set type	`crates/pounce-qp/src/working_set.rs`
Classifier	`crates/pounce-algorithm/src/sqp/warm_start.rs`
IpoptApplication hooks	`crates/pounce-algorithm/src/application.rs`
C ABI	`crates/pounce-cinterface/src/lib.rs` + `include/pounce.h`
Python binding	`crates/pounce-py/src/{problem,warm_start}.rs`
GAMS link	`gams/gams_pounce.c`
Design rationale	Design Note

9. Reading list

Hock, Schittkowski (1981) Test Examples for Nonlinear Programming Codes — the in-repo HS subset reference.
Nocedal, Wright (2006) Numerical Optimization, ch. 18 — SQP fundamentals.
Fletcher, Leyffer (2002) — filter line search.
Han (1977) / Powell (1978) — l1-merit and damped-BFGS update.
Wächter, Biegler (2006) — pounce’s IPM heritage.
Gill, Murray, Saunders (2002) — SNOPT / l1-elastic phase 1.
Forsgren, Gill, Wright (2002) — IPM vs SQP comparison.
Kirches (2011) — parametric active-set SQP.

Design note — Active-set SQP for warm-started NLP sequences

Status: implemented. This note was originally the research → plan half of the research → plan → implement workflow that operationalized the C1 active-set SQP entry of the future-work roadmap (dev-notes/research/future-work-roadmap.md, §3.2, §5 Phase 5). The driver (Phase 5b/5c) has since landed and is wired through the Rust API, C ABI, Python bindings, and the GAMS link; see the user tutorial at Active-Set SQP & Warm Starts. The note is retained as design rationale and pins each algorithmic choice to the literature.

The target is a state-of-the-art sparse active-set SQP solver that (a) reuses pounce’s NLP / derivative / sparse-linalg foundation, (b) warm-starts on the working set across solves (not just primal-dual seeds), and (c) integrates symmetrically across the Rust API, C ABI, Python bindings, and GAMS link.

1. What this is

A sequential quadratic programming algorithm with a sparse parametric active-set QP subproblem — a second solver inside pounce sharing the model / derivative / linalg foundation but with its own iteration skeleton — designed for warm-started sequences of related NLPs:

Model predictive control (MPC): re-solve a similar NLP every control step. The horizon shifts by one stage; the active set rarely changes.
MINLP branch-and-bound: thousands of node relaxations differing by a few bound changes. Bounds-only active-set updates dominate.
Parametric homotopy / continuation: trace the solution along a parameter path. Predictor (sensitivity) + corrector (SQP step from the predicted point) reuses the working set across path steps.

The motivation is the warm-start gap in interior-point methods: the barrier pushes iterates to the interior, so a near-optimal point from a previous solve sits near the bound boundary and cannot be exploited. Active- set methods, by contrast, carry the working set across solves; if the optimal active set is unchanged, the next solve converges in O(1) QP iterations. This is the documented reason qpOASES, SNOPT, and filterSQP dominate in MPC.

2. The architectural mismatch (read this first)

IpoptData / IpoptCalculatedQuantities are shaped around primal- dual interior-point variables — slacks s, barrier μ, bound multipliers z_l/z_u, complementarity quantities. Active-set SQP has none of these: it carries (x, λ, 𝒲) where 𝒲 is the working set — the indices of currently active inequalities and bounds — and globalizes on a merit function or filter without a barrier at all.

This is therefore a new AlgorithmStrategy end to end — a Tier 3 addition in the roadmap’s tier ladder — and not an edit to the existing loop. The existing IPM (IpoptAlgorithm::optimize in crates/pounce-algorithm/src/ipopt_alg.rs) is left untouched and remains the default solver. Active-set SQP is opt-in via a new top-level algorithm option (§7.1), parallel to the existing linear_solver (Ma57/Feral) and mu_strategy (Monotone/Adaptive) choices in alg_builder.rs:54-63.

The dual-skeleton commitment is the cost; the warm-start strength is the payoff.

3. What pounce already has that SQP can reuse

Need	Existing component	Location
NLP model trait (`f`, `g`, `∇f`, `J`, `∇²ℒ`)	`IpoptNlp` / `TNLP`	`crates/pounce-algorithm/src/ipopt_nlp.rs`, `crates/pounce-nlp/`
`.nl` and CUTEst frontends	`pounce-cli`, `benchmarks/cutest`	unchanged
Sparse storage (triplet + CSC)	`SymTMatrix`, triplet→CSR converter	`crates/pounce-linalg/src/triplet.rs:374-405`, `triplet_convert.rs:40`
Sparse symmetric LDLᵀ with inertia	`SparseSymLinearSolverInterface` (FERAL, MA57)	`crates/pounce-linsol/src/sparse_sym_iface.rs:42-84`
Multi-RHS solve sharing one factor	`t_sym_solver.rs::multi_solve`	`crates/pounce-linsol/src/t_sym_solver.rs:174`
Inertia reporting (eigenvalue counts)	`SparseSymLinearSolverInterface::provides_inertia`	`crates/pounce-linsol/src/sparse_sym_iface.rs:84`
Limited-memory BFGS / SR1	`hess/quasi_newton.rs`	reused for SQP Hessian approximation
Filter acceptor	`line_search/filter_ls_acceptor.rs`	dominance test reusable for SQP filter
Convergence-check trait	`conv_check::trait::ConvCheck`	reused; KKT-error formula is identical
Option / journalist / iteration-output	`pounce-common` + `output/`	reused; new fields for working-set events
Warm-start primal/dual seeds from `TNLP`	`init/warm_start.rs:60-100`	extended (§6) with working-set state
Parametric sensitivity (sIPOPT port)	`pounce-sensitivity`	provides predictor for parametric-homotopy use case

The interfaces below pounce-nlp are stable enough that SQP inherits the full derivative and linalg layer unchanged. Everything new lives at the algorithm / solver level.

4. The algorithm — fully pinned

This section pins each algorithmic choice to literature. There is no remaining “decide during implementation” discretion at the level of algorithm class; only tuning constants are open.

4.1 Outer SQP loop — filter line search with Maratos correction

The outer loop is the filter SQP of Fletcher-Leyffer-Toint, with the Wächter-Biegler second-order correction (Maratos effect) and watchdog mechanism already implemented in line_search/. Filter because:

It avoids the penalty-parameter tuning of l1-merit (Han-Powell).
It reuses pounce’s existing FilterLsAcceptor (line_search/filter_ls_acceptor.rs) without modification — the dominance test on (‖c‖, f) is identical.
It is the globalization in filterSQP (Fletcher-Leyffer) and WORHP, the two open-source SQP solvers that compete with SNOPT on CUTEst, and the documented choice in Nocedal-Wright §18.10.

Alternative offered as opt-in: l1-elastic merit (the SNOPT choice), via a sqp_globalization option. l1 is simpler to reason about under MPCC-like degeneracies; filter is faster on smooth nonconvex NLPs in published benchmarks (Fletcher-Leyffer-Toint 2002 §6; Wächter-Biegler 2006 Tab. 3-5).

References:

Fletcher, Leyffer, “Nonlinear programming without a penalty function”, Math. Prog. 91 (2002), 239–269.
Fletcher, Leyffer, Toint, “On the global convergence of a filter- SQP algorithm”, SIAM J. Optim. 13 (2002), 44–59.
Wächter, Biegler, “Line search filter methods for nonlinear programming: Motivation and global convergence”, SIAM J. Optim. 16 (2005), 1–31.
Wächter, Biegler, “On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming”, Math. Prog. 106 (2006) — pounce’s existing filter implementation.

4.2 QP subproblem — sparse Schur-complement parametric active-set

The QP subproblem solver is a sparse parametric active-set method with Schur-complement basis updates, the lineage of qpOASES extended to sparse Hessian and Jacobian. This is the SOTA choice for SQP subproblems in industrial MPC and for parametric / homotopy use: it is the only active-set QP family in the literature with proven cross-solve warm-start performance in the sparse regime.

Why this family (vs alternatives):

Family	Sparse?	Indefinite H?	Parametric WS warm-start?	Reference
Goldfarb-Idnani (1983)	no (dense)	no (convex only)	partial	Goldfarb-Idnani 1983
Range-space (SQOPT)	partial	yes	partial	Gill-Murray-Saunders 2008
Null-space (Gould-Hribar-Nocedal)	partial	yes	partial	Gould-Hribar-Nocedal 2001
qpOASES (online active set)	no (dense)	yes	yes (homotopy)	Ferreau et al. 2014
Sparse Schur-complement parametric	yes	yes	yes	Kirches 2011, Janka 2017
OSQP (ADMM)	yes	no (convex only)	seed only	Stellato et al. 2020
PIQP / HPIPM (interior-point)	yes	yes	seed only	Schwan 2023, Frison-Diehl 2020

Only the sparse Schur-complement parametric method covers all three columns. It is what is needed.

Algorithm sketch. At any iterate the QP solver maintains a factorization of the base KKT matrix for some “base” working set 𝒲_base:

        ┌ H   Aᵀ_𝒲 ┐
K_𝒲 =  │            │,    LDLᵀ via pounce-linsol (FERAL/MA57)
        └ A_𝒲  0   ┘

When the working set changes (a constraint is added or dropped during the homotopy), the new system is not refactorized. Instead, the change is absorbed by a Schur-complement update: the modified system has the form K_𝒲 + UVᵀ (low-rank correction), and solves against the modified factor are obtained by the Schur-complement formula

(K + UVᵀ)⁻¹ b = K⁻¹b − K⁻¹U (I + Vᵀ K⁻¹ U)⁻¹ Vᵀ K⁻¹ b

so each active-set change costs one rank-1 update of the dense Schur complement S = I + Vᵀ K⁻¹ U plus one back-solve against the cached sparse factor. This is the Bartels-Golub-Reid principle from sparse simplex adapted to symmetric QP. When S grows too large (default: 50 updates) or its condition number degrades, a fresh sparse refactorization of K_𝒲 resets the cycle.

The homotopy itself follows qpOASES: between two QPs (H₀, g₀, A₀, b₀) and (H₁, g₁, A₁, b₁), the solver traces the parametric path (H_t, g_t, A_t, b_t) = (1-t)·QP₀ + t·QP₁ for t ∈ [0, 1], jumping the working set at each t where a multiplier hits zero or a constraint hits its bound. If the active set is identical at the two endpoints (the warm-start sweet spot), the homotopy completes with zero working-set changes.

Why Schur-complement, not direct LDLᵀ update? Direct sparse LDLᵀ factor updates (the symbolic+numeric reanalysis required when a constraint row is added or dropped) are known to be unstable under many updates because fill-in is not bounded (Davis 2006 §11). The Schur-complement / Bartels-Golub-Reid approach bounds the asymptotic update cost and is the technique that production sparse simplex (CPLEX, Gurobi, HiGHS) and SOTA sparse parametric QP (Kirches’s qpDUNES, the Janka parOSQP lineage) use.

References:

Ferreau, Kirches, Potschka, Bock, Diehl, “qpOASES: a parametric active-set algorithm for quadratic programming”, Math. Prog. Comp. 6 (2014), 327–363 — the dense reference algorithm.
Kirches, Fast Numerical Methods for Mixed-Integer Nonlinear Model-Predictive Control, Vieweg+Teubner (2011), Ch. 5–7 — the sparse Schur-complement extension; the canonical reference.
Janka, Kirches, Sager, Schlöder, “An SR1/BFGS SQP algorithm for nonconvex nonlinear programs with block-diagonal Hessian matrix”, Math. Prog. Comp. 8 (2016), 435–459 — block-sparse extension.
Kirches, Potschka, Bock, Sager, “A parametric active set method for quadratic programs with vanishing constraints”, Pacific J. Optim. 9 (2013) — MPCC structure handling, relevant to C4 reuse.
Bartels, “A stabilization of the simplex method”, Numer. Math. 16 (1971); Reid, “A sparsity-exploiting variant of the Bartels-Golub decomposition”, Math. Prog. 24 (1982) — the Schur-complement basis-update lineage.
Eldersveld, Saunders, “A block-LU update for large-scale linear programming”, SIAM J. Matrix Anal. Appl. 13 (1992).
Gill, Murray, Saunders, “SNOPT: An SQP algorithm for large-scale constrained optimization”, SIAM Rev. 47 (2005) — the range-space active-set used inside SNOPT; competing family.
Davis, Direct Methods for Sparse Linear Systems, SIAM (2006) — fill-in and refactor cost analysis.

4.3 Phase-1 / initial feasibility — l1 elastic mode

Active-set QP requires a feasible starting working set. The l1-elastic mode (Gill-Murray-Saunders, SQOPT) reformulates the infeasibility problem inside the same QP: each constraint gets a nonnegative elastic slack with a large linear cost γ, the working set starts empty, and elastic slacks are driven to zero as the homotopy proceeds. If the original QP is feasible the elastic slacks vanish at the solution; if infeasible the residual elastic slacks certify the minimal infeasibility.

This is preferred over the Big-M approach used in dense qpOASES because it preserves sparsity (the cost vector grows by m entries, the Jacobian by m columns; no large constants in H).

References:

Gill, Murray, Saunders, User’s Guide for SQOPT 7.7, Stanford SOL Report (2008) — elastic-mode reference implementation.
Friedlander, Saunders, “A globally convergent linearly constrained Lagrangian method for nonlinear optimization”, SIAM J. Optim. 15 (2005) — elastic mode as feasibility restoration.

4.4 Anti-cycling — EXPAND

Degeneracy in the working set (multiple constraints active with linearly dependent rows, or zero step lengths) can cause cycling in naive active-set methods. The SOTA anti-cycling rule is EXPAND (Gill-Murray-Saunders-Wright 1989): a small primal perturbation is introduced and grown over iterations so that the step length is always strictly positive, with periodic resets.

Bland’s rule (1977) and Wolfe’s rule (1963) are alternatives, but EXPAND is faster in practice and is the rule used by SNOPT, MINOS, LANCELOT, and qpOASES.

References:

Gill, Murray, Saunders, Wright, “A practical anti-cycling procedure for linearly constrained optimization”, Math. Prog. 45 (1989), 437–474.

4.5 Indefinite reduced Hessian — inertia control + projected modified Cholesky

For nonconvex NLP subproblems the Hessian of the Lagrangian is indefinite. The QP must still be solved to a meaningful descent direction. Two-layer scheme, both standard:

Detect via inertia of the LDLᵀ factor of K_𝒲. pounce-linsol already exposes inertia via provides_inertia() / number_of_neg_evals (sparse_sym_iface.rs:84). The correct inertia for an SQP subproblem with m working constraints is (n − m, m, 0); any deviation flags reduced-Hessian indefiniteness.
Correct via projected modified Cholesky on the reduced Hessian: when wrong inertia is detected, shift H ← H + δI with δ chosen by the same inertia-correction logic pounce already uses in kkt/perturbation_handler.rs:141-356. This restores correct inertia at minimal modification.

References:

Gould, “On modified factorizations for large-scale linearly constrained optimization”, SIAM J. Optim. 9 (1999), 1041–1063.
Gould, Hribar, Nocedal, “On the solution of equality constrained quadratic programming problems arising in optimization”, SIAM J. Sci. Comput. 23 (2001), 1376–1395 — the inertia-correction prescription for SQP subproblems.
Forsgren, “Inertia-controlling factorizations for optimization algorithms”, Appl. Num. Math. 43 (2002), 91–107.

4.6 Hessian approximation — exact, damped BFGS, L-BFGS

The SQP outer loop accepts three Hessian sources via the existing HessianUpdater trait (hess/r#trait.rs):

Exact ∇²ℒ from the NLP (default when available). Indefinite on nonconvex problems; handled by §4.5.
Damped BFGS (Powell 1978): full dense BFGS with Powell’s damping rule, guaranteed PSD. Default fallback when exact Hessian is unavailable, for problems where n is small.
Limited-memory BFGS / SR1 (Liu-Nocedal 1989, Byrd-Nocedal-Schnabel 1994): the existing pounce L-BFGS implementation. Default for large n. SR1 is the indefinite-Hessian variant preferred in Janka 2016 for nonconvex SQP block-sparse problems.

The QP subproblem absorbs whichever Hessian is supplied; only the indefinite-handling path (§4.5) differs.

References:

Powell, “A fast algorithm for nonlinearly constrained optimization calculations”, in Numerical Analysis Dundee 1977 (1978) — damped BFGS for SQP.
Liu, Nocedal, “On the limited memory BFGS method for large scale optimization”, Math. Prog. 45 (1989), 503–528.
Byrd, Nocedal, Schnabel, “Representations of quasi-Newton matrices and their use in limited memory methods”, Math. Prog. 63 (1994), 129–156.

Single iteration of fixed-precision iterative refinement on every QP solve, using the cached factorization. Standard practice; pounce-feral and MA57 backends already implement it (t_sym_solver.rs::multi_solve applies refinement when configured).

References:

Wilkinson, The Algebraic Eigenvalue Problem, OUP (1965) — original.
Higham, Accuracy and Stability of Numerical Algorithms (2nd ed., SIAM 2002), §12.

5. New crate `pounce-qp` — concrete types

Standalone crate. Depends on pounce-linalg and pounce-linsol; depended on by pounce-algorithm (for SQP), pounce-sensitivity (for the parametric corrector in Phase 5c+), optionally pounce-presolve (for tighter feasibility checks in future work).

5.1 Types

All types are sparse from the start, using the existing pounce-linalg storage conventions (SymTMatrix triplet → CSC for the symmetric Hessian; GenTMatrix for the Jacobian).

#![allow(unused)]
fn main() {
// crates/pounce-qp/src/problem.rs

use pounce_linalg::triplet::{SymTMatrix, GenTMatrix};

/// A convex-or-nonconvex sparse QP:
///     min  ½ xᵀ H x + gᵀ x
///     s.t. bl ≤ A x ≤ bu
///          xl ≤   x ≤ xu
/// Two-sided general bounds; H is symmetric (upper triangle stored)
/// and may be indefinite (caller sets `hessian_inertia`).
pub struct QpProblem<'a> {
    pub n: usize,
    pub m: usize,
    pub h: &'a SymTMatrix,          // symmetric, upper triangle, may be indefinite
    pub g: &'a [f64],
    pub a: &'a GenTMatrix,           // m × n, sparse
    pub bl: &'a [f64], pub bu: &'a [f64],
    pub xl: &'a [f64], pub xu: &'a [f64],
    pub hessian_inertia: HessianInertia,  // PSD | Indefinite | Unknown
}

/// Discrete state per primal-and-constraint index. Carried across
/// solves to implement working-set warm start.
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub enum BoundStatus { Inactive, AtLower, AtUpper, Fixed }

#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub enum ConsStatus { Inactive, AtLower, AtUpper, Equality }

pub struct WorkingSet {
    pub bounds:      Vec<BoundStatus>,   // length n
    pub constraints: Vec<ConsStatus>,    // length m
}

pub struct QpWarmStart {
    pub x:       Vec<f64>,
    pub lambda_g: Vec<f64>,              // length m
    pub lambda_x: Vec<f64>,              // length n (z_l − z_u, signed)
    pub working:  WorkingSet,
}

pub struct QpSolution {
    pub x:        Vec<f64>,
    pub lambda_g: Vec<f64>,
    pub lambda_x: Vec<f64>,
    pub working:  WorkingSet,
    pub obj:      f64,
    pub status:   QpStatus,              // Optimal | Infeasible | Unbounded | MaxIter | …
    pub stats:    QpStats,               // n_active_set_changes, n_refactor, time …
}
}

5.2 Trait surface

#![allow(unused)]
fn main() {
// crates/pounce-qp/src/solver.rs

use pounce_linsol::sparse_sym_iface::SparseSymLinearSolverInterface;

pub trait QpSolver {
    /// Solve a single QP. `ws` is `None` for cold start.
    fn solve(
        &mut self,
        qp: &QpProblem,
        ws: Option<&QpWarmStart>,
        opts: &QpOptions,
    ) -> Result<QpSolution, QpError>;

    /// Parametric solve: trace the homotopy from a previous QP+solution
    /// to a new QP. Falls back to `solve` if the previous solution is
    /// `None`. This is the entry point SQP uses across outer iterations
    /// to reuse the cached factorization across consecutive QPs.
    fn solve_parametric(
        &mut self,
        qp_prev: &QpProblem,
        sol_prev: &QpSolution,
        qp_new:  &QpProblem,
        opts: &QpOptions,
    ) -> Result<QpSolution, QpError>;
}

pub struct QpOptions {
    pub algorithm: QpAlgorithm,          // ParametricActiveSet | …
    pub linear_solver_factory: …,        // injected from pounce-algorithm
    pub max_iter: usize,
    pub feas_tol: f64,
    pub opt_tol:  f64,
    pub max_schur_updates_before_refactor: usize,  // default 50, ref §4.2
    pub anti_cycling: AntiCyclingChoice, // Expand (default), Bland, None
    pub elastic_gamma: f64,              // §4.3 penalty for elastic mode
    pub print_level: i32,
}
}

The linear_solver_factory injection mirrors alg_builder.rs::LinearBackendFactory (line 50) so pounce-qp remains backend-agnostic: FERAL by default, MA57 when built with the ma57 feature.

5.3 Internal structure

crates/pounce-qp/
├── Cargo.toml
└── src/
    ├── lib.rs
    ├── problem.rs           — types from §5.1
    ├── working_set.rs       — WorkingSet ops: add, drop, validate
    ├── kkt.rs               — KKT assembly from QP + 𝒲
    ├── factor.rs            — sparse LDLᵀ wrapper + Schur-complement state
    ├── schur.rs             — block-LU update (Eldersveld-Saunders 1992)
    ├── homotopy.rs          — parametric step engine (§4.2 t ∈ [0,1])
    ├── elastic.rs           — phase-1 elastic mode (§4.3)
    ├── expand.rs            — EXPAND anti-cycling (§4.4)
    ├── inertia.rs           — indefinite handling (§4.5)
    ├── refine.rs            — iterative refinement (§4.7)
    ├── solver.rs            — QpSolver impl
    └── options.rs           — QpOptions, defaults

6. SQP iterate state and working-set warm-start contract

#![allow(unused)]
fn main() {
// crates/pounce-algorithm/src/sqp/iterates.rs

pub struct SqpIterates {
    pub x:        Rc<DenseVector>,
    pub lambda_g: Rc<DenseVector>,
    pub lambda_x: Rc<DenseVector>,
    pub working:  WorkingSet,            // §5.1
    pub h_approx: HessianStore,          // exact | DampedBfgs | LBfgs (existing)
    pub merit:    Option<f64>,           // l1-elastic mode or filter pair cache
}
}

The warm-start contract carried across calls to SqpAlgorithm::optimize is the tuple (x, λ_g, λ_x, 𝒲, H):

(x, λ_g, λ_x) — already supported by the existing init/warm_start.rs machinery; reuse the seed-from-NLP path (warm_start.rs:60-100).
𝒲 (working set) — new. Encoded as (Vec<BoundStatus>, Vec<ConsStatus>). Transmitted via:
- Rust: a new SqpWarmStartIterateInitializer parallel to the IPM one, populated by an extended TNLP::get_warm_start_working_set hook (Rust trait default: returns None ⇒ cold-start the working set via §4.3 elastic mode).
- C/Python/GAMS: §7.
H (Hessian) — already supported via the existing L-BFGS carry-forward path; reuse unchanged.

Cold-warm bootstrap (no prior 𝒲): elastic-mode QP §4.3 with empty initial working set. The first QP infers 𝒲₀ from which elastic slacks vanish at its solution.

Validation: before consuming a user-supplied 𝒲_prev, run a linear feasibility check against the new bounds. If a previously active bound is now infeasible, drop it (degrades to a cheaper warm start, never to incorrectness). This is the same defensive check qpOASES does on set_warm_start_x.

7. Integration with pounce — symmetric across interfaces

Each interface today is documented in the survey above. The integration plan below adds the same five-point contract (algorithm choice + suboptions + warm-start input + warm-start output + working-set typed-or-string surface) to each, without disturbing existing IPM users.

7.1 Rust / `alg_builder.rs` — the source of truth

New enum following the established LinearSolverChoice / MuStrategyChoice pattern at alg_builder.rs:54-63:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum AlgorithmChoice {
    InteriorPoint,   // default; existing IpoptAlgorithm
    ActiveSetSqp,    // new SqpAlgorithm
}
}

AlgorithmBuilder gains an algorithm: AlgorithmChoice field with default InteriorPoint. build_inner branches on it, returning either the existing AlgorithmBundle (IPM) or a new SqpAlgorithmBundle. The two bundles share init, conv_check, hess, iter_output; differ in main-loop driver.

New options registered in upstream_options.rs (the registry pattern at lines 510-703 for the existing warm-start knobs):

Option	Type	Default	Meaning
`algorithm`	enum	`interior-point`	`interior-point` ‖ `active-set-sqp`
`sqp_qp_solver`	enum	`parametric-active-set`	placeholder for future QP backends
`sqp_globalization`	enum	`filter`	`filter` ‖ `l1-elastic`
`sqp_hessian`	enum	`exact`	`exact` ‖ `damped-bfgs` ‖ `lbfgs`
`sqp_warm_start_working_set`	bool	`no`	accept caller-supplied 𝒲
`sqp_max_qp_iter`	int	200	per-QP iteration cap
`sqp_qp_feas_tol`	num	1e-9	QP feasibility tolerance
`sqp_elastic_gamma`	num	1e6	elastic-mode penalty (§4.3)
`sqp_max_schur_updates`	int	50	refactor frequency (§4.2)

7.2 C API (`crates/pounce-cinterface/`)

Three additions, all backward-compatible (existing IPM users see no change).

(a) Option exposure. No new C entry point — AddIpoptStrOption already accepts arbitrary option names. Setting algorithm via AddIpoptStrOption(problem, "algorithm", "active-set-sqp") selects the SQP path. This is identical to how linear_solver is selected today.

(b) Working-set transfer. Three new C entry points in include/pounce.h, ABI-stable (no change to existing structs):

/* Length-n status vectors. 0=Inactive, 1=AtLower, 2=AtUpper, 3=Fixed/Equality. */
typedef int IpoptBoundStatus;
typedef int IpoptConsStatus;

/* Retrieve the working set from the last solve. Returns 0 on success.
 * Buffers must be sized n and m respectively. NULL buffer ⇒ skip that side. */
int IpoptGetWorkingSet(
    IpoptProblem problem,
    IpoptBoundStatus *bound_status_out,   /* length n, or NULL */
    IpoptConsStatus  *cons_status_out     /* length m, or NULL */
);

/* Supply a warm-start working set for the next solve. Buffers may
 * be NULL ⇒ that side is cold-started. Caller-owned; copied. */
int IpoptSetWarmStartWorkingSet(
    IpoptProblem problem,
    const IpoptBoundStatus *bound_status_in,   /* length n, or NULL */
    const IpoptConsStatus  *cons_status_in     /* length m, or NULL */
);

/* One-shot solve with warm-start state. Equivalent to IpoptSolve
 * preceded by IpoptSetWarmStartWorkingSet. Returns working set in
 * the supplied output buffers if non-NULL. */
int IpoptSolveWarmStart(
    IpoptProblem problem,
    Number *x, Number *g, Number *obj_val,
    Number *mult_g, Number *mult_x_L, Number *mult_x_U,
    const IpoptBoundStatus *bound_status_in,   /* in, or NULL */
    const IpoptConsStatus  *cons_status_in,    /* in, or NULL */
    IpoptBoundStatus *bound_status_out,        /* out, or NULL */
    IpoptConsStatus  *cons_status_out,         /* out, or NULL */
    UserDataPtr user_data
);

IpoptProblem (lib.rs:67) gains an internal Option<WorkingSet> slot; the existing IpoptSolve signature is unchanged. The C ABI adds three symbols; existing cyipopt / JuMP / AMPL clients are unaffected.

(c) Suboption strings. Already covered by §7.1’s option registry via the existing AddIpoptStrOption / AddIpoptIntOption / AddIpoptNumOption setters; no new C signatures required.

7.3 Python (`crates/pounce-py/`)

PyO3 bindings extend symmetrically. pounce.Problem.add_option already accepts algorithm and the suboption strings from §7.1; no binding change.

New methods on PyProblem (crates/pounce-py/src/problem.rs):

class Problem:
    # existing ────────────────────────────────────────
    def add_option(self, name: str, value): ...
    def solve(self, x0,
              lagrange=None, zl=None, zu=None,
              # NEW kwargs, default None ⇒ cold:
              working_set: Optional[WorkingSet] = None
              ) -> SolveResult: ...

    # NEW ─────────────────────────────────────────────
    def get_working_set(self) -> WorkingSet: ...

@dataclass
class WorkingSet:
    bounds:      np.ndarray  # dtype=int8, length n
    constraints: np.ndarray  # dtype=int8, length m

@dataclass
class SolveResult:
    x: np.ndarray
    obj_val: float
    mult_g: np.ndarray
    mult_x_L: np.ndarray
    mult_x_U: np.ndarray
    working_set: Optional[WorkingSet]  # populated when algorithm == "active-set-sqp"
    info: dict

The MPC / parametric-continuation Python idiom becomes:

prob = pounce.Problem(...)
prob.add_option("algorithm", "active-set-sqp")
prob.add_option("sqp_warm_start_working_set", True)

ws = None
for step in range(horizon):
    res = prob.solve(x0=x_prev, working_set=ws)
    ws  = res.working_set        # carry across solves
    x_prev = shift(res.x)

This is the same ergonomics as qpOASES’s Python binding, deliberately.

7.4 GAMS (`gams/gams_pounce.c`)

GAMS is the hardest case because the link is single-shot per solve statement and there is no in-process persistence between solves. Two mechanisms cover the use cases:

(a) Algorithm and suboption selection via the existing pounce.opt option file (gams_pounce.c:220-273). No code change — the option file already forwards unknown keys to the C API via AddIpoptStrOption etc. Adding the keys from §7.1 to the documented GAMS options list is the only deliverable here:

* pounce.opt
algorithm active-set-sqp
sqp_globalization filter
sqp_hessian exact
sqp_warm_start_working_set yes

(b) Working-set transfer across solves. GAMS has no native discrete-multiplier carry. Two mechanisms, both standard in GAMS solver links:

Marginal-based reconstruction (the GAMS-native idiom). After a solve, GAMS variable .m (marginal) holds the bound multiplier and equation .m holds the constraint multiplier. The next solve’s link reads these and reconstructs an approximate working set by sign + tolerance test: bound_status[i] = AtLower if x.m[i] > tol else (AtUpper if x.m[i] < -tol else Inactive). This is lossy (degenerate cases ambiguous) but matches what CONOPT, IPOPT, and KNITRO already do under GAMS. Implemented in gams_pounce.c::pouCallSolver (:437) prior to building the problem.
Persistent state file (the precise idiom). The solver writes a per-model state file (e.g. .<modelname>.pou-ws) at the end of each solve and reads it at the start of the next. The state file holds (bound_status, cons_status) as a small binary blob, keyed by the model’s GMO checksum so a structural change invalidates it cleanly. The GAMS option sqp_state_file controls the path; absence means cold-start.

Both mechanisms ship in Phase 5c; mechanism 1 is the default (no configuration required), mechanism 2 is opt-in for users who care about precision in degenerate cases. Documented limitation: full fidelity requires a GUSS-style scenario sweep within a single GAMS session.

7.5 Interface summary

Layer	Algorithm switch	Working-set in	Working-set out	Bridge
Rust	`AlgorithmChoice::ActiveSetSqp`	`SqpWarmStartIterateInitializer`	`SqpSolution.working`	direct
C ABI	`AddIpoptStrOption("algorithm", …)`	`IpoptSetWarmStartWorkingSet`	`IpoptGetWorkingSet`	thin shim
Python	`add_option("algorithm", …)`	`solve(…, working_set=ws)`	`res.working_set`	PyO3 over C ABI
GAMS	`pounce.opt`	marginals ‖ state file	marginals ‖ state file	C code in `pouCallSolver`

8. Test harness

The harness is layered: cheap analytical smoke tests on every commit, fixed reference problems on every PR, scaling sweeps weekly, full regression suite on phase-gate. Each layer below names specific problems, specific size parameters where applicable, and specific reference numbers from published literature so a regression is detectable rather than handwaved.

8.0 Analytical correctness ladder — CI smoke tests

Closed-form problems with hand-computable answers. Run on every cargo test. Each catches a distinct class of bug in hour 1, not week 3. These are the unit-level equivalent of pounce-feral’s “factor a 3×3” smoke tests.

#	Problem	Closed form	What it catches
1	Unconstrained QP, `H=I`, arbitrary `g`	`x* = −g`, one Newton step	KKT sign convention, gradient assembly
2	Equality-only QP: `min ½xᵀHx + gᵀx` s.t. `Ax = b`, with `H`, `A` full rank	`[x; λ] = [H Aᵀ; A 0]⁻¹ [−g; b]` (one linear solve)	KKT factor block layout, multiplier sign
3	Separable box-constrained QP: `H = diag(h)`, `xl ≤ x ≤ xu`, no general constraints	`x*_i = clip(−gᵢ/hᵢ, xlᵢ, xuᵢ)` per coordinate	Bound-multiplier sign, working-set add/drop
4	Strictly convex QP with one redundant constraint	Same as without redundant; redundant row stays inactive	Degeneracy detection, EXPAND triggering
5	Infeasible QP (`xl > xu` on one coord)	Elastic mode returns minimal-infeas point	§4.3 phase-1 elastic detection
6	Indefinite Hessian, single equality, reduced Hessian PD	Solvable; reduced-Hessian inertia OK	§4.5 inertia-control trigger

Implemented as #[test] functions in pounce-qp/src/tests/analytical.rs. Total runtime budget: < 50 ms across all six. Exit: all six pass to 1e-12 relative.

8.1 QP correctness — fixed reference set (Phase 5a, PR-level)

Maros-Mészáros QP test set (Maros-Mészáros 1999): 138 problems, sizes n ∈ [2, 12955]. Format: .qps (QP-extended MPS); new reader in pounce-qp/src/maros.rs, sharing infrastructure with pounce-cli’s MPS handling.
Reference oracle:
- qpOASES (Ferreau 2014) for dense problems — exposes a C API we FFI.
- OSQP (Stellato 2020) for sparse convex problems — widely available, sparse, Python bindings.
- CPLEX/Gurobi via Python as tiebreaker on indefinite cases where qpOASES and OSQP disagree.
Tolerance: 1e-6 relative objective, 1e-7 KKT residual.
Exit: ≥ 95 % of Maros-Mészáros pass within tolerance. The remaining ≤ 5 % are documented as “known indefinite hard” with the per-problem reason. qpOASES itself reports ~97 % pass on its reference table (Ferreau 2014 Tab. 2), so 95 % is the floor.

8.2 QP scaling sweep — system size dependence (Phase 5a, weekly)

The deliverable here is a plot — iteration count and wall time vs n — and a published reference curve to compare against. Three families, each with a single size axis:

(a) LASSO QP (the canonical OSQP benchmark; Stellato 2020 Tab. 4). Formulation min ½‖Ax − b‖² + λ‖x‖₁, reformulated as a sparse QP of dimension 2n with 2n inequality constraints. Sweep n ∈ {10², 10³, 10⁴, 10⁵} with fixed sparsity 5 %.

Reference: OSQP paper Tab. 4 reports per-solve time ~0.01s / 0.1s / 1s / 12s respectively on a 2.6 GHz Xeon.
Exit: within 3× of OSQP at every size; ideally within 2× by Phase 5a end. (Active-set is slower than ADMM on cold convex LASSO; the warm-start sequence in §8.4 reverses that.)

(b) MPC quadrotor scaling (Frison-Diehl HPIPM 2020 §5; the acados reference). Linear-quadratic MPC with state dim 12, horizon h ∈ {10, 20, 40, 80, 160}. Sweep yields n = 12·h, m = 12·h sparse QPs with block-banded structure.

Reference: HPIPM paper reports ~0.1 ms / 0.4 ms / 1.6 ms / 6.4 ms / 25.6 ms cold solve (linear in h, as the block factor is O(h)).
Exit: linear scaling in h (i.e., not super-linear) within 10× HPIPM at every horizon.

(c) Maros-Mészáros size buckets. Same problems as §8.1, sliced by size. Bucket boundaries n ∈ [1, 10²) ∪ [10², 10³) ∪ [10³, 10⁴) ∪ [10⁴, ∞). Solve time per bucket reported as median + p95.

Reference: qpOASES reports total time for the full set; per- bucket numbers are computed once during Phase 5a and committed as pounce-qp/benches/maros_baseline.json. Regression alert if median doubles or p95 quadruples between commits.

8.3 NLP correctness — fixed reference set (Phase 5b)

Hock-Schittkowski test set (HS001-HS119; Hock-Schittkowski 1981). The SQP-community gold standard. Tiny problems (n ≤ 30 mostly) with documented solutions. Every SQP paper reports HS results; filterSQP (Fletcher-Leyffer) and SNOPT (Gill-Murray-Saunders) both publish per-problem iteration tables.
- Source: the CUTEst harness already contains all HS problems (benchmarks/cutest/problem_list.txt includes HS001–HS119).
- Exit: ≥ 117 of 119 converge; the two allowed failures are HS013 and HS099 which most SQP solvers also fail (Wächter 2002 Tab. 6.1).
CUTEst small NLP subset (n < 1000, the default problem_list.txt minus large-scale entries — roughly 500 problems).
- Reference numbers: Wächter-Biegler 2006 Tab. 5 and Fletcher- Leyffer 1999 Tab. 6 publish per-problem iteration counts for IPM and filter-SQP on CUTEst.
- Exit: total iteration count within 30 % of the median of {filterSQP, SNOPT, IPOPT} published numbers; success rate ≥ 90 %.

8.4 NLP scaling sweep — system size dependence (Phase 5b, monthly)

Two families giving a single size axis to test scaling claims:

(a) AC OPF — pglib-OPF (Babaeinejadsarookolaee 2019). Standard power-grid benchmark, scales 14 → 30000 buses. Pounce’s CUTEst list already has ACOPP14, ACOPP30, ACOPR14, ACOPR30; extending to the full pglib-OPF set (14, 30, 57, 118, 200, 300, 1354, 2853, 9241, 13659, 30000 buses) gives a clean two-order-of-magnitude sweep with real-world structure (sparse Jacobian, near-degenerate binding limits — exactly what active-set should be measured on).

Reference: the MATPOWER project publishes IPOPT solve times for pglib-OPF on each instance; PowerModels.jl benchmarks filterSQP and KNITRO on the same set.
Exit: ≤ 2× IPOPT time at every bus count for cold solve. The warm-start advantage shows up in §8.5(a).

(b) Poisson boundary optimal control (Biegler 2010, Nonlinear Programming, §11.3). PDE-constrained NLP: minimize tracking-cost on u subject to −Δu = f + Bv on [0,1]² with boundary-control v. Standard reference NLP scaling family. Mesh sweep grid = 16 × 16, 64 × 64, 256 × 256, 1024 × 1024 gives n ∈ {256, 4096, 65536, 10⁶} with smooth, well-characterized continuous solution.

Reference: Biegler 2010 Ch. 11 publishes IPOPT iter counts for exactly this family at each mesh.
Exit: mesh-independent iteration count (≤ 30 outer iters at every mesh, as the continuous problem is well-posed).

8.5 Warm-start sweep — the actual deliverable (Phase 5c)

The headline result. A perturbation-magnitude axis × cold/warm comparison for each warm-start workload. Plotted as iteration count and active-set-change count vs perturbation size.

(a) MPC closed-loop with horizon shift. Quadrotor or autonomous- vehicle model (acados examples; Verschueren 2022). 200-step closed-loop simulation. At each step, the NLP is the horizon-shifted neighbor of the previous; the warm-start carries the working set shifted by one stage.

Metrics:
- Mean SQP iterations per step (cold vs warm).
- Mean QP-subproblem active-set changes per SQP iteration (cold vs warm).
- 99th-percentile per-step wall time (worst case for real-time deployment).
Reference: qpOASES paper (Ferreau 2014 Tab. 3-4) reports 5–50× iteration speedup on closed-loop MPC; acados paper (Verschueren 2022 Tab. 2) reports per-step times for HPIPM and qpOASES. Beat HPIPM-warm-start on worst-case latency — that’s the whole point.
Exit: ≥ 5× iteration speedup, ≤ 3 active-set changes per step in the steady-state regime.

(b) Parametric continuation. Trace the solution of a parametric NLP min f(x;t) s.t. g(x;t) ≤ 0 as t sweeps [0, 1] in 100 steps. Use the Beltistos parametric NLP benchmark (Pirnay-López- Negrete-Wächter 2012) or a Wächter-Biegler 2006 §5 instance.

Metrics: total iterations across the full path; size of largest discontinuous active-set jump.
Reference: the pounce-sensitivity (sIPOPT port) already has a baseline number for IPM-warm-start on the same path. Beat it.
Exit: ≥ 3× total-iteration speedup over IPM-warm-start.

(c) MINLP B&B trace. Record bound changes from a small MINLP B&B run (one of the minlplib instances with documented bound- tightening trace; Bussieck 2003). Replay the bound sequence, warm-starting each child from its parent.

Metrics: total iterations across the B&B trace.
Reference: the minlplib instances have published Bonmin baselines; Bonmin uses IPOPT-warm-start internally.
Exit: ≥ 2× speedup vs Bonmin.

(d) Perturbation-size sweep. A synthetic perturbation axis on a fixed problem: start from a solved QP, perturb (i) one bound by ε ∈ {1e-6, 1e-3, 1e-1, 1}, (ii) ε of all bounds, (iii) drop one constraint, (iv) add one constraint. Plot iter count vs perturbation magnitude on log-x; the curve characterizes the “warm-start cliff” where active-set adaptation cost crosses cold-start cost.

Reference: no published baseline; this curve becomes pounce-qp’s own published characterization. It’s what tells prospective users when warm-start helps.
Exit: monotone in perturbation magnitude; sub-linear up to 10 % bound change.

8.6 Cross-phase comparison — the headline plot

One plot per benchmark family: iter count and wall time vs n, with four curves on each panel:

SQP cold
SQP warm (with prior solve at the same n)
IPM cold (pounce-default)
IPM warm (pounce-default + warm_start_init_point=yes)

This is the deliverable that justifies the whole Phase-5 effort. The expected story: cold curves IPM ≤ SQP; warm curves SQP ≪ IPM at all sizes. Two-line summary in the eventual paper / README.

Committed in benchmarks/sqp_scaling/ alongside Phase 5c.

8.7 Unit tests — per-module

For each module in pounce-qp/src/:

factor.rs (sparse LDLᵀ wrapper): roundtrip factor-then-solve on the 6 analytical-ladder problems.
schur.rs (Schur-complement updates): each rank-1 update validated against a full refactor of the equivalent KKT matrix, Frobenius-norm diff < 1e-10.
expand.rs: anti-cycling verified on Beale’s cycling LP example (Beale 1955), Hoffman’s cycling LP (Hoffman 1953), and Maros 1996 §4.2 degenerate QP.
elastic.rs: feasibility detection on the infeasible subset of Maros-Mészáros (problems QPCBOEI2, QSCAGR25, QSCFXM1 — documented infeasible).
working_set.rs: random add/drop sequence (50 ops on a random working set), validated against ground-truth full KKT solves.
homotopy.rs: parametric trace from QP₀ to QP₁ with identical optimal active set; verify zero working-set changes (the warm- start sweet spot).
inertia.rs: indefinite-Hessian QP with reduced Hessian PD; verify §4.5 path produces a stationary point.

Total unit-test runtime budget: < 5 s; runs on every cargo test.

8.8 Phase-gate matrix

Phase	Required passing
5a (QP standalone)	§8.0 + §8.1 + §8.2 + §8.7
5b (cold SQP NLP)	All 5a + §8.3 + §8.4
5c (warm SQP)	All 5b + §8.5 + §8.6
5d (l1-elastic opt)	All 5c + side-by-side §8.5 comparison filter vs l1-elastic

A phase is not declared shipped until every cell in its row passes the named exit criterion.

9. Per-workload notes

9.1 MPC

Block-shift working-set carry: 𝒲_{k+1}[i] = 𝒲_k[i+1] with new terminal stage seeded cold. Modeling-layer convention; the solver only needs the warm-start API to be cheap.
The qpOASES paper (Ferreau 2014 Tab. 3) reports the homotopy completing in 1–3 working-set changes per shift in the well-warm- started regime. This is the headline benchmark for Phase 5c.

9.2 MINLP branch-and-bound

Sibling/child relaxations differ in one bound. The previous solve’s 𝒲 is feasible for the child unless the bound change invalidates it; then one active-set update fixes it. Documented in Pirnay-Lopez- Negrellos-Wachter (2012) §4 for IPM warm start; the active-set numbers are categorically better.

9.3 Parametric homotopy / continuation

Step in parameter t: min f(x; t) s.t. g(x; t) ≤ 0.
Predictor: pounce-sensitivity computes dx/dt, dλ/dt from the reduced Hessian at the previous solution. Reuse unchanged.
Corrector: one SQP solve from (x + Δt·dx/dt, λ + Δt·dλ/dt, 𝒲_prev). If 𝒲_prev is still optimal, one QP iteration.
This is the workload where SQP outperforms a well-warm-started IPM most clearly. Cleanest demo target.

10. Implementation status

The driver shipped in four milestones, each with standalone value: 5a builds the standalone sparse QP solver, 5b the cold SQP NLP driver, 5c the working-set warm start and full-stack integration, and 5d the l1-elastic alternative. Everything below is implemented and self-tested; the only outstanding work is the external-oracle regression comparisons called out at the end.

Phase 5a — `pounce-qp` standalone sparse QP solver

§4.2 active-set inner loop with cached-factor resolve and an opt-in sparse Schur-complement update layer (QpOptions::use_schur_updates). The SchurState owns U, V, K₀⁻¹U, S and applies Sherman-Morrison-Woodbury rank-2 updates per working-set change, cross-checked against a fresh factorization to 1e-9.
§4.3 l1-elastic mode (Gill-Murray-Saunders, SQOPT): an augmented QP with two non-negative slacks per row and penalty γ, solved through the standard active-set path; infeasibility is certified when residual slacks exceed feas_tol.
§4.4 anti-cycling: Bland’s rule plus the full GMSW EXPAND τ-growth with snap-reset, built on a Harris-style two-pass ratio test.
§4.5 inertia control: factorize_with_inertia_control wraps every factor call site with a diagonal-shift retry on WrongInertia / Singular, matching the pounce-algorithm perturbation-handler defaults.
§4.7 iterative refinement inherited from pounce-feral (on by default).
Test harness: the §8.0 analytical correctness ladder, a pure-Rust Maros-Mészáros .qps reader (including RANGES), and per-module unit tests for kkt, elastic, refinement, and qps.

Phase 5b — cold SQP NLP driver

Outer loop (SqpAlgorithm::optimize) runs end-to-end on nonlinear NLPs, assembling each QP subproblem from the linearization (SqpQpData::build) and consuming any NLP the IPM consumes via IpoptNlpAdapter (.nl, CUTEst, Python bindings).
Globalization: both an l1-merit line search (Han-Powell with ν adaptation + Armijo backtracking) and §4.1 filter globalization (Fletcher-Leyffer 2002), selectable via sqp_globalization.
Hessian sources: exact, §4.6 damped BFGS (Powell 1978, guaranteeing PD iterates so the QP solver needs no inertia control), and L-BFGS (a circular curvature-pair history seeded by the Nocedal-Wright γI scaling), selectable via sqp_hessian.
Dispatch: add_option("algorithm", "active-set-sqp") routes through optimize_sqp_tnlp, which builds the NLP chain (TNLPAdapter → OrigIpoptNlp → IpoptNlpAdapter) and maps SqpStatus back to ApplicationReturnStatus; the IPM path is unchanged when the default interior-point is selected.
Options: eleven registered sqp_* suboptions (globalization, hessian, max_iter, tol, constr_viol_tol, dual_inf_tol, l1_penalty, bt_reduction, bt_min_alpha, print_level, lbfgs_max_history), all defaulting to SqpOptions::default() and applied through apply_sqp_options.

Phase 5c — working-set warm start and integration

Rust: SqpAlgorithm::optimize_with_warm_start consumes the §6 tuple (x, λ_g, λ_x, 𝒲) and feeds the working set into pounce-qp’s solve_with_working_set; IpoptApplication exposes set_sqp_warm_start, clear_sqp_warm_start, and last_sqp_working_set (input iterate consumed once and auto-cleared, output working set valid until the next solve overwrites it).
C ABI (§7.2): IpoptGetWorkingSet, IpoptSetWarmStartWorkingSet, IpoptClearWarmStartWorkingSet, and IpoptSolveWarmStart, with POUNCE_WS_* status codes. Existing cyipopt / JuMP / AMPL clients are unaffected — no existing signature changes.
Python (§7.3): the working_set=(bounds, cons) kwarg on Problem.solve, the set / clear / get_working_set methods, the info["working_set"] return key, and a module-level pounce.classify_working_set(...) so parametric-continuation users can wire IPM-converged multipliers in without dropping into Rust.
GAMS (§7.4): the marginal-based reconstruction path (§7.4(a)) classifies the working set from gmoGetVarM / gmoGetEquM, with the opt-in persistent state file (§7.4(b), sqp_state_file) as the lossless alternative — a binary format with an FNV-1a checksum keyed by (n, m, x_l, x_u, g_l, g_u) so structural changes invalidate cleanly and fall back to §7.4(a).
Sensitivity corrector: classify_working_set builds a WorkingSet from any (primal, multipliers, bounds) snapshot, completing the parametric “predictor (sensitivity) + corrector (SQP)” pattern. The worked end-to-end pipeline ships as python/examples/sqp_warm_start_mpc.py, gams/examples/parametric_sqp_warm_start.gms, and the tests/parametric_sqp_corrector.rs integration test, which validates a cold IPM solve → active-set classification → predictor step → SQP corrector to the exact perturbed optimum at 1e-8.

Phase 5d — l1-elastic alternative

Shipped and self-tested: sqp_l1_penalty_safety and sqp_l1_penalty_max clamp the Han-Powell ν update, and comparison tests certify that the Filter and L1Elastic globalizations converge to the same optimum on the shared Hock-Schittkowski fixtures (HS28, HS35).

Deferred — external-oracle regressions

These are benchmarking comparisons that each need a third-party solver or problem distribution wired in; none gate algorithmic completeness:

Maros-Mészáros 138-problem regression vs qpOASES / OSQP. The in-repo framework parses each .qps and asserts against a supplied optimum; the distribution and reference-optima table are what remain.
Hock-Schittkowski 119-problem regression vs CUTEst.
AC OPF (pglib-OPF) and Poisson boundary-control scaling sweeps vs MATPOWER / PowerModels.jl.
A measured ≥5× iteration-count drop on the MPC and parametric suites vs HPIPM / qpOASES / acados, and a small-NLP iteration-count comparison vs filterSQP / SNOPT.

Phases 5a and 5b each have standalone value (the sparse QP solver and the cold SQP NLP driver); 5c is where the warm-start payoff lands; 5d is the comparison work.

11. Risk

Maintenance. Two solver paths is a permanent maintenance liability. Mitigation: SQP shares the IpoptNlp, derivative, scaling, options, journalist, conv-check, and Hessian layers unchanged; only the iteration skeleton + QP subproblem are net new.
Indefinite-Hessian failure modes. Reduced-Hessian indefiniteness with bad scaling can defeat §4.5 inertia control. Mitigation: SR1 fallback (§4.6) and the same kappa_d damping pounce already applies in mu/adaptive.rs.
Schur-complement growth. If the working set changes O(n) times before a refactor, the dense Schur block becomes a cost concern. Mitigation: refactor cap sqp_max_schur_updates (default 50, §7.1); Davis 2006 §11 and Eldersveld-Saunders 1992 give empirical guidance.
GAMS state-transfer ambiguity. Mechanism §7.4(a) is lossy on degenerate active sets. Mitigation: §7.4(b) state file as opt-in; documentation calling out the limitation.
Benchmark target completeness. No MPC, MINLP, or parametric workload sits in benchmarks/ today. Phase 5c ships with at least one each (§8.2) committed alongside.

12. Design decisions

The scope-and-policy questions raised during design were resolved as follows:

Hessian default for cold SQP. Exact Hessian, with a damped-BFGS auto-fallback when the QP repeatedly fails — fastest when reliable, robust on hard nonconvex problems.
GAMS state-file format. Binary, with a checksum keyed by the problem structure so a changed shape invalidates the file cleanly.
C API entry-point granularity. Both the three-call primitive (IpoptSet… / IpoptSolve / IpoptGet…) and the one-shot IpoptSolveWarmStart convenience wrapper ship; the sequence is the primitive, the one-shot is convenience.
pounce-sensitivity integration. Landed in Phase 5c, so the parametric workload is a real end-to-end test rather than a unit-test stub.
Crate placement. crates/pounce-qp/, matching the existing workspace convention.

13. References

Algorithm — outer SQP

Fletcher, Leyffer (2002), Math. Prog. 91, 239–269 — filter SQP.
Fletcher, Leyffer, Toint (2002), SIAM J. Optim. 13, 44–59 — convergence of filter SQP.
Wächter, Biegler (2005), SIAM J. Optim. 16, 1–31 — filter line search.
Wächter, Biegler (2006), Math. Prog. 106 — IPOPT reference.
Nocedal, Wright, Numerical Optimization (2nd ed., Springer 2006), Ch. 16 (QP), Ch. 18 (SQP).

Algorithm — QP subproblem

Ferreau, Kirches, Potschka, Bock, Diehl (2014), Math. Prog. Comp. 6, 327–363 — qpOASES, dense parametric active set.
Kirches (2011), Fast Numerical Methods for Mixed-Integer Nonlinear Model-Predictive Control, Vieweg+Teubner — sparse Schur-complement extension; the canonical reference for §4.2.
Janka, Kirches, Sager, Schlöder (2016), Math. Prog. Comp. 8, 435–459 — block-sparse SR1/BFGS SQP.
Goldfarb, Idnani (1983), Math. Prog. 27 — dual active-set for convex QP (competing family).
Gill, Murray, Saunders (2005), SIAM Rev. 47, 99–131 — SNOPT.
Gould, Hribar, Nocedal (2001), SIAM J. Sci. Comput. 23, 1376–1395 — null-space, indefinite Hessian (§4.5).
Stellato, Banjac, Goulart, Bemporad, Boyd (2020), Math. Prog. Comp. 12 — OSQP (operator-splitting alternative).

Algorithm — sparse linear algebra and updates

Bartels (1971), Numer. Math. 16 — basis-update lineage.
Reid (1982), Math. Prog. 24 — Bartels-Golub-Reid sparse variant.
Eldersveld, Saunders (1992), SIAM J. Matrix Anal. Appl. 13 — block-LU update used for the Schur complement.
Davis, Direct Methods for Sparse Linear Systems (SIAM 2006) — fill-in analysis.

Algorithm — anti-cycling, elastic mode, inertia

Gill, Murray, Saunders, Wright (1989), Math. Prog. 45, 437–474 — EXPAND.
Gill, Murray, Saunders (2008), User’s Guide for SQOPT 7.7 — l1-elastic mode.
Friedlander, Saunders (2005), SIAM J. Optim. 15 — elastic globalization.
Gould (1999), SIAM J. Optim. 9, 1041–1063 — modified factorizations.
Forsgren (2002), Appl. Num. Math. 43 — inertia control.

Algorithm — Hessian approximation

Powell (1978), in Numerical Analysis Dundee 1977 — damped BFGS for SQP.
Liu, Nocedal (1989), Math. Prog. 45, 503–528 — L-BFGS.
Byrd, Nocedal, Schnabel (1994), Math. Prog. 63 — compact representations.

Test harness and benchmarks

Hock, Schittkowski, Test Examples for Nonlinear Programming Codes, Lecture Notes in Economics and Mathematical Systems 187 (Springer 1981) — the HS001–HS119 reference set used in §8.3.
Maros, Mészáros (1999), Optim. Methods Softw. 11/12 — Maros-Mészáros QP test set.
Maros (1996), Computational Techniques of the Simplex Method, Springer — degenerate-QP cycling examples used in §8.7.
Beale (1955), “Cycling in the dual simplex algorithm”, Naval Res. Logistics Quart. 2 — cycling LP smoke-test instance.
Hoffman (1953), “Cycling in the simplex algorithm”, National Bureau of Standards Report 2974 — second cycling smoke-test.
Stellato, Banjac, Goulart, Bemporad, Boyd (2020), Math. Prog. Comp. 12 — LASSO scaling reference numbers in §8.2(a).
Frison, Diehl (2020), “HPIPM: a high-performance quadratic programming framework for model predictive control”, IFAC- PapersOnLine 53 — MPC scaling reference numbers in §8.2(b).
Babaeinejadsarookolaee et al. (2019), “The power grid library for benchmarking AC optimal power flow algorithms”, arXiv:1908.02788 — pglib-OPF used in §8.4(a).
Biegler, Nonlinear Programming: Concepts, Algorithms, and Applications to Chemical Processes, SIAM (2010), §11.3 — Poisson optimal-control scaling family used in §8.4(b).
Verschueren et al. (2022), Math. Prog. Comp. 14 — acados MPC benchmark suite used in §8.5(a).
Pirnay, López-Negrete, Wächter (2012), Math. Prog. Comp. 4 — Beltistos parametric NLP and IPM warm-start comparison baseline used in §8.5(b).
Bussieck, Drud, Meeraus (2003), INFORMS J. Comp. 15 — MINLPLib instances used in §8.5(c).
Wächter (2002), An Interior Point Algorithm for Large-Scale Nonlinear Optimization with Inexact Step Computations, PhD thesis, CMU — HS failure documentation referenced in §8.3.

Roadmap context

The future-work roadmap’s C1 entry — the active-set SQP track this note operationalizes.
Sister design notes cover C3 (the composite-step Byrd-Omojokun trust-region globalization) and C5 (the matrix-free interior-CG / Krylov-KKT track).

NLP and Linear-System Scaling

Optimization problems whose objective, constraints, or KKT system span many orders of magnitude often converge poorly — or not at all — without some form of rescaling. pounce inherits two independent scaling layers from Ipopt and adds a third option at the linear-system level (see issue #61).

The two layers are conceptually separate:

Layer	Option	What it touches
NLP scaling	`nlp_scaling_method`	The objective `f` and each constraint row `c_i`, before the IPM sees them. Changes algorithmic behavior (filter, `tol`, μ).
Linear-system scaling	`linear_system_scaling`	Symmetric scaling of the KKT augmented system `D K D` for the factorization. Purely numerical — the IPM sees the same iterates.

You can configure them independently. Defaults match upstream Ipopt: nlp_scaling_method = gradient-based, linear_system_scaling = none.

NLP-level scaling

Option	Default	Effect
`nlp_scaling_method`	`gradient-based`	`none` / `gradient-based` / `user-scaling`.
`nlp_scaling_max_gradient`	`100.0`	Cutoff above which gradient-based scaling applies. Per-row scale = `min(1, max_gradient / ‖∇c_i‖_∞)`.
`nlp_scaling_min_value`	`1e-8`	Floor on computed scale factors — prevents inverting near-zero gradients.
`nlp_scaling_obj_target_gradient`	`0.0`	When `> 0`, pins the scaled objective gradient ∞-norm to this value. Overrides the `max_gradient` cutoff.
`nlp_scaling_constr_target_gradient`	`0.0`	Same as above, per constraint row.
`obj_scaling_factor`	`1.0`	Constant multiplier on the objective, applied after the automatic factor.

`gradient-based` (default)

Evaluates ∇f and ∇c_i once at the starting point x_0 and chooses per-row scales that pull each gradient ∞-norm into a reasonable band. Single-shot is mandatory — recomputing per iteration would invalidate the filter’s history (Wächter, 2013).

The clamp at 1.0 means scaling never amplifies a small row; it only damps large ones.

`user-scaling`

The TNLP is asked for obj_scaling, a per-variable x_scaling, and a per-constraint g_scaling via the get_scaling_parameters callback. Use this when you know the natural units of your problem (e.g. mass in kg vs. distance in mm) and can supply better scales than the gradient-based heuristic.

Note: pounce’s OrigIpoptNlp currently honors obj_scaling and per-constraint g_scaling. The x_scaling request channel is accepted but not yet acted on. Mirrors the design in issue #61.

If the TNLP’s get_scaling_parameters returns false (the default), pounce falls back to no automatic scaling.

Setting user scaling

From C — call SetIpoptProblemScaling(problem, obj, x_scaling, g_scaling) then AddIpoptStrOption("nlp_scaling_method", "user-scaling"). See crates/pounce-cinterface/include/pounce.h.
From Rust — implement TNLP::get_scaling_parameters on your problem type.
From Python — pounce.Problem.set_problem_scaling(obj_scaling, x_scaling=None, g_scaling=None), followed by add_option("nlp_scaling_method", "user-scaling"). Walked through end-to-end in python/notebooks/07_scaling.ipynb.

Target-gradient overrides

nlp_scaling_obj_target_gradient and nlp_scaling_constr_target_gradient are subtle. When set to a positive value, they override the max_gradient cutoff and the 1.0 clamp: the scaling is computed unconditionally as target / max_gradient_norm, so the scaled gradient ∞-norm becomes exactly the target. Useful when you have a specific numeric range you want the IPM to see.

The default 0.0 means “use the cutoff path” — i.e. only scale rows that are above nlp_scaling_max_gradient.

Linear-system-level scaling

Option	Default	Effect
`linear_system_scaling`	`none`	`none` / `ruiz`. `mc19` and `slack-based` are accepted by the option registry but not yet implemented — both fall back to `none`.
`linear_scaling_on_demand`	`yes`	Defer scaling computation until a linear solve is poor; reduces overhead for well-conditioned KKT systems.

The KKT augmented system is symmetric; all linear-system scalers in pounce use the symmetric form D K D (single diagonal) to preserve that structure for the downstream factorization (MA57, MUMPS, FERAL/SSIDS).

none — first-class choice. The inner linear solver (MA57, MUMPS, FERAL) often does its own scaling under some configurations; stacking pounce-level scaling on top can hurt. Default. Use ma57_automatic_scaling=yes to get MA57’s internal scaling instead.
ruiz — iterative symmetric ∞-norm equilibration (Ruiz, CERFACS TR/PA/01/14). Pure Rust, no Fortran dependency. Converges geometrically; capped at 10 iterations. The only implemented scaler today; recommended starting point when MA57’s internal scaling is off.
mc19 (not yet implemented) — intended HSL MC19 row/column scaling (Curtis-Reid 1972; minimizes Σ log²|a_ij|). Accepted by the registry but currently logs a warning and falls back to none.
slack-based (not yet implemented) — intended slack-aware scaling. Accepted by the registry but falls back to none.

Worked example — `nql180`

nql180 is one of the Mittelmann NLP benchmarks where both default pounce and default Ipopt fail to clear the strict tol gate (see issue #25). Forcing Ruiz symmetric equilibration on the augmented KKT system is enough to push pounce all the way to “Optimal Solution Found”:

pounce nql180.nl presolve=yes linear_system_scaling=ruiz \
       linear_scaling_on_demand=no

	default	+ Ruiz (forced)
Exit status	Solved To Acceptable Level	Optimal Solution Found
Iterations	41	50
Primal infeasibility	4.0e-11	1.2e-15
Dual infeasibility	1.0e-5	3.1e-4
Complementarity	1.2e-9	9.9e-10
Overall NLP error	2.4e-7	9.9e-10

The four-orders-of-magnitude primal-feasibility improvement and ~3 orders on the overall NLP error are the textbook Ruiz benefit: symmetric ∞-norm equilibration lowers the condition number of the KKT matrix enough that the back-solve residuals drop the extra fractional digits needed to clear tol. The extra nine iterations are well spent — the 50-iter Ruiz solution is mathematically of strictly higher quality than the 41-iter unscaled “acceptable” solution.

linear_scaling_on_demand=no forces always-on Ruiz; the default (yes) defers scaling computation until the linear solver flags an iterate as poorly scaled, which is the right behavior for problems that don’t need it (most of the Mittelmann set, where the iter count is unchanged with or without Ruiz).

Reporting

All scaling effects are undone before the solve report (final objective, multipliers, dual residuals, KKT termination metric) is handed back to the user. You always see quantities in the natural units of your TNLP.

Internally, the IPM operates in scaled space: stopping criteria (tol, acceptable_tol) compare scaled values, the barrier parameter μ is in scaled units, and the filter’s history is built from scaled function values.

When to override the defaults

Reach for non-default scaling when:

The constraint Jacobian has entries spanning many orders of magnitude (chemistry, power-flow, mixed-unit mechanics). Try mc19 or ruiz at the linear-system level, after disabling MA57’s internal scaling.
The IPM stalls with small step sizes but no clear infeasibility. Worth turning nlp_scaling_method=none to see whether the default gradient scaling is doing the wrong thing; then re-enable with problem-specific target gradients.
You know the natural units of your problem better than the solver can infer from gradients at x_0. Wire user-scaling.

Otherwise the upstream-Ipopt-style defaults (gradient-based at the NLP level, none at the linear-system level with MA57’s internal scaling on) are a reasonable starting point.

References

Wächter, A. On the effects of scaling on the performance of Ipopt. arXiv:1301.7283 (2013). https://arxiv.org/abs/1301.7283
Ruiz, D. A scaling algorithm to equilibrate both rows and columns norms in matrices. CERFACS TR/PA/01/14. https://cerfacs.fr/wp-content/uploads/2017/06/14_DanielRuiz.pdf
Curtis, A. R. and Reid, J. K. On the Automatic Scaling of Matrices for Gaussian Elimination. (1972). HSL MC19 reference.
pounce issue #61.

Feasibility-Based Bound Tightening (FBBT)

pounce supports feasibility-based bound tightening on nonlinear constraints: interval-arithmetic propagation through the constraint expression DAG to discover variable bounds the user did not write down (e.g. x² + y² ≤ 1 ⇒ x ∈ [-1, 1], exp(x) ≤ 10 ⇒ x ≤ ln 10). It pairs with the linear bound-tightening already in the presolve pipeline (which only handles linear constraints).

Tracks issue #62. References: Belotti, Cafieri, Lee, Liberti (2010).

When it helps

The Jacobian / objective row magnitudes are wildly different from what the user-declared bounds suggest.
A nonlinear equality or one-sided inequality is much tighter than the user’s [lo, hi] box.
Loose bounds were inherited from a modeling tool that doesn’t propagate constraints back to variable boxes (most modeling tools don’t).

FBBT cannot help when:

The TNLP has no structural-expression representation. Today only .nl-loaded problems (NlTnlp) expose one. Python (PyTnlp), C-callback (CCallbackTnlp), and Rust closure-based problems silently opt out.
The expression uses operators FBBT doesn’t reason about (Funcall to AMPL imported functions, variable-exponent powers, sin / cos reverse pass). Those subtrees become opaque and block tightening through them, but the rest of the constraint still propagates normally.

Options

Option	Default	Effect
`presolve_fbbt`	`no`	Master switch. Requires `presolve=yes` and an `ExpressionProvider`.
`fbbt_tol`	`1e-6`	Minimum per-variable bound improvement to keep iterating.
`fbbt_max_iter`	`10`	Outer-sweep cap.
`fbbt_max_constraints`	`0`	Per-sweep cap on constraints inspected (`0` = unlimited).

FBBT runs after the linear bound-tightening (Phase 1) and before the redundant-constraint pass (Phase 2), so any FBBT-derived tightening feeds forward into row drops, the LICQ check, and the bound-multiplier warm starts.

With presolve_fbbt=yes, the per-solve presolve banner prints two lines instead of one:

Presolve: tightened 170 bounds (82 newly-finite), dropped 46 redundant rows, LICQ=Full
Presolve FBBT: 10 sweeps, 1362 variable tightenings (Σ|Δ|=7.5e20)

Fields:

sweeps — number of outer iterations actually executed (≤ fbbt_max_iter). Hitting the cap is informational, not an error.
variable tightenings — total count of per-variable (x_lo[j], x_hi[j]) updates that strictly improved the box.
Σ|Δ| — sum of absolute bound improvements across all updates. Provided as a coarse “how much did we move” signal — not part of the FBBT algorithm.

If FBBT detects infeasibility (the constraint bound is disjoint from the interval enclosure at the current variable box), it stops and emits pounce: FBBT detected infeasibility (witness constraint N). The solve continues with the partially-updated bounds — the IPM will then report infeasibility through its own channels.

Should I turn it on?

The issue’s design says: default off until benchmark evidence justifies a flip. Today’s evidence:

On small problems (e.g. tutorial_flow_density.nl): FBBT moves iteration count slightly, sometimes up, sometimes down.
On larger problems (e.g. gaslib11_steady.nl): FBBT enables additional redundant-row drops and can promote the LICQ verdict from StructuralRank to Full, but the iteration count change is mixed.

So: try it on your problem. If you see fewer iterations or a cleaner LICQ verdict, keep it on; if it costs iterations, turn it off again. The cost of FBBT itself is small (one pass over the expression DAGs per sweep, capped at fbbt_max_iter).

Soundness guarantees

FBBT uses outward-rounded interval arithmetic. Every operation widens its result by one ULP outward so accumulated floating-point error always increases the interval, never shrinks it. The consequence: FBBT may produce a looser tightening than ideal, but it cannot drop a feasible point. The pointwise soundness fuzz tests in crates/pounce-presolve/src/fbbt/{forward,reverse,orchestrator}.rs verify this property on random sample grids.

Operator support

Forward + reverse rules cover the operators that account for ~all nonlinear constraints in practice:

Operator	Forward	Reverse
`+ - * / neg`	✓	✓
`pow` (integer constant)	✓	✓ (branch-selecting for even powers)
`pow` (variable / non-integer)	opaque	opaque
`sqrt exp ln abs`	✓	✓ (with domain clipping)
`sin cos`	✓ (loose)	declines to tighten
`log10`	rewritten as `ln / ln(10)`	follows the rewrite
AMPL imported `Funcall`	opaque	opaque
n-ary `Sum`	folded into binary `Add`	follows the fold

Opaque slots evaluate to [-∞, +∞] on the forward pass and block reverse propagation through them — they don’t pollute the rest of the constraint.

Extending support to new TNLP sources

FBBT consumes the pounce_nlp::expression_provider::ExpressionProvider trait. Any TNLP can opt in by implementing:

#![allow(unused)]
fn main() {
impl ExpressionProvider for MyTnlp {
    fn constraint_expression(&self, i: usize) -> Option<pounce_nlp::FbbtTape> {
        // Build a tape from your problem's symbolic structure.
        // Return None to decline (FBBT becomes a no-op on that
        // constraint).
    }
}
}

FbbtTape is a flat tape of FbbtOp nodes; the existing NlTnlp implementation in crates/pounce-cli/src/nl_fbbt_translate.rs is the canonical template (it walks an AMPL Expr tree, preserving CSE sharing via Rc::as_ptr keying). Building a similar tape from a Pyomo, JAX, or sympy expression is a finite-effort project.

References

Belotti, Cafieri, Lee, Liberti. On feasibility based bounds tightening. (2010). https://enac.hal.science/hal-00935464v1/document
Liberti et al. Feasibility-based bounds tightening via fixed points. COCOA 2010. https://www.lix.polytechnique.fr/~liberti/fbbt-cocoa10.pdf
Puranik, Sahinidis. Domain reduction techniques for global NLP and MINLP optimization. Constraints 22 (2017). https://arxiv.org/pdf/1706.08601
pounce issue #62.

Auxiliary-Equality Preprocessing

POUNCE’s auxiliary-equality preprocessing pass identifies small, self-contained equality sub-systems in an NLP and solves them before the IPM starts. Variables determined by those sub-systems are pinned to their values; the equality rows are dropped from the problem the IPM sees. The IPM then handles the reduced problem, which is smaller, often better-conditioned, and sometimes solvable in zero iterations.

The pass is a port of ripopt PR #32 by David Bernal Neira to pounce’s TNLP wrapper. It lives entirely inside pounce-presolve and is enabled by setting two options:

pounce problem.nl presolve=yes presolve_auxiliary=yes

What it does, in words

For each call to the inner TNLP, the wrapper:

Builds a bipartite graph between equality constraint rows and variables, using the Jacobian sparsity.
Finds a maximum matching (Hopcroft-Karp).
Runs a Dulmage-Mendelsohn partition, slicing the graph into three pieces: overdetermined, underdetermined, and square (the piece where rows and variables pair up one-to-one).
Decomposes the square piece into independent connected components, and each component into an ordered sequence of blocks via Tarjan SCC.
Classifies each block by how it’s coupled to the rest of the problem: pure equality, objective-coupled, inequality-coupled, or both.
Solves each pure-equality (or, with aggressive coupling, objective-coupled) block via a small dense-LU Newton step and verifies the full-space residual is within tolerance.
Applies accepted blocks by clamping the fixed variables’ bounds (x_l = x_u = value) and dropping the dropped rows.
After the IPM finishes, recovers the Lagrange multipliers for the dropped rows via a small dense-LU stationarity solve, and hands the user back a complete full-space KKT solution.

If the model has no eliminable structure, the pass is a tested no-op and the IPM runs as usual.

When it helps

The pass is most valuable when an NLP contains:

Algebraic auxiliary variables that appear in one or two linear constraints with no other coupling (common in process-engineering and energy-system models).
Internal chains where one variable is defined as a function of another (e.g. T_out = T_in + delta_T with T_in already known).
Mass-balance equalities that form a small square block on a subset of stream variables.

ripopt reports gaslib11_steady going from 204 / 200 vars / cons to 140 / 136 vars / cons under this pass, and tutorial_flow_density going from 6–7 IPM iterations to 0.

Coupling classes

Every candidate block is classified by what it touches:

Class	Touches inequality?	Touches objective grad?	Eliminated under `safe`?	Eliminated under `aggressive`?
`PureEquality`	no	no	yes	yes
`ObjectiveCoupled`	no	yes	no	yes (postsolve candidate)
`InequalityCoupled`	yes	no	no	no
`ObjectiveAndInequalityCoupled`	yes	yes	no	no

safe is the default. Inequality-coupled blocks are never eliminated in v1 — fixing such a variable could violate the inequality.

Options

See Solver Options → NLP Presolve for the full list. The two switches you most often touch are:

Option	Default	Effect
`presolve_auxiliary`	`no`	Master switch. Off → pass is a no-op.
`presolve_auxiliary_coupling`	`safe`	`none` / `safe` / `aggressive` policy.

Diagnostics

The pass populates an AuxiliaryPreprocessingDiagnostics struct on every call. From Rust:

#![allow(unused)]
fn main() {
use pounce_presolve::{wrap_with_presolve, PresolveOptions};

let opts = PresolveOptions { enabled: true, auxiliary: true, ..PresolveOptions::defaults() };
let wrapped = wrap_with_presolve(inner, opts)?;
// ... run a solve ...
// Access via the typed handle returned by PresolveTnlp::new:
//   let diag = typed.auxiliary_diagnostics();
//   println!("{diag}");
}

The Display impl produces output like:

auxiliary-preprocessing: 1 of 1 candidate block(s) eliminated, fixing 2 variable(s) and dropping 2 row(s) in 0 ms
  max block dim: 2, max residual: 0.000e0
  coupling: pure=1, obj=0, ineq=0, both=0

Per-stage timings (stage_time_ms.incidence_ms / matching_ms / dm_ms / components_ms / btf_ms / block_solve_ms / residual_check_ms) and per-class accept counts are also available.

From the command line, set presolve_auxiliary_diagnostics=yes to have the same summary emitted to stderr automatically after every Phase-0 pass:

pounce problem.nl presolve=yes presolve_auxiliary=yes \
  presolve_auxiliary_diagnostics=yes

Limitations (v1)

Both linear and nonlinear blocks are eliminated. The linear path reuses the pre-fetched Jacobian; the nonlinear path drives Newton through TNLP callbacks.

Fixed variables are assumed to be interior to their original bounds at the optimum; postsolve sets their bound multipliers to zero implicitly. Lifting this assumption — handling the case where a fixed variable is at an original bound — is a known follow-up.

The pass currently runs once, at the start of the solve. Iterative re-elimination (running the pass again on the reduced problem) is not supported in v1.

Interaction with the rest of presolve

The auxiliary pass runs before the existing bound-tightening phase (presolve_bound_tightening=yes). The two phases interact at the bounds: aux clamps x_l[i] = x_u[i] = value for variables it fixes; bound tightening then propagates the remaining constraints. The orchestrator filters out aux-dropped rows before tightening runs, so they can’t propagate contradictions back over the clamps. If tighten_bounds still flags infeasibility — for example because an aux-fixed value disagrees with a kept-row’s bound — the orchestrator rolls back the aux pass for that solve and re-runs tightening on the unfiltered rows. A one-line warning lands on stderr when this happens.

Interaction with sensitivity / reduced-Hessian post-processing

When the input .nl file carries sensitivity suffixes (sens_init_constr / sens_state_*) or the CLI is invoked with --compute-reduced-hessian, the entire presolve layer — including auxiliary preprocessing — is silently disabled. The user sees a single warning on stderr (pounce: disabling presolve — ...) and the solve proceeds without any presolve transformation. This is because the existing sensitivity / reduced-Hessian code paths assume the IPM’s variable and row indices match the user’s original .nl. Lifting this restriction is tracked separately (pounce#19).

Caveat: nonconvex problems can land at a different local optimum

When the auxiliary pass eliminates a block, it pins the block’s variables to a specific feasible point of the equality system — the one Newton converges to from the probe point. On convex problems this is the unique local optimum and the IPM would reach the same values anyway. On nonconvex problems with multiple feasible solutions to the equality system, the auxiliary pass may fix variables to a feasible point in a different basin of attraction than where the un-presolved IPM would eventually settle. The full-space objective then differs between presolve_auxiliary=yes and presolve_auxiliary=no, both solutions remain feasible and locally optimal.

The vendored gaslib11_steady.nl benchmark in crates/pounce-cli/tests/fixtures/aux_presolve/ exhibits exactly this — presolve_auxiliary=yes converges to objective ≈ 1.825e-02 while the un-presolved path settles at ≈ 3.286e-02. Both points satisfy the model’s KKT conditions; aux just lands in a different basin. The regression test for gaslib11_steady deliberately does NOT assert objective parity for this reason; the test name and comments document the constraint.

If matching the un-presolved path’s local optimum is important for your workflow, leave presolve_auxiliary=no until iterative re-elimination or a multiple-basin-aware policy lands (tracked on pounce#53).

Worked example

Run any of these to see the pipeline in action:

cargo run -p pounce-presolve --example pipeline_demo
cargo run -p pounce-presolve --example phase0_via_tnlp

The first runs the algorithmic pipeline directly on a hand-crafted problem and prints each stage’s output. The second wraps a real TNLP with presolve_auxiliary=yes and exercises the end-to-end elimination + multiplier recovery.

References

Issue tracking the port: pounce#53.
Upstream: ripopt PR #32 by David Bernal Neira (@bernalde). The tutorial_flow_density{,_perturbed}.nl and gaslib11_steady.nl fixtures vendored into crates/pounce-cli/tests/fixtures/aux_presolve/ originate from that ripopt PR.
Design notes: dev-notes/auxiliary-equality-preprocessing.md in the pounce repo.

Troubleshooting Recipes

When a pounce solve fails, stalls, or settles for “acceptable” instead of “optimal”, the default options aren’t always the best fit. This page collects concrete, reproducible recipes that turn failures into successes (or improve already-successful solves) on real problems.

Each entry follows the same shape:

When to try it — symptoms in the iter table or the final report that point to this knob.
The knob — exact option(s) and CLI invocation.
Worked example — before/after table on a named problem so you can verify the recipe reproduces on your machine.

A recipe earns a place on this page when there’s a named problem where it demonstrably helps. “Should help in theory” entries belong in the reference pages (Scaling, FBBT, Options), not here. If you find a new win, the contribution guide (CONTRIBUTING.md) walks through adding it.

Quick lookup by symptom

Symptom	Recipe
Exit “Solved To Acceptable Level” but you need strict optimality	Ruiz linear-system scaling
Hundreds of small steps, slow convergence on a problem with loose bounds	FBBT on nonlinear constraints
`Search Direction is becoming Too Small` early in the iter table	Ruiz linear-system scaling, then μ-strategy switch
Restoration phase fires repeatedly	ℓ₁ exact-penalty wrapper
Iterates wander on an LP-like / linearly constrained problem	`mehrotra_algorithm=yes`
Hundreds of iterations, monotone μ stair-steps slowly toward optimal	`mu_strategy=adaptive`
Iter count looks fine but seconds-per-iter is dominated by the linear solve on a hard QCQP / banded problem	`feral_ordering=auto_race`

Presolve: bound-tightening and row drops

`presolve=yes` (start here)

The pounce presolve pipeline drops fixed variables, propagates bounds from linear rows, detects empty / redundant constraints, and warm-starts bound multipliers. It is off by default to match upstream Ipopt’s no-surprises behavior; turn it on for any non-trivial NLP.

pounce problem.nl presolve=yes

Cheap, almost always helpful, and a prerequisite for FBBT.

FBBT (feasibility-based bound tightening)

Interval propagation through the nonlinear constraint DAG to discover variable bounds the user did not write down (x² + y² ≤ 1 ⇒ x ∈ [-1, 1], exp(x) ≤ 10 ⇒ x ≤ ln 10, etc.). Full reference in Feasibility-Based Bound Tightening.

When to try it. Hundreds of small steps in the iter table, the primal infeasibility stuck against a bound, or a problem that’s clearly under-constrained from the modeler’s side. Requires a structural-expression representation, which today means an .nl input.

The knob.

pounce problem.nl presolve=yes presolve_fbbt=yes

Worked example — clnlbeam (Mittelmann):

	`presolve=yes`	`+ presolve_fbbt=yes`
Exit status	Optimal Solution Found	Optimal Solution Found
Iterations	552	65
Wall time	41.4 s	8.2 s

FBBT discovers tight nonlinear bounds the linear sweep missed; the IPM then has a much smaller feasibility gap to close and converges in roughly one-eighth the iterations.

Not every problem benefits. On corkscrw and arki0003 FBBT produces no measurable change or a slight regression — the infrastructure is cheap (one pass per constraint per outer sweep, capped at fbbt_max_iter=10), so the worst case is a few percent of extra presolve time.

Scaling

Full reference in Scaling. The two layers are independent.

Ruiz scaling on the augmented KKT system

When to try it. Exit status is “Solved To Acceptable Level” with small step sizes near the end, or dual_inf plateaus several orders above tol while primal feasibility is already at machine epsilon. That pattern signals a poorly-conditioned KKT augmented matrix — the back-solve loses the last few fractional digits the convergence check needs.

The knob.

pounce problem.nl presolve=yes linear_system_scaling=ruiz \
       linear_scaling_on_demand=no

linear_scaling_on_demand=no forces always-on Ruiz; the default (yes) defers scaling until the linear solver flags an iterate as poorly scaled. For diagnostic runs, force it on.

Worked example — nql180 (Mittelmann):

	default	`+ linear_system_scaling=ruiz`
Exit status	Solved To Acceptable Level	Optimal Solution Found
Iterations	41	50
Primal infeasibility	4.0e-11	1.2e-15
Dual infeasibility	1.0e-5	3.1e-4
Complementarity	1.2e-9	9.9e-10
Overall NLP error	2.4e-7	9.9e-10

Symmetric ∞-norm equilibration improves primal feasibility by four orders of magnitude and overall NLP error by ~3 orders, letting the solver clear the strict tol gate. The extra nine iterations are well spent. Resolves issue #25.

Worked example — WM_CFy (Mittelmann ampl-nlp, n=8709, m=12850):

	default	`+ linear_system_scaling=ruiz`
Exit status	Optimal Solution Found	Optimal Solution Found
Iterations	605	241
Wall time	~2300 s	~543 s
Overall NLP error	3.4e-9	2.6e-9

A 4× wall-time speedup on a problem that previously sat in the “hard W-B” bucket: every Ipopt + linear-solver combination tried in issue #29 had failed to converge within a 600 s budget. Ruiz wasn’t just an iteration-count win — at 605 iters / 2300 s default-pounce was the only configuration that even finished; Ruiz cuts that to under ten minutes. Same underlying mechanism as nql180: the augmented KKT system is ill-conditioned enough that the back-solve burns iterations chasing residuals symmetric ∞-norm equilibration fixes in one preconditioning pass.

Pairing mu_strategy=adaptive with Ruiz on this problem solves to a ~50× tighter NLP error (5e-11) but takes twice as long (491 iters, 1100 s). For a tighter solution at any cost, use both; for a fast solve, Ruiz alone wins.

NLP-level scaling: when the default hurts

The gradient-based default at the NLP level is computed once at x_0 and is sometimes the wrong fingerprint of the problem — for instance when the starting point lives near a flat region of the objective. If the IPM stalls with no clear infeasibility and the unscaled gradients in the report look reasonable, try turning NLP scaling off:

pounce problem.nl nlp_scaling_method=none

Or, if you know the natural units of your problem better than the solver does, supply user-scaling (see Scaling for the end-to-end recipe).

μ-strategy

Monotone vs. adaptive

Monotone (the default) decreases the barrier parameter μ in geometric steps; adaptive uses a quality-function oracle to pick each new μ based on the current iterate’s complementarity. Adaptive is more aggressive in well-conditioned regions and more conservative near degeneracy.

When to try it. Convex or nearly-convex problems where the monotone schedule wastes iterations stair-stepping toward a μ that the iterate clearly accepts; alternately, ill-conditioned problems where monotone overshoots and triggers restoration.

The knob.

pounce problem.nl mu_strategy=adaptive

Pair with mu_oracle=quality-function (the default) or mu_oracle=probing for the Mehrotra-style affine probe.

Worked example — arki0009 (Mittelmann):

	`mu_strategy=monotone` (default)	`mu_strategy=adaptive`
Exit status	Optimal Solution Found	Optimal Solution Found
Iterations	358	108

A 70 % iteration-count reduction with no quality regression. The quality-function oracle picks larger μ-decrements when the complementarity gap is well-balanced, skipping the slow stair-step that monotone is forced into on this instance.

nql180 is also rescued by mu_strategy=adaptive alone (Acceptable → Optimal in 61 iters) — so for that problem you have a choice between the Ruiz recipe (above) and the adaptive-μ recipe. Ruiz gives a numerically cleaner solution (primal infeasibility 1.2e-15 vs ~5e-12); adaptive μ is one knob instead of two and has no linear-system overhead.

Mehrotra predictor-corrector

For problems that are LP-like (linear or mildly nonlinear constraints, quadratic objective), the Mehrotra predictor-corrector mode short-circuits the filter line search and accepts every trial step:

pounce problem.nl mehrotra_algorithm=yes

This sets a Mehrotra-canonical configuration (adaptive_mu_globalization=never-monotone-mode, accept_every_trial_step=yes, alpha_for_y=bound_mult, larger bound_push and bound_mult_init_val). On well-conditioned LP-like problems it routinely cuts iteration counts in half. On nonconvex NLPs it can destabilize — see issue #58 for the trade-off discussion.

Restoration & ℓ₁ exact-penalty wrapper

When restoration fires repeatedly, the standard IPM is stuck on an infeasible subproblem the filter cannot accept. The ℓ₁ exact-penalty wrapper rephrases the constraints as an additive penalty term and solves a sequence of bound-constrained subproblems instead:

pounce problem.nl l1_exact_penalty_barrier=yes

Or, only invoke the wrapper as a fallback when standard restoration fails:

pounce problem.nl l1_fallback_on_restoration_failure=yes

This is the recipe for problems with rank-deficient constraints, ill-defined bounds at the starting point, or pathological LICQ violations — anywhere the filter’s history rules out feasibility restoration paths the wrapper can still find.

Worked example: certifying genuine infeasibility

The built-in infeasible-eq problem is the smallest fixture that exercises the fallback end-to-end:

min  x0^2 + x1^2
s.t. x0 + x1 = 1     (g0)
     x0 + x1 = 2     (g1)

The two equalities are mutually contradictory, so no x exists with ||g(x)||_∞ = 0. The standard solve diagnoses this without the wrapper:

$ pounce --problem infeasible-eq
...
EXIT: Converged to a point of local infeasibility. Problem may be infeasible.

That message is the filter giving up: it found an iterate where the constraint gradients are linearly dependent and no admissible step reduces infeasibility further. The output does not tell you whether the problem is genuinely infeasible or whether the filter rejected a feasible neighborhood that another method could reach. Re-run with the wrapper to find out:

$ pounce --problem infeasible-eq l1_fallback_on_restoration_failure=yes
iter      objective   inf_pr   inf_du lg(mu)    ||d|| lg(rg) ...
   0  0.0000000e+00 2.00e+00 0.00e+00   -1.0 0.00e+00     -  ...
   1  1.1250000e+00 5.00e-01 4.22e-09   -1.0 7.50e-01     -  ...
   2r 1.1250000e+00 5.00e-01 9.99e+02   -0.3 0.00e+00     -  ...   ← restoration
...
iter      objective   inf_pr   inf_du lg(mu)    ||d|| lg(rg) ...   ← second inner solve
   0  3.0202000e+00 9.90e-03 0.00e+00   -1.0 0.00e+00     -  ...
...
   6  1.5000000e+00 2.22e-16 2.53e-14   -8.6 1.88e-06     -  ...   ← wrapper converges
                                                                     in the slacked
                                                                     problem
EXIT: Converged to a point of local infeasibility. Problem may be infeasible.

Read this trace carefully. The wrapper’s inner solve converges to KKT tolerance on the slacked problem — inf_pr falls to 1e-16 in six iterations because the added slack variables s+, s- absorb the inconsistency g0 ≠ g1. But pounce reports the overall verdict on the original constraints, so the final Constraint violation = 0.5 is unchanged: that’s the irreducible gap (g1 − g0)/2. Two independent solvers (filter IPM and ℓ₁-penalty barrier) landing on the same least-infeasible iterate, from different starting strategies, is what makes this an infeasibility certificate rather than a diagnosis of solver fragility.

The recipe in plain English:

Standard solve says “local infeasibility” → may or may not be a real obstruction; could be filter history, LICQ degeneracy, or a bad starting point.
Wrapper agrees on the same least-infeasible iterate → trust the certificate; reformulate the model.
Wrapper promotes to Solve_Succeeded → the standard filter was rejecting a feasible neighborhood it could not reach; the model itself is fine.

Implementation note — running this case used to panic with restoration factory invoked more than once because the CLI wired a one-shot restoration factory into the application. The fix (pounce#24) routes through a multi-pass provider so the wrapper can mint a fresh restoration phase per inner solve. The regression test that guards it (crates/pounce-cli/tests/l1_fallback_no_panic.rs) uses this same infeasible-eq builtin.

Linear solver choice

linear_solver=ma57 (when built with HSL):

pounce problem.nl linear_solver=ma57

For problems that go many hundreds of iterations, the round-off chain of the inner sparse factorization matters — MUMPS, FERAL/SSIDS, and MA57 do not produce bitwise-identical iterates, and on the worst-case instances the difference can be the difference between convergence and a μ-reset spiral (issue #58, issue #64).

Pair with ma57_automatic_scaling=yes (default in HSL builds) and leave linear_system_scaling=none — MA57’s internal scaling and a pounce-level Ruiz pass should not be stacked.

FERAL ordering: when the adaptive dispatcher guesses wrong

When linear_solver=feral (the default) and per-iter wall time is dominated by the linear solve — typical on dense / quadratically- coupled KKT systems where iteration counts look reasonable but seconds-per-iter are high — the fill-reducing ordering choice often matters more than any other knob. By default, feral_ordering=auto picks AMD / AMF / METIS from cheap pattern features. This is right in the common case but can miss badly on a single hard problem.

The safe recipe is to measure the right ordering rather than guess:

pounce problem.nl feral_ordering=auto_race

This runs symbolic factorization on AMD, METIS, SCOTCH and KaHIP and keeps the one with the smallest factor_nnz. Costs ~4× a single symbolic pass — paid once per problem because symbolic factorization is cached across numeric refactorizations with the same pattern, so the overhead is invisible to the per-iter cost on anything but a one-iter problem.

feral_ordering=amd (concrete pin) is the right escalation when the race itself is showing AMD winning consistently — pinning skips the race entirely on subsequent runs. See the full feral_ordering table for the other variants.

Diagnosing before you reach for a knob

Before trying recipes, dump the per-iter diagnostic categories that pounce supports:

pounce problem.nl --dump kkt --dump iterate \
       --dump-dir /tmp/dump-problem

The dumps land as JSONL under /tmp/dump-problem/. Two categories have wired dump sites today:

--dump kkt — KKT residuals and condition-number proxy; large values motivate Ruiz scaling.
--dump iterate — primal/dual values; needed to spot whether a small step is bound-snapping or infeasibility-driven.

The --dump mu and --dump resto categories are accepted by the CLI but not yet wired to a dump site, so they currently emit no data. For the μ trajectory and restoration entries/exits, use the Studio queries below (which read the iteration stream from the solve report).

The Studio MCP (pounce-studio) wraps these dumps in higher-level diagnostic queries (diagnose, find_stalls, restoration_windows), which is the recommended workflow when iterating on options.

Logs, colors, and machine-readable output

POUNCE routes diagnostics through tracing. The knobs are environment variables (see Options › Logging and colored output), not solver options.

When to try it

You want more detail than the iteration table shows (which phase fired, why restoration triggered, linear-solver fallbacks).
A downstream tool (Studio, CI) needs to parse per-iteration data.
Color is garbling a log file, or you want color forced through a pipe.

The knobs

Goal	Invocation
Verbose, everything	`RUST_LOG=debug pounce problem.nl`
Just the restoration phase	`RUST_LOG=pounce::restoration=debug pounce problem.nl`
Separate logs from results	`pounce problem.nl > result.txt 2> solve.log`
Plain text (no color)	`NO_COLOR=1 pounce problem.nl`
Force color through a pipe	`CLICOLOR_FORCE=1 pounce problem.nl
Line-delimited JSON iterations	`POUNCE_LOG_FORMAT=json pounce problem.nl 2> iters.jsonl`

Logs go to stderr; the iteration table, final summary, and --dump output are program output on stdout. The colored table uses a tiger/rust theme — restoration lines get a kind-dependent background and the row text reddens as the step length alpha shrinks, so a stalling or restoration-heavy solve is visible at a glance. When stdout is not a terminal (or NO_COLOR is set) the table is emitted as plain text with the same column layout.

Contributing a new recipe

A recipe earns a place here when:

There is a named, reproducible problem where the recipe demonstrably helps. Mittelmann benchmark (benchmarks/mittelmann/nl/) is preferred but any committed .nl works.
The before/after numbers are captured at print_level=3 or higher and pasted into the worked-example table.
The recipe is not a special case of an existing one. (If your problem needs three knobs together, write one entry; if your problem benefits from a knob already documented here, file a PR to add a second worked example under that entry.)

Open a PR adding to this file with the table populated. The maintainer-side review checks that the numbers reproduce against the current main and that the recipe really is a recipe — not a problem-specific accident.

Benchmarks

The benchmarks/ directory contains comparison harnesses that run POUNCE against upstream Ipopt across several test suites: the Vanderbei CUTE-in-AMPL collection, Mittelmann ampl-nlp, CHO parameter estimation, GasLib pipelines, water-network design, electrolyte thermodynamics, AC optimal power flow, and large-scale synthetic NLPs. Every suite is .nl-driven — a directory of AMPL .nl files solved by both pounce and ipopt.

Common targets:

make benchmark              # full sweep: every suite + composite report
make benchmark-report       # regenerate benchmarks/BENCHMARK_REPORT.md
make benchmark-cho          # one suite at a time
make benchmark-gas
make benchmark-water
make benchmark-mittelmann
make benchmark-vanderbei    # Vanderbei CUTE-in-AMPL collection (733 problems)

The benchmark inputs themselves — the .nl problem files — and the per-run logs and JSON results are regenerated locally and not tracked in the repository. See benchmarks/README.md for the full list and per-suite details.

Color Theme

POUNCE’s terminal output uses one tiger / rust / warm palette across every colored surface — the iteration table, the branded wordmark, and the interactive debugger. This page is the single reference for what the colors mean; the palette itself lives in pounce-common::style (a pure, unit-tested module — no I/O, no globals).

For the environment variables that turn color on/off (NO_COLOR, CLICOLOR_FORCE, RUST_LOG, POUNCE_LOG_FORMAT) see Solver Options → Logging and colored output.

The palette

Name	Hex	Role
`ALPHA_COOL`	`#000000`	iteration-row text at α = 1 (full Newton step)
`ALPHA_HOT`	`#cc2200`	iteration-row text at α → 0 (stalling); molten-claw base
`TAN`	`#8a6d3b`	restoration soft-stay row background (`s`)
`AMBER`	`#b56a12`	restoration soft-exit row background (`S`)
`RUST_DEEP`	`#6e260e`	restoration hard row background (`R` / resto-phase rows)
`CREAM`	`#f5e6c8`	restoration-row text at α = 1
`BRIGHT_YEL`	`#ffe03a`	restoration-row text at α → 0; molten-claw top
`TIGER_ORANGE`	`#e87a1e`	`WARN` logs, banner accents, molten-claw mid

Two further surfaces reuse these or a small extension:

Name	Hex	Role
steel-hi → steel-lo	`#d2d6dc` → `#5c6068`	wordmark letter sheen, top row → bottom row
gold	`#ffb000`	debugger banner highlight (`interior-point debugger`, `help`)
dim	`#7a7e88`	debugger banner gloss text

Where the colors appear

The iteration table

Two orthogonal channels encode solver state on each row:

Background = restoration kind, keyed off the row’s alpha_primal_char tag:
- s soft-stay → tan, S soft-exit → amber, R hard (and the dedicated restoration phase’s r-suffixed rows) → deep rust.
- Normal (non-restoration) rows have no background.
- Tiny-step tags (t/T) deliberately get no background — that stall is shown by the foreground instead.
Foreground = a smooth gradient on the primal step length α ∈ [0, 1] (a visual stalling cue):
- Normal rows: black (α = 1, full step) → hot red (α → 0).
- Restoration rows: cream (α = 1) → bright yellow (α → 0), so the text stays legible on the dark background.
- α is clamped to [0, 1]; a non-finite α is treated as a full step (no false stalling alarm).

So at a glance: a dark row means restoration (its shade tells you which kind), and redder / yellower text means a shorter step (the solver is struggling to move).

The branded wordmark (`pounce` logo)

Printed atop a normal solve and at the top of the debugger REPL. The POUNCE block letters carry a top-lit steel sheen (light silver #d2d6dc at the top row fading to dark steel #5c6068 at the bottom), and three diagonal molten claw slashes rake across them, glowing bright yellow → tiger-orange → deep red top-to-bottom — the project logo’s forged-metal-with-lava look.

The interactive debugger

The REPL open banner (--debug) reuses the same wordmark, then a command cheat-sheet whose shortcut keys are tiger-orange, the interior-point debugger line and the help hint are gold, and the descriptive gloss is dim grey. Pause banners and command output are otherwise uncolored. (--debug-json emits no color — its stdout is a pure JSON channel.)

viz kkt / viz L open in the external Plotly viewer (pounce-dbg-viz), which is a separate visual language: the sparse-matrix heatmaps use a diverging red–blue scale keyed on entry value (sign + magnitude), not the terminal palette.

Logs

WARN-level log lines (on stderr) take the tiger-orange accent; other levels use the subscriber’s defaults.

Terminal support & downgrade

Truecolor (24-bit) is used when the terminal advertises it via COLORTERM — every color above is emitted as exact RGB.
256-color terminals get a graceful fallback: each RGB color snaps to the nearest xterm 6×6×6 cube color (downgrade / nearest_ansi256). The theme still reads correctly, just quantized.

When color is emitted

Color is opt-out and TTY-aware:

The iteration table is colored only when stdout is a terminal (via anstream::AutoStream, which strips escapes from redirected output while keeping identical column alignment).
The debugger banner is colored only when stderr is a terminal.
NO_COLOR (any value) disables color everywhere; CLICOLOR_FORCE forces it even into a non-terminal sink. See Solver Options.

Because the policy is consistent, redirected logs/output are always plain text — safe to diff, grep, and ingest.

For contributors

Add or change colors in pounce-common::style, never with inline ANSI: the constants, the α-gradient (alpha_gradient_rgb), the restoration mapping (resto_background_rgb), the composed iteration_row_style, and the truecolor downgrade all live there and are unit-tested without a TTY. Print sites style through anstyle + anstream (or, for the debugger banner on stderr, gate on stderr().is_terminal() and NO_COLOR). Keep the two iteration-table channels — background = restoration kind, foreground = step length — orthogonal.

Acknowledgments

POUNCE’s nonlinear-programming core is a Rust port of Ipopt, the interior-point nonlinear programming solver by Andreas Wächter, Lorenz T. Biegler, and the COIN-OR community. Its algorithm, console output, and option semantics are modeled directly on that codebase, which is released under the EPL-2.0.

It is a sibling of ripopt, an earlier memory-safe interior-point NLP optimizer in Rust by the same author (DOI 10.5281/zenodo.19542664).

Convex solver inspiration

The specialized convex conic solver (pounce-convex; see Convex Solver) is a pure-Rust port of ideas — not a wrapper — from two reference projects, gratefully acknowledged:

Clarabel by Paul Goulart and Yuwen Chen (University of Oxford). POUNCE’s homogeneous-free conic interior-point design — a quadratic objective handled directly over a product of symmetric cones, with Nesterov–Todd scaling for the second-order cone and a diagonal-plus-rank-1 sparse KKT representation — follows Clarabel’s approach. Clarabel is itself a pure-Rust solver; POUNCE shares the spirit but is an independent implementation.
PaPILO, the presolving library of SCIP (the Zuse Institute Berlin optimization suite). POUNCE’s transaction-stack presolve with full primal and dual postsolve — forcing constraints, dominated columns, bound tightening with global dual recovery, parallel/duplicate rows, iterated to a fixpoint — is modeled on PaPILO’s catalog and postsolve discipline.

Contributors

David Bernal Neira (@bernalde) designed and prototyped the auxiliary-equality preprocessing pass in ripopt PR #32. POUNCE’s pounce-presolve::auxiliary Phase-0 orchestrator (issue #53) is a port of that work — Hopcroft-Karp matching, Dulmage-Mendelsohn partition, Tarjan SCC, block-triangular reduction, damped-Newton block solver, reduction frame with multiplier recovery — and ships with the tutorial_flow_density{,_perturbed}.nl and gaslib11_steady.nl test fixtures David vendored.
Milan Rother (@milanofthe) suggested the boundary value problem solver and the tritium gas-liquid-contactor (GLC) test problem behind docs/src/bvp.md and python/examples/glc_feral_vs_scipy.py. The GLC model is adapted from pathsim-chem (src/pathsim_chem/tritium/glc.py, MIT License, Copyright (c) 2025 PathSim).

Key references

Wächter, A., Biegler, L.T. “On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.” Mathematical Programming 106(1), 25–57 (2006). DOI 10.1007/s10107-004-0559-y — the algorithm POUNCE implements.
Wächter, A., Biegler, L.T. “Line search filter methods for nonlinear programming: Motivation and global convergence.” SIAM Journal on Optimization 16(1), 1–31 (2005). DOI 10.1137/S1052623403426556
Wächter, A., Biegler, L.T. “Line search filter methods for nonlinear programming: Local convergence.” SIAM Journal on Optimization 16(1), 32–48 (2005). DOI 10.1137/S1052623403426544
Fletcher, R., Leyffer, S. “Nonlinear programming without a penalty function.” Mathematical Programming 91(2), 239–269 (2002). DOI 10.1007/s101070100244 — the filter concept underlying the line search.
Pirnay, H., López-Negrete, R., Biegler, L.T. “Optimal sensitivity based on IPOPT.” Mathematical Programming Computation 4(4), 307–331 (2012). DOI 10.1007/s12532-012-0043-2 — the sIPOPT method behind pounce-sensitivity.
Duff, I.S. “MA57—a code for the solution of sparse symmetric definite and indefinite systems.” ACM Transactions on Mathematical Software 30(2), 118–144 (2004). DOI 10.1145/992200.992202 — the optional ma57 linear-solver backend.
Goulart, P.J., Chen, Y. “Clarabel: An interior-point solver for conic programs with quadratic objectives.” (2024). arXiv:2405.12762 / Clarabel.rs — the conic interior-point design behind pounce-convex.
Gleixner, A., Gottwald, L., Hoen, A. “PaPILO: A Parallel Presolving Library for Integer and Linear Optimization with Multiprecision Support.” INFORMS Journal on Computing 35(6), 1329–1341 (2023). DOI 10.1287/ijoc.2022.0171 — the presolve catalog and dual-postsolve model behind pounce-convex::presolve.
Domahidi, A., Chu, E., Boyd, S. “ECOS: An SOCP solver for embedded systems.” European Control Conference (2013), 3071–3076. DOI 10.23919/ECC.2013.6669541 — the sparse second-order-cone KKT representation.
Amos, B., Kolter, J.Z. “OptNet: Differentiable Optimization as a Layer in Neural Networks.” ICML (2017), 136–145. arXiv:1703.00443 — the implicit differentiation behind the pounce.jax convex layers.
Wilkinson, M.D. et al. “The FAIR Guiding Principles for scientific data management and stewardship.” Scientific Data 3, 160018 (2016). DOI 10.1038/sdata.2016.18 — the provenance model behind the JSON solve report.

Keyboard shortcuts

POUNCE