Introduction
POUNCE is a pure-Rust port of the Ipopt interior-point nonlinear programming solver. It solves problems of the form
min f(x)
s.t. g_L <= g(x) <= g_U
x_L <= x <= x_U
where f and g are twice-continuously-differentiable.
The algorithm, console output, and option semantics follow upstream
Ipopt closely enough that anyone used to reading ipopt logs can drop
in pounce without relearning where the numbers live.
Pure Rust by default
The default build is pure Rust — no Fortran, no commercial solver, no system BLAS
required. The bundled FERAL backend provides a sparse symmetric LDLᵀ
factorization. The HSL MA57 backend is available behind the optional
ma57 feature for users who have a license for libcoinhsl and have it installed (see
Installation).
Status
Production-ready for the core IPM workflow. The algorithm-side core,
NLP interface, line search, filter, barrier update (monotone +
Mehrotra adaptive), KKT solve, restoration phase, AMPL .nl reader,
the C ABI (pounce-cinterface), the Python wrapper (pounce-solver),
and the CLI all solve a wide range of NLPs from the standard test
suites (Hock-Schittkowski, CUTEst, Mittelmann ampl-nlp, CHO parameter
estimation, gas/water network design). Sensitivity analysis (sIPOPT
port), reduced-Hessian computation, the auxiliary-equality + FBBT
presolve, and the active-set SQP path are all wired in and available
behind option keys. Existing PyIpopt / cyipopt / JuMP / AMPL clients
link against libpounce_cinterface in place of libipopt
unchanged.
License
EPL-2.0, the same license as upstream Ipopt.
Where to go next
- Installation — build and install POUNCE.
- Quick Start — solve your first problem.
- Running Solves — the command-line driver in depth.
- Acknowledgments — the papers behind the algorithm.
Installation
Prerequisites
A stable Rust toolchain. Nothing else is needed for the default pure-Rust build. Install Rust via rustup:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"
Verify the install:
rustc --version && cargo --version
Build
From the repository root:
make # release build of the workspace
make test # run all tests
make clippy # lint
make doc # rustdoc for the Rust API
Install
make install # installs to $HOME/.local
sudo make install PREFIX=/usr/local # or system-wide
This drops the pounce binary into $PREFIX/bin and the
libpounce_cinterface shared library into $PREFIX/lib. Make sure
$HOME/.local/bin is on your PATH, then verify:
pounce --version
HSL MA57 backend (optional)
The default FERAL backend needs no external libraries. To build with
the HSL MA57 linear solver instead, you need a CoinHSL install whose
lib/ directory holds libcoinhsl. Point the COINHSL_DIR
environment variable at it and build with the ma57 feature:
export COINHSL_DIR=/path/to/CoinHSL
cargo build -p pounce-cli --release --features ma57
Build CoinHSL from https://www.hsl.rl.ac.uk/ipopt/. MA57 is
primarily useful for benchmarking against upstream Ipopt; the FERAL
backend is the supported default for everyday use, and a build without
--features ma57 never touches COINHSL_DIR.
Using POUNCE as a Rust library
The workspace is a set of library crates (see Algorithm & Workspace for the layout). To browse the Rust API, build and open the rustdoc:
make doc # generates target/doc
Quick Start
This page assumes POUNCE is built and on your PATH
(see Installation).
Solve an AMPL .nl file
pounce problem.nl
This solves the problem and writes a sibling problem.sol next to the
input, following the AMPL solver convention. The console output
mirrors upstream ipopt’s banner, per-iteration table, and final
summary.
Append KEY=VALUE pairs to override options — the syntax and
semantics match the upstream Ipopt CLI:
pounce problem.nl print_level=8 max_iter=500 tol=1e-10
See Solver Options for details.
Try a built-in problem
POUNCE ships several self-contained test problems that exercise the
full pipeline without parsing a .nl file (run pounce --list-problems
for the full set):
pounce --list-problems
pounce --problem rosenbrock
pounce --problem quadratic
From Python
import numpy as np
from pounce import minimize
res = minimize(lambda x: ((x - 1) ** 2).sum(), x0=np.zeros(3))
print(res.fun, res.x)
See the Python API chapter for the full cyipopt-compatible interface.
From Pyomo
import pyomo_pounce # registers 'pounce'
from pyomo.environ import SolverFactory
SolverFactory('pounce').solve(model)
See the Pyomo chapter for details.
Full help
pounce --help
Running Solves
The pounce command-line driver solves built-in TNLPs and AMPL .nl
files. Its console output mirrors upstream ipopt’s banner,
per-iteration table, and final summary, so anyone used to reading
ipopt logs can read pounce logs unchanged.
Basic usage
pounce problem.nl
pounce problem.nl print_level=8 max_iter=500 tol=1e-10
pounce problem.nl linear_solver=ma57 # with --features ma57
pounce problem.nl --options-file ipopt.opt # upstream-format options file
Trailing KEY=VALUE pairs follow the same syntax and semantics as the
upstream Ipopt CLI; they override values loaded from --options-file.
See Solver Options.
Built-in problems
pounce --list-problems
pounce --problem quadratic
pounce --problem rosenbrock
quadratic—min (x[0]-3)² + (x[1]-4)²(unconstrained, optimum(3, 4)).rosenbrock—min 100·(x[1]-x[0]²)² + (1-x[0])²(unconstrained, optimum(1, 1)).bounded-quadratic—quadraticwith box bounds0 ≤ x ≤ 2(optimum at the upper corner(2, 2)).eq-quadratic—min x[0]² + x[1]²s.t.x[0] + x[1] = 1(a single equality).circle—min x[0]s.t.x[0]² + x[1]² = 1(a nonlinear equality).infeasible-eq— two contradictory equalities (x[0]+x[1]=1and=2); exercises the infeasibility-detection path.
Run pounce --list-problems for the authoritative list.
Built-in problems have no .nl stub, so they only write a .sol file
when --sol-output is given explicitly.
Degenerate / MPCC NLPs — the ℓ₁-exact penalty-barrier wrapper
For problems where the standard IPM thrashes in restoration because LICQ fails at the iterate (degenerate equalities, MPCC-like complementarity), enable the Thierry–Biegler ℓ₁-exact penalty-barrier wrapper:
pounce problem.nl l1_exact_penalty_barrier=yes
The wrapper turns every equality row c_i(x) = g_i into a
slack-relaxed c_i(x) − p_i + n_i = g_i with (p_i, n_i) ≥ 0,
augments the objective by ρ · Σ(p + n), and runs a
Byrd–Nocedal–Waltz outer loop that escalates ρ until the slacks
collapse (constraints satisfied) or saturate (locally infeasible
problem detected). The user-visible (x*, λ*) are reported in the
original variable space.
For everyday use, the simpler form is an auto-fallback:
pounce problem.nl l1_fallback_on_restoration_failure=yes
POUNCE first runs the standard solve. If it terminates in
Restoration_Failed, Infeasible_Problem_Detected,
Solved_To_Acceptable_Level, Maximum_Iterations_Exceeded, or
Not_Enough_Degrees_Of_Freedom, the wrapper is invoked transparently
and the result is promoted to Solve_Succeeded only if the retry
succeeds. Otherwise the original status is preserved.
The tuning knobs are listed under Solver Options.
AMPL / Pyomo solver mode
AMPL drivers — and Pyomo’s ASL interface — invoke a solver as
solver problem.nl -AMPL. Pass -AMPL to run pounce that way:
pounce problem.nl -AMPL
It changes nothing about the solve itself; it switches the process to
the AMPL exit-code contract (see below), so the driver reads the
termination from the .sol file rather than the exit status. The
pyomo-pounce package builds on top of this mode.
Exit codes
0—Solve_Succeeded(orSolved_To_Acceptable_Level).- non-zero — any other
ApplicationReturnStatus.
In AMPL solver mode (-AMPL) the exit code instead follows the AMPL
contract: 0 for any solve that ran and produced a .sol file —
limit-reached, infeasible, even a failed solve — since the termination
is carried by the file’s solve_result_num. Genuine startup failures
(unreadable .nl, bad option) still exit non-zero.
Diagnostics & introspection
pounce --about # version, build info, features, backends
pounce problem.nl --dump kkt:5-10 --dump iterate # dump per-iteration diagnostics
pounce problem.nl --dump kkt --dump-dir /tmp/d # override the dump root
--about— print version, build info, enabled features, and linear-solver backends, then exit.--dump <cat>[:<spec>]— write the diagnostic category to per-iteration files (JSONL). Wired categories arekktanditerate; an optional:<spec>selects iterations (e.g.kkt:5,kkt:2-10,iterate:all).--dump-dir <path>— override the dump root (default./pounce-dump-<timestamp>).--dump-format <fmt>— dump format (defaultjsonl).
Help
pounce --help
pounce --version # also -v, -V
Solver Options
POUNCE accepts options the same way upstream Ipopt does. Option names
and semantics follow Ipopt’s, so an existing Ipopt options file or
KEY=VALUE invocation works unchanged.
Setting options
On the command line — append KEY=VALUE pairs after the input:
pounce problem.nl tol=1e-10 max_iter=500 print_level=8
From an options file — upstream ipopt.opt format:
pounce problem.nl --options-file ipopt.opt
Command-line KEY=VALUE pairs override values loaded from the options
file.
Commonly used options
| Option | Meaning |
|---|---|
tol | Overall convergence tolerance on the KKT error. |
max_iter | Maximum number of outer iterations. |
print_level | Console verbosity, 0 (silent) – 12 (maximum debug). |
linear_solver | KKT linear-solver backend. ma57 requires the ma57 feature build. |
mu_strategy | Barrier-parameter update strategy (monotone / adaptive). |
For the full upstream option catalogue, see the Ipopt options reference; POUNCE reuses those names.
For scaling-specific options (nlp_scaling_method, target-gradient
overrides, linear_system_scaling), see the Scaling
reference page. For nonlinear bound tightening (presolve_fbbt,
fbbt_tol, fbbt_max_iter, fbbt_max_constraints), see the
FBBT reference page.
Barrier-parameter (μ) strategy
The barrier parameter μ controls the inner subproblem’s relaxation of
complementarity. The two strategies are monotone (default — geometric
schedule) and adaptive (quality-function oracle picks each μ from the
current iterate’s complementarity). See
μ-strategy for when to switch.
| Option | Default | Meaning |
|---|---|---|
mu_strategy | monotone | monotone (Fiacco–McCormick schedule) or adaptive (oracle-driven). |
mu_oracle | quality-function | Adaptive oracle: quality-function / loqo / probing. |
mu_init | 0.1 | Seed value for μ at the first iterate. |
mu_min | 1e-11 | Floor on μ; the solver stops decreasing past this. |
mu_max | 1e5 | Cap on μ (adaptive mode). When set explicitly it overrides the mu_max_fact initialization. |
mu_max_fact | 1e3 | Initializes mu_max as mu_max_fact · curr_avrg_compl at the first iterate (adaptive mode). |
mu_target | 0.0 | Stop target for μ in monotone mode. |
mu_linear_decrease_factor | 0.2 | κ_μ in μ ← min(κ_μ · μ, μ^θ_μ). |
mu_superlinear_decrease_power | 1.5 | θ_μ in the same formula. |
barrier_tol_factor | 10.0 | Inner-subproblem tolerance scales as barrier_tol_factor · μ. |
sigma_max | 1e2 | Upper clamp on σ chosen by the quality-function oracle. |
sigma_min | 1e-6 | Lower clamp on σ (raising this to 1e-2 can break a stair-stepping stall on some problems). |
adaptive_mu_globalization | obj-constr-filter | Adaptive-mode globalization: kkt-error, obj-constr-filter, or never-monotone-mode. |
Quality-function oracle (adaptive-μ details)
These are only consumed when mu_strategy=adaptive and
mu_oracle=quality-function. Defaults mirror upstream
IpQualityFunctionMuOracle::RegisterOptions.
| Option | Default | Meaning |
|---|---|---|
quality_function_norm_type | 2-norm-squared | Norm used to aggregate KKT components inside q(σ): 1-norm, 2-norm, 2-norm-squared, max-norm. |
quality_function_centrality | none | Centrality penalty term: none, log, reciprocal, cubed-reciprocal. |
quality_function_balancing_term | none | Balancing penalty when complementarity ≪ infeasibilities: none or cubic. |
quality_function_max_section_steps | 8 | Cap on golden-section iterations when picking σ. |
quality_function_section_sigma_tol | 1e-2 | Width tolerance in σ-space terminating the golden-section search. |
quality_function_section_qf_tol | 0.0 | Relative flatness tolerance on q(σ) terminating golden section. |
Adaptive-μ globalization
Tuning the safeguards that fall back to monotone-μ mode when the
adaptive oracle stops making progress. Defaults mirror upstream
IpAdaptiveMuUpdate::RegisterOptions.
| Option | Default | Meaning |
|---|---|---|
adaptive_mu_safeguard_factor | 0.0 | LOQO safeguard floor on the oracle’s μ candidate. |
adaptive_mu_monotone_init_factor | 0.8 | Multiplier on avrg_compl when seeding monotone mode after a bailout. |
adaptive_mu_restore_previous_iterate | no | Restore the latest free-mode iterate when switching to fixed mode. |
adaptive_mu_kkterror_red_iters | 4 | Window length for the kkt-error globalization history. |
adaptive_mu_kkterror_red_fact | 0.9999 | Required relative KKT-error reduction over that window. |
adaptive_mu_kkt_norm_type | 2-norm-squared | Norm used to score the iterate in adaptive globalization decisions. |
ℓ₁ penalty-barrier wrapper options
These tune the degenerate-NLP wrapper described in Running Solves. All are default-tuned and rarely need overriding:
| Option | Default | Meaning |
|---|---|---|
l1_exact_penalty_barrier | no | Run the ℓ₁-exact penalty-barrier wrapper unconditionally. |
l1_fallback_on_restoration_failure | no | Retry with the wrapper only when the standard solve fails. |
l1_penalty_init | 1.0 | Initial penalty weight ρ. |
l1_penalty_max | 1e6 | Maximum penalty weight before declaring infeasibility. |
l1_penalty_increase_factor | 8.0 | Multiplier applied to ρ each outer iteration. |
l1_penalty_max_outer_iter | 8 | Maximum penalty outer iterations. |
l1_slack_tol | 1e-6 | Slack tolerance for “constraints satisfied”. |
l1_steering_factor | 10.0 | Steering-rule factor for ρ escalation. |
NLP Presolve
POUNCE’s TNLP-wrapper presolve pipeline runs before the IPM starts. It tightens variable bounds, drops redundant rows, and (optionally) eliminates square auxiliary-equality sub-systems structurally. All are off by default — set the master switch first:
| Option | Default | Meaning |
|---|---|---|
presolve | no | Master switch for the whole presolve layer. Off → wrapper is a no-op. |
presolve_bound_tightening | yes | Phase 1 — Andersen-style bound propagation from linear rows. |
presolve_redundant_constraint_removal | yes | Phase 2 — drop linear constraints already implied by current bounds. |
presolve_linear_eq_reduction | no | Phase ≥2 — eliminate fixed singleton variables exposed by linear equalities. |
presolve_licq_check | yes | Phase 3 — detect rank-deficient equality blocks before the IPM starts. |
presolve_licq_action | warn | What to do on degeneracy: warn (just report) or auto_l1 (turn on ℓ₁). |
presolve_warm_z_bounds | yes | Phase 4 — warm-start bound multipliers when bounds get tightened by Phase 1. |
presolve_bound_mult_init_val | 1.0 | Value used by Phase 4 for those warm-start hints. |
presolve_max_passes | 3 | Fixed-point iteration cap across the bound-tightening passes. |
presolve_print_level | 0 | Per-pass verbosity (0 silent, 5 per-pass, 8 per-transformation). |
Feasibility-based bound tightening (Phase 1b)
Interval-arithmetic propagation through nonlinear constraint
expression DAGs (see FBBT). Available today for
.nl-loaded problems via NlTnlp; other TNLP sources opt out
silently.
| Option | Default | Meaning |
|---|---|---|
presolve_fbbt | no | Master switch. Requires presolve=yes and an ExpressionProvider. |
fbbt_tol | 1e-6 | Minimum per-variable bound improvement to keep iterating. |
fbbt_max_iter | 10 | Outer-sweep cap. |
fbbt_max_constraints | 0 | Per-sweep cap on constraints inspected (0 = unlimited). |
Auxiliary-equality preprocessing (Phase 0)
A separate set of options controls the structural elimination pass documented in Auxiliary-Equality Preprocessing:
| Option | Default | Meaning |
|---|---|---|
presolve_auxiliary | no | Master switch for the Phase-0 structural elimination pass. |
presolve_auxiliary_coupling | safe | Which coupling classes are eligible: none / safe / aggressive. |
presolve_auxiliary_tol | 1e-8 | Residual tolerance for accepting a candidate block solve. |
presolve_auxiliary_max_block_dim | 8 | Largest block the lightweight Newton solver will attempt (larger blocks rejected in v1). |
presolve_auxiliary_wall_time_fraction | 0.1 | Fraction of the solver’s wall-time budget the pass is allowed to spend. |
presolve_auxiliary_diagnostics | no | Emit the diagnostics summary via the journalist after Phase 0 runs. |
FERAL backend tuning
linear_solver=feral (the default — see
Commonly used options) is configurable
through six feral_* options. Defaults are tuned for the IPM
workload and rarely need changing; reach for these when profiling a
specific problem.
| Option | Default | Meaning |
|---|---|---|
feral_ordering | auto | Fill-reducing ordering method (see table below). auto lets feral’s adaptive dispatcher pick per-matrix; auto_race measures the actual symbolic outcome and keeps the best. |
feral_pivtol | 1e-8 | Relative Bunch-Kaufman partial-pivoting threshold u. Analog of ma27_pivtol / ma57_pivtol. Smaller → sparser L, faster, less stable; larger → more 2×2 blocks, denser, more stable. LAPACK’s textbook maximum-stability value is 0.5. |
feral_refine | yes | Iterative refinement on every back-solve. Closes the residual floor from cascade-break’s L-factor perturbation; disable only when timing the bare factor + back-solve in isolation. |
feral_cascade_break | (unset) | Tri-state. Unset → inherit feral’s Phase B default (CB on with bounded delayed-pivot catchment). yes records explicit intent (no behavioural change). no reproduces pre-Phase-B behaviour by surfacing DelayBudgetExceeded on non-root cascade victims. |
feral_fma | no | Dispatch dense kernels through fused multiply-add intrinsics. Roughly 2× throughput on aarch64 / x86_v3, at the cost of per-pivot rounding drift that trips more WrongInertia checks. Turn on when kernel throughput dominates and the IPM tolerates a noisier inertia signal. |
feral_singular_pivot_floor | 1e-20 | Pounce’s analog of MA57’s CNTL(2). After a successful factor, the smallest accepted D-block pivot magnitude (scaled space) is compared against this absolute floor; if it falls below, the factor is reported Singular so the IPM bumps δ_w. 0 disables. |
feral_ordering variants
All six concrete and adaptive options live under the same string
option. feral_ordering also falls back to the
POUNCE_FERAL_ORDERING environment variable when not set on the
OptionsList.
| Value | Strategy |
|---|---|
auto | Default. Adaptive dispatcher: picks a concrete method per matrix from cheap pattern features. Branches: very-large-and-sparse (n > 100 000, avg degree < 5) → AMD; n ≤ 10 000 → AMF; otherwise → MetisND. One symbolic pass; right when the heuristic shape rules apply (the common case). |
auto_race | Race-based dispatcher: runs full symbolic factorization on AMD, MetisND, ScotchND, KahipND and keeps the smallest factor_nnz. ~4× a single symbolic pass, paid once per problem (symbolic factorization is cached across numeric refactorizations with the same pattern). Use when the cheap dispatcher’s guess is suspect — e.g. pinene_3200_0009, where auto picks MetisND (88 s numeric factor) but amd factors in 19.5 s on the same matrix. |
amd | Approximate Minimum Degree (Amestoy/Davis/Duff). Pins AMD regardless of problem shape; robust default for IPM workloads. Best for very-large-and-sparse cases that the adaptive dispatcher already routes here. |
amf | Approximate Minimum Fill (HAMF4 variant of Amestoy 1999). Strong on small-and-sparse populations (n ≤ 10 000); aggregate fill ≈ 0.87× AMD on feral’s IPM small-sparse inventory. |
metis | feral-metis multilevel nested dissection. Tends to produce squarer fronts than AMD on banded / nearly-1D structure; preferred for large structured matrices. |
scotch | feral-scotch nested dissection. Similar regime to METIS; alternative when METIS is unavailable or for cross-validation. |
kahip | feral-kahip flow-based nested dissection with K1 preprocessing. Ties METIS on fill geomean at 4–6× per-call symbolic cost. Reach for it only when ND fill matters and per-call cost is amortized. |
When in doubt: leave feral_ordering at the default. When a hard
problem looks linear-solver-bound, try feral_ordering auto_race
before per-variant manual sweeping — it’s the safe choice when the
per-problem winner is uncertain.
Logging and colored output
POUNCE emits structured logs and a colored iteration table through the
tracing ecosystem. Behavior is governed by
environment variables (not solver options), so they apply to the pounce
CLI, the C/Python frontends, and anything embedding the library.
| Variable | Values | Effect |
|---|---|---|
RUST_LOG | e.g. info, debug, pounce::restoration=debug | Log verbosity / per-target filtering. Default info. Logs go to stderr. |
POUNCE_LOG_FORMAT | text (default) · json | json emits line-delimited JSON on stderr (incl. the per-iteration pounce::iteration stream) for Studio / CI ingestion. |
NO_COLOR | set to any value | Disables ANSI color in the iteration table and logs (see https://no-color.org). |
CLICOLOR_FORCE | set to any value | Forces color even when stdout is not a terminal. |
Filtering by subsystem. Solver internals log under namespaced targets
— pounce::algorithm, pounce::linsol, pounce::mu, pounce::sqp,
pounce::linesearch, pounce::restoration, pounce::presolve,
pounce::py. For example, to trace only the restoration phase:
RUST_LOG=pounce::restoration=debug pounce problem.nl
Program output vs. logs. The iteration table, the final summary, and
--dump diagnostics are program output on stdout; diagnostic and
progress messages are logs on stderr. Redirecting one does not
affect the other:
pounce problem.nl > result.txt 2> solve.log
Color. The iteration table is colored with a tiger/rust theme:
restoration lines take a background that varies by restoration kind
(soft-stay → tan, soft-exit → amber, hard → deep rust), and the row text
shades from black toward red as the primal step length alpha shrinks
(stalling). Color is emitted only when stdout is a terminal; redirected
output and NO_COLOR get plain text with identical column alignment.
Machine-readable iterations. POUNCE_LOG_FORMAT=json turns the
per-iteration records into JSON on stderr:
POUNCE_LOG_FORMAT=json pounce problem.nl 2> iters.jsonl
Solution Output
The .sol file
Following the AMPL solver convention, solving a positional .nl file
writes a sibling <stub>.sol next to it — pounce problem.nl
produces problem.sol. The file carries the primal x and dual
lambda blocks plus an objno line with the AMPL solve_result_num,
so AMPL (or any .sol reader) can pull the solution back:
pounce problem.nl # writes problem.sol
pounce problem.nl --sol-output out.sol # write to an explicit path
pounce problem.nl --no-sol # skip the .sol write
A .sol is written even when the solve fails, so the
solve_result_num is always recoverable. Built-in problems
(--problem …) have no .nl stub, so they only produce a .sol
when --sol-output is given explicitly.
Choosing an output format
| You want… | Use |
|---|---|
| AMPL / Pyomo to read the result back | the .sol file (default) |
| A structured, schema-versioned report for tooling | --json-output (see JSON Solve Report) |
| Just the console summary | --no-sol |
The .sol and JSON outputs are not exclusive — you can request both
in the same run.
JSON Solve Report
Pass --json-output PATH to write a structured solve report alongside
the regular console output:
pounce problem.nl --json-output result.json
pounce problem.nl --json-output result.json --json-detail full
The report carries everything an AMPL .sol file holds — status,
primal x, dual lambda, suffix blocks — plus FAIR-aligned
provenance metadata (Wilkinson et al. 2016, DOI
10.1038/sdata.2016.18) and,
optionally, the per-iteration trajectory.
Detail levels
| Level | Emits |
|---|---|
summary (default) | FAIR metadata, problem dimensions, final solution, aggregate statistics. |
full | The above plus the per-iteration trajectory (iter, objective, inf_pr, inf_du, mu, step norms, alphas, line-search trials) and sensitivity / suffix blocks. |
Choose summary for production logs and batch runs; full for
debugging — it is the JSON equivalent of upstream’s print_level=8.
Schema stability
The schema is versioned (pounce.solve-report/v1) so downstream
tooling can pin against a major version:
- Adding fields is non-breaking — consumers must tolerate unknown fields.
- Removing or renaming a field bumps the major version (
v1→v2).
The Schema v1 Reference documents every field, the FAIR mapping, and the versioning policy in full.
POUNCE solve-report schema, v1
Schema tag: pounce.solve-report/v1
This document is the canonical reference for the JSON solve report
emitted by pounce --json-output PATH and pounce_sens --json-output PATH. The report carries everything an AMPL .sol file holds —
status, primal x, dual lambda, suffix blocks — plus FAIR-aligned
provenance metadata and (optionally) the per-iteration trajectory.
Implementation: the serde structs live in crates/pounce-solve-report/src/lib.rs (per-iteration IterRecord in crates/pounce-nlp/src/solve_statistics.rs); crates/pounce-cli/src/solve_report.rs wires them to the CLI.
Why a structured solve report?
Production NLP workflows often need to (a) capture which solve
produced which numbers for audit / reproducibility, (b) feed solver
output into downstream tooling (notebooks, dashboards, ML pipelines)
that don’t want to parse a free-form .sol file, and (c) compare
runs across versions of pounce. Both upstream Ipopt’s stdout summary
and AMPL’s .sol were designed for human consumption and AMPL’s
reader respectively — neither carries provenance metadata, neither is
schema-versioned, and neither is trivially machine-parseable across
ecosystems.
A versioned JSON schema with FAIR-aligned provenance solves all three.
FAIR alignment
The fair_metadata block maps onto the four FAIR principles
(Wilkinson et al. 2016, “The FAIR Guiding Principles for scientific
data management and stewardship”, Scientific Data 3, 160018, DOI
10.1038/sdata.2016.18; citation
verified via Crossref on 2026-05-14):
| Principle | Mapping in this schema |
|---|---|
| Findable | result_id (<unix_nanos>-<pid>, globally unique and time-ordered), created_at_iso, created_at_unix_nanos. |
| Accessible | Plain-text JSON on disk; no protocol gating; UTF-8. Same trust model as the .sol file. |
| Interoperable | Schema-versioned (pounce.solve-report/v1); JSON primitives only (no binary blobs); units documented per-field below; solution.status is the enum-variant string for cross-language consumption. |
| Reusable | solver (name + version + git commit + target triple), license, input (kind + path + size) capture enough provenance to reproduce a solve. |
Versioning policy
schema is the version tag. Compatibility rules:
- Adding fields is non-breaking. Consumers MUST tolerate unknown fields. New optional fields land between versions; the major version doesn’t bump.
- Removing or renaming fields bumps the major version (
v1→v2). Consumers should pin against a major version (schema starts_with "pounce.solve-report/v1"). - Changing field semantics without a rename is forbidden. If semantics need to change, add a new field and deprecate the old.
The pre-1.0 phase of POUNCE itself does NOT relax this rule for the schema. Once a solve-report version ships, its field set is frozen even while the rest of the solver is under churn.
Top-level shape
{
"schema": "pounce.solve-report/v1",
"fair_metadata": { ... },
"problem": { ... },
"solution": { ... },
"statistics": { ... },
"iterations": [ ... ], // optional, omitted when empty
"linear_solver": { ... } // optional, omitted when backend did not report
}
Fields
schema (string, required)
Identifier for this schema version. Always
"pounce.solve-report/v1" for v1. Major-version bumps change the
prefix; minor / patch (additive) changes do not.
fair_metadata (object, required)
| Field | Type | Notes |
|---|---|---|
result_id | string | Format: <unix_nanos>-<process_id>. Monotonically ordered within a process, globally unique across processes. No external UUID library needed. |
created_at_iso | string | Solve start time as ISO-8601 UTC: YYYY-MM-DDTHH:MM:SS.sssZ. |
created_at_unix_nanos | integer | Same instant as Unix nanoseconds since 1970-01-01 UTC. Provided alongside the ISO string for consumers that prefer integer arithmetic. |
elapsed_seconds | float | Wallclock seconds the solve took (matches statistics.total_wallclock_time_secs modulo float precision). |
solver | object | See below. |
license | string | SPDX identifier. Always "EPL-2.0" for this version. |
input | object | See Input descriptor below. |
solver sub-object
| Field | Type | Notes |
|---|---|---|
name | string | Always "pounce". |
version | string | Crate version (e.g. "0.1.0"). Read from CARGO_PKG_VERSION at build time. |
git_commit | string | omitted | Build-time git revision. Omitted when the build environment did not set POUNCE_GIT_COMMIT (e.g. development builds). Set via POUNCE_GIT_COMMIT=$(git rev-parse HEAD) cargo build to populate. |
target_triple | string | Build target triple (e.g. "x86_64-apple-darwin"); falls back to "unknown" when Cargo did not expose TARGET at build time. |
Input descriptor (input)
Tagged enum keyed on kind. Possible shapes:
{ "kind": "nl-file", "path": "/path/to/foo.nl", "size_bytes": 366 }
{ "kind": "builtin", "name": "rosenbrock" }
{ "kind": "tnlp-direct" }
nl-file— the input came from.nlfile atpath.size_bytesis present when the file’s metadata is readable; consumers that want bit-exact provenance can hash the file themselves.builtin— the input was a built-in problem named byname(e.g.pounce --problem rosenbrock).tnlp-direct— used by library callers building a TNLP in-process without a.nlround-trip.
problem (object, required)
Problem dimensions reported by the TNLP at get_nlp_info().
| Field | Type | Notes |
|---|---|---|
n_variables | integer | Number of primal variables. |
n_constraints | integer | Number of constraints (equalities + inequalities). |
n_objectives | integer | Number of objectives. The IPM uses objective 0; extras are read but ignored. |
minimize | boolean | true for minimization (the AMPL default). |
nnz_jac_g | integer | omitted | Number of declared non-zeros in the constraint Jacobian. |
nnz_h_lag | integer | omitted | Number of declared non-zeros in the lower triangle of the Lagrangian Hessian. |
solution (object, required)
| Field | Type | Notes |
|---|---|---|
status | string | ApplicationReturnStatus enum variant name verbatim (e.g. "SolveSucceeded", "MaximumIterationsExceeded"). |
solve_result_num | integer | AMPL-style solve-result code (Gay 2005, “Hooking Your Solver to AMPL” §5, p. 23 table): 0 = solved, 100-range = warning, 200-range = infeasible, 400-range = limit reached, 500-range = failure. |
objective | float | Final unscaled objective value. 0.0 (not NaN) when the solve never completed; check statistics.iteration_count > 0 to distinguish. |
x | array of float | empty | Primal vector, length problem.n_variables. Empty when the binary doesn’t capture the final iterate (currently: pounce on the newton_driver fast-path). Omitted from JSON when empty. |
lambda | array of float | empty | Constraint multipliers, length problem.n_constraints. Same omission convention as x. |
suffixes | array of object | empty | sIPOPT-style suffix blocks; emitted only at --json-detail full. See below. |
Suffix entries
{
"name": "sens_sol_state_1",
"target": "var",
"kind": "real",
"values": [0.576..., 0.378..., -0.046..., 4.5, 1.0]
}
| Field | Type | Notes |
|---|---|---|
name | string | AMPL suffix name. |
target | string | One of "var", "con", "obj", "problem". Matches AMPL’s Sufkind_* enum. |
kind | string | "real" or "int". Selects which payload array is populated. |
values | array of float | Dense values (length = target dimension). Present when kind = "real". |
int_values | array of integer | Present when kind = "int". |
statistics (object, required)
Projection of pounce_nlp::solve_statistics::SolveStatistics minus
the per-iteration history (which lives at the top level when present).
| Field | Type | Notes |
|---|---|---|
iteration_count | integer | Number of accepted outer iterations. |
final_objective | float | Unscaled. Matches solution.objective. |
final_scaled_objective | float | Scaled by the IPM’s internal NLP scaling. Equal to final_objective when no scaling is in effect. |
final_dual_inf | float | ` |
final_constr_viol | float | ` |
final_compl | float | Max complementarity over the four bound blocks. |
final_kkt_error | float | Overall KKT error reported by the convergence check. |
num_obj_evals | integer | eval_f call count. |
num_constr_evals | integer | eval_g call count. |
num_obj_grad_evals | integer | eval_grad_f count. |
num_constr_jac_evals | integer | eval_jac_g count. |
num_hess_evals | integer | eval_h count. |
total_wallclock_time_secs | float | Wall time spent inside optimize_*. |
restoration_calls | integer | Number of restoration-phase entries (pounce#12). |
restoration_inner_iters | integer | Cumulative inner-IPM iterations across all restoration calls. |
restoration_outer_iters | integer | Outer iterations that ran in restoration mode (R-line equivalents). |
restoration_wall_secs | float | Wall time spent inside perform_restoration. |
Eval counters (num_*_evals) populate only on the .nl-file path
because the pounce binary’s CountingTnlp wrapper tracks them.
Library callers using IpoptApplication::optimize_tnlp directly see
zeros there; the underlying counts are still available through
upstream’s IpoptCalculatedQuantities if needed.
iterations (array of object, optional)
Per-iteration trajectory. Emitted only at --json-detail full (when
IpoptApplication::enable_iter_history() was called). Omitted from
JSON entirely when empty.
Each row maps to one line of the upstream-formatted console iter table. Fields:
| Field | Type | Notes |
|---|---|---|
iter | integer | 0-based iteration index. |
objective | float | f(x_k) at the start of iter k (unscaled). |
inf_pr | float | Primal infeasibility ` |
inf_du | float | Dual infeasibility ` |
mu | float | Barrier parameter μ_k (not log10; consumers can take log10 if they want the console format). |
d_norm | float | ` |
regularization | float | Hessian regularization δ_w applied this iter; 0.0 when none was needed. |
alpha_dual | float | Dual step length. |
alpha_primal | float | Primal step length. |
alpha_primal_char | string (1 char) | Single-character tag (f, h, r, etc.) matching the alpha-primal column of upstream’s iter table. |
ls_trials | integer | Number of backtracking line-search trials this iter. |
linear_solver (object, optional)
Aggregate post-mortem from the symmetric-indefinite linear backend
that solved the KKT systems. Populated only when the backend
self-instruments (the default FERAL backend does; HSL MA57 and
custom backends plugged through set_linear_backend_factory do not).
Omitted from JSON when no backend reported.
| Field | Type | Notes |
|---|---|---|
solver_name | string | Backend identifier (e.g. "feral"). |
n_factors | integer | Total numeric factorizations performed. |
n_pattern_reuse | integer | Factor calls that reused the existing symbolic pattern. |
n_pattern_changes | integer | Factor calls that triggered a re-analysis. |
max_fill_ratio | float | omitted | Peak nnz(L) / nnz(A) observed across all factorizations. |
min_abs_pivot | float | omitted | Smallest absolute pivot magnitude seen across all factorizations (diagnostic for near-singularity). |
max_abs_pivot | float | omitted | Largest absolute pivot magnitude. |
last_inertia | [int, int, int] | omitted | (positive, negative, zero) inertia of the final factor. Should match (n, m, 0) at a regular KKT optimum. |
last_nnz_a | integer | omitted | Non-zero count of the assembled KKT matrix at the final factor. |
last_nnz_l | integer | omitted | Non-zero count of the L-factor at the final factor. |
Detail levels
The --json-detail LEVEL flag selects how much detail is emitted.
Levels map to verbosity in the same spirit as upstream’s print_level
(0 silent → 12 maximum debug):
| Level | What’s emitted | What’s omitted |
|---|---|---|
summary (default) | FAIR metadata, problem, solution scalars + arrays, aggregate statistics | iterations, solution.suffixes |
full | All of the above plus per-iteration trajectory and suffix blocks | nothing — full detail |
summary is the right choice for production logs and batch runs.
full is the debugging equivalent of upstream’s print_level=8.
Worked example
pounce_sens crates/pounce-cli/tests/fixtures/parametric.nl out.sol --json-output result.json --json-detail full produces (truncated for brevity):
{
"schema": "pounce.solve-report/v1",
"fair_metadata": {
"result_id": "1778777029606881000-76543",
"created_at_iso": "2026-05-14T16:43:49.606Z",
"created_at_unix_nanos": 1778777029606881000,
"elapsed_seconds": 0.011,
"solver": {
"name": "pounce",
"version": "0.1.0",
"target_triple": "x86_64-apple-darwin"
},
"license": "EPL-2.0",
"input": {
"kind": "nl-file",
"path": "crates/pounce-cli/tests/fixtures/parametric.nl",
"size_bytes": 366
}
},
"problem": { "n_variables": 5, "n_constraints": 4, "n_objectives": 1, "minimize": true },
"solution": {
"status": "SolveSucceeded",
"solve_result_num": 0,
"objective": 0.5510204081632656,
"x": [0.6326530575201161, 0.3877551079678144, 0.020408165487930466, 5.0, 1.0],
"lambda": [-0.16326530000405073, -0.28571431357898697, -0.16326530000405073, 0.18075803406303625],
"suffixes": [{
"name": "sens_sol_state_1",
"target": "var",
"kind": "real",
"values": [0.5765305974643309, 0.3775510440570709, -0.04591835847859835, 4.5, 1.0]
}]
},
"statistics": { "iteration_count": 9, "final_dual_inf": 2.89e-14, "...": "..." },
"iterations": [
{ "iter": 0, "objective": 0.0451, "inf_pr": 5.0, "inf_du": 0.407, "mu": 0.1,
"d_norm": 0.0, "regularization": 0.0, "alpha_dual": 0.0, "alpha_primal": 0.0,
"alpha_primal_char": " ", "ls_trials": 0 },
{ "iter": 1, "objective": 0.957, "inf_pr": 0.212, "...": "..." }
]
}
Consumer guidance
- Pin the major version. Check
schema.startswith("pounce.solve-report/v1")before consuming. - Tolerate unknown fields. New optional fields will land between minor versions of pounce. Use
serde(default)/ equivalent. - Distinguish “no solve” from “solve produced zero”. Pre-solve, scalar fields are
0.0(notNaN, because JSON has no NaN literal).statistics.iteration_count == 0is the signal that no solve occurred. solution.x/solution.lambdamay be empty. When the binary couldn’t capture the final iterate (currently: thepouncebinary on itsnewton_driverfast-path form=0, n≤1000problems), the arrays are empty and the keys are omitted from JSON entirely.pounce_sensalways populates them.
References
- Wilkinson et al. (2016). “The FAIR Guiding Principles for scientific data management and stewardship.” Scientific Data 3, 160018. DOI 10.1038/sdata.2016.18. (Verified via Crossref 2026-05-14.)
- Gay (2005). “Hooking Your Solver to AMPL.” https://ampl.com/REFS/hooking2.pdf. §5 (Returning Results to AMPL) for the
.solbaseline this schema is structured around. - SPDX license identifiers: https://spdx.org/licenses/.
Sensitivity Analysis
POUNCE includes a parametric sensitivity capability compatible with
upstream Ipopt’s contrib/sIPOPT/ (Pirnay, López-Negrete & Biegler
2012, DOI
10.1007/s12532-012-0043-2).
It computes the first-order change in the optimal primal solution with
respect to a problem parameter, reusing the KKT factorization from the
converged solve. Three entry points cover the common workflows.
AMPL CLI
The main pounce driver auto-detects the sIPOPT suffixes
(sens_state_1, sens_state_value_1, sens_init_constr) in an input
.nl, runs a post-optimal sensitivity step after the solve, and
writes the perturbed primal back as a sens_sol_state_1 suffix — no
separate binary or flag needed:
pounce problem.nl # writes problem.sol
pounce problem.nl out.sol --json-output result.json --json-detail full
pounce_sens is retained as a thin backward-compatibility alias:
pounce_sens in.nl out.sol is identical to pounce in.nl out.sol, so
existing AMPL / solver scripts keep working unchanged.
Related flags:
--sens-boundcheck/--sens-bound-eps EPS— clamp the perturbed primalx* + Δxonto the declared[x_l, x_u]box.--compute-red-hessian/--rh-eigendecomp— compute the reduced Hessian (and its eigendecomposition) over the variables tagged by thered_hessianinteger var-suffix.
Rust library
SensSolve is a builder that wraps the on_converged callback
plumbing into a single call:
#![allow(unused)]
fn main() {
use pounce_sensitivity::SensSolve;
let result = SensSolve::new(vec![2, 3])
.with_deltas(vec![0.05, 0.0])
.with_reduced_hessian()
.run(&mut app, tnlp);
// result.dx, result.reduced_hessian, result.status
}
with_reduced_hessian_eigen() adds the eigendecomposition;
with_boundcheck(eps) enables the bound projection.
Python
solve_with_sens exposes the same capability from the
cyipopt-compatible Python wrapper:
# pin_constraint_indices is required; pass deltas=..., compute_reduced_hessian=True,
# or both. Returns (x, info) — sensitivity outputs live in the info dict.
x, info = prob.solve_with_sens(x0, pin_constraint_indices=[2, 3],
deltas=[0.05, 0.0], sens_boundcheck=True)
# info["dx"], info["reduced_hessian"], info["reduced_hessian_eigenvalues"], ...
compute_reduced_hessian=True returns the reduced Hessian in
info["reduced_hessian"]; rh_eigendecomp=True adds its
eigendecomposition; sens_bound_eps=… tunes the bound projection. See
python/notebooks/04_sensitivity.ipynb
for a walkthrough.
Verification
All three entry points are verified against upstream sIPOPT 3.14.19’s
parametric_cpp golden output to within roughly 6e-9 per component.
The bound projection is a single-pass clamp; upstream’s iterative
Schur refinement (re-factorize on each violation) is intentionally not
ported.
Sessions: Factor-Once / Solve-Many
POUNCE’s IPM converges to a KKT linear system that, once factored, answers a number of useful follow-up questions cheaply: parametric steps, reduced Hessians, custom back-solves. The session APIs let you hold that factor alive between operations, rather than rebuilding it on every call. The same machinery serves two workloads:
- Sensitivity / many-RHS. After one solve, issue many cheap operations against the converged factor — parametric steps for several parameter perturbations, reduced Hessians over several pinned-row sets, raw KKT back-solves.
- Factor-only. For non-IPM uses (shift-invert eigensolves, custom
Newton iterations) the underlying [
Factorization] handle inpounce-linsolexposes factor / refactor / back-solve directly, without the IPM in the loop.
Which layer do I want?
| You want… | Use |
|---|---|
| One solve plus a few sensitivity queries, from Python | pounce.Solver (Python) |
| The same, from C | IpoptSolver (C ABI) |
| The same, from Rust | pounce_sensitivity::Solver |
| Just a sparse symmetric factor — no IPM involved | pounce_linsol::Factorization |
| A one-shot sensitivity computation with a fluent builder | pounce_sensitivity::SensSolve (Rust) or Problem.solve_with_sens (Python) |
The session API does not rebuild the IPM. Each solve() call runs
the full barrier method from scratch. What it reuses is the factor
that exists at convergence: KKT back-solves and sensitivity
operations skip the symbolic factor, AMD ordering, and numeric
factorization.
Python
import pounce
problem = pounce.Problem(...)
solver = pounce.Solver(problem)
x, info = solver.solve(x0=x0)
assert solver.converged
# Parametric step ∂x*/∂p · Δp, with p pinned by g(x) row indices.
dx = solver.parametric_step([2, 3], [-0.5, 0.0])
# Reduced Hessian B K⁻¹ Bᵀ over the same pinned-row set.
hr = solver.reduced_hessian([2, 3])
# Raw KKT back-solve, useful for custom workflows.
dim = solver.kkt_dim
rhs = np.zeros(dim)
lhs = solver.kkt_solve(rhs)
The KKT compound vector is laid out as
x || s || y_c || y_d || z_l || z_u || v_l || v_u. pin indices in
parametric_step / reduced_hessian are 0-based row indices into
g(x); they are shifted internally to the y_c block.
pounce.Problem.solve() and Problem.solve_with_sens() still work
unchanged — each internally builds a fresh session — but new code that
issues more than one sensitivity query per solve should prefer
pounce.Solver to skip rebuilding the application.
C
IpoptProblem prob = CreateIpoptProblem(...);
AddIpoptStrOption(prob, "linear_solver", "feral");
/* Consumes prob — the IpoptSolver is now the sole owner.
prob is NULLed; calling FreeIpoptProblem(prob) on the now-null
pointer is harmless. */
IpoptSolver sol = IpoptCreateSolver(&prob);
double x[n], obj;
IpoptSolverSolve(sol, x, NULL, &obj, NULL, NULL, NULL, user_data);
Index dim = IpoptSolverGetKktDim(sol); /* compound KKT dim */
double rhs[dim], lhs[dim]; /* memset rhs as needed */
IpoptSolverKktSolve(sol, rhs, lhs);
Index pins[2] = {2, 3};
double deltas[2] = {-0.5, 0.0};
double dx[n];
IpoptSolverParametricStep(sol, 2, pins, deltas, dx);
double hr[2 * 2]; /* column-major dense */
IpoptSolverReducedHessian(sol, 2, pins, 1.0, hr);
IpoptFreeSolver(sol);
The classic IpoptSolve API is unchanged and unaffected; the session
handle lives alongside it.
Rust
#![allow(unused)]
fn main() {
use pounce_sensitivity::Solver;
let mut solver = Solver::new(app, tnlp);
solver.solve();
assert!(solver.converged().is_some());
let dx = solver.parametric_step(&[2, 3], &[-0.5, 0.0])?;
let hr = solver.compute_reduced_hessian(&[2, 3], 1.0)?;
let mut lhs = vec![0.0; solver.kkt_dim().unwrap()];
solver.kkt_solve(&rhs, &mut lhs)?;
}
For purely linear-algebra uses with no IPM in the loop:
#![allow(unused)]
fn main() {
use pounce_linsol::Factorization;
let mut fact = Factorization::new(dim, ia, ja, &values, backend)?;
fact.solve(&mut rhs, 1)?; // back-substitute in place
fact.refactor(&new_values)?; // pattern preserved; numeric reuse
fact.solve_one(&mut another_rhs)?;
}
What’s preserved across operations
- Symbolic factor / AMD ordering. Owned by the linear-solver
backend; reused on every back-solve and on
refactor(). - Numeric factor. Reused on every back-solve until you refactor.
- The converged primal-dual state (
x*, multipliers,g(x*), iteration stats).
What’s not preserved across solve() calls
The session is currently a factor-and-query value: one solve,
many follow-up operations. A separate resolve() that re-runs the
IPM while reusing the symbolic factor + AMD ordering across top-level
solves (for MPC / B&B / warm-start workloads) is planned but not yet
implemented. Each solve() call today runs a fresh IPM.
Verification
All session entry points are tested for numerical equivalence with the corresponding one-shot APIs:
pounce.Solver.solve≡Problem.solve(1e-12).pounce.Solver.parametric_step≡Problem.solve_with_sens(deltas=…)['dx'](1e-10).pounce.Solver.reduced_hessian≡Problem.solve_with_sens(compute_reduced_hessian=True)['reduced_hessian'](1e-10).pounce_sensitivity::Solver::parametric_step≡SensSolve::with_deltas(1e-10).
See python/tests/test_solver_session.py and
crates/pounce-sensitivity/tests/solver_session.rs for the full test
matrix.
Interactive Solver Debugger
POUNCE ships an interactive debugger for the interior-point loop — a pdb for the IPM. You can pause the solve at well-defined points, inspect and mutate the live mathematical state (the iterate, multipliers, the barrier parameter μ), set breakpoints (by iteration, on a numeric condition, or on a solver event), step through an iteration’s internal phases, rewind to an earlier iterate, re-solve from a saved point with new options, and drop in automatically when a solve fails.
It has two front ends sharing one command engine:
- a human REPL (
--debug) with history, Ctrl-R search, and Tab completion, and - a newline-delimited JSON protocol (
--debug-json) that an LLM agent, a script, or a visual debugger (e.g. a VS Code Debug Adapter) can drive programmatically.
No production NLP solver ships anything like this; if you have used
ipopt you have had print_level and a log. This is a live debugger.
The debugger has zero effect on the solve when it is not attached. The checkpoint fire-sites short-circuit when no debugger is installed, so the standard regression suite is bit-for-bit identical with and without the feature compiled in.
Quick start
pounce problem.nl --debug # human REPL, pauses at iteration 0
pounce problem.nl --debug-json # JSON protocol on stdin/stdout
pounce problem.nl --debug-on-error # run freely; drop in only if it fails
pounce problem.nl --debug-on-interrupt # run; Ctrl-C drops you in
A 30-second session (human REPL):
$ pounce --problem rosenbrock --debug
── pounce-dbg ── iter 0 @iter_start mu=1.000e-1 obj=2.420000e1 inf_pr=0.00e0 inf_du=1.00e2
pounce-dbg> info
iter = 0
mu = 1.000000e-1
objective = 2.42000000e1
...
pounce-dbg> print x
x = [-1.200000e0, 1.000000e0]
pounce-dbg> break if inf_du<1e-6
conditional breakpoint: inf_du<1e-6
pounce-dbg> continue
... solver runs ...
── pounce-dbg ── iter 21 @iter_start mu=... inf_du=8.7e-7
↳ inf_du<1e-6
pounce-dbg> quit
The prompt is on stderr; the solver’s own iteration table stays on stdout, so a redirected log is unaffected.
The two front ends
--debug (REPL) | --debug-json | |
|---|---|---|
| Audience | human at a terminal | agent / script / GUI |
| Channel | prompt + output on stderr | pure JSON on stdout |
| Line editing | rustyline: history (~/.pounce_dbg_history), Ctrl-R, Tab completion | n/a (caller supplies UI) |
| Solver table | shown on stdout | suppressed (print_level 0) |
| Commands | bare strings | bare strings or {"cmd":…,"args":[…],"id":…} |
On a non-TTY stdin (a pipe), the REPL falls back to a plain line reader (no history/Tab) but otherwise behaves identically — handy for scripted tests.
The JSON protocol is documented in full below.
Pausing and flow control
Checkpoints
The loop fires the debugger at these points (a pause reports which one
via its checkpoint field):
| Checkpoint | Fires | What’s fresh |
|---|---|---|
iter_start | top of each outer iteration | the accepted iterate from the previous step |
after_mu | μ updated for this iteration | the new barrier parameter |
after_search_dir | Newton step δ solved | the step (dx …), regularization, KKT inertia |
after_step | trial accepted | the step lengths α, the new iterate |
step_rejected | line search gave up (tiny step / all backtracks failed), before restoration | the search direction δ and the un-accepted iterate |
pre_restoration_entry | just before restoration | the iterate that tripped restoration |
post_restoration_exit | restoration returned | what restoration produced |
terminated | once, before the solve returns | the final / failing iterate + status |
By default the debugger only stops at iter_start (and terminated).
The sub-iteration checkpoints fire every iteration but resume immediately
unless you ask to stop at them.
Stepping into restoration. The same debugger drives the restoration
inner IPM: when the solve enters restoration, the inner solve’s
checkpoints fire too. A step/stepi that lands on an inner iteration
pauses there with in_restoration: true (REPL banner shows
[restoration]), and print x shows the restoration sub-NLP iterate.
stop-at resto (pre_restoration_entry) is the easy way to catch the
hand-off and then step inward.
Stepping
| Command | Effect |
|---|---|
step / s / n | run to the next iter_start |
step sub / stepi / si | run to the next checkpoint of any kind (walk an iteration’s phases) |
continue / c | run to the next breakpoint (or to completion) |
run N / r N | run until iteration N |
stop-at <cp> | always pause at checkpoint <cp> |
detach | stop pausing; run to completion |
quit / q | stop the solve now |
stop-at takes a checkpoint name or a friendly alias:
stop-at after_search_dir # or: stop-at kkt
stop-at pre_restoration_entry # or: stop-at resto
stop-at # list active stop-at checkpoints
stop-at clear
Aliases: mu → after_mu, kkt/search_dir → after_search_dir,
step → after_step, resto → pre_restoration_entry, resto_exit →
post_restoration_exit.
Breakpoints
Three kinds, all reported in break and surfaced as the pause reason.
By iteration
break 12 # pause at iteration 12 (alias: b 12)
tbreak 12 # one-shot: pause at 12, then delete itself (alias: tb)
break # list all breakpoints
break del 12 # remove
break clear # remove everything (iters + conditions + events)
Watchpoints (data breakpoints)
watchpoint x[3] # pause when x[3] changes (alias: wp)
watchpoint x 1e-3 # pause when any x component moves by > 1e-3
watchpoint # list; watchpoint del x[3]; watchpoint clear
Distinct from watch (which only displays): a watchpoint pauses the
solve when the watched value changes by more than its threshold (default
0 = any change) between iterations. Useful for a component expected to
stay put (e.g. a variable pinned at a bound).
Breakpoint command lists
Attach commands to a breakpoint that run automatically when it hits — semicolon-separated, ending with a flow command to auto-resume:
break 5
commands 5 print kkt ; set mu 0.1 ; continue # at iter 5: inspect, tweak μ, go
commands 5 clear # remove
commands # list all
When iteration 5 is reached, the debugger emits the pause, runs the
attached commands (each result is reported), and if one of them
resumes/stops, honors it without dropping to the prompt — otherwise it
falls through to the interactive prompt as usual.
Conditional (with compound predicates)
break if inf_pr<1e-6
break if mu<1e-4 && inf_pr>1e-3
break if iter>10 && (inf_du>1e-2 || obj<0)
break clear cond
- Metrics:
mu,inf_pr,inf_du,obj,err(overall NLP error),iter. - Operators:
<,<=,>,>=,==(==is float-tolerant). - Compound:
&&and||, evaluated strictly left-to-right with no precedence; parentheses are accepted but stripped (they don’t group). For real grouping, register several conditions — any one that holds fires.
Conditions are evaluated at iter_start.
On a solver event
break on regularized
break on resto_entered
break clear events
| Event | Fires when |
|---|---|
resto_entered | the algorithm enters restoration |
resto_exited | restoration returns |
regularized | the KKT system needed regularization (δ_w > 0 — inertia correction) |
tiny_step | the primal step is numerically negligible (‖dx‖∞ < 1e-10) |
ls_rejected | the line search tried more than one trial point |
mu_stalled | μ held (to tolerance) for 3 consecutive iterations |
nan | the NLP error or objective became non-finite |
Events fire at whatever checkpoint makes them observable (e.g.
regularized at after_search_dir), and pause with
reason: "event: <name>".
Inspecting state
info # one-line summary: iter, mu, obj, inf_pr, inf_du, nlp_error, dims
print x # a primal/dual block (alias: p x)
print dx # a search-direction block (d + block name)
print mu # a scalar: mu|obj|inf_pr|inf_du|err|compl|iter
print kkt # KKT inertia + regularization (see below)
print rank # SVD numerical rank of the equality Jacobian J_c (see below)
print active # which bound categories are near-active (small slack)
watch mu # auto-print a target at every pause (alias: display)
watch # list watches; watch del mu; watch clear
watch <target> registers any print target (block, dx, scalar,
kkt) to be shown automatically at every subsequent pause — the
debugger’s equivalent of gdb’s display. In JSON mode the values arrive
in the pause event’s watches array.
Blocks (the eight components of the primal-dual iterate):
| Name | Meaning |
|---|---|
x | primal variables |
s | inequality slacks |
y_c | equality-constraint multipliers |
y_d | inequality-constraint multipliers |
z_l, z_u | bound multipliers on x |
v_l, v_u | bound multipliers on s |
Prefix any block with d (dx, dz_l, …) to print the corresponding
block of the most recent Newton step.
Model names (.col / .row)
A solver-internal diagnostic that says “variable 132 in equation 3 looks
singular” is far less actionable than one that says “T_reactor in
energy_balance”. Lee et al. (2024) identify this gap — between
detecting an issue numerically and tracing it back to a named equation
in the modeling environment — as a central roadblock for debugging
equation-oriented models.1
AMPL .nl files carry no names, but AMPL emits two optional sibling
files when the modeler sets option auxfiles rc;:
| File | Contents |
|---|---|
stub.col | one variable name per line, in column order |
stub.row | one constraint name per line, in row order |
When these sit next to the .nl, pounce captures them
(NlProblem::var_names / con_names) and exposes them through the
ExpressionProvider::variable_name / constraint_name seam. Missing or
malformed name files are non-fatal — names are a diagnostic aid, never
load-blocking, so the debugger simply falls back to index labels.
print residuals uses these names directly. Residual values live in the
solver’s split space (equalities and inequalities separated, fixed
variables removed), so a name only labels the right row if it is carried
through the same permutations. The TNLP publishes its .col/.row names
under the conventional idx_names metadata key, and OrigIpoptNlp
projects them into split space (x_not_fixed_map for variables, c_map
for equalities, d_map for inequalities) — the debugger reads the result
via DebugCtx::split_names. So a near-singular equality residual prints as
c[energy_balance] = +3.142e-04 |3.142e-04|
instead of c[3]. The same idx_names pool labels grad_x_L[...]
(variable names) and grad_s_L[...] / d-s[...] (inequality names). The
JSON payload keeps the numeric index and adds a name field.
Status. Capture, exposure, and
print residualslabeling are live on the AMPL.nlpath with names projected through the bound / c-d-split permutations. Presolve renumbers rows, soPresolveTnlpdeclinesidx_namesrather than risk mislabeling a permuted row — under presolve the debugger safely falls back to index labels. Carrying names through the presolve map and decoratingprint activeare the next steps built on this foundation.
print equation — the algebra of a named constraint
Naming a culprit row is only half the story; the next question is always
what does that equation actually say? Lee et al. (2024) make this the
core of actionable equation-oriented diagnostics — a debugger should
surface the named equation, not just a row index.1 print equation closes that loop: once print residuals points at, say,
c[energy_balance], you read the constraint’s source algebra directly.
(dbg) print equation energy_balance
energy_balance: T_reactor*flow - 300*flow - Q = 0
(dbg) print equation 14 # by original .nl row index
c[14]: x[3]^2 + x[7]^2 <= 1
A constraint is addressable by its model name (preferred, and robust
to row reordering) or its original .nl row index. With no argument,
print equation reports how many equations are available. The renderer
works from the faithful Expr DAG the .nl parser built — not the lossy
evaluation tape — so common-subexpressions, imported functions, and
piecewise/conditional forms render as written. The affine part is printed
with tidy signs (a - 2*b, not a + -2*b), zero-coefficient Jacobian
placeholders are suppressed, and bounds render in their natural relation
(= rhs, lo <= body <= hi, >= lo, <= hi). The JSON payload carries
{index, name, equation}.
Equations are static model data in original .nl row order, so unlike
residuals they need no split-space projection — print equation works
regardless of presolve. It is available whenever a model was loaded from
an .nl file; the JSON name field is present only when a .row auxfile
supplied one.
print kkt — inertia and regularization
Available at/after after_search_dir (use stop-at kkt). This is the
view a solver expert reaches for when a step looks wrong:
pounce-dbg> stop-at kkt
pounce-dbg> continue
── pounce-dbg ── iter 3 @after_search_dir ...
pounce-dbg> print kkt
dim = 3
inertia = n+=2 n-=1 (expected n-=1) → correct
delta_w = 0.000000e0 (primal regularization)
delta_c = 0.000000e0 (dual regularization)
status = Success
The augmented (KKT) system has expected inertia (n₊ = n, n₋ = m, n₀ = 0) where m is the number of equality + inequality multipliers.
A mismatch — or a nonzero delta_w/delta_c — is the classic signal
that the step is being stabilized (the solver added regularization to
fix the inertia).
For the matrix and factor themselves:
viz kkt # the assembled augmented-system matrix (triplets) + inertia
viz L # the LDLᵀ factor (strict-lower triplets + values)
viz kkt writes the KKT matrix as 1-based lower-triangle triplets
(dim, irn, jcn, vals) alongside the inertia summary — point
$POUNCE_DBG_VIEWER at a heatmap script. viz L writes the LDLᵀ
factor (n, fill-reducing perm, strict-lower l_irn/l_jcn/l_vals
in permuted coordinates), read out of the factor the solver actually
computed.
Both are read-only and always show the most recent factorization:
the current iteration’s system at an after_search_dir stop, or the
previous iteration’s at the default iter_start pause (the step
that produced where you’re standing). The matrix and factor are captured
every iteration while the debugger is stepping; once you detach (run
free) the capture is dropped — so on a large problem a free run doesn’t
pay the O(nnz) assembly. If you viz kkt/viz L right after a free run,
step once to re-capture.
print rank — numerical rank of the equality Jacobian
print kkt tells you that the dual system needed regularization
(delta_c > 0) or that the inertia was wrong; the structural_singularity
finding names equations that are dependent by sparsity pattern. print rank closes the last gap: a rank-revealing SVD of the equality Jacobian
J_c at the current iterate. It factors the matrix the solver actually
sees (constraint scaling already applied), so it localizes the dependency
to specific equations — including dependencies that are numerical only
(values that cancel over a full sparsity pattern), which the structural
Dulmage–Mendelsohn pass cannot detect.
It doesn’t just name the culprit equations — it prints them. When a
.nl model is loaded, each implicated row’s source algebra is rendered
directly beneath it (the same DAG-faithful text print equation shows), so
you read the dependency without a second command:
pounce-dbg> print rank
equality Jacobian J_c: 3 row(s) × 4 column(s)
numerical rank = 2 / 3 (deficiency 1)
σ_max = 3.162e0 σ_min = 0.000e0 cond = inf (σ_min = 0) (rank tol τ = 1.40e-15)
singular values: [3.162e0, 1.414e0, 0.000e0]
rank-deficient: 1 equation(s) lie in the near-null space (linearly dependent / redundant) — the source of δ_c regularization:
c[mass_balance] (participation 0.50)
x[0] + x[1] - 10 = 0
c[mass_balance_dup] (participation 0.50)
x[0] + x[1] - 10 = 0
The two equations print identically — that is the redundancy, now visible on its face.
For the SVD J_c = U Σ Vᵀ, the left singular vectors u_k whose singular
value σ_k ≈ 0 span the left null space — the row combinations u_kᵀ J_c ≈ 0 that vanish. Each row’s participation w_i = Σ_{k : σ_k ≤ τ} u[i,k]² ∈ [0, 1] localizes the dependency: a redundancy shared between two
equations splits ≈ 0.5/0.5, while w_i = 1 means row i lies entirely in
the null space. The numerical-rank threshold is the standard LAPACK/NumPy
τ = σ_max · max(m, n) · ε; the implicated rows are resolved to model
names through the same .row plumbing as print residuals / print equation.
The inline algebra is resolved by model name, so it appears for named
rows. The rank report’s row index is the split equality position, not the
original .nl row the equation source keys on, so an unnamed row can’t be
mapped — there print rank falls back to a print equation <name> hint
instead of guessing. When J_c has full row rank, that is reported as a
positive signal (J_c has full row rank at this iterate.) with the
σ_min/cond witnessing how far it is from degenerate — silence would be
ambiguous. The command is available whenever the iterate has an equality
block; a problem with no equality constraints returns a short explanatory
error. The JSON payload is {iter, n_rows, n_cols, rank, deficiency, rank_deficient, sigma_max, sigma_min, cond, tol, singular_values, culprits: [{row, kind, index, name, label, weight, equation}]} (equation is the
rendered source or null when unresolved; cond is null when σ_min = 0, since JSON has no infinity).
diagnose — a live, named health report
info, print residuals, and print kkt each expose one facet of the
current iterate. diagnose (alias diag) runs a panel of heuristics over
all of them at once and returns a ranked list of findings — and, crucially,
names the culprit equation or variable behind each numerical symptom.
That last step is the actionable-diagnostics path of Lee et al.
(2024):1 a report that says “mass_balance is the worst
constraint residual” is worth far more than “row 13 is infeasible.”
pounce-dbg> diagnose
[ error] primal_infeasible: Primal infeasibility 1.70e+02; worst constraint
residual is c[mass_balance] = +1.701e+02. Inspect this equation's
feasibility and scaling (`print equation mass_balance`).
[warning] dual_infeasible: Dual infeasibility 9.84e-01; largest stationarity
residual is grad_x_L[T_reactor] = -9.838e-01.
[warning] inertia_wrong: KKT inertia is wrong (n-=2 vs expected 1): the system
was indefinite/singular and the step had to be stabilized.
[ info] bounds_pinned: 3 variable bound(s) are active (slack < 1e-6).
This is the live counterpart to the pounce-studio diagnose tool,
which runs temporal heuristics over a finished solve report. The two
share a {severity, code, message} shape so
a client can treat them uniformly, but the live command sees what a saved
report cannot: the current KKT inertia and regularization, and the
named primal/dual residuals at this exact point. Findings are sorted
error → warning → info; a clean iterate yields a single healthy
finding. The checks:
| code | severity | fires when |
|---|---|---|
primal_infeasible | error/warning | inf_pr above tol → names the worst constraint residual |
dual_infeasible | warning | inf_du above tol → names the worst stationarity residual |
inertia_wrong | warning | KKT inertia ≠ expected (rank-deficient Jacobian / indefinite Hessian) |
heavy_regularization | info | primal δ_w applied (Hessian indefinite) |
dual_regularization | warning | dual δ_c applied (linearly dependent / redundant equalities) |
structural_singularity | warning | a subset of equalities is over-determined → names the dependent equations |
rank_deficient_jacobian | warning | SVD of J_c is numerically rank-deficient → names the equations in the near-null space (catches value-only dependencies too) |
large_multipliers | warning | a multiplier exceeds 1e8 (constraint-qualification / scaling) |
bounds_pinned | info | variables pressed against their bounds |
tiny_step | warning | accepted α_pr collapsed |
heavy_line_search | warning | ≥10 backtracking trials for the accepted step |
in_restoration | warning | currently inside feasibility restoration |
mu_stalled | warning | μ flat for ≥3 consecutive iterations |
KKT-derived findings (inertia_wrong, *_regularization) need a computed
search direction, so they appear at/after after_search_dir. Names follow
the same rule as print residuals: present on the .nl path with
.col/.row files, index labels (c[13]) under presolve. The JSON payload
is {iter, findings: [{severity, code, message}], n_findings}.
Structural rank: naming the dependent equations
inertia_wrong and dual_regularization detect a rank-deficient
Jacobian, but only as a scalar — they tell you a redundancy exists, not
which equations are redundant. structural_singularity closes that gap
with a Dulmage–Mendelsohn decomposition of the equality Jacobian’s
sparsity pattern (the same structural check at the heart of IDAES’s
DiagnosticsToolbox). A maximum bipartite matching between equality rows
and variables partitions the system; any over-determined block — more
equations than the variables they jointly touch — forces at least one of
those equations to be redundant or mutually inconsistent (LICQ fails). The
finding lists those equations by model name, e.g.:
pounce-dbg> diagnose
[warning] structural_singularity: Constraint Jacobian is structurally singular
(Dulmage–Mendelsohn): 2 equation(s) over-determine the 1 variable(s)
they jointly touch (flow_rate), so ≥1 of them must be redundant or
mutually inconsistent (LICQ fails on this block). Candidate dependent
equations: mass_balance, mass_balance_dup. Inspect them with
`print equation <name>`; this names the rows behind any δ_c
dual-regularization / wrong-inertia signal.
This is the named-culprit payoff of Lee et al. (2024):1
reporting “mass_balance and mass_balance_dup are linearly dependent”
rather than “the Jacobian is singular.” The check is iterate-independent
(it reads only the sparsity pattern), so unlike the KKT-derived findings it
fires from iteration 0 — it can flag a structurally broken model before the
solver ever stalls on it. It is suppressed for well-posed problems: an NLP
with more variables than equality constraints is the normal case (the spare
degrees of freedom are pinned by the objective, bounds, and inequalities),
so only the over-determined side is reported, never the under-determined
one. Available on the .nl path; names fall back to c[i]/x[i] when no
.col/.row auxiliary files were emitted.
Numerical rank: the value-dependency the structure can’t see
structural_singularity reads only the sparsity pattern, so it is blind
to a redundancy that lives in the values — three equations whose every
entry is nonzero (a structurally full-rank pattern) but whose rows satisfy
row₂ = row₀ + row₁ numerically. rank_deficient_jacobian is the
numerical complement: it runs the same SVD as print rank over J_c at
the current iterate and, when the numerical rank falls short, names the
equations in the near-null space:
pounce-dbg> diagnose
[warning] rank_deficient_jacobian: Equality Jacobian J_c is numerically
rank-deficient at this iterate: rank 2/3 (deficiency 1),
σ_min=0.00e0, cond=inf (σ_min = 0). Linearly dependent or redundant
equality constraints — the root cause behind δ_c regularization /
wrong inertia. Implicated equations: c[mass_balance],
c[mass_balance_dup].
Unlike the structural check, this one is iterate-dependent — it factors
J_c at the current x, so it reflects the matrix the solver is actually
regularizing and catches dependencies that only appear at certain points.
The two checks are deliberately layered: structural_singularity fires
from iteration 0 on the pattern alone; rank_deficient_jacobian confirms
it numerically and, more importantly, surfaces the value-only dependencies
the structural pass provably cannot. See print rank for the SVD
math and the per-equation participation weights.
Mutating state
Mutations feed straight back into the solve.
set mu 0.5 # overwrite the barrier parameter
set x[2] 1.5 # overwrite one component of a block
set x 1.0,2.0,3.0 # overwrite a whole block (comma-separated)
Setting any block works (set z_l[0] 1e-3, …). Iterate edits rebuild the
iterate with a fresh change-tag, so the cached derived quantities
(curr_f, slacks, σ, …) invalidate correctly and the next step is
computed from the new point — exactly as if the line search had produced
it.
Staging a solver option (validated against the registry):
set opt mu_strategy adaptive
set opt linear_solver ma57
Staged options are not applied to the strategies already built for
the running solve (they don’t re-read options mid-iteration). They take
effect on a resolve or the next solve.
Discovering options
opt # list every registered option
opt mu # filter by name/category substring
complete pri # completion candidates for a prefix
opt <exact-name> also prints the long description. In the REPL, Tab
completes command verbs, block names, metric names (after break if),
checkpoint names (after stop-at), event names (after break on),
option names (after set opt / opt), and filesystem paths (after
load / sweep / save / source — directories get a trailing /).
The same contexts are available programmatically via the complete <prefix…> command (JSON complete), so an agent or GUI can offer the
same completions.
Time travel
Rewind (goto / restart)
The debugger snapshots the primal-dual state (x, s, multipliers, μ,
τ) every iteration. goto rewinds to a captured iteration and stays
paused so you can re-tune before resuming:
goto 3 # rewind to the start of iteration 3
restart # rewind to the earliest snapshot
Caveat — this is a soft rewind. Only the primal-dual state is restored; strategy history (the filter, the adaptive-μ oracle, the quasi-Newton memory) is not rolled back. So continuing from a rewound point is “resume from here,” not a bit-exact replay of the original run.
Re-solve from a saved point
resolve re-runs the solve from the current x with any
set opt edits applied — a primal warm start with new options. Use it
for “what if I change mu_strategy from here?”:
pounce-dbg> set opt mu_strategy adaptive
pounce-dbg> resolve
re-solving from current x with 1 staged option override(s)…
── pounce-dbg ── iter 0 @iter_start ... # fresh solve, seeded from the captured x
Because each solve rebuilds its strategies from the options, the changes do take effect on the re-solve. The seed is dropped (falling back to the problem’s own start) if presolve / fixed-variable elimination changed the coordinate count.
Saving and visualizing artifacts
save # write the current iterate + residuals to a temp JSON
save /tmp/iter3.json # explicit path
viz x # write a block and open it in an external viewer
viz dx # a search-direction block
viz kkt # the KKT inertia/regularization report
save writes every non-empty block, the search-direction blocks, and the
residual scalars (iter, mu, objective, inf_pr, inf_du,
nlp_error) — a self-contained artifact for external analysis.
load — the inverse of save
Typing a start point by hand is fine for a 2-variable toy and miserable for
anything real. load reads a block straight into the live iterate, so you
generate the point once (a prior solve, a surrogate, a sampler) and pull it
in:
load /tmp/it0.json # a `save` artifact: every block it contains is loaded
load start.csv # a plain numeric file → x (comma/space/newline sep)
load start.csv s # … into a named block instead of x
Two input shapes are accepted:
- A
saveartifact (JSON). Blocks are read from the top level or from aniterateobject; every block present (x,s, multipliers, …) is written, each validated against the current dimension. Sosave→loadround-trips a full point, and you can lift just the part that fits if dimensions changed. - A plain numeric file — values separated by commas, whitespace, or
newlines — written into the named block (default
x). This is the many-variable escape hatch:numpy.savetxt("start.csv", x0)thenload start.csv.
A loaded x becomes the seed for the next step (or for resolve — a
warm start from an externally-computed point with no typing).
Interactive figures (pounce-dbg-viz)
viz writes a JSON artifact and hands it to a viewer. The Python package
ships an interactive Plotly viewer that renders these properly —
a spy/heatmap for viz kkt (the augmented matrix, colored by value, with
the inertia/regularization in the title) and viz L (the LDLᵀ factor),
and a bar chart for vector blocks (viz x, viz dx):
pip install 'pounce-solver[viz]' # installs the `pounce-dbg-viz` script
When pounce-dbg-viz is on PATH, viz uses it automatically (opening
an interactive figure in your browser). The launch order is:
$POUNCE_DBG_VIEWER— a command template ({}← the artifact path), if set;pounce-dbg-viz— the bundled Plotly viewer, if installed;- the OS opener (
xdg-open/open) on the raw JSON.
So export POUNCE_DBG_VIEWER='python my_plot.py {}' overrides with your
own plotter, and with nothing set + the viz extra installed it just
works. The same pounce-dbg-viz <file.json> also renders a save
artifact (the full iterate).
Multi-start and initialization sensitivity
Interior-point methods find a local solution, and which one depends on
where you start. Two commands turn the debugger into an
initialization-sensitivity probe: they run many full solves — each from a
different start — and tabulate where each one ends up. Both build on the
same re-solve machinery as resolve (so
they need the restart cell the CLI wires by default; they error in
contexts without it), and both leave you at a normal prompt on the final
solve afterward.
sweep <file> — explicit starts
Run one solve per start point listed in a file (one start per line,
comma/whitespace-separated; #/// comments and blank lines skipped):
pounce-dbg> sweep starts.txt
sweep 1/4: Success iters=21 obj=3.743990e-21 inf_pr=0.00e0
sweep 2/4: Success iters=15 obj=1.233088e-28 inf_pr=0.00e0
sweep 3/4: Success iters=14 obj=1.328861e-28 inf_pr=0.00e0
sweep 4/4: Success iters=29 obj=2.982346e-18 inf_pr=0.00e0
── sweep complete ── 4 solves, 4 succeeded, 1 distinct minima
# status iters objective inf_pr
0 Success 21 3.743990e-21 0.00e0
1 Success 15 1.233088e-28 0.00e0
2 Success 14 1.328861e-28 0.00e0
3 Success 29 2.982346e-18 0.00e0
best: solve #2 obj=1.32886077e-28
Each start must have the same length as x (mismatches are reported with
the line number). The summary clusters successful objectives to a relative
1e-6 to count distinct minima and flags the best (lowest-objective)
solve. This is the “is this solve fragile to its start, and to which basins
does it fall?” diagnostic — and unlike a black-box global search it leaves
every solve’s trajectory observable: set a break on resto_entered or a
stop-at kkt first and the sweep will pause inside whichever solve trips
it.
multistart <N> [rel] — sampled restarts
When you don’t have a file of starts, multistart generates N of them:
pounce-dbg> multistart 8 # 8 starts
pounce-dbg> multistart 8 0.3 # wider jitter on any unbounded vars
Each variable that has a finite box [x_Lᵢ, x_Uᵢ] is sampled
uniformly inside it — a genuine box multistart. Variables that are
unbounded on either side fall back to a relative jitter ±rel·(|xᵢ|+1)
around the current point (rel default 0.1, with a floor so components at
zero still move). The command reports the split, e.g.
multistart 8 (box 5/7 vars; 2 unbounded → jitter rel=0.1).
Start 0 is always the unperturbed current x (so the run includes where
you already are), and the sampler is a fixed-seed PRNG, so a multistart
run reproduces exactly.
The bounds are the ones the algorithm sees — full-length, post-scaling,
after any bound_relax_factor — so every sampled start is a valid seed.
For a problem with no finite bounds (a pure unconstrained NLP) multistart
degrades to jitter around x; sweep an external sample if you want a
specific spread there.
Driving a sweep from a file with load
The pieces compose. To seed a sweep from points computed elsewhere, write
them with numpy.savetxt and sweep the file directly — or, for a single
externally-computed warm start, load it and resolve:
import numpy as np
np.savetxt("starts.txt", sampler(n=32), delimiter=",") # 32 starts, one per row
pounce-dbg> sweep starts.txt
sweep vs. find_minima
sweep/multistart are diagnostics: they show you how a handful of
starts behave, with full visibility into each solve’s path. For an
automated global search — Sobol sampling, deduplication, minimum
certification (PSD Hessian), redundant-descent avoidance — reach for the
Python pounce.find_minima, whose multistart and
mlsl methods are the production tools. Rule of thumb: debugger sweep
when you’re asking why a solve is start-sensitive; find_minima when
you want the minima themselves.
Ask Claude about the state
ask [question] packages the current paused state — checkpoint,
residuals, step lengths, dimensions, and the KKT inertia/regularization —
into a prompt and runs it through Claude Code (claude -p, headless
print mode), printing the reply inline. It’s AI-assisted debugging
without leaving the loop:
pounce-dbg> stop-at kkt
pounce-dbg> continue
pounce-dbg> ask why is the dual infeasibility stalling?
# → Claude's analysis of the state + suggested options to try
With no question it defaults to “explain the current state and suggest
what to try next.” The command is configurable via $POUNCE_DBG_LLM
(default claude -p); the prompt is fed on the tool’s stdin, or
substituted into a {} placeholder if the template has one:
export POUNCE_DBG_LLM='claude -p' # default
export POUNCE_DBG_LLM='llm -m claude-opus' # any prompt-on-stdin CLI
export POUNCE_DBG_LLM='mytool --ask {}' # prompt as an argument
In JSON mode the reply comes back in the result event’s data.reply.
Attaching to a run
You don’t have to single-step from iteration 0.
- Drop in on failure —
--debug-on-errorruns the solve freely and pauses at theterminatedcheckpoint only if the solve did not succeed, leaving you at the failing iterate for a post-mortem. (Plain--debugalso pauses atterminatedfor a final-point inspect.) - Attach with Ctrl-C —
--debug-on-interruptruns normally but installs a SIGINT handler; a first Ctrl-C drops you in at the next iteration (reason: "interrupt (Ctrl-C)"), a second Ctrl-C aborts. Ctrl-C also breaks into any other debug mode mid-continue.
Ctrl-C at the prompt. At a rustyline prompt Ctrl-C arrives as input,
not a signal, so it has its own analogous double-tap: the first Ctrl-C
cancels the current input line (readline convention), a second in a row
stops the solve (a clean UserRequestedStop, same as quit). So
whether you are running or sitting at the prompt, two Ctrl-Cs always get you
out; quit/q and Ctrl-D (EOF, which detaches and finishes) remain the
explicit exits.
Scripting
Run a sequence of debugger commands from a file — one per line, # and
// comments and blank lines skipped:
# warmup.pdbg
break if inf_pr<1e-6
watch mu
stop-at after_search_dir
continue
pounce problem.nl --debug-script warmup.pdbg # run at the first pause
pounce-dbg> source warmup.pdbg # or interactively
A script runs top-to-bottom and stops early if a command resumes or
stops the solve (so ending with continue hands control back at the
first breakpoint). --debug-script implies --debug when no --debug*
mode is given, and runs once at the first pause (not on a resolve).
Example: a scripted initialization-sensitivity run
Because load, sweep, and set opt are ordinary commands, a whole
diagnostic fits in a script file. This one watches each solve’s path and
sweeps a set of externally-generated starts:
# sensitivity.pdbg — generate starts.txt first (e.g. numpy.savetxt)
break on resto_entered # surface any start that falls into restoration
sweep starts.txt # one solve per row; tabulated at the end
pounce model.nl --debug-script sensitivity.pdbg
Or compare a baseline against a what-if on the same starts by staging an option before the sweep:
# adaptive-vs-monotone.pdbg
set opt mu_strategy adaptive
multistart 16 0.2 # 16 sampled restarts, all under adaptive μ
Example: drive a multistart from a program (JSON protocol)
For many variables and many starts, hold the x0s as arrays in a driver
program and let it assemble the commands — no point is ever typed. The
--debug-json protocol emits a sweep_result per solve and a final
sweep_summary:
import subprocess, json, numpy as np
p = subprocess.Popen(["pounce", "big.nl", "--debug-json"],
stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True)
send = lambda c, **k: (p.stdin.write(json.dumps({"cmd": c, **k}) + "\n"), p.stdin.flush())
recv = lambda: json.loads(p.stdout.readline())
recv() # hello
recv() # initial pause
# Option A — let the debugger sample: N restarts (uniform in finite boxes).
send("multistart", args=["32", "0.25"])
# Option B — supply your own starts via a file and sweep it:
# np.savetxt("starts.txt", my_sampler(n=32), delimiter=",")
# send("sweep", args=["starts.txt"])
results = []
for line in p.stdout:
ev = json.loads(line)
if ev.get("event") == "sweep_result":
results.append((ev["status"], ev["objective"]))
elif ev.get("event") == "sweep_summary":
print(f"{ev['succeeded']}/{ev['solves']} ok, "
f"{ev['distinct_minima']} distinct minima, "
f"best obj {ev['best_objective']:.6e}")
break
Each sweep_result carries index, status, iters, objective,
inf_pr, and the seed it started from; the sweep_summary adds
distinct_minima, best_index, and best_objective. A client can
feature-detect support via hello.capabilities.sweep.
Exit model
| Path | Result |
|---|---|
quit | stops now → UserRequestedStop |
| Ctrl-C ×2 at the prompt | cancel line, then stop → UserRequestedStop |
Ctrl-C ×2 mid-continue | break in, then abort (exit 130) |
continue / detach | run to natural completion |
| stdin EOF, REPL (Ctrl-D) | detach and finish (pdb convention) |
| stdin EOF, JSON (pipe closed) | abort — the controlling client is gone |
| external SIGKILL | process dies (no terminated event) |
Every non-kill path ends with a terminated event in JSON mode.
Command reference
| Command (aliases) | Summary |
|---|---|
help (h, ?) | list commands |
info (i) | current-iterate summary |
print <what> (p) | block, d-block, scalar, kkt, or residuals |
print equation <name|row> | source algebra of a constraint, by model name or .nl row |
step (s, n) | run to next iter_start |
step sub / stepi (si) | run to next checkpoint of any kind |
continue (c) | run to next breakpoint |
run N (r) | run until iteration N |
break … (b) | iteration / if / on breakpoints; list; clear; del N |
stop-at <cp> | always pause at a checkpoint |
set mu/x/<block>/opt … | mutate μ, the iterate, or stage an option |
opt [filter] | list/search registered options |
complete <prefix> | completion candidates |
viz <target> | open an artifact in a viewer |
save [path] | dump the iterate to JSON |
load <file> [block] | read a block (default x) from a save artifact / numeric file |
sweep <file> | one solve per start in <file>; tabulate outcomes |
multistart <N> [rel] | N restarts (uniform in each finite box; jitter elsewhere); tabulate |
watch <target> (display) | auto-print a target at every pause |
tbreak N (tb) | one-shot iteration breakpoint |
watchpoint <blk>[<i>] [τ] (wp) | pause when a value changes by > τ |
diff | what changed in the iterate since the last iteration |
diagnose (diag) | live health report: named culprit residuals, KKT inertia, stalls |
source <file> | run debugger commands from a file |
goto N / restart | soft-rewind to a captured iteration |
resolve | re-solve from current x with staged options |
ask [question] | ask Claude Code (claude -p / $POUNCE_DBG_LLM) about the state |
progress [on/off] | toggle JSON progress events |
detach | stop pausing; run to completion |
quit (q, exit) | stop the solve |
The JSON protocol
--debug-json makes stdout a pure stream of newline-delimited JSON
objects (the banner, problem stats, and final summary are routed to
stderr, and print_level is forced to 0). A program reads one JSON
object per line.
Session lifecycle
hello— emitted once, up front. The handshake.pause— at each stop.result— one per command, echoing the client’srequest_id.progress— one per iteration while running between pauses.sweep_result/sweep_summary— during asweep/multistart: onesweep_resultper completed solve, then asweep_summaryat the end.terminated— once, after the solve.
Commands
Write one per line to stdin, either a bare string or an object:
{"cmd": "print", "args": ["x"], "id": 7}
{"cmd": "break if inf_pr<1e-6", "id": 8}
"continue"
id (any JSON value) is echoed back as request_id on the matching
result, for async correlation.
hello
{"event":"hello","protocol":"pounce-dbg/1","pounce_version":"0.2.0",
"capabilities":{"inspect":true,"mutate_iterate":true,"mutate_mu":true,
"conditional_breakpoints":"compound","request_ids":true,
"viz":["block","delta"],"save":true,"load":true,"sweep":true,
"kkt_inspect":true,
"rewind":"primal_dual","resolve":true,"terminal_checkpoint":true,
"interruptible":true,"progress_events":true,"async_pause":"checkpoint"},
"checkpoints":["iter_start","after_mu","after_search_dir","after_step",
"pre_restoration_entry","post_restoration_exit","terminated"],
"events":["resto_entered","resto_exited","regularized","tiny_step",
"ls_rejected","nan"],
"commands":[…],"blocks":[…],"metrics":[…]}
A client should feature-detect off capabilities / checkpoints /
events rather than the protocol string — those lists are additive as
the debugger grows.
pause
{"event":"pause","checkpoint":"iter_start","status":null,
"iter":3,"mu":2.0e-2,"objective":5.05,"inf_pr":0.0,"inf_du":2.7e-14,
"nlp_error":0.0237,"dims":{"x":2,"s":0,"y_c":0,"y_d":0,"z_l":2,"z_u":2,
"v_l":0,"v_u":0},"breakpoints":[],"conditions":[],"reason":"mu<0.05"}
status is non-null only at the terminated checkpoint. reason
carries the firing breakpoint / condition / event / interrupt.
result
{"event":"result","request_id":7,"command":"print x","ok":true,
"output":["x = [-1.18e0, 1.38e0]"],"data":{"name":"x","values":[-1.18,1.38]}}
output is human-readable lines; data is the structured payload
(present for inspection commands).
progress
{"event":"progress","iter":42,"mu":1.0e-5,"inf_pr":3.2e-7,"inf_du":1.1e-6,"obj":12.34}
Emitted once per outer iteration during a continue, so a UI can show
live progress instead of a hang. Default on; toggle with the progress
command.
terminated
{"event":"terminated","status":"SolveSucceeded",
"status_message":"Optimal Solution Found.","iterations":6,
"objective":4.9999999,"evals":{"obj":7,"obj_grad":7,"constr":1,
"constr_jac":12,"hess":6}}
Async pause
A running continue can be interrupted two ways, both pausing at the
next checkpoint with a reason:
- SIGINT —
process.kill(pid, "SIGINT")(or Ctrl-C). This is what a Debug Adapter’s pause button maps to. Reason:"interrupt (Ctrl-C)". - In-band command — send
{"cmd":"pause"}on stdin while the solve is running (JSON mode). No signals, so it works on Windows. Reason:"pause (requested)".
hello.capabilities.async_pause is "checkpoint", and
pause_command is true.
Tutorials
1. Why did this problem go to restoration?
$ pounce hard.nl --debug-json
{"cmd":"break on resto_entered"}
{"cmd":"continue"}
# → pause at checkpoint "pre_restoration_entry", reason "event: resto_entered"
{"cmd":"info"} # how infeasible is the iterate?
{"cmd":"print kkt"} # was the KKT singular / heavily regularized?
{"cmd":"print x"}
2. Catch a step that gets regularized
break on regularized
continue
# → pause at after_search_dir when delta_w > 0
print kkt # inertia n- vs expected; delta_w / delta_c
print dx # the (stabilized) Newton step
3. What-if: try a different μ strategy from here
break 5
continue # stop at iteration 5
set opt mu_strategy adaptive
resolve # re-solve from the iter-5 point with adaptive μ
4. Post-mortem on a failure
pounce maybe-infeasible.nl --debug-on-error
Runs unattended; if the solve returns anything but success you land at the final iterate:
── pounce-dbg ── TERMINATED (LocalInfeasibility) iter 11 obj=1.13e0 inf_pr=5.0e-1 inf_du=1.2e-8
pounce-dbg> print x
pounce-dbg> print kkt
5. Drive it from a program / agent
import subprocess, json
p = subprocess.Popen(["pounce", "hs071.nl", "--debug-json"],
stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True)
def send(cmd, **kw): p.stdin.write(json.dumps({"cmd": cmd, **kw}) + "\n"); p.stdin.flush()
def recv(): return json.loads(p.stdout.readline())
hello = recv() # capabilities / vocabulary
print(recv()) # initial pause
send("break if inf_du<1e-6", id=1)
print(recv()) # result, request_id=1
send("continue")
for line in p.stdout: # progress … pause … terminated
ev = json.loads(line)
if ev["event"] == "terminated": break
6. Is this solve sensitive to its start?
break on resto_entered # flag any start that falls into restoration
multistart 16 # 16 restarts (uniform in each finite box)
# → per-solve lines, then a table: succeeded / distinct minima / best
Swap multistart 16 for sweep starts.txt to run your own start
points (numpy.savetxt("starts.txt", X0, delimiter=",")). See
Multi-start and initialization sensitivity.
Limitations
- Soft rewind only.
goto/restartrestore the primal-dual state, not strategy history (see the caveat above). set optis staged, not hot-applied to a running solve; it takes effect onresolve/ the next solve.
-
A. Lee, R. B. Parker, S. Poon, D. Gunter, A. W. Dowling, and B. Nicholson, “Model Diagnostics for Equation-Oriented Models: Roadblocks and the Path Forward,” Systems and Control Transactions 3:966–974 (2024). https://doi.org/10.69997/sct.147875 ↩ ↩2 ↩3 ↩4
Pyomo
Because POUNCE speaks the AMPL NL/SOL protocol, it drops into Pyomo through the AMPL Solver Library interface — exactly how Pyomo drives Ipopt.
The pyomo-pounce
package registers pounce as a Pyomo SolverFactory solver:
import pyomo_pounce # registers 'pounce'
from pyomo.environ import ConcreteModel, Var, Objective, SolverFactory
model = ConcreteModel()
model.x = Var(bounds=(-10, 10))
model.obj = Objective(expr=(model.x - 3) ** 2)
solver = SolverFactory('pounce')
solver.solve(model)
Options pass through the usual Pyomo mechanism:
solver.solve(model, options={'tol': 1e-10, 'max_iter': 500})
Under the hood, Pyomo writes the model to an AMPL .nl file, invokes
pounce problem.nl -AMPL, and reads the result back from the .sol
file. See Running Solves for the -AMPL solver mode.
Python API
POUNCE ships a Python wrapper that is intentionally cyipopt-compatible: code written for cyipopt typically runs against POUNCE by changing only the import.
Install
cd python
pip install maturin
maturin develop --release # builds the native extension into your venv
Optional extras:
pip install -e .[jax] # JAX integration
pip install -e .[dev] # tests + jax + scipy
cyipopt-style interface
import numpy as np
import pounce
class HS071:
def objective(self, x):
return x[0]*x[3]*(x[0]+x[1]+x[2]) + x[2]
def gradient(self, x):
return np.array([
x[0]*x[3] + x[3]*(x[0]+x[1]+x[2]),
x[0]*x[3],
x[0]*x[3] + 1.0,
x[0]*(x[0]+x[1]+x[2]),
])
def constraints(self, x):
return np.array([np.prod(x), np.dot(x, x)])
def jacobianstructure(self):
return (np.repeat([0, 1], 4), np.tile([0, 1, 2, 3], 2))
def jacobian(self, x):
return np.array([
x[1]*x[2]*x[3], x[0]*x[2]*x[3], x[0]*x[1]*x[3], x[0]*x[1]*x[2],
2*x[0], 2*x[1], 2*x[2], 2*x[3],
])
prob = pounce.Problem(
n=4, m=2,
problem_obj=HS071(),
lb=[1]*4, ub=[5]*4,
cl=[25, 40], cu=[2e19, 40],
)
prob.add_option('tol', 1e-8)
x, info = prob.solve(x0=np.array([1.0, 5.0, 5.0, 1.0]))
print(info['status_msg'], info['obj_val'], x)
scipy.optimize-style
import numpy as np
from pounce import minimize
res = minimize(lambda x: (x - 1) @ (x - 1) + 1, x0=np.zeros(5))
print(res.fun, res.x)
Finding multiple minima
pounce.find_minima is the global-search companion to minimize: it drives
the same solver in a loop to discover many distinct minima (flooding,
deflation, tunneling, multistart, MLSL, basin-hopping). See
Finding Multiple Minima for the methods and references,
Choosing a Method for selection guidance
(including high-dimensional behavior), and notebooks
15,
16,
17
for the three families.
from pounce import find_minima
r = find_minima(fun, x0, method="deflation", jac=jac, hess=hess,
bounds=bounds, n_minima=6)
print(r.status, len(r), "minima; best f =", r.fun)
JAX integration
The pounce.jax subpackage provides five entry points:
| Surface | Use it for |
|---|---|
from_jax(f, g, …) | Build a one-shot pounce.Problem from JAX-traced f(x) and g(x). |
solve(p, …) | custom_vjp-wrapped differentiable solve over a parameter p. |
solve_with_warm(p, …, warm_start=) | solve + dual-triple (x, λ, z) warm-start hand-off across calls. |
vmap_solve(p_batch, …) / vmap_solve_parallel(…) | Batched solve over a leading axis of p; the _parallel variant uses a ThreadPoolExecutor and releases the GIL inside each solve. |
JaxProblem(f, g, n, m, p_example=, …) | Build-once / solve-many handle that caches JIT artefacts, the sparsity probe, and the underlying pounce.Problem across calls. |
One-shot build with from_jax
import jax.numpy as jnp
from pounce.jax import from_jax
def f(x): return jnp.sum((x - 1) ** 2)
def g(x): return jnp.stack([jnp.sum(x) - 5.0])
prob = from_jax(f, g, n=4, m=1, lb=jnp.zeros(4), ub=jnp.full(4, 10.0),
cl=jnp.zeros(1), cu=jnp.zeros(1))
x, info = prob.solve(x0=jnp.ones(4))
Sparse Jacobian/Hessian compression (sparse=)
By default the constraint Jacobian and the Lagrangian Hessian are
computed densely — jax.jacrev/jacfwd/hessian build the full
matrix, which is then sliced to the detected sparsity pattern. The
reported structure is sparse, but the AD work and memory are O(m·n)
(Jacobian) and O(n²) (Hessian) regardless of how sparse the true
matrices are. On a 10,000-variable banded system that means computing
~10⁸ entries per iteration to keep ~50,000.
Passing sparse=True switches both derivatives to CPR-style colored
AD (pounce#83): structurally-orthogonal columns are colored, one
JVP (Jacobian) / HVP (Hessian) is taken per color — k ≪ n colors —
and the compressed result is scattered back to the known nonzeros. The
per-iteration cost drops from O(n) to O(k) AD passes. This is the
same compression strategy the Rust .nl tape path already uses for its
Hessian.
prob = from_jax(f, g, n=4, m=1, lb=jnp.zeros(4), ub=jnp.full(4, 10.0),
cl=jnp.zeros(1), cu=jnp.zeros(1),
sparse=True) # colored JVP/HVP instead of dense slice
The flag is also accepted by JaxProblem,
where it applies to both the single-solve and the batched
block-diagonal paths. The reported structure, the values, and the
solution are identical to the dense path either way — only the cost of
producing the derivative values changes. The differentiable backward
(factor_reuse / implicit diff) is unaffected.
When to use it. sparse=True wins on problems whose Jacobian/Hessian
are genuinely sparse with bounded per-row fill (banded, block, finite
differences/elements, PDE-constrained, separable). On a dense problem
the coloring finds no orthogonality (k = n) and the flag is a small,
bounded overhead, so it is opt-in rather than the default. Measured
on a banded family (python/benchmarks/bench_sparse_ad_83.py):
| n | colors (Jac / Hess) | per-eval Jacobian | per-eval Hessian | full solve |
|---|---|---|---|---|
| 800 | 2 / 3 | 6.2× faster | 2.0× faster | 1.3× faster |
| 2000 | 2 / 3 | 18.4× faster | 5.4× faster | 7.6× faster |
| 5000 | 2 / 3 | 560× faster | 200× faster | — |
The color count stays constant in n while the dense path grows
linearly, so the gap widens without bound as the problem scales.
Pattern detection. Sparsity is found by probing the dense derivative
at random points and recording where entries are nonzero. Under
sparse=True a mis-probe is costlier — it corrupts the compression
seed, not just a reported nonzero — so detection unions 3 probes by
default (vs 1 for the dense path). Override with n_probes=. Truly
value-dependent structure (branchy where/abs) should still be
hand-rolled via the Problem API.
Differentiable solve
pounce.jax.solve(p, f=, g=, …) is a custom_vjp-wrapped solve that
differentiates x*(p) through the implicit function theorem on the
converged KKT system. Inequality rows that are not active at x*
are dropped from the KKT block before the implicit-diff back-solve, so
the gradient matches the analytic active-set sensitivity even on
slack-inequality problems (pounce#73).
import jax, jax.numpy as jnp
from pounce.jax import solve as psolve
def f(x, p): return jnp.sum((x - p) ** 2)
def g(x, p): return jnp.stack([x[0] + x[1] - 1.0]) # equality
def x_star(p):
return psolve(
p, f=f, g=g, x0=jnp.zeros(2), n=2, m=1,
lb=jnp.full(2, -10.0), ub=jnp.full(2, 10.0),
cl=jnp.zeros(1), cu=jnp.zeros(1),
options={"tol": 1e-10, "print_level": 0},
)
# Gradient of the L2 distance to the target as p moves:
loss = lambda p: jnp.sum(x_star(p) ** 2)
print(jax.grad(loss)(jnp.array([0.3, 0.7])))
Warm-start across a parameter trajectory
solve_with_warm returns the full primal-dual triple alongside x*,
and consumes one on the next call. The warm-state is opaque from the
JAX side (pytree of jnp arrays) but maps directly onto the
x0 / λ0 / z0 ports of the underlying solver — for a sequence of
nearby p values this often cuts solver iterations by an order of
magnitude (pounce#74).
from pounce.jax import solve_with_warm
trajectory = [jnp.array([0.3 + 0.01 * k, 0.7 - 0.01 * k]) for k in range(50)]
x, warm = solve_with_warm(
trajectory[0], f=f, g=g, x0=jnp.zeros(2), n=2, m=1,
lb=jnp.full(2, -10.0), ub=jnp.full(2, 10.0),
cl=jnp.zeros(1), cu=jnp.zeros(1),
warm_start=None, # first call → cold start
options={"tol": 1e-10, "print_level": 0},
)
xs = [x]
for p_k in trajectory[1:]:
x, warm = solve_with_warm(
p_k, f=f, g=g, x0=x, n=2, m=1,
lb=jnp.full(2, -10.0), ub=jnp.full(2, 10.0),
cl=jnp.zeros(1), cu=jnp.zeros(1),
warm_start=warm, # reuse λ, z
options={"tol": 1e-10, "print_level": 0},
)
xs.append(x)
Batched solve (vmap_solve / vmap_solve_parallel)
vmap_solve runs one solve per row of p_batch sequentially.
vmap_solve_parallel is the same surface but dispatches each row to a
ThreadPoolExecutor; the underlying Rust solve releases the GIL via
py.allow_threads, so workers actually run in parallel on multi-core
CPUs (pounce#74).
import numpy as np
from pounce.jax import vmap_solve_parallel
rng = np.random.default_rng(0)
batch = jnp.asarray(rng.standard_normal((32, 2)))
X = vmap_solve_parallel(
batch, f=f, g=g, x0=jnp.zeros(2), n=2, m=1,
lb=jnp.full(2, -10.0), ub=jnp.full(2, 10.0),
cl=jnp.zeros(1), cu=jnp.zeros(1),
workers=8, # ThreadPoolExecutor size
options={"tol": 1e-9, "print_level": 0},
)
assert X.shape == (32, 2)
Both batched surfaces are custom_vjp-wrapped, so a downstream
jax.grad/jax.jacobian over a batched loss works end-to-end.
Build once, solve many: JaxProblem
For iterative use — a parameter trajectory in a continuation loop, a
training step that calls the solver inside a batch, a notebook cell
that sweeps a knob — from_jax/solve rebuild the JIT artefacts, the
sparsity probe, and the underlying pounce.Problem on every call.
JaxProblem does that work once at construction and exposes the same
four method shapes against the cached state. On the
pounce#75 microbench shape (n=5, m=6, 20 sequential solves at
different p) this is roughly a 14× speedup, taking per-solve time
from ~96 ms down to ~7 ms (pounce#75).
from pounce.jax import JaxProblem
jp = JaxProblem(
f=f, g=g, n=2, m=1, p_example=jnp.zeros(2), # p_example fixes shape/dtype only
lb=jnp.full(2, -10.0), ub=jnp.full(2, 10.0),
cl=jnp.zeros(1), cu=jnp.zeros(1),
options={"tol": 1e-9, "print_level": 0},
# sparse=True, # colored AD on sparse problems (see above)
)
# Sequential, differentiable:
x = jp.solve(jnp.array([0.3, 0.7]), x0=jnp.zeros(2))
# Dual-warm-start trajectory (composes warm-state hand-off with reuse):
x, warm = jp.solve_with_warm(trajectory[0], x0=jnp.zeros(2), warm_start=None)
for p_k in trajectory[1:]:
x, warm = jp.solve_with_warm(p_k, x0=x, warm_start=warm)
# Batched parallel solve over a row-axis of p_batch:
X = jp.vmap_solve_parallel(batch, x0=jnp.zeros(2), workers=8)
Each worker thread in vmap_solve_parallel keeps its own cached
pounce.Problem via threading.local, so the per-thread build cost
is paid at most once per worker rather than once per batch row.
Factor-reuse backward (factor_reuse=)
JaxProblem.solve and solve_with_warm default to a k_aug-style
backward that reuses the IPM’s converged compound KKT factor
(pounce.Solver.kkt_solve) instead of assembling a dense
(n+m) × (n+m) block and running jnp.linalg.solve on it
(pounce#76). The held LDLᵀ factor turns the bwd back-solve from
O((n+m)³) into O(nnz(L)) and drops the explicit active-set masking
that the dense path does — the barrier rows on the bound multipliers
(z_l, z_u) already encode “active bounds force Δx_i = 0” exactly,
and the (v_l, v_u) rows do the same for slack inequalities. The
accuracy of the resulting gradient is O(μ) at the IPM barrier
parameter, which sits well below tol after convergence.
jp = JaxProblem(..., factor_reuse=True) # default; reuse the IPM factor
jp = JaxProblem(..., factor_reuse=False) # dense JAX backward
Pick factor_reuse=False when you want higher-order differentiation
(jax.grad(jax.grad(...)) through the solver) — the dense backward
stays JAX-traced and is itself differentiable, the factor-reuse one
crosses to the Rust host via pure_callback and is opaque to a
second-order trace.
When to pick which on batched_solve workloads (pounce#77)
factor_reuse=False is itself a form of factor reuse — it builds
the per-block (n+m) × (n+m) KKT at pounce’s converged
(x*, λ*, μ_l*, μ_u*) (saved in the custom_vjp residual) and
solves it under jax.vmap with a JIT-fused per-block
jnp.linalg.solve. So both modes reuse pounce’s converged solution;
they differ only in what they back-solve:
factor_reuse=True— back-solves pounce’s held LDLᵀ factor of the full stacked KKT (Rust-side, via FFI through a single-thread executor pin).factor_reuse=False— back-solves a freshly assembled per-block dense KKT in JAX, fused undervmap.
For batched_solve + jax.jacrev / jax.vmap minibatch projections
factor_reuse=False is faster at every scale we measured
(n = 3 through 48 per block, B = 64 stacked):
n=3 reuse bwd = 16.6 ms dense bwd = 20.6 ms reuse/dense = 0.80×
n=8 reuse bwd = 52.5 ms dense bwd = 38.5 ms reuse/dense = 1.36×
n=16 reuse bwd = 157.6 ms dense bwd = 57.2 ms reuse/dense = 2.76×
n=32 reuse bwd = 558.6 ms dense bwd = 103.6 ms reuse/dense = 5.39×
n=48 reuse bwd =1262.9 ms dense bwd = 137.4 ms reuse/dense = 9.19×
The dense path scales as B · (n+m)³; the factor-reuse path scales
as N · kkt_dim ≈ B² · n · (n+m) because jax.jacrev fans out
N = B·n cotangents and each triggers a back-solve of the full
stacked LDLᵀ even though only one block has nonzero signal.
Guidance:
- Single solve + many sensitivities —
jax.jacrev(jp.solve, argnums=0)(p, x0)and friends — keepfactor_reuse=True. One LDLᵀ back-solve per cotangent against the held factor beats JAX dense-solving a fresh(n+m) × (n+m)block. - Batched solve + jacrev / vmap —
jax.jacrev(lambda P: jp.batched_solve(P, x0))(pb)— setfactor_reuse=False. Treat the dense path as the default for minibatch projections.
Each fwd registers its converged factor in a bounded LRU on the
JaxProblem (default capacity 128). For very long-running training
loops with many distinct forward solves you can drop the cache
explicitly:
jp.clear_solver_cache()
Off-thread dispatch (training loops, jit(value_and_grad(...)))
pounce.Solver is a !Send PyO3 type (it holds an
Rc<RefCell<dyn TNLP>> interior), so any attempt to touch the held
factor from a thread other than the one that built it raises a PyO3
panic. JAX hits this whenever the bwd pure_callback lands on an XLA
worker thread — typical for jax.jit(jax.value_and_grad(...)) inside
a training step.
JaxProblem(factor_reuse=True) defends against this by routing every
pounce.Solver interaction (fwd register, warm-start solve, batched
solve, bwd kkt_solve) through a dedicated single-thread
ThreadPoolExecutor owned by the JaxProblem (pounce#77). All solver
touches are pinned to that one worker thread regardless of which
thread JAX dispatches from. vmap_solve_parallel bypasses the pin
(it doesn’t register with the factor cache), so its B-way thread
concurrency is preserved.
Pickle / distributed training
JaxProblem round-trips through pickle.dumps / pickle.loads, so
it works with the realistic distributed-training paths:
multiprocessing(start_method='spawn')— the default on macOS and whattorch.utils.data.DataLoader(num_workers>0)uses;- Ray and Dask actors via
cloudpickle; - Naive checkpointing for resume.
The per-process runtime state (JIT’d closures, threading.Lock,
threading.local, the factor-reuse executor, the held LDLᵀ factor
registry) is dropped from the pickle and rebuilt on the receiving
side. The sparsity-pattern arrays survive the round trip, so the
worker doesn’t redo the one-shot JAX probe. Held factors do not
survive — a fresh process has no history of fwd solves, so the
receiver’s registry starts empty and the bwd factor-reuse path picks
up from the next solve.
User-side requirement: f and g must themselves be picklable.
Module-level functions work with stdlib pickle; lambdas / inner
functions need cloudpickle (which is what Ray, Dask, and
torch.multiprocessing use by default anyway).
multiprocessing(start_method='fork') is not supported — JAX
itself warns that os.fork() is incompatible with its threading;
use spawn instead.
Stacked block-diagonal batched solve (batched_solve)
JaxProblem.batched_solve(p_batch, x0) runs one IPM solve over a
single NLP whose variables are [x^(1); ...; x^(B)], constraints are
concat(g(x^(k), p^(k))), and objective is Σ_k f(x^(k), p^(k)).
The Jacobian and Lagrangian Hessian are block-diagonal — each block-k
constraint touches only the block-k slice of X, and the objective
is a pure sum, so there’s no cross-block coupling. The IPM sees one
big sparse problem but does only B × (per-block factor cost) work
on the linear system.
p_batch = jnp.array([[0.3, 0.7], [0.5, 0.5], [-0.1, 0.4]])
x_batch = jp.batched_solve(p_batch, x0=jnp.zeros(2)) # (B, n)
custom_vjp-wrapped, so jax.grad/jax.jacobian through the
batched solve work end-to-end:
def loss(P):
return jnp.sum(jp.batched_solve(P, x0=jnp.zeros(2)) ** 2)
dloss_dP = jax.grad(loss)(p_batch) # (B, p_shape)
The backward path follows factor_reuse=:
factor_reuse=True(default) — oneSolver.kkt_solveagainst the stacked held LDLᵀ factor; the per-block∂²L/∂x∂p/∂g/∂parejax.vmap’d autodiff over the user’sf/g, then contracted with the per-blocku_x/u_gslices of the single back-solve. Composes (A) and (B) — one factor for both forward and per-batch sensitivities (pounce#76).factor_reuse=False—jax.vmapof the per-element dense(n+m) × (n+m)JAX KKT solve. Exact for the same reason: block- diagonal coupling means∂x^(k)*/∂p^(j) = 0fork ≠ j.
When to pick batched_solve vs the existing batched surfaces:
| Surface | Wins when |
|---|---|
vmap_solve | Long batches, want one solve per iterate sequentially. |
vmap_solve_parallel | Batch elements have very different convergence behaviour — slow blocks don’t drag fast ones (B independent IPMs in worker threads, GIL released per solve). |
batched_solve | Blocks have similar convergence behaviour (shared barrier homotopy and symbolic factorisation amortise) and B is large enough that the per-call Python overhead of B fwd dispatches becomes visible (one Rust crossing instead of B). |
Per-block lb/ub/cl/cu are tiled across the batch; the
parameter p is what varies, not the feasible region. Stacked
Problems are cached per (thread, B) in a tiny LRU (cap 4), so
calls in a loop with one or two batch sizes pay the build cost at
most once per worker.
Post-solve Jacobian and sensitivities (batched_solve_with_jacobian)
When you need the explicit per-block Jacobian J[k] = ∂x^(k)*/∂p^(k)
as a first-class result — for validation, linear-update layers, or
diagnostics — batched_solve_with_jacobian returns it directly from
the held KKT factor instead of wrapping batched_solve in
jax.jacrev:
x_star, (lam, zL, zU), J = jp.batched_solve_with_jacobian(p_batch, x0)
# x_star : (B, n) J : (B, n, p_dim) duals match batched_solve_with_warm
J’s row i is the reverse-mode VJP at cotangent e_i (the KKT
system is symmetric), so the whole Jacobian is one multi-RHS back-solve
against the held LDLᵀ factor — no NLP re-solve, no repeated public
jax.vjp calls. Pass wrt_cols (1-D p only) to keep just the
parameter columns you care about, e.g. wrt_cols=slice(0, ny) to drop
context columns; J then has trailing dim len(wrt_cols).
For the linear-update pattern — anchor once, then apply several nearby
sensitivity products — pin the factor with an AnchorState and reuse it:
with jp.anchor(p_batch, x0, wrt_cols=slice(0, ny)) as state:
dx = jp.batched_jvp_from_state(state, dp) # J @ dp (forward)
dp_bar = jp.batched_vjp_from_state(state, x_bar) # J^T @ x_bar (reverse)
batched_jvp_from_state is the cheap path for linear updates that only
need the directional sensitivity delta_x = J @ delta_p and never the
full J: it assembles the parameter-side RHS [∂²L/∂x∂p · dp; ∂g/∂p · dp]
and back-solves once against the held factor. When the state was anchored
with wrt_cols, pass the reduced dp (one entry per selected column);
otherwise pass a full (B,) + p_shape perturbation (zero out the columns
you don’t want to move).
anchor(...) (and batched_solve_with_jacobian(..., return_state=True))
return an AnchorState that holds the factor across calls. Prefer the
context-manager form; for handles that must outlive a single block
(e.g. stored on a projection layer), use explicit ownership:
state = jp.anchor(p_batch, x0)
... # later calls reuse `state`
state.reanchor(p_new, x0) # swap the solve in place (closes prior pin)
state.close() # release the held factor
Pinned factors are exempt from the backward LRU but capped
(_pinned_capacity, default 16) so a missed close() fails loudly
rather than leaking; a weakref finalizer reclaims the factor if a
handle is garbage-collected without close(). A worked example —
projection layer, full Jacobian, JVP/VJP-from-state, and the lifetime
patterns — is in
notebooks/13_post_solve_jacobian.ipynb.
Notebooks
The notebooks under
python/notebooks/
work through getting started, JAX autodiff, implicit differentiation,
sensitivity analysis, the Pyomo integration,
NLP scaling
(set_problem_scaling + nlp_scaling_method=user-scaling), and
FBBT
(nonlinear bound tightening via presolve_fbbt=yes on Pyomo
models).
Finding Multiple Minima
pounce.minimize finds a single local minimum from a starting point.
pounce.find_minima is its global-search companion: it drives the same
local solver in a loop to discover many distinct minima, or the global
one among them.
import pounce
result = pounce.find_minima(
fun, x0,
method="deflation", # see the method families below
jac=jac, hess=hess, # same as minimize; analytic derivatives recommended
bounds=bounds,
n_minima=6, # target number of distinct minima
max_solves=None, # budget; default 8 * n_minima
patience=8, # give up after this many solves with nothing new
dedup=1e-3, # minima closer than this are "the same"
seed=0,
)
result.minima # list of minima, sorted by objective (lowest first)
result.values # their objective values
result.x # the best (lowest) minimum
result.status # "target_reached" | "converged" | "budget_exhausted"
result.n_solves # solver calls used
result.trace # per-solve diagnostics
Every method reuses minimize, so bounds and constraints carry through
unchanged, and the acceptance test is shared: each candidate is polished
on the clean objective, checked against the bounds, and — when a Hessian is
supplied — certified as a true minimum (positive-semidefinite Hessian, so
saddles and maxima are rejected) before being de-duplicated and recorded.
The six methods fall into three families by how they escape a minimum they have already found.
Repulsion — transform the problem and re-solve
These modify the problem so the solver can no longer settle where it just did, then re-solve. They share the lineage of the filled-function method and metadynamics: make the found minimum unattractive.
flooding
Add a repulsive Gaussian bump to the objective at each found minimum
x*_k:
F(x) = f(x) + Σ_k A_k · exp(−‖x − x*_k‖² / 2σ_k²)
The bump does not move the stationary point (a Gaussian is flat on top); it
flips its curvature. The minimum turns into a saddle once the bump is
taller than the basin’s curvature — precisely when
A/σ² > λ_min(∇²f(x*)) — and the solver rolls off it into a new
basin. The bump is smooth with an analytic gradient and Hessian, so the
flooded problem is as solvable as the original.
- Knobs (
strategy_kw):sigma(width) andamplitude(height). Both are"auto"by default.sigmais per-dimension — a fraction (sigma_frac, default 0.1) of each variable’s bounds range — so variables on very different scales are handled automatically.amplitudeis set per minimum from the local curvature (amp_margin × μ_min, the well-tempered escape height, whereμ_minis the smallest generalized eigenvalue of the Hessian against the bump metric) and raised adaptively if the solver returns to a flooded basin — so no manual energy scale is needed (a steep PES whose wells are ~150 deep needs noamplitude=150). Override either with a scalar or a length-nvector (sigma). - Best for broad enumeration of all minima of a smooth objective.
- References. Ge, R. “A filled function method for finding a global minimizer of a function of several variables.” Mathematical Programming 46, 191–204 (1990). doi:10.1007/BF01585737. Laio, A. & Parrinello, M. “Escaping free-energy minima.” PNAS 99(20), 12562–12566 (2002). doi:10.1073/pnas.202427399. Grubmüller, H. “Predicting slow structural transitions in macromolecular systems: Conformational flooding.” Phys. Rev. E 52(3), 2893–2906 (1995). doi:10.1103/PhysRevE.52.2893. Adaptive bump heights: Barducci, A., Bussi, G. & Parrinello, M. “Well-tempered metadynamics.” Phys. Rev. Lett. 100, 020603 (2008). doi:10.1103/PhysRevLett.100.020603.
deflation
Instead of a finite local bump, add a singular pole penalty:
F(x) = f(x) + Σ_k η / (‖x − x*_k‖² + s)^(p/2)
Each found minimum becomes infinitely costly. The pole reaches further than
a Gaussian (it decays as 1/r^p rather than vanishing exponentially), so
it can clear a basin a narrow Gaussian would miss. This is the additive,
minimization-friendly realization of the deflation idea, whose original
form multiplies the residual of a nonlinear system by a deflation operator
to exclude known roots for a Newton iteration.
- Knobs:
eta(penalty strength),powerp,softs(softening that keeps the pole finite), andlength— the per-dimension pole scale, also"auto"from the bounds range by default (scalar or vector to override). - Best for enumeration on problems where the longer-reach repulsion helps; the most Newton/IPM-native of the repulsion methods.
- References. Brown, K.M. & Gearhart, W.B. “Deflation techniques for the calculation of further solutions of a nonlinear system.” Numerische Mathematik 16, 334–342 (1971). doi:10.1007/BF02165004. Farrell, P.E., Birkisson, Á. & Funke, S.W. “Deflation techniques for finding distinct solutions of nonlinear partial differential equations.” SIAM J. Sci. Comput. 37(4), A2026–A2045 (2015). doi:10.1137/140984798.
tunneling
Rather than climb out of a basin, tunneling crosses sideways at constant height to a point past the barrier, then descends. Between local solves it seeks a point at the height of the most-recently found minimum while being repelled from all known minima, and then re-minimizes there. The result is a monotonically non-increasing sequence of minima.
- Knobs:
eta,power,soft(the repelling poles). - Best for finding the global minimum and a descending trail to it, not exhaustive enumeration.
- Reference. Levy, A.V. & Montalvo, A. “The tunneling algorithm for the global minimization of functions.” SIAM J. Sci. Stat. Comput. 6(1), 15–29 (1985). doi:10.1137/0906002.
A worked example of all three is in
python/notebooks/15_find_minima_repulsion.ipynb.
Restart — choose the next start cleverly
These leave the objective untouched and only change where each local solve begins.
multistart
Random (or Sobol low-discrepancy) sampling of the bounds box, one local solve per start. Simple, a strong baseline, and embarrassingly parallel.
- Knobs:
sobol(low-discrepancy sampling, on by default),restart_jitter(used when no bounds box is given). - Best for a robust default, especially when local solves are cheap and can be parallelized.
mlsl
Multi-Level Single Linkage grows a pool of sample points and starts a local solve from a sample only when (a) no better sample lies within a shrinking “reduced distance,” and (b) it is not near an already-found minimum. The effect is that each basin is descended approximately once, instead of many times as plain multistart re-discovers knowns.
- Knobs:
samples_per_round,gamma(reduced-distance scale). - Best for expensive local solves on funneling landscapes, where avoiding redundant descents matters.
- Reference. Rinnooy Kan, A.H.G. & Timmer, G.T. “Stochastic global optimization methods part II: Multi level methods.” Mathematical Programming 39, 57–78 (1987). doi:10.1007/BF02592071.
See python/notebooks/16_find_minima_restart.ipynb,
which shows multistart spending ~15 solves (9 redundant) to find all six
camel minima where MLSL needs ~6 (0 redundant).
Hopping — a Markov chain over minima
basinhopping
From the current minimum, apply a random perturbation, locally minimize to a neighboring minimum, and accept or reject by a Metropolis rule on the objective. The chain is biased downhill, so it reliably reaches the global minimum while collecting the distinct minima it visits.
- Knobs:
step(perturbation size),temperature(acceptance). - Best for the global minimum on rugged, high-dimensional landscapes — the workhorse of cluster and protein optimization.
- References. Li, Z. & Scheraga, H.A. “Monte Carlo-minimization approach to the multiple-minima problem in protein folding.” PNAS 84(19), 6611–6615 (1987). doi:10.1073/pnas.84.19.6611. Wales, D.J. & Doye, J.P.K. “Global optimization by basin-hopping…” J. Phys. Chem. A 101(28), 5111–5116 (1997). doi:10.1021/jp970984n. Cousin with history feedback: Goedecker, S. “Minima hopping…” J. Chem. Phys. 120(21), 9911–9917 (2004). doi:10.1063/1.1724816.
See python/notebooks/17_find_minima_hopping.ipynb.
Beyond minima: saddle points and critical points
The same ideas extend to every stationary point of f — saddles
(transition states) and maxima included. A critical point has ∇f(x) = 0;
its Morse index (the number of negative Hessian eigenvalues) classifies
it: 0 = minimum, 1 = transition state, …, n = maximum. Two entry
points are provided.
find_critical_points — enumerate and classify
Stationary points are the roots of ∇f(x) = 0, which are exactly the minima
of the gradient-norm merit ½‖∇f(x)‖² (zero there). So find_critical_points
runs find_minima on that merit — using any enumeration method
("deflation", "multistart", …) — then keeps the points where ‖∇f‖ is
truly zero and labels each by its Morse index. This treats pounce as a
root-finder and reuses the whole find_minima machine.
r = pounce.find_critical_points(
fun, x0, grad=grad, hess=hess, bounds=bounds,
method="deflation", n_points=12, dedup=1e-2,
)
r.minima # index 0
r.saddles # 0 < index < n (transition states)
r.maxima # index n
for p in r.points:
print(p.kind, p.x, p.f, p.index)
find_saddles — eigenvector following
A saddle is a minimum in most directions and a maximum along a few. By
walking uphill along the index softest Hessian eigenvectors and Newton-
downhill in the rest, eigenvector following lands directly on an
index-index saddle; multistart enumerates several.
s = pounce.find_saddles(fun, x0, grad=grad, hess=hess, bounds=bounds,
index=1, n_saddles=4)
Together with the minima, the index-1 saddles between them form the transition-state network / disconnectivity graph — flooding fills the basins, and the saddles are the barriers crossed between filled basins.
reaction_network — states, barriers, and connectivity in one call
reaction_network packages the whole workflow: it finds the minima (stable
states), finds the index-1 transition states, and connects each
transition state to the two minima it joins — by descending its unstable
mode into each adjacent basin — returning the barrier table and the
minimum-energy paths.
net = pounce.reaction_network(
fun, x0, grad=grad, hess=hess, bounds=bounds,
n_states=3, n_transition_states=2,
minima_kw={"sigma": 0.4, "amplitude": 150.0}, # find_minima tuning
saddle_kw={"max_step": 0.05}, # find_saddles tuning
)
print(net.summary())
net.minima # stable states, sorted by energy (CriticalPoint)
net.transition_states # index-1 saddles
net.connections # each: .ts, .minima=(i,j), .barrier=(fwd,rev), .path (MEP)
net.barrier(i, j) # lowest single-step barrier from state i to state j
net.neighbors(i) # states reachable from i over one transition state
net.path_between(i, j) # the connecting minimum-energy path, oriented i -> j
This is the natural high-level entry point for reaction-barrier and
energy-landscape work: the connectivity it returns is the reaction network
(equivalently, a disconnectivity graph), and the barrier of an elementary
step i → j is E(transition state) − E(state i).
- References. Cerjan, C.J. & Miller, W.H. “On finding transition states.” J. Chem. Phys. 75, 2800 (1981). Henkelman, G. & Jónsson, H. “A dimer method for finding saddle points…” J. Chem. Phys. 111, 7010 (1999). doi:10.1063/1.480097. Henkelman, G., Uberuaga, B.P. & Jónsson, H. “A climbing image nudged elastic band method…” J. Chem. Phys. 113, 9901 (2000). doi:10.1063/1.1329672. E, W. & Zhou, X. “The gentlest ascent dynamics.” Nonlinearity 24, 1831 (2011). doi:10.1088/0951-7715/24/6/008.
Runnable demos: a landscape with 4 minima, 4 saddles, and 1 maximum in
python/examples/critical_points.py,
and a molecular reaction barrier on the Müller-Brown potential — one
reaction_network call locating the stable states and the transition states
between them, then reading off barrier heights and the minimum-energy path —
in python/examples/reaction_barrier.py.
Termination
The search stops on whichever fires first, reported in result.status:
| condition | meaning | status |
|---|---|---|
n_minima distinct minima found | got what you asked for | target_reached |
patience solves in a row find nothing new | landscape appears exhausted | converged |
max_solves reached | spent the budget | budget_exhausted |
patience is what makes the “fewer minima exist than requested” case
efficient: ask for 6, find 2, try a few more times, and stop with
converged rather than burning the whole budget. find_minima always
returns however many minima it actually found — falling short of n_minima
is not an error.
A solve is many function evaluations; max_solves counts solver calls. A
true per-evaluation ceiling belongs inside each solve via
options={"max_iter": ...}.
Choosing a method
See Choosing a Multiple-Minima Method, including how the families behave as the dimension grows.
Scope
find_minima covers methods that drive pounce’s local solver as their inner
loop. Rigorous deterministic global optimization (branch-and-bound, DIRECT),
population/stochastic globals (differential evolution, CMA-ES — already in
SciPy), and homotopy continuation (all stationary points of polynomial
systems) are different machinery and out of scope.
Choosing a Multiple-Minima Method
All six find_minima methods drive the same local solver; they differ in
how they leave a minimum once found. Use this page to pick one.
By goal
| Your goal | Prefer | Why |
|---|---|---|
| Enumerate all minima of a smooth, low-dimensional objective | flooding, deflation | repulsion clears each basin so the next solve finds a new one; analytic derivatives keep the inner solve fast |
| Just the global minimum | basinhopping, tunneling | both are biased downhill and do not try to cover the whole space |
| A robust, parallel baseline | multistart | independent starts, trivially parallel, no tuning |
| Expensive solves on a funneling landscape | mlsl | clustering avoids re-descending basins it has already mapped |
| Rugged, high-dimensional landscape (clusters, conformers) | basinhopping | a local random walk over minima; the standard tool at scale |
By problem structure
- Have an analytic Hessian? Repulsion methods (
flooding,deflation) exploit it directly and certify each result as a true minimum. Without a Hessian, saddle rejection is skipped and the restart/hopping methods are a safer default. - Constrained problem? All methods pass
bounds/constraintsthrough. Repulsion only touches the objective, so it is the cleanest with general constraints; restart and hopping sample/perturb inside the bounds box. - No bounds?
multistart/mlslfall back to jittering aroundx0(give aboundsbox for genuine global coverage).flooding/deflationandbasinhoppingwork without bounds. - Variables on very different scales? Handled automatically. The
repulsion bump widths (
sigma/length) are per-dimension and"auto"by default — sized to each variable’s bounds range — and the default dedup metric measures distance in that same scaled space, so a singlededuptolerance is scale-free. Giveboundsso the scales can be inferred; pass an explicit scalar or length-nvector to override. - Symmetric or periodic coordinates (e.g. a periodic box): pass a custom
distance=metric so that images of the same minimum de-duplicate correctly.
Tuning cheat-sheet
| method | key knobs | rule of thumb |
|---|---|---|
flooding | sigma, amplitude | both "auto" by default (sigma per-dimension from the bounds; amplitude per-minimum from local curvature) — leave them; override only to force a specific width/height |
deflation | eta, power, soft, length | length is per-dimension "auto" by default; raise eta if the solver returns to a known minimum |
tunneling | eta, power | increase patience; it descends in a chain |
multistart | sobol | leave Sobol on for coverage |
mlsl | samples_per_round, gamma | more samples/round on rugged landscapes |
basinhopping | step, temperature | step ≈ basin spacing; raise temperature to cross higher barriers |
If a run stops at converged with fewer minima than you wanted, raise
patience (search longer before giving up) and/or max_solves. If it stops
at budget_exhausted, raise max_solves.
Scaling to high dimensions
The honest headline: enumerating all minima is intractable in high
dimensions for every method here — and that is a property of the problem,
not of the solver. The number of local minima typically grows exponentially
with dimension (Rastrigin has on the order of k^n; molecular energy
landscapes grow exponentially with the number of atoms). No method can list
exponentially many minima. What changes with dimension is which goal
remains reachable and which methods stay efficient.
Two costs scale independently:
- Cost per local solve. This is just pounce’s interior-point solve and
scales well with sparse, large
n— provided the objective stays sparse. Here is the catch for repulsion methods: each Gaussian or pole term adds a densen×nHessian contribution, so withKfound minima the augmented Hessian is the sparse base plusKdense updates. On large sparse problems this destroys sparsity and the inner solve slows sharply. Restart and hopping never modify the objective, so they keep the original sparsity and the per-solve cost scales asminimizeitself does. - Number of solves needed. For coverage this grows exponentially for all methods. For the global minimum it grows much more slowly for the downhill-biased methods, which is why they remain usable at scale.
How each family behaves as n grows:
- Repulsion (
flooding,deflation). Two problems compound. A Gaussian of widthσcovers a vanishing volume fraction~ σⁿ, so filling space needs exponentially many bumps; and the bumps densify the Hessian (above). The standard high-dimensional fix — used by metadynamics in practice — is to flood in a low-dimensional collective-variable subspace rather than allncoordinates. In full coordinates these are best kept to roughlyn ≲ 10–20. - Restart (
multistart,mlsl). Each start is cheap and parallel, but the number of starts to cover (or to hit the global basin) grows exponentially. MLSL’s clustering relies on a reduced radius∝ (ln N / N)^(1/n); asngrows that exponent→ 0, distances concentrate, and MLSL degenerates toward plain multistart. So MLSL’s advantage is a low-to-moderate-dimension phenomenon; in high dimension prefer plain parallelmultistart(for the global basin) and spend the budget on more starts. - Hopping (
basinhopping). This is the family that scales best in practice, and it is exactly what the chemistry/physics community uses for hundreds to thousands of degrees of freedom. It performs a local random walk in minimum-space — it never tries to cover the domain — keeps the objective (and its sparsity) untouched, and the Metropolis bias funnels toward low minima. Pair it with multiple independent chains for parallelism.
Practical guidance:
n ≲ 10–20, want all minima:flooding,deflation, ormlsl.- High
n, want the global (or a few good) minima:basinhoppingfirst;multistartwith many parallel starts as a baseline;tunnelingfor a descending trail. - High
nand you still want flooding-style biasing: restrict the bumps to a handful of collective variables, as in metadynamics, rather than the full coordinate vector. - Always: each individual solve inherits pounce’s scalability; the bottleneck is the number of solves and, for repulsion, the loss of sparsity — not the local solver.
Finding Multiple Minima from the CLI
The pounce command line solves one problem from one starting point. The
--minima family turns that single solve into a global search: it drives
the same interior-point solver in a loop, escaping each minimum it finds, and
collects the distinct local minima into a deduplicated archive. It is the
pure-Rust counterpart of the Python find_minima API and
needs no Python — it works on built-in problems and on AMPL .nl files alike.
$ pounce model.nl --minima flooding --n-minima 10
Methods
--minima <method> selects one of six strategies. They share the same local
solver and acceptance test and differ only in how they leave a minimum once
found:
| method | how it escapes a found minimum | reference |
|---|---|---|
multistart | independent random / Sobol’ starts across the box | — |
mlsl | Multi-Level Single-Linkage clustering of sampled starts | Rinnooy Kan & Timmer (1987) |
basinhopping | Metropolis random walk over minima | Wales & Doye (1997) |
flooding | repulsive Gaussian bumps added at found minima (filled-function) | Ge (1990) |
deflation | softened 1/‖x−x*‖^p poles added at found minima | Farrell, Birkisson & Funke (2015) |
tunneling | equal-height tunnel term between descents | Levy & Montalvo (1985) |
--multistart is shorthand for --minima multistart. For help choosing,
see Choosing a Method — the guidance there applies
unchanged to the CLI.
Shared options
| flag | default | meaning |
|---|---|---|
--n-minima <N> | 10 | target number of distinct minima (a stop condition) |
--max-solves <N> | 8 × n-minima | hard cap on solver calls |
--patience <N> | 8 | stop after N solves in a row that find nothing new |
--dedup <d> | 1e-4 | minima within this per-dimension-scaled distance are the same |
--psd-tol <t> | 1e-6 | smallest Hessian eigenvalue tolerated by the saddle-rejection check |
--seed <S> | 0 | seed for sampling / Sobol’ scramble (runs are reproducible) |
--sobol / --no-sobol | on | use a scrambled Sobol’ sequence for box sampling |
A candidate is accepted when its solve converged, the point is finite and
inside the bounds, its objective Hessian is positive semidefinite within
--psd-tol (saddle rejection; skipped when no Hessian is available or the
problem is large), and it is not already within --dedup of an archived
minimum. The dedup distance is measured in a per-dimension-scaled space
(‖(a−b)/L‖, with L the box width per variable), so a single tolerance is
scale-free.
The search stops at the first of: target_reached (--n-minima found),
converged (--patience consecutive empty solves), or
budget_exhausted (--max-solves reached).
Strategy knobs
Each is optional and used only by the relevant method; omit them to take the
defaults (which mirror find_minima). "auto" widths are sized per dimension
from the bounds.
| flags | method |
|---|---|
--sigma, --sigma-frac, --amplitude, --amp-margin | flooding |
--eta, --power, --soft, --length, --length-frac | deflation, tunneling |
--gamma, --samples-per-round | mlsl |
--step, --temperature | basinhopping |
--restart-jitter | all (perturbation scale for restart fallbacks) |
The repulsion methods (
flooding,deflation,tunneling) run each escape solve underhessian_approximation = limited-memory— the analytic penalty term is added to the objective and its gradient, and the quasi-Newton update supplies curvature, so the dense augmented Hessian is never assembled. Each accepted point is then polished by re-solving the clean objective with the exact Hessian, so the reported minima sit on the true problem.
Output
The console prints a ranked table of the distinct minima (rank, objective, and scaled distance to the best), followed by the stop status and the number of solves:
find-minima: 6 distinct minima in 17 solves (target_reached)
rank objective dist-to-best
0 -1.03162845e0 0.000000e0
1 -1.03162845e0 4.772232e-1
...
Solution files. With .sol output enabled, the global best minimum is
written to the usual <stub>.sol (preserving the AMPL contract), and the
remaining minima, ranked by objective, to siblings <stub>.min001.sol,
<stub>.min002.sol, … (so min001 is the second-best point).
JSON report. --json-output writes the standard single-solve report for
the best minimum, plus a backward-compatible minima section:
"minima": {
"method": "multistart",
"status": "target_reached",
"n_solves": 17,
"n_minima": 6,
"minima": [{ "x": [...], "objective": -1.0316 }, ...],
"values": [-1.0316, -1.0316, -0.2155, ...]
}
Omitting --minima leaves the default single-solve output completely
unchanged.
Example
The six-hump camel function has six local minima (two global at
f ≈ −1.0316). Searching for all of them from an .nl model:
$ pounce sixhump.nl --minima multistart --n-minima 6 \
--max-solves 120 --patience 40 --dedup 1e-3 --seed 0
References
- Ge, R. “A filled function method for finding a global minimizer of a function of several variables.” Mathematical Programming 46, 191–204 (1990).
- Rinnooy Kan, A.H.G. & Timmer, G.T. “Stochastic global optimization methods part II: Multi level methods.” Mathematical Programming 39, 57–78 (1987).
- Levy, A.V. & Montalvo, A. “The tunneling algorithm for the global minimization of functions.” SIAM J. Sci. Stat. Comput. 6(1), 15–29 (1985). doi:10.1137/0906002.
- Wales, D.J. & Doye, J.P.K. “Global optimization by basin-hopping and the lowest energy structures of Lennard-Jones clusters containing up to 110 atoms.” J. Phys. Chem. A 101(28), 5111–5116 (1997).
- Farrell, P.E., Birkisson, Á. & Funke, S.W. “Deflation techniques for finding distinct solutions of nonlinear partial differential equations.” SIAM J. Sci. Comput. 37(4), A2026–A2045 (2015). doi:10.1137/140984798.
Algorithm & Workspace
Algorithm
POUNCE implements the interior-point filter line-search algorithm of Wächter & Biegler (2006) — the same algorithm upstream Ipopt uses. A solve proceeds as a sequence of barrier subproblems: for a decreasing sequence of barrier parameters μ, it takes primal-dual Newton steps on the perturbed KKT system, accepting each step through a filter line-search that balances objective descent against constraint infeasibility. When a regular step cannot be found, a restoration phase minimizes constraint violation to return the iterate to a filter-acceptable region.
See Acknowledgments for the papers behind each component.
Workspace layout
POUNCE is a Cargo workspace. Each crate maps onto a part of the upstream Ipopt source tree:
| Crate | Purpose |
|---|---|
pounce-common | Types, exceptions, journalist, options, tagged objects, cached results (Ipopt src/Common). |
pounce-linalg | BLAS-1, dense/compound vectors and matrices, triplet storage, CSC conversion (Ipopt src/LinAlg). |
pounce-linsol | Symmetric linear-solver trait layer — no FFI; backends plug in below. |
pounce-feral | Pure-Rust sparse symmetric LDLᵀ backend. The default. |
pounce-hsl | MA57 backend via libcoinhsl (optional, behind the ma57 feature). |
pounce-nlp | TNLP trait, TNLPAdapter, IpoptApplication entry point (Ipopt src/Interfaces). |
pounce-algorithm | IteratesVector, IpoptData, calculated quantities, KKT, line search, μ update, convergence check, main loop (Ipopt src/Algorithm). |
pounce-restoration | Restoration phase (Ipopt Algorithm/Resto*). |
pounce-presolve | Presolve / problem-reduction pass run before the IPM. |
pounce-l1penalty | ℓ₁-exact penalty-barrier wrapper for degenerate / MPCC NLPs. |
pounce-sensitivity | Parametric sensitivity (port of Ipopt contrib/sIPOPT). |
pounce-cinterface | C ABI shim — CreateIpoptProblem / IpoptSolve / FreeIpoptProblem. |
pounce-py | Python bindings (the pounce Python package). |
pounce-cli | The pounce command-line driver. |
The C ABI shim lets existing PyIpopt / cyipopt / JuMP / AMPL clients link against POUNCE in place of Ipopt.
Color Theme
POUNCE’s terminal output uses one tiger / rust / warm palette across
every colored surface — the iteration table, the branded wordmark, and
the interactive debugger. This page is the single reference for what the
colors mean; the palette itself lives in
pounce-common::style
(a pure, unit-tested module — no I/O, no globals).
For the environment variables that turn color on/off (NO_COLOR,
CLICOLOR_FORCE, RUST_LOG, POUNCE_LOG_FORMAT) see
Solver Options → Logging and colored output.
The palette
| Name | Hex | Role |
|---|---|---|
ALPHA_COOL | #000000 | iteration-row text at α = 1 (full Newton step) |
ALPHA_HOT | #cc2200 | iteration-row text at α → 0 (stalling); molten-claw base |
TAN | #8a6d3b | restoration soft-stay row background (s) |
AMBER | #b56a12 | restoration soft-exit row background (S) |
RUST_DEEP | #6e260e | restoration hard row background (R / resto-phase rows) |
CREAM | #f5e6c8 | restoration-row text at α = 1 |
BRIGHT_YEL | #ffe03a | restoration-row text at α → 0; molten-claw top |
TIGER_ORANGE | #e87a1e | WARN logs, banner accents, molten-claw mid |
Two further surfaces reuse these or a small extension:
| Name | Hex | Role |
|---|---|---|
| steel-hi → steel-lo | #d2d6dc → #5c6068 | wordmark letter sheen, top row → bottom row |
| gold | #ffb000 | debugger banner highlight (interior-point debugger, help) |
| dim | #7a7e88 | debugger banner gloss text |
Where the colors appear
The iteration table
Two orthogonal channels encode solver state on each row:
- Background = restoration kind, keyed off the row’s
alpha_primal_chartag:ssoft-stay → tan,Ssoft-exit → amber,Rhard (and the dedicated restoration phase’sr-suffixed rows) → deep rust.- Normal (non-restoration) rows have no background.
- Tiny-step tags (
t/T) deliberately get no background — that stall is shown by the foreground instead.
- Foreground = a smooth gradient on the primal step length α ∈ [0, 1]
(a visual stalling cue):
- Normal rows: black (α = 1, full step) → hot red (α → 0).
- Restoration rows: cream (α = 1) → bright yellow (α → 0), so the text stays legible on the dark background.
- α is clamped to
[0, 1]; a non-finite α is treated as a full step (no false stalling alarm).
So at a glance: a dark row means restoration (its shade tells you which kind), and redder / yellower text means a shorter step (the solver is struggling to move).
The branded wordmark (pounce logo)
Printed atop a normal solve and at the top of the debugger REPL. The
POUNCE block letters carry a top-lit steel sheen (light silver
#d2d6dc at the top row fading to dark steel #5c6068 at the bottom),
and three diagonal molten claw slashes rake across them, glowing
bright yellow → tiger-orange → deep red top-to-bottom — the project
logo’s forged-metal-with-lava look.
The interactive debugger
The REPL open banner (--debug) reuses the same wordmark, then a command
cheat-sheet whose shortcut keys are tiger-orange, the
interior-point debugger line and the help hint are gold, and the
descriptive gloss is dim grey. Pause banners and command output are
otherwise uncolored. (--debug-json emits no color — its stdout is a
pure JSON channel.)
viz kkt / viz L open in the external Plotly viewer
(pounce-dbg-viz),
which is a separate visual language: the sparse-matrix heatmaps use a
diverging red–blue scale keyed on entry value (sign + magnitude),
not the terminal palette.
Logs
WARN-level log lines (on stderr) take the tiger-orange accent;
other levels use the subscriber’s defaults.
Terminal support & downgrade
- Truecolor (24-bit) is used when the terminal advertises it via
COLORTERM— every color above is emitted as exact RGB. - 256-color terminals get a graceful fallback: each RGB color snaps
to the nearest xterm 6×6×6 cube color (
downgrade/nearest_ansi256). The theme still reads correctly, just quantized.
When color is emitted
Color is opt-out and TTY-aware:
- The iteration table is colored only when stdout is a terminal
(via
anstream::AutoStream, which strips escapes from redirected output while keeping identical column alignment). - The debugger banner is colored only when stderr is a terminal.
NO_COLOR(any value) disables color everywhere;CLICOLOR_FORCEforces it even into a non-terminal sink. See Solver Options.
Because the policy is consistent, redirected logs/output are always plain text — safe to diff, grep, and ingest.
For contributors
Add or change colors in pounce-common::style, never with inline ANSI:
the constants, the α-gradient (alpha_gradient_rgb), the restoration
mapping (resto_background_rgb), the composed iteration_row_style, and
the truecolor downgrade all live there and are unit-tested without a
TTY. Print sites style through anstyle + anstream (or, for the
debugger banner on stderr, gate on stderr().is_terminal() and
NO_COLOR). Keep the two iteration-table channels — background =
restoration kind, foreground = step length — orthogonal.
NLP and Linear-System Scaling
Optimization problems whose objective, constraints, or KKT system span many orders of magnitude often converge poorly — or not at all — without some form of rescaling. pounce inherits two independent scaling layers from Ipopt and adds a third option at the linear-system level (see issue #61).
The two layers are conceptually separate:
| Layer | Option | What it touches |
|---|---|---|
| NLP scaling | nlp_scaling_method | The objective f and each constraint row c_i, before the IPM sees them. Changes algorithmic behavior (filter, tol, μ). |
| Linear-system scaling | linear_system_scaling | Symmetric scaling of the KKT augmented system D K D for the factorization. Purely numerical — the IPM sees the same iterates. |
You can configure them independently. Defaults match upstream Ipopt:
nlp_scaling_method = gradient-based, linear_system_scaling = none.
NLP-level scaling
| Option | Default | Effect |
|---|---|---|
nlp_scaling_method | gradient-based | none / gradient-based / user-scaling. |
nlp_scaling_max_gradient | 100.0 | Cutoff above which gradient-based scaling applies. Per-row scale = min(1, max_gradient / ‖∇c_i‖_∞). |
nlp_scaling_min_value | 1e-8 | Floor on computed scale factors — prevents inverting near-zero gradients. |
nlp_scaling_obj_target_gradient | 0.0 | When > 0, pins the scaled objective gradient ∞-norm to this value. Overrides the max_gradient cutoff. |
nlp_scaling_constr_target_gradient | 0.0 | Same as above, per constraint row. |
obj_scaling_factor | 1.0 | Constant multiplier on the objective, applied after the automatic factor. |
gradient-based (default)
Evaluates ∇f and ∇c_i once at the starting point x_0 and
chooses per-row scales that pull each gradient ∞-norm into a
reasonable band. Single-shot is mandatory — recomputing per iteration
would invalidate the filter’s history (Wächter, 2013).
The clamp at 1.0 means scaling never amplifies a small row; it only damps large ones.
user-scaling
The TNLP is asked for obj_scaling, a per-variable x_scaling, and a
per-constraint g_scaling via the get_scaling_parameters callback.
Use this when you know the natural units of your problem (e.g. mass in
kg vs. distance in mm) and can supply better scales than the
gradient-based heuristic.
Note: pounce’s
OrigIpoptNlpcurrently honorsobj_scalingand per-constraintg_scaling. Thex_scalingrequest channel is accepted but not yet acted on. Mirrors the design in issue #61.
If the TNLP’s get_scaling_parameters returns false (the default),
pounce falls back to no automatic scaling.
Setting user scaling
- From C — call
SetIpoptProblemScaling(problem, obj, x_scaling, g_scaling)thenAddIpoptStrOption("nlp_scaling_method", "user-scaling"). Seecrates/pounce-cinterface/include/pounce.h. - From Rust — implement
TNLP::get_scaling_parameterson your problem type. - From Python —
pounce.Problem.set_problem_scaling(obj_scaling, x_scaling=None, g_scaling=None), followed byadd_option("nlp_scaling_method", "user-scaling"). Walked through end-to-end inpython/notebooks/07_scaling.ipynb.
Target-gradient overrides
nlp_scaling_obj_target_gradient and
nlp_scaling_constr_target_gradient are subtle. When set to a
positive value, they override the max_gradient cutoff and the 1.0
clamp: the scaling is computed unconditionally as
target / max_gradient_norm, so the scaled gradient ∞-norm becomes
exactly the target. Useful when you have a specific numeric range you
want the IPM to see.
The default 0.0 means “use the cutoff path” — i.e. only scale rows
that are above nlp_scaling_max_gradient.
Linear-system-level scaling
| Option | Default | Effect |
|---|---|---|
linear_system_scaling | none | none / ruiz. mc19 and slack-based are accepted by the option registry but not yet implemented — both fall back to none. |
linear_scaling_on_demand | yes | Defer scaling computation until a linear solve is poor; reduces overhead for well-conditioned KKT systems. |
The KKT augmented system is symmetric; all linear-system scalers in
pounce use the symmetric form D K D (single diagonal) to preserve
that structure for the downstream factorization (MA57, MUMPS,
FERAL/SSIDS).
none— first-class choice. The inner linear solver (MA57, MUMPS, FERAL) often does its own scaling under some configurations; stacking pounce-level scaling on top can hurt. Default. Usema57_automatic_scaling=yesto get MA57’s internal scaling instead.ruiz— iterative symmetric ∞-norm equilibration (Ruiz, CERFACS TR/PA/01/14). Pure Rust, no Fortran dependency. Converges geometrically; capped at 10 iterations. The only implemented scaler today; recommended starting point when MA57’s internal scaling is off.mc19(not yet implemented) — intended HSL MC19 row/column scaling (Curtis-Reid 1972; minimizes Σ log²|a_ij|). Accepted by the registry but currently logs a warning and falls back tonone.slack-based(not yet implemented) — intended slack-aware scaling. Accepted by the registry but falls back tonone.
Worked example — nql180
nql180 is one of the Mittelmann NLP benchmarks where both default
pounce and default Ipopt fail to clear the strict tol gate (see
issue #25). Forcing
Ruiz symmetric equilibration on the augmented KKT system is enough to
push pounce all the way to “Optimal Solution Found”:
pounce nql180.nl presolve=yes linear_system_scaling=ruiz \
linear_scaling_on_demand=no
| default | + Ruiz (forced) | |
|---|---|---|
| Exit status | Solved To Acceptable Level | Optimal Solution Found |
| Iterations | 41 | 50 |
| Primal infeasibility | 4.0e-11 | 1.2e-15 |
| Dual infeasibility | 1.0e-5 | 3.1e-4 |
| Complementarity | 1.2e-9 | 9.9e-10 |
| Overall NLP error | 2.4e-7 | 9.9e-10 |
The four-orders-of-magnitude primal-feasibility improvement and ~3
orders on the overall NLP error are the textbook Ruiz benefit:
symmetric ∞-norm equilibration lowers the condition number of the KKT
matrix enough that the back-solve residuals drop the extra fractional
digits needed to clear tol. The extra nine iterations are well spent
— the 50-iter Ruiz solution is mathematically of strictly higher
quality than the 41-iter unscaled “acceptable” solution.
linear_scaling_on_demand=no forces always-on Ruiz; the default
(yes) defers scaling computation until the linear solver flags an
iterate as poorly scaled, which is the right behavior for problems
that don’t need it (most of the Mittelmann set, where the iter count
is unchanged with or without Ruiz).
Reporting
All scaling effects are undone before the solve report (final objective, multipliers, dual residuals, KKT termination metric) is handed back to the user. You always see quantities in the natural units of your TNLP.
Internally, the IPM operates in scaled space: stopping criteria
(tol, acceptable_tol) compare scaled values, the barrier parameter
μ is in scaled units, and the filter’s history is built from scaled
function values.
When to override the defaults
Reach for non-default scaling when:
- The constraint Jacobian has entries spanning many orders of magnitude
(chemistry, power-flow, mixed-unit mechanics). Try
mc19orruizat the linear-system level, after disabling MA57’s internal scaling. - The IPM stalls with small step sizes but no clear infeasibility.
Worth turning
nlp_scaling_method=noneto see whether the default gradient scaling is doing the wrong thing; then re-enable with problem-specific target gradients. - You know the natural units of your problem better than the solver
can infer from gradients at
x_0. Wireuser-scaling.
Otherwise the upstream-Ipopt-style defaults (gradient-based at the
NLP level, none at the linear-system level with MA57’s internal
scaling on) are a reasonable starting point.
References
- Wächter, A. On the effects of scaling on the performance of Ipopt. arXiv:1301.7283 (2013). https://arxiv.org/abs/1301.7283
- Ruiz, D. A scaling algorithm to equilibrate both rows and columns norms in matrices. CERFACS TR/PA/01/14. https://cerfacs.fr/wp-content/uploads/2017/06/14_DanielRuiz.pdf
- Curtis, A. R. and Reid, J. K. On the Automatic Scaling of Matrices for Gaussian Elimination. (1972). HSL MC19 reference.
- pounce issue #61.
Feasibility-Based Bound Tightening (FBBT)
pounce supports feasibility-based bound tightening on nonlinear
constraints: interval-arithmetic propagation through the constraint
expression DAG to discover variable bounds the user did not write
down (e.g. x² + y² ≤ 1 ⇒ x ∈ [-1, 1], exp(x) ≤ 10 ⇒
x ≤ ln 10). It pairs with the linear bound-tightening already in
the presolve pipeline (which only handles linear constraints).
Tracks issue #62. References: Belotti, Cafieri, Lee, Liberti (2010).
When it helps
- The Jacobian / objective row magnitudes are wildly different from what the user-declared bounds suggest.
- A nonlinear equality or one-sided inequality is much tighter than
the user’s
[lo, hi]box. - Loose bounds were inherited from a modeling tool that doesn’t propagate constraints back to variable boxes (most modeling tools don’t).
FBBT cannot help when:
- The TNLP has no structural-expression representation. Today only
.nl-loaded problems (NlTnlp) expose one. Python (PyTnlp), C-callback (CCallbackTnlp), and Rust closure-based problems silently opt out. - The expression uses operators FBBT doesn’t reason about
(
Funcallto AMPL imported functions, variable-exponent powers,sin/cosreverse pass). Those subtrees become opaque and block tightening through them, but the rest of the constraint still propagates normally.
Options
| Option | Default | Effect |
|---|---|---|
presolve_fbbt | no | Master switch. Requires presolve=yes and an ExpressionProvider. |
fbbt_tol | 1e-6 | Minimum per-variable bound improvement to keep iterating. |
fbbt_max_iter | 10 | Outer-sweep cap. |
fbbt_max_constraints | 0 | Per-sweep cap on constraints inspected (0 = unlimited). |
FBBT runs after the linear bound-tightening (Phase 1) and before the redundant-constraint pass (Phase 2), so any FBBT-derived tightening feeds forward into row drops, the LICQ check, and the bound-multiplier warm starts.
Reading the presolve banner
With presolve_fbbt=yes, the per-solve presolve banner prints two
lines instead of one:
Presolve: tightened 170 bounds (82 newly-finite), dropped 46 redundant rows, LICQ=Full
Presolve FBBT: 10 sweeps, 1362 variable tightenings (Σ|Δ|=7.5e20)
Fields:
sweeps— number of outer iterations actually executed (≤fbbt_max_iter). Hitting the cap is informational, not an error.variable tightenings— total count of per-variable(x_lo[j], x_hi[j])updates that strictly improved the box.Σ|Δ|— sum of absolute bound improvements across all updates. Provided as a coarse “how much did we move” signal — not part of the FBBT algorithm.
If FBBT detects infeasibility (the constraint bound is disjoint
from the interval enclosure at the current variable box), it stops
and emits pounce: FBBT detected infeasibility (witness constraint N). The solve continues with the partially-updated bounds — the
IPM will then report infeasibility through its own channels.
Should I turn it on?
The issue’s design says: default off until benchmark evidence justifies a flip. Today’s evidence:
- On small problems (e.g.
tutorial_flow_density.nl): FBBT moves iteration count slightly, sometimes up, sometimes down. - On larger problems (e.g.
gaslib11_steady.nl): FBBT enables additional redundant-row drops and can promote the LICQ verdict fromStructuralRanktoFull, but the iteration count change is mixed.
So: try it on your problem. If you see fewer iterations or a
cleaner LICQ verdict, keep it on; if it costs iterations, turn it
off again. The cost of FBBT itself is small (one pass over the
expression DAGs per sweep, capped at fbbt_max_iter).
Soundness guarantees
FBBT uses outward-rounded interval arithmetic. Every operation
widens its result by one ULP outward so accumulated floating-point
error always increases the interval, never shrinks it. The
consequence: FBBT may produce a looser tightening than ideal, but
it cannot drop a feasible point. The pointwise soundness fuzz tests
in crates/pounce-presolve/src/fbbt/{forward,reverse,orchestrator}.rs
verify this property on random sample grids.
Operator support
Forward + reverse rules cover the operators that account for ~all nonlinear constraints in practice:
| Operator | Forward | Reverse |
|---|---|---|
+ - * / neg | ✓ | ✓ |
pow (integer constant) | ✓ | ✓ (branch-selecting for even powers) |
pow (variable / non-integer) | opaque | opaque |
sqrt exp ln abs | ✓ | ✓ (with domain clipping) |
sin cos | ✓ (loose) | declines to tighten |
log10 | rewritten as ln / ln(10) | follows the rewrite |
AMPL imported Funcall | opaque | opaque |
n-ary Sum | folded into binary Add | follows the fold |
Opaque slots evaluate to [-∞, +∞] on the forward pass and block
reverse propagation through them — they don’t pollute the rest of
the constraint.
Extending support to new TNLP sources
FBBT consumes the pounce_nlp::expression_provider::ExpressionProvider
trait. Any TNLP can opt in by implementing:
#![allow(unused)]
fn main() {
impl ExpressionProvider for MyTnlp {
fn constraint_expression(&self, i: usize) -> Option<pounce_nlp::FbbtTape> {
// Build a tape from your problem's symbolic structure.
// Return None to decline (FBBT becomes a no-op on that
// constraint).
}
}
}
FbbtTape is a flat tape of FbbtOp nodes; the existing
NlTnlp implementation in crates/pounce-cli/src/nl_fbbt_translate.rs
is the canonical template (it walks an AMPL Expr tree, preserving
CSE sharing via Rc::as_ptr keying). Building a similar tape from
a Pyomo, JAX, or sympy expression is a finite-effort project.
References
- Belotti, Cafieri, Lee, Liberti. On feasibility based bounds tightening. (2010). https://enac.hal.science/hal-00935464v1/document
- Liberti et al. Feasibility-based bounds tightening via fixed points. COCOA 2010. https://www.lix.polytechnique.fr/~liberti/fbbt-cocoa10.pdf
- Puranik, Sahinidis. Domain reduction techniques for global NLP and MINLP optimization. Constraints 22 (2017). https://arxiv.org/pdf/1706.08601
- pounce issue #62.
Auxiliary-Equality Preprocessing
POUNCE’s auxiliary-equality preprocessing pass identifies small, self-contained equality sub-systems in an NLP and solves them before the IPM starts. Variables determined by those sub-systems are pinned to their values; the equality rows are dropped from the problem the IPM sees. The IPM then handles the reduced problem, which is smaller, often better-conditioned, and sometimes solvable in zero iterations.
The pass is a port of ripopt PR #32 by
David Bernal Neira to pounce’s TNLP wrapper. It lives
entirely inside pounce-presolve and is enabled by setting two
options:
pounce problem.nl presolve=yes presolve_auxiliary=yes
What it does, in words
For each call to the inner TNLP, the wrapper:
- Builds a bipartite graph between equality constraint rows and variables, using the Jacobian sparsity.
- Finds a maximum matching (Hopcroft-Karp).
- Runs a Dulmage-Mendelsohn partition, slicing the graph into three pieces: overdetermined, underdetermined, and square (the piece where rows and variables pair up one-to-one).
- Decomposes the square piece into independent connected components, and each component into an ordered sequence of blocks via Tarjan SCC.
- Classifies each block by how it’s coupled to the rest of the problem: pure equality, objective-coupled, inequality-coupled, or both.
- Solves each pure-equality (or, with
aggressivecoupling, objective-coupled) block via a small dense-LU Newton step and verifies the full-space residual is within tolerance. - Applies accepted blocks by clamping the fixed variables’
bounds (
x_l = x_u = value) and dropping the dropped rows. - After the IPM finishes, recovers the Lagrange multipliers for the dropped rows via a small dense-LU stationarity solve, and hands the user back a complete full-space KKT solution.
If the model has no eliminable structure, the pass is a tested no-op and the IPM runs as usual.
When it helps
The pass is most valuable when an NLP contains:
- Algebraic auxiliary variables that appear in one or two linear constraints with no other coupling (common in process-engineering and energy-system models).
- Internal chains where one variable is defined as a function of
another (e.g.
T_out = T_in + delta_TwithT_inalready known). - Mass-balance equalities that form a small square block on a subset of stream variables.
ripopt reports gaslib11_steady going from 204 / 200 vars / cons to
140 / 136 vars / cons under this pass, and tutorial_flow_density
going from 6–7 IPM iterations to 0.
Coupling classes
Every candidate block is classified by what it touches:
| Class | Touches inequality? | Touches objective grad? | Eliminated under safe? | Eliminated under aggressive? |
|---|---|---|---|---|
PureEquality | no | no | yes | yes |
ObjectiveCoupled | no | yes | no | yes (postsolve candidate) |
InequalityCoupled | yes | no | no | no |
ObjectiveAndInequalityCoupled | yes | yes | no | no |
safe is the default. Inequality-coupled blocks are never
eliminated in v1 — fixing such a variable could violate the
inequality.
Options
See Solver Options → NLP Presolve for the full list. The two switches you most often touch are:
| Option | Default | Effect |
|---|---|---|
presolve_auxiliary | no | Master switch. Off → pass is a no-op. |
presolve_auxiliary_coupling | safe | none / safe / aggressive policy. |
Diagnostics
The pass populates an
AuxiliaryPreprocessingDiagnostics
struct on every call. From Rust:
#![allow(unused)]
fn main() {
use pounce_presolve::{wrap_with_presolve, PresolveOptions};
let opts = PresolveOptions { enabled: true, auxiliary: true, ..PresolveOptions::defaults() };
let wrapped = wrap_with_presolve(inner, opts)?;
// ... run a solve ...
// Access via the typed handle returned by PresolveTnlp::new:
// let diag = typed.auxiliary_diagnostics();
// println!("{diag}");
}
The Display impl produces output like:
auxiliary-preprocessing: 1 of 1 candidate block(s) eliminated, fixing 2 variable(s) and dropping 2 row(s) in 0 ms
max block dim: 2, max residual: 0.000e0
coupling: pure=1, obj=0, ineq=0, both=0
Per-stage timings (stage_time_ms.incidence_ms /
matching_ms / dm_ms / components_ms / btf_ms /
block_solve_ms / residual_check_ms) and per-class accept counts
are also available.
From the command line, set presolve_auxiliary_diagnostics=yes to
have the same summary emitted to stderr automatically after every
Phase-0 pass:
pounce problem.nl presolve=yes presolve_auxiliary=yes \
presolve_auxiliary_diagnostics=yes
Limitations (v1)
Both linear and nonlinear blocks are eliminated. The linear path reuses the pre-fetched Jacobian; the nonlinear path drives Newton through TNLP callbacks.
Fixed variables are assumed to be interior to their original bounds at the optimum; postsolve sets their bound multipliers to zero implicitly. Lifting this assumption — handling the case where a fixed variable is at an original bound — is a known follow-up.
The pass currently runs once, at the start of the solve. Iterative re-elimination (running the pass again on the reduced problem) is not supported in v1.
Interaction with the rest of presolve
The auxiliary pass runs before the existing bound-tightening
phase (presolve_bound_tightening=yes). The two phases interact at
the bounds: aux clamps x_l[i] = x_u[i] = value for variables it
fixes; bound tightening then propagates the remaining constraints.
The orchestrator filters out aux-dropped rows before tightening
runs, so they can’t propagate contradictions back over the clamps.
If tighten_bounds still flags infeasibility — for example because
an aux-fixed value disagrees with a kept-row’s bound — the
orchestrator rolls back the aux pass for that solve and re-runs
tightening on the unfiltered rows. A one-line warning lands on
stderr when this happens.
Interaction with sensitivity / reduced-Hessian post-processing
When the input .nl file carries sensitivity suffixes
(sens_init_constr / sens_state_*) or the CLI is invoked with
--compute-reduced-hessian, the entire presolve layer — including
auxiliary preprocessing — is silently disabled. The user sees a
single warning on stderr (pounce: disabling presolve — ...) and
the solve proceeds without any presolve transformation. This is
because the existing sensitivity / reduced-Hessian code paths
assume the IPM’s variable and row indices match the user’s
original .nl. Lifting this restriction is tracked separately
(pounce#19).
Caveat: nonconvex problems can land at a different local optimum
When the auxiliary pass eliminates a block, it pins the block’s
variables to a specific feasible point of the equality system —
the one Newton converges to from the probe point. On convex
problems this is the unique local optimum and the IPM would reach
the same values anyway. On nonconvex problems with multiple
feasible solutions to the equality system, the auxiliary pass may
fix variables to a feasible point in a different basin of
attraction than where the un-presolved IPM would eventually settle.
The full-space objective then differs between
presolve_auxiliary=yes and presolve_auxiliary=no, both
solutions remain feasible and locally optimal.
The vendored gaslib11_steady.nl benchmark in
crates/pounce-cli/tests/fixtures/aux_presolve/ exhibits exactly
this — presolve_auxiliary=yes converges to objective ≈ 1.825e-02
while the un-presolved path settles at ≈ 3.286e-02. Both points
satisfy the model’s KKT conditions; aux just lands in a different
basin. The regression test for gaslib11_steady deliberately does
NOT assert objective parity for this reason; the test name and
comments document the constraint.
If matching the un-presolved path’s local optimum is important
for your workflow, leave presolve_auxiliary=no until iterative
re-elimination or a multiple-basin-aware policy lands (tracked on
pounce#53).
Worked example
Run any of these to see the pipeline in action:
cargo run -p pounce-presolve --example pipeline_demo
cargo run -p pounce-presolve --example phase0_via_tnlp
The first runs the algorithmic pipeline directly on a hand-crafted
problem and prints each stage’s output. The second wraps a real
TNLP with presolve_auxiliary=yes and exercises the end-to-end
elimination + multiplier recovery.
References
- Issue tracking the port: pounce#53.
- Upstream: ripopt PR #32 by David Bernal Neira
(@bernalde). The
tutorial_flow_density{,_perturbed}.nlandgaslib11_steady.nlfixtures vendored intocrates/pounce-cli/tests/fixtures/aux_presolve/originate from that ripopt PR. - Design notes:
dev-notes/auxiliary-equality-preprocessing.mdin the pounce repo.
Troubleshooting Recipes
When a pounce solve fails, stalls, or settles for “acceptable” instead of “optimal”, the default options aren’t always the best fit. This page collects concrete, reproducible recipes that turn failures into successes (or improve already-successful solves) on real problems.
Each entry follows the same shape:
- When to try it — symptoms in the iter table or the final report that point to this knob.
- The knob — exact option(s) and CLI invocation.
- Worked example — before/after table on a named problem so you can verify the recipe reproduces on your machine.
A recipe earns a place on this page when there’s a named problem where it demonstrably helps. “Should help in theory” entries belong in the reference pages (Scaling, FBBT, Options), not here. If you find a new win, the contribution guide (CONTRIBUTING.md) walks through adding it.
Quick lookup by symptom
| Symptom | Recipe |
|---|---|
| Exit “Solved To Acceptable Level” but you need strict optimality | Ruiz linear-system scaling |
| Hundreds of small steps, slow convergence on a problem with loose bounds | FBBT on nonlinear constraints |
Search Direction is becoming Too Small early in the iter table | Ruiz linear-system scaling, then μ-strategy switch |
| Restoration phase fires repeatedly | ℓ₁ exact-penalty wrapper |
| Iterates wander on an LP-like / linearly constrained problem | mehrotra_algorithm=yes |
| Hundreds of iterations, monotone μ stair-steps slowly toward optimal | mu_strategy=adaptive |
| Iter count looks fine but seconds-per-iter is dominated by the linear solve on a hard QCQP / banded problem | feral_ordering=auto_race |
Presolve: bound-tightening and row drops
presolve=yes (start here)
The pounce presolve pipeline drops fixed variables, propagates bounds from linear rows, detects empty / redundant constraints, and warm-starts bound multipliers. It is off by default to match upstream Ipopt’s no-surprises behavior; turn it on for any non-trivial NLP.
pounce problem.nl presolve=yes
Cheap, almost always helpful, and a prerequisite for FBBT.
FBBT (feasibility-based bound tightening)
Interval propagation through the nonlinear constraint DAG to discover
variable bounds the user did not write down (x² + y² ≤ 1 ⇒
x ∈ [-1, 1], exp(x) ≤ 10 ⇒ x ≤ ln 10, etc.). Full reference
in Feasibility-Based Bound Tightening.
When to try it. Hundreds of small steps in the iter table, the
primal infeasibility stuck against a bound, or a problem that’s
clearly under-constrained from the modeler’s side. Requires a
structural-expression representation, which today means an .nl
input.
The knob.
pounce problem.nl presolve=yes presolve_fbbt=yes
Worked example — clnlbeam (Mittelmann):
presolve=yes | + presolve_fbbt=yes | |
|---|---|---|
| Exit status | Optimal Solution Found | Optimal Solution Found |
| Iterations | 552 | 65 |
| Wall time | 41.4 s | 8.2 s |
FBBT discovers tight nonlinear bounds the linear sweep missed; the IPM then has a much smaller feasibility gap to close and converges in roughly one-eighth the iterations.
Not every problem benefits. On corkscrw and arki0003 FBBT
produces no measurable change or a slight regression — the
infrastructure is cheap (one pass per constraint per outer sweep,
capped at fbbt_max_iter=10), so the worst case is a few percent of
extra presolve time.
Scaling
Full reference in Scaling. The two layers are independent.
Ruiz scaling on the augmented KKT system
When to try it. Exit status is “Solved To Acceptable Level” with
small step sizes near the end, or dual_inf plateaus several orders
above tol while primal feasibility is already at machine epsilon.
That pattern signals a poorly-conditioned KKT augmented matrix — the
back-solve loses the last few fractional digits the convergence check
needs.
The knob.
pounce problem.nl presolve=yes linear_system_scaling=ruiz \
linear_scaling_on_demand=no
linear_scaling_on_demand=no forces always-on Ruiz; the default
(yes) defers scaling until the linear solver flags an iterate as
poorly scaled. For diagnostic runs, force it on.
Worked example — nql180 (Mittelmann):
| default | + linear_system_scaling=ruiz | |
|---|---|---|
| Exit status | Solved To Acceptable Level | Optimal Solution Found |
| Iterations | 41 | 50 |
| Primal infeasibility | 4.0e-11 | 1.2e-15 |
| Dual infeasibility | 1.0e-5 | 3.1e-4 |
| Complementarity | 1.2e-9 | 9.9e-10 |
| Overall NLP error | 2.4e-7 | 9.9e-10 |
Symmetric ∞-norm equilibration improves primal feasibility by four
orders of magnitude and overall NLP error by ~3 orders, letting the
solver clear the strict tol gate. The extra nine iterations are
well spent. Resolves issue #25.
Worked example — WM_CFy (Mittelmann ampl-nlp, n=8709, m=12850):
| default | + linear_system_scaling=ruiz | |
|---|---|---|
| Exit status | Optimal Solution Found | Optimal Solution Found |
| Iterations | 605 | 241 |
| Wall time | ~2300 s | ~543 s |
| Overall NLP error | 3.4e-9 | 2.6e-9 |
A 4× wall-time speedup on a problem that previously sat in the “hard
W-B” bucket: every Ipopt + linear-solver combination tried in
issue #29 had failed
to converge within a 600 s budget. Ruiz wasn’t just an iteration-count
win — at 605 iters / 2300 s default-pounce was the only configuration
that even finished; Ruiz cuts that to under ten minutes. Same
underlying mechanism as nql180: the augmented KKT system is
ill-conditioned enough that the back-solve burns iterations chasing
residuals symmetric ∞-norm equilibration fixes in one preconditioning
pass.
Pairing mu_strategy=adaptive with Ruiz on this problem solves to a
~50× tighter NLP error (5e-11) but takes twice as long (491 iters,
1100 s). For a tighter solution at any cost, use both; for a fast
solve, Ruiz alone wins.
NLP-level scaling: when the default hurts
The gradient-based default at the NLP level is computed once at
x_0 and is sometimes the wrong fingerprint of the problem — for
instance when the starting point lives near a flat region of the
objective. If the IPM stalls with no clear infeasibility and the
unscaled gradients in the report look reasonable, try turning NLP
scaling off:
pounce problem.nl nlp_scaling_method=none
Or, if you know the natural units of your problem better than the
solver does, supply user-scaling (see Scaling for the
end-to-end recipe).
μ-strategy
Monotone vs. adaptive
Monotone (the default) decreases the barrier parameter μ in geometric steps; adaptive uses a quality-function oracle to pick each new μ based on the current iterate’s complementarity. Adaptive is more aggressive in well-conditioned regions and more conservative near degeneracy.
When to try it. Convex or nearly-convex problems where the monotone schedule wastes iterations stair-stepping toward a μ that the iterate clearly accepts; alternately, ill-conditioned problems where monotone overshoots and triggers restoration.
The knob.
pounce problem.nl mu_strategy=adaptive
Pair with mu_oracle=quality-function (the default) or
mu_oracle=probing for the Mehrotra-style affine probe.
Worked example — arki0009 (Mittelmann):
mu_strategy=monotone (default) | mu_strategy=adaptive | |
|---|---|---|
| Exit status | Optimal Solution Found | Optimal Solution Found |
| Iterations | 358 | 108 |
A 70 % iteration-count reduction with no quality regression. The quality-function oracle picks larger μ-decrements when the complementarity gap is well-balanced, skipping the slow stair-step that monotone is forced into on this instance.
nql180 is also rescued by mu_strategy=adaptive alone
(Acceptable → Optimal in 61 iters) — so for that problem you have a
choice between the Ruiz recipe (above) and the adaptive-μ recipe.
Ruiz gives a numerically cleaner solution (primal infeasibility
1.2e-15 vs ~5e-12); adaptive μ is one knob instead of two and has no
linear-system overhead.
Mehrotra predictor-corrector
For problems that are LP-like (linear or mildly nonlinear constraints, quadratic objective), the Mehrotra predictor-corrector mode short-circuits the filter line search and accepts every trial step:
pounce problem.nl mehrotra_algorithm=yes
This sets a Mehrotra-canonical configuration (adaptive_mu_globalization=never-monotone-mode,
accept_every_trial_step=yes, alpha_for_y=bound_mult, larger
bound_push and bound_mult_init_val). On well-conditioned LP-like
problems it routinely cuts iteration counts in half. On nonconvex
NLPs it can destabilize — see
issue #58 for the
trade-off discussion.
Restoration & ℓ₁ exact-penalty wrapper
When restoration fires repeatedly, the standard IPM is stuck on an infeasible subproblem the filter cannot accept. The ℓ₁ exact-penalty wrapper rephrases the constraints as an additive penalty term and solves a sequence of bound-constrained subproblems instead:
pounce problem.nl l1_exact_penalty_barrier=yes
Or, only invoke the wrapper as a fallback when standard restoration fails:
pounce problem.nl l1_fallback_on_restoration_failure=yes
This is the recipe for problems with rank-deficient constraints, ill-defined bounds at the starting point, or pathological LICQ violations — anywhere the filter’s history rules out feasibility restoration paths the wrapper can still find.
Worked example: certifying genuine infeasibility
The built-in infeasible-eq problem is the smallest fixture that
exercises the fallback end-to-end:
min x0^2 + x1^2
s.t. x0 + x1 = 1 (g0)
x0 + x1 = 2 (g1)
The two equalities are mutually contradictory, so no x exists with
||g(x)||_∞ = 0. The standard solve diagnoses this without the
wrapper:
$ pounce --problem infeasible-eq
...
EXIT: Converged to a point of local infeasibility. Problem may be infeasible.
That message is the filter giving up: it found an iterate where the constraint gradients are linearly dependent and no admissible step reduces infeasibility further. The output does not tell you whether the problem is genuinely infeasible or whether the filter rejected a feasible neighborhood that another method could reach. Re-run with the wrapper to find out:
$ pounce --problem infeasible-eq l1_fallback_on_restoration_failure=yes
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) ...
0 0.0000000e+00 2.00e+00 0.00e+00 -1.0 0.00e+00 - ...
1 1.1250000e+00 5.00e-01 4.22e-09 -1.0 7.50e-01 - ...
2r 1.1250000e+00 5.00e-01 9.99e+02 -0.3 0.00e+00 - ... ← restoration
...
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) ... ← second inner solve
0 3.0202000e+00 9.90e-03 0.00e+00 -1.0 0.00e+00 - ...
...
6 1.5000000e+00 2.22e-16 2.53e-14 -8.6 1.88e-06 - ... ← wrapper converges
in the slacked
problem
EXIT: Converged to a point of local infeasibility. Problem may be infeasible.
Read this trace carefully. The wrapper’s inner solve converges to
KKT tolerance on the slacked problem — inf_pr falls to 1e-16 in
six iterations because the added slack variables s+, s- absorb the
inconsistency g0 ≠ g1. But pounce reports the overall verdict on
the original constraints, so the final Constraint violation = 0.5
is unchanged: that’s the irreducible gap (g1 − g0)/2. Two
independent solvers (filter IPM and ℓ₁-penalty barrier) landing on
the same least-infeasible iterate, from different starting strategies,
is what makes this an infeasibility certificate rather than a
diagnosis of solver fragility.
The recipe in plain English:
- Standard solve says “local infeasibility” → may or may not be a real obstruction; could be filter history, LICQ degeneracy, or a bad starting point.
- Wrapper agrees on the same least-infeasible iterate → trust the certificate; reformulate the model.
- Wrapper promotes to
Solve_Succeeded→ the standard filter was rejecting a feasible neighborhood it could not reach; the model itself is fine.
Implementation note — running this case used to panic with
restoration factory invoked more than oncebecause the CLI wired a one-shot restoration factory into the application. The fix (pounce#24) routes through a multi-pass provider so the wrapper can mint a fresh restoration phase per inner solve. The regression test that guards it (crates/pounce-cli/tests/l1_fallback_no_panic.rs) uses this sameinfeasible-eqbuiltin.
Linear solver choice
linear_solver=ma57 (when built with HSL):
pounce problem.nl linear_solver=ma57
For problems that go many hundreds of iterations, the round-off chain of the inner sparse factorization matters — MUMPS, FERAL/SSIDS, and MA57 do not produce bitwise-identical iterates, and on the worst-case instances the difference can be the difference between convergence and a μ-reset spiral (issue #58, issue #64).
Pair with ma57_automatic_scaling=yes (default in HSL builds) and
leave linear_system_scaling=none — MA57’s internal scaling and a
pounce-level Ruiz pass should not be stacked.
FERAL ordering: when the adaptive dispatcher guesses wrong
When linear_solver=feral (the default) and per-iter wall time is
dominated by the linear solve — typical on dense / quadratically-
coupled KKT systems where iteration counts look reasonable but
seconds-per-iter are high — the fill-reducing ordering choice often
matters more than any other knob. By default, feral_ordering=auto
picks AMD / AMF / METIS from cheap pattern features. This is right
in the common case but can miss badly on a single hard problem.
The safe recipe is to measure the right ordering rather than guess:
pounce problem.nl feral_ordering=auto_race
This runs symbolic factorization on AMD, METIS, SCOTCH and KaHIP and
keeps the one with the smallest factor_nnz. Costs ~4× a single
symbolic pass — paid once per problem because symbolic factorization
is cached across numeric refactorizations with the same pattern, so
the overhead is invisible to the per-iter cost on anything but a
one-iter problem.
feral_ordering=amd (concrete pin) is the right escalation when the
race itself is showing AMD winning consistently — pinning skips the
race entirely on subsequent runs. See the full
feral_ordering table for the
other variants.
Diagnosing before you reach for a knob
Before trying recipes, dump the per-iter diagnostic categories that pounce supports:
pounce problem.nl --dump kkt --dump iterate \
--dump-dir /tmp/dump-problem
The dumps land as JSONL under /tmp/dump-problem/. Two categories
have wired dump sites today:
--dump kkt— KKT residuals and condition-number proxy; large values motivate Ruiz scaling.--dump iterate— primal/dual values; needed to spot whether a small step is bound-snapping or infeasibility-driven.
The
--dump muand--dump restocategories are accepted by the CLI but not yet wired to a dump site, so they currently emit no data. For the μ trajectory and restoration entries/exits, use the Studio queries below (which read the iteration stream from the solve report).
The Studio MCP (pounce-studio) wraps these dumps in higher-level
diagnostic queries (diagnose, find_stalls, restoration_windows),
which is the recommended workflow when iterating on options.
Logs, colors, and machine-readable output
POUNCE routes diagnostics through tracing.
The knobs are environment variables (see
Options › Logging and colored output),
not solver options.
When to try it
- You want more detail than the iteration table shows (which phase fired, why restoration triggered, linear-solver fallbacks).
- A downstream tool (Studio, CI) needs to parse per-iteration data.
- Color is garbling a log file, or you want color forced through a pipe.
The knobs
| Goal | Invocation |
|---|---|
| Verbose, everything | RUST_LOG=debug pounce problem.nl |
| Just the restoration phase | RUST_LOG=pounce::restoration=debug pounce problem.nl |
| Separate logs from results | pounce problem.nl > result.txt 2> solve.log |
| Plain text (no color) | NO_COLOR=1 pounce problem.nl |
| Force color through a pipe | `CLICOLOR_FORCE=1 pounce problem.nl |
| Line-delimited JSON iterations | POUNCE_LOG_FORMAT=json pounce problem.nl 2> iters.jsonl |
Logs go to stderr; the iteration table, final summary, and --dump
output are program output on stdout. The colored table uses a
tiger/rust theme — restoration lines get a kind-dependent background and
the row text reddens as the step length alpha shrinks, so a stalling or
restoration-heavy solve is visible at a glance. When stdout is not a
terminal (or NO_COLOR is set) the table is emitted as plain text with
the same column layout.
Contributing a new recipe
A recipe earns a place here when:
- There is a named, reproducible problem where the recipe
demonstrably helps. Mittelmann benchmark (
benchmarks/mittelmann/nl/) is preferred but any committed.nlworks. - The before/after numbers are captured at
print_level=3or higher and pasted into the worked-example table. - The recipe is not a special case of an existing one. (If your problem needs three knobs together, write one entry; if your problem benefits from a knob already documented here, file a PR to add a second worked example under that entry.)
Open a PR adding to this file with the table populated. The
maintainer-side review checks that the numbers reproduce against the
current main and that the recipe really is a recipe — not a
problem-specific accident.
Benchmarks
The benchmarks/ directory contains comparison harnesses that run
POUNCE against upstream Ipopt across several test suites: the Vanderbei
CUTE-in-AMPL collection, Mittelmann ampl-nlp, CHO parameter estimation,
GasLib pipelines, water-network design, electrolyte thermodynamics,
AC optimal power flow, and large-scale synthetic NLPs. Every suite is
.nl-driven — a directory of AMPL .nl files solved by both pounce
and ipopt.
Common targets:
make benchmark # full sweep: every suite + composite report
make benchmark-report # regenerate benchmarks/BENCHMARK_REPORT.md
make benchmark-cho # one suite at a time
make benchmark-gas
make benchmark-water
make benchmark-mittelmann
make benchmark-vanderbei # Vanderbei CUTE-in-AMPL collection (733 problems)
The benchmark inputs themselves — the .nl problem files — and the
per-run logs and JSON results are regenerated locally and not tracked in
the repository. See
benchmarks/README.md
for the full list and per-suite details.
Acknowledgments
POUNCE is a Rust port of Ipopt, the interior-point nonlinear programming solver by Andreas Wächter, Lorenz T. Biegler, and the COIN-OR community. Its algorithm, console output, and option semantics are modeled directly on that codebase, which is released under the EPL-2.0.
It is a sibling of ripopt, an earlier memory-safe interior-point NLP optimizer in Rust by the same author (DOI 10.5281/zenodo.19542664).
Contributors
- David Bernal Neira (@bernalde)
designed and prototyped the auxiliary-equality preprocessing pass
in ripopt PR #32.
POUNCE’s
pounce-presolve::auxiliaryPhase-0 orchestrator (issue #53) is a port of that work — Hopcroft-Karp matching, Dulmage-Mendelsohn partition, Tarjan SCC, block-triangular reduction, damped-Newton block solver, reduction frame with multiplier recovery — and ships with thetutorial_flow_density{,_perturbed}.nlandgaslib11_steady.nltest fixtures David vendored.
Key references
- Wächter, A., Biegler, L.T. “On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming.” Mathematical Programming 106(1), 25–57 (2006). DOI 10.1007/s10107-004-0559-y — the algorithm POUNCE implements.
- Wächter, A., Biegler, L.T. “Line search filter methods for nonlinear programming: Motivation and global convergence.” SIAM Journal on Optimization 16(1), 1–31 (2005). DOI 10.1137/S1052623403426556
- Wächter, A., Biegler, L.T. “Line search filter methods for nonlinear programming: Local convergence.” SIAM Journal on Optimization 16(1), 32–48 (2005). DOI 10.1137/S1052623403426544
- Fletcher, R., Leyffer, S. “Nonlinear programming without a penalty function.” Mathematical Programming 91(2), 239–269 (2002). DOI 10.1007/s101070100244 — the filter concept underlying the line search.
- Pirnay, H., López-Negrete, R., Biegler, L.T. “Optimal sensitivity
based on IPOPT.” Mathematical Programming Computation 4(4),
307–331 (2012). DOI
10.1007/s12532-012-0043-2
— the sIPOPT method behind
pounce-sensitivity. - Duff, I.S. “MA57—a code for the solution of sparse symmetric
definite and indefinite systems.” ACM Transactions on Mathematical
Software 30(2), 118–144 (2004). DOI
10.1145/992200.992202 — the
optional
ma57linear-solver backend. - Wilkinson, M.D. et al. “The FAIR Guiding Principles for scientific data management and stewardship.” Scientific Data 3, 160018 (2016). DOI 10.1038/sdata.2016.18 — the provenance model behind the JSON solve report.