Introduction
feral is a sparse symmetric indefinite direct solver written in pure Rust, with certified inertia counts. It is a clean-room implementation from published papers and BSD-licensed references, MIT-licensed, with no BLAS, LAPACK, or Fortran dependencies in the core solver.
This book is the narrative companion to the API reference. The book covers concepts, semantics, and design choices; the API reference covers types and functions.
What it does
- Sparse and dense symmetric indefinite factorization (LDLᵀ with Bunch–Kaufman pivoting).
- Certified inertia
(n_pos, n_neg, n_zero)alongside every factor. - Single and batched (multi-RHS) solves — factor once, solve against
many right-hand sides, with register-blocked panel kernels for wide
nrhs. - Iterative refinement and optional MC64 scaling for ill-conditioned and near-singular systems.
- Unsymmetric LU basis engine (
feral::lu) — a separate factorization family for revised-simplex bases, with cheap rank-1 column-replacement updates and warmftran/btransolves (no inertia). - Python bindings (
feral-solveron PyPI) andscipy.sparseinterop.
Where to go next
- Getting started — install, link, and run sparse, batched, and dense solves.
- Inertia semantics — what
Inertiameans and what feral guarantees on singular matrices. - Unsymmetric LU basis engine — the separate
feral::lufamily for simplex bases:P B Q = L U, rank-1 updates, and warm solves. - Python bindings — install, quickstart, and batched solves from Python.
- API reference — the rustdoc output for the
feralcrate and workspace members.
Project links
- Repository: https://github.com/jkitchin/feral
- Crate: https://crates.io/crates/feral
- Issue tracker: https://github.com/jkitchin/feral/issues
Getting started
Add feral to your Cargo.toml:
[dependencies]
feral = "0.10"
feral is primarily a sparse symmetric indefinite solver. The
ergonomic entry point is the Solver type: build a matrix,
factor it once, then solve it against one or many right-hand sides.
Sparse quickstart
use feral::Solver;
use feral::numeric::solver::FactorStatus;
use feral::sparse::csc::CscMatrix;
// Build a symmetric matrix from triplets of its LOWER triangle
// (row, col, value), `n` rows/cols. Duplicate entries are summed.
let n = 5;
let rows = [0, 1, 2, 3, 4, 1, 2, 3, 4];
let cols = [0, 0, 0, 0, 0, 1, 2, 3, 4];
let vals = [10.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0];
let a = CscMatrix::from_triplets(n, &rows, &cols, &vals)?;
// Factor once. Pass `None` to skip the inertia check, or
// `Some(expected)` to verify the matrix has the inertia you expect.
let mut solver = Solver::new();
let status = solver.factor(&a, None);
assert_eq!(status, FactorStatus::Success);
// The factorization carries a certified inertia (n_pos, n_neg, n_zero).
println!("inertia: {:?}", solver.inertia());
// Solve A·x = b for a single right-hand side.
let b = [1.0, 2.0, 3.0, 4.0, 5.0];
let x = solver.solve(&b)?;
solver.factor returns a FactorStatus; match on it to
distinguish Success, Singular, WrongInertia, and FatalError.
Once factored, the Solver keeps the factor, so any number of solves
reuse it.
Many right-hand sides (batched solve)
The same factorization can be solved against many right-hand sides at
once. solve_many shares the supernodal traversal across columns, so it
is substantially cheaper than looping a single-RHS solve — and for wide
nrhs it runs each supernode's dense panel as register-blocked
TRSM + GEMM kernels.
B and X are column-major n × nrhs matrices, stored as flat
slices of length n * nrhs (column c occupies [c*n .. (c+1)*n]):
let nrhs = 64;
// b_many[c*n + i] is row i of right-hand-side column c.
let b_many: Vec<f64> = make_columns(n, nrhs);
let x_many = solver.solve_many(&b_many, nrhs)?; // length n * nrhs
This is the path behind batched KKT back-solves, jax.jacrev over a
solve, sensitivity analysis, and parameter sweeps. On 2-D Laplacians it
is roughly 3–6× faster per RHS than looping single-RHS solves (the exact
factor depends on size, CPU SIMD width, and cache). See GitHub issue #57
and the 02_multi_rhs_batched
notebook.
Iterative refinement
With ZeroPivotAction::ForceAccept (the default), an unrefined solve can
leave a residual on near-singular pivots. solve_refined runs a few
steps of iterative refinement against the original matrix and returns
the best iterate:
let x = solver.solve_refined(&a, &b)?; // single RHS
let x_many = solver.solve_many_refined(&a, &b_many, nrhs)?; // batched
solve_many_refined keeps per-column best-iterate convergence but, for
wide right-hand sides, refines through the same batched panel kernel as
solve_many — one batched solve per refinement step over the still-
unconverged columns — so the refined path amortizes too (issue #58).
Dense path
For small dense systems there is a direct API that mirrors the sparse
one. factor returns the factors and the certified inertia:
use feral::{factor, solve, BunchKaufmanParams, SymmetricMatrix};
// Lower-triangle entries (row >= col) of an n×n symmetric matrix.
let a = SymmetricMatrix::from_lower_triangle(n, &entries);
let (factors, inertia) = factor(&a, &BunchKaufmanParams::default())?;
let x = solve(&factors, &b)?;
More
- Runnable Rust programs:
examples/exercises the dense and sparse paths, scaling, and refinement. - Python bindings and notebooks: see Python bindings.
- Inertia guarantees on singular matrices: see Inertia semantics.
Inertia semantics
Inertia is the triple (n_pos, n_neg, n_zero) — the number of positive,
negative, and zero eigenvalues of a symmetric matrix. feral reports an inertia
alongside every factorization.
Guarantees
- Non-singular matrices: inertia is exact.
- Singular matrices: on matrices where the canonical Fortran direct solvers (MUMPS 5.8.2 and SPRAL SSIDS) disagree on inertia, feral agrees with at least one of them.
The corpus consensus framework lives under
external_benchmarks/consensus/
and tags matrices with no 3-of-4-oracle agreement as excluded; those matrices
are not part of the inertia gate.
See Inertia in the API reference for the type itself.
Scaling
Symmetric scaling replaces the system A·x = b with a congruent system
that is better conditioned, then maps the solution back. feral applies a
single per-variable scaling vector s symmetrically:
(D·A·D)·(D⁻¹·x) = D·b , D = diag(s)
so the same vector s pre-scales the right-hand side and post-scales
the solution — the D⁻¹ cancels algebraically. Concretely, each matrix
entry a[i,j] is multiplied by s[i]·s[j] during frontal assembly, the
RHS is multiplied by s before the forward sweep, and the solution is
multiplied by s after the backward sweep. Good scaling pulls the
matrix entries toward unit magnitude, which improves pivot quality and
keeps the certified inertia trustworthy on KKT-shaped systems.
Choosing a strategy
The strategy is a feral::scaling::ScalingStrategy, set on the Solver:
use feral::Solver;
use feral::scaling::ScalingStrategy;
let solver = Solver::new().with_scaling(ScalingStrategy::Auto);
The variants:
| Variant | What it computes |
|---|---|
Auto (default) | Picks Mc64Symmetric or InfNorm per matrix; see below. |
InfNorm | Knight–Ruiz iterative ∞-norm equilibration. |
Mc64Symmetric | Matching-based (Duff–Koster) symmetric scaling. |
Identity | No scaling (all ones) — the fast path. |
External(Vec<f64>) | A scaling vector you supply, in user ordering. |
InfNorm — equilibration
InfNorm runs an iterative ∞-norm equilibration in the symmetry-preserving
style of Ruiz: it computes a diagonal D so that every row (and, by
symmetry, every column) of D·A·D has ∞-norm near one. It needs only a
handful of sparse mat-vec passes, requires no matching solve, and
preserves symmetry exactly. It is the right default for matrices that are
already roughly balanced.
See Ruiz (2001) and Knight, Ruiz & Uçar (2014) below.
Mc64Symmetric — matching-based
Mc64Symmetric builds the scaling from a maximum-weight bipartite
matching on the log-magnitude graph (the MC64 family). The asymmetric
dual variables u, v of the matching are symmetrized into a single
per-variable factor
s_i = exp((u_i + v_i) / 2)
which is exactly the scaling MUMPS and SPRAL SSIDS use for symmetric indefinite matrices. It is more expensive than equilibration (it solves a matching) but dramatically better on arrow-shaped KKT systems, where a few dense coupling rows otherwise dominate the norms.
See Duff & Koster (1999, 2001) and Duff & Pralet (2005) below.
Auto — adaptive routing
Auto (the default) inspects the sparsity pattern and routes to
Mc64Symmetric only when the matrix looks like an arrow/saddle-point
KKT — specifically when both:
- a large fraction of columns are structurally diagonal-only (slack columns), and
- at least one column is dense relative to the rest (an arrow head).
Otherwise it routes to InfNorm. This split matters: arrow-KKT matrices
can improve by one to two orders of magnitude under MC64, while banded
KKTs can regress under it, so the router keeps each on the strategy
that helps. Auto additionally applies fallback guards — if the matrix
is already well equilibrated, or if the matching produces a degenerate
(over-wide) scaling, it falls back to InfNorm.
The exact thresholds are implementation details that have been tuned against the benchmark corpus and may change between releases; treat
Autoas "feral picks a reasonable scaling for this matrix" and reach for an explicit variant only when you have measured a better choice.
References
Matching-based scaling (MC64 family):
- I. S. Duff and J. Koster. The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices. SIAM J. Matrix Anal. Appl. 20(4):889–901, 1999. doi:10.1137/S0895479897317661
- I. S. Duff and J. Koster. On Algorithms for Permuting Large Entries to the Diagonal of a Sparse Matrix. SIAM J. Matrix Anal. Appl. 22(4):973–996, 2001. doi:10.1137/S0895479899358443
- I. S. Duff and S. Pralet. Strategies for Scaling and Pivoting for Sparse Symmetric Indefinite Problems. SIAM J. Matrix Anal. Appl. 27(2):313–340, 2005. doi:10.1137/04061043X
Equilibration (Ruiz / Knight–Ruiz family):
- D. Ruiz. A Scaling Algorithm to Equilibrate Both Rows and Columns Norms in Matrices. Tech. Report RAL-TR-2001-034, Rutherford Appleton Laboratory, 2001. PDF
- P. A. Knight, D. Ruiz, and B. Uçar. A Symmetry Preserving Algorithm for Matrix Scaling. SIAM J. Matrix Anal. Appl. 35(3):931–955, 2014. doi:10.1137/110825753
Fill-reducing ordering
Before factoring, feral permutes the matrix to reduce fill-in — the
new nonzeros created during elimination. A good ordering can change the
factor size and factorization time by orders of magnitude, so the choice
of ordering is the single most important tuning knob for large sparse
problems. All orderings are clean-room implementations in pure Rust
(workspace crates feral_amd, feral_amf, feral_metis,
feral_scotch, feral_kahip); none link the original C/Fortran
libraries.
Choosing a method
The method is a feral::symbolic::OrderingMethod, set on the Solver:
use feral::Solver;
use feral::symbolic::OrderingMethod;
let solver = Solver::new().with_ordering(OrderingMethod::Auto);
The variants:
| Variant | Algorithm |
|---|---|
Amd | Approximate Minimum Degree. |
Amf | Approximate Minimum Fill (HAMF, the MUMPS variant). |
MetisND | METIS-style multilevel nested dissection. |
ScotchND | Scotch-style nested dissection. |
KahipND | KaHIP flow-based nested dissection with data reduction. |
Auto | Adaptive: pick a method from cheap pattern metrics. |
AutoRace | Run several candidates and keep the smallest factor. |
Local (greedy) orderings
Amd— Approximate Minimum Degree eliminates the vertex of smallest approximate degree at each step, using a quotient-graph model with element absorption and supervariable detection. It is fast, low-memory, and the strongest general-purpose choice for small and medium matrices. This is the default forsymbolic_factorize.Amf— Approximate Minimum Fill scores by an estimate of the fill a pivot would create rather than its degree. It is the HAMF4 variant used by MUMPS, and tends to beat AMD slightly on the medium KKT systems that dominate interior-point workloads.
Nested dissection orderings
Nested dissection recursively finds a small vertex separator, numbers it last, and recurses on the two halves. It produces asymptotically better orderings than greedy methods on large, mesh-like graphs.
MetisND— multilevel nested dissection: coarsen the graph by heavy-edge matching, bisect at the coarsest level, then uncoarsen with Fiduccia–Mattheyses refinement, falling back to AMD on small leaves.ScotchND— multilevel nested dissection with graph compression and direct vertex-separator refinement.KahipND— flow-based nested dissection: it computes separators with max-flow/min-cut local improvement and applies exhaustive data reduction (degree-1/2, twin, and indistinguishable-vertex rules) before dissecting. Highest quality and highest setup cost; it is never chosen automatically.
Automatic selection
Autochooses a method from a few cheap pattern metrics: very large, very sparse graphs are sent toAmd(nested dissection's separator search does not pay off there), and everything else follows the size-based default —Amffor smaller matrices,MetisNDfor larger ones. The crossover and the "very large, very sparse" gate are tuned against the benchmark corpus and may change between releases.AutoRaceruns the symbolic analysis for several candidate orderings (Amd,MetisND,ScotchND,KahipND) and keeps the one whose factor is predicted smallest. It costs roughly the sum of those symbolic passes — worth it when one factorization is reused for many solves, since symbolic analysis is amortized and the numeric phase dominates.
Symbolic analysis (ordering + supernode detection) runs once and is cached on the
Solver; refactorizing the same pattern with new values reuses it. So an expensive ordering likeKahipNDorAutoRaceis cheap in amortized terms across a Newton run or a refactorization loop.
References
Greedy orderings:
- P. R. Amestoy, T. A. Davis, and I. S. Duff. An Approximate Minimum Degree Ordering Algorithm. SIAM J. Matrix Anal. Appl. 17(4):886–905, 1996. doi:10.1137/S0895479894278952
- P. R. Amestoy, T. A. Davis, and I. S. Duff. Algorithm 837: AMD, an Approximate Minimum Degree Ordering Algorithm. ACM Trans. Math. Softw. 30(3):381–388, 2004. doi:10.1145/1024074.1024081
- A. George and J. W. H. Liu. Computer Solution of Large Sparse Positive Definite Systems. Prentice-Hall, 1981. ISBN 0-13-165274-5. (Minimum-fill and elimination-tree foundations.)
Nested dissection:
- A. George. Nested Dissection of a Regular Finite Element Mesh. SIAM J. Numer. Anal. 10(2):345–363, 1973. doi:10.1137/0710032
- R. J. Lipton, D. J. Rose, and R. E. Tarjan. Generalized Nested Dissection. SIAM J. Numer. Anal. 16(2):346–358, 1979. doi:10.1137/0716027
- G. Karypis and V. Kumar. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J. Sci. Comput. 20(1):359–392, 1998. doi:10.1137/S1064827595287997
- F. Pellegrini and J. Roman. Scotch: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs. HPCN Europe, LNCS 1067:493–498, 1996. doi:10.1007/3-540-61142-8_588
- P. Sanders and C. Schulz. Engineering Multilevel Graph Partitioning Algorithms. ESA 2011, LNCS 6942:469–480. doi:10.1007/978-3-642-23719-5_40
- W. Ost, C. Schulz, and D. Strash. Engineering Data Reduction for Nested Dissection. ALENEX 2021, 113–127. doi:10.1137/1.9781611976472.9
Refinement primitive:
- C. M. Fiduccia and R. M. Mattheyses. A Linear-Time Heuristic for Improving Network Partitions. 19th Design Automation Conference, 175–181, 1982. doi:10.1109/DAC.1982.1585498
Unsymmetric LU basis engine
Most of feral is a symmetric indefinite LDLᵀ solver with certified inertia.
The feral::lu module is a separate factorization family: an unsymmetric
LU built to drive a revised-simplex basis. It is additive — the LDLᵀ solver
and every one of its code paths are untouched — and it deliberately does not
compute inertia (an unsymmetric basis has no symmetric eigenvalue structure to
certify).
The distinguishing requirement is not the one-shot factor/solve but the rank-1
update: a simplex iteration replaces one basic column, and the engine folds
that change into the existing factors in O(nnz) rather than refactoring.
The factorization
For a square nonsingular basis B (m × m), feral computes
P B Q = L U
where P is a row permutation (threshold partial pivoting, for stability), Q
is a fill-reducing column permutation (sparse path only; Q = I on the dense
path), L is unit lower triangular, and U is upper triangular. The two hot
operations of revised simplex follow directly:
ftran: solve B x = a (forward transformation, B⁻¹a)
btran: solve Bᵀ x = a (backward transformation, B⁻ᵀa)
Dense path
For the small bases that dominate (e.g. OBBT bases on a handful of variables),
DenseLu factors a general column-major matrix with right-looking
LU and threshold partial pivoting:
use feral::{DenseLu, LuParams};
// `cols[j]` is column j of the m×m basis (length m each).
let m = 3;
let cols = vec![
vec![2.0, 4.0, 8.0],
vec![1.0, 3.0, 7.0],
vec![1.0, 3.0, 9.0],
];
let mut lu = DenseLu::factor(&cols, m, LuParams::default())?;
// ftran / btran overwrite the right-hand side in place.
let mut x = vec![1.0, 2.0, 3.0];
lu.ftran(&mut x)?; // x ← B⁻¹ x
let mut y = vec![1.0, 0.0, 0.0];
lu.btran(&mut y)?; // y ← B⁻ᵀ y
Sparse path
For larger bases, SparseLu is a left-looking Gilbert–Peierls LU
with an output-sensitive depth-first reach (so the factor cost is O(flops),
not O(n²)). The fill-reducing column order is a reusable symbolic handle
computed by SparseLuSymbolic — feral's in-tree AMD run on the
AᵀA (column-intersection) pattern, a stand-in for COLAMD. Because the pattern
is invariant under the row permutation and scaling, the same order is valid for
numerically different but structurally identical bases:
use feral::{SparseColMatrix, SparseLu, SparseLuSymbolic, LuParams};
let b: SparseColMatrix = /* general CSC basis */;
let symbolic = SparseLuSymbolic::analyze(&b)?; // reusable across refactors
let mut lu = SparseLu::factor(&b, &symbolic, LuParams::default())?;
let mut x = vec![/* … length m … */];
lu.ftran(&mut x)?;
should_use_dense_lu(m, nnz, ¶ms) mirrors the symmetric router: tiny bases
go dense unconditionally, small dense-enough bases go dense by a density gate,
and the rest go sparse.
Rank-1 column-replacement update
This is the reason the engine exists. update (and update_sparse, which takes
the entering column already in sparse form) replaces one basic column and folds
the change into the factors in place:
let leaving_slot = 2; // basis column to evict
let entering = vec![0.5, 0.0, 1.5]; // new column aₙₑw
match lu.update(leaving_slot, &entering) {
Ok(()) => { /* factors now reflect the new basis */ }
Err(feral::FeralError::NeedsRefactor) => {
// Budget or stability limit reached; `lu` is unchanged.
lu.refactor(&b, &symbolic)?; // (DenseLu::refactor takes the new columns)
}
Err(e) => return Err(e),
}
- Dense updates use Bartels–Golub re-triangularization (spike → cyclic
column shift to upper-Hessenberg → Gauss sweep, folded into
L/U). - Sparse updates use a Forrest–Tomlin / Bartels–Golub–Reid scheme: the bump
is re-triangularized by sparse Gaussian elimination with partial pivoting,
recorded as a replayable eta and applied between the
L- andU-solves. The work is bump-local, so a warm solve after a localized update stays sparse instead of degrading toward a full re-solve.
An update returns NeedsRefactor — leaving the factorization
unchanged — when the update count (max_updates) or growth monitor
(max_growth) trips, so the caller can refactor rather than accept an unstable
factor. A vanished bump pivot returns
SingularBasis so the simplex can repair the basis instead of
receiving a garbage solve.
Robustness: scaling and refinement
The wrong-answer bugs that historically bit downstream simplex code were all
scaling/tolerance, never the update math — so the robustness layer is
load-bearing. LuParams::scaling selects a two-sided strategy:
LuScaling::InfNorm— Knight–Ruiz ∞-norm equilibration (separate row/column scalings).LuScaling::Mc64— unsymmetric MC64 (max-weight bipartite matching) that places large entries on the diagonal, with a partial-matching fall back toInfNorm.
Scaling wraps the core solve; the factorization factors the scaled matrix
D_row Π B D_col. For ill-conditioned bases, ftran_refined / btran_refined
run residual-based iterative refinement against the original basis:
let params = LuParams {
scaling: feral::LuScaling::Mc64,
refine_steps: 2,
refine_tol: 1e-14,
..LuParams::default()
};
let mut lu = SparseLu::factor(&b, &symbolic, params)?;
let mut x = a.clone();
lu.ftran_refined(&b, &mut x)?; // drives the true residual ‖Bx − a‖ down
Scope
The LU engine is a Rust API today; it is not exposed through the
Python bindings yet (no inertia, different update-centric
surface). The downstream BasisEngine integration and reference (UMFPACK/KLU)
benchmarks are tracked as later phases — see dev/plans/unsymmetric-lu-epic.md
in the repository.
Python bindings
feral ships Python bindings as the feral-solver
package (import name feral). They wrap the same Rust solver, so the
certified inertia and the batched multi-RHS performance carry over.
Install
pip install feral-solver # plain
pip install 'feral-solver[scipy]' # with scipy.sparse adapters
uv add feral-solver # via uv
Wheels cover CPython 3.10+ on Linux x86_64/aarch64, macOS universal2, and Windows x86_64.
Quickstart
import numpy as np
import feral
A = feral.CscMatrix.from_dense(np.array([
[4.0, 1.0, 0.0],
[1.0, 3.0, 2.0],
[0.0, 2.0, 5.0],
]))
solver = feral.Solver()
status, inertia = solver.factor(A)
assert status == feral.FactorStatus.SUCCESS
print(inertia) # Inertia(n_pos=3, n_neg=0, n_zero=0)
b = np.array([1.0, 2.0, 3.0])
x = solver.solve(b)
print(np.allclose(A.symv(x), b)) # True
Batched multi-RHS solves
Pass Solver.solve a 2-D (n, nrhs) array and it returns an
(n, nrhs) solution, solving every column against the one shared
factorization. For wide nrhs this engages the register-blocked panel
kernels automatically — no API change, just a 2-D right-hand side:
B = np.random.default_rng(0).standard_normal((n, nrhs))
X = solver.solve(B) # (n, nrhs), one batched call
# X[:, j] == solver.solve(B[:, j]) to machine precision
On 2-D Laplacians this is roughly 3–6× faster per RHS than looping the
single-RHS solve. The
02_multi_rhs_batched
notebook walks through a steady-state heat-conduction example with a
correctness check, a visualization, and a looped-vs-batched timing.
Iterative refinement batches too: pass a 2-D B to
solver.solve_refined(A, B) and each refinement step runs through the
panel kernel over the still-unconverged columns (issue #58), so the
refined path — the default for KKT back-solves — is ~2–3× faster per RHS
than looping the single-RHS refined solve. The batched path amortizes
the solves; the per-column residual B − A·X is still un-batched, so
on dense Hessians (where the matrix–vector product rivals the solve) the
gain is smaller. The notebook's final section measures it.
scipy.sparse interop
import scipy.sparse as sp
A = feral.from_scipy(A_scipy, symmetric="full") # reads the lower triangle
# ... factor, solve ...
A_back = feral.to_scipy(A) # round-trips to scipy
Unsymmetric LU basis engine
LuFactor factors a general (unsymmetric) square matrix and solves
A x = b (ftran) and Aᵀ y = c (btran), with simplex-style
product-form column updates. It auto-routes between a dense and a
sparse engine via the same should_use_dense_lu heuristic the Rust core
uses; pass force_dense=True/False to override the routing.
import numpy as np
import feral
A = np.array([[2.0, 1.0, 0.0],
[0.0, 3.0, 1.0],
[1.0, 0.0, 4.0]])
lu = feral.LuFactor(feral.LuMatrix.from_dense(A))
x = lu.ftran(np.array([1.0, 2.0, 3.0])) # solve A x = b
y = lu.btran(np.array([1.0, 0.0, 0.0])) # solve Aᵀ y = c
lu.update(1, np.array([0.0, 5.0, 1.0])) # replace basis column 1
print(lu.updates_since_refactor) # 1
# Factor identity P A Q = L U :
# A[lu.perm][:, lu.qcol] == lu.l_array() @ lu.u_array()
A singular basis raises SingularBasisError (a subclass of
FactorError, so existing except FactorError handlers keep working);
an exhausted update budget raises NeedsRefactorError — recover by
calling lu.refactor(new_matrix), which resets
updates_since_refactor to 0.
Factor access and introspection
After Solver.factor, the assembled factor and its statistics are
available without re-solving. All of the following are additive — the
original factor/solve surface is unchanged.
s = feral.Solver(ordering="amd", profiling=True)
s.factor(A)
fac = s.factors() # Factors snapshot (None before factor)
indptr, indices, data = fac.l_csc() # unit-lower L as CSC (factorization order)
d_diag, d_subdiag = fac.d_blocks() # block-diagonal D (2×2 where d_subdiag != 0)
L_scipy = fac.to_scipy_l() # optional scipy.sparse.csc_matrix
# Reconstruction identity (in factorization order):
# L @ D @ Lᵀ == P (S A S) Pᵀ
# using fac.perm and the per-row fac.scaling vector.
stats = s.last_factor_stats() # nnz_a, nnz_l, fill_ratio, inertia, pivots
print(s.min_pivot_magnitude, s.max_pivot_magnitude)
print(s.scaling_info.kind) # "applied" | "mc64_fallback_to_infnorm" | ...
print(s.profile_report()) # populated only when profiling=True
Solver.symbolic() — and the standalone feral.analyze(A, ordering=...),
which runs no numeric factorization — return a SymbolicAnalysis
exposing the resolved ordering, perm/perm_inv, num_supernodes,
factor_nnz_estimate (a slack-inflated upper bound on the realized
factor_nnz), col_counts, and the elimination-tree etree_parent
array (roots marked by root_sentinel).
New Solver(...) keyword arguments — all optional, each defaulting to
the prior behavior — expose the tuning knobs: ordering ("amd",
"amf", "metis", "scotch", "kahip", "auto", "auto_race"),
mc64_cache, profiling, partial_singular_warning, and
auto_cascade_break. Counters mc64_fallback_count,
mc64_scaling_fallback_count, mc64_retry_attempt_count, and
mc64_cache_hit_count round out the introspection surface, alongside
invalidate_factors() / invalidate_symbolic_cache().
Conversion conveniences
A = feral.CscMatrix.from_dense(M, triangle="lower") # or "upper" / "full"
dense = A.to_dense() # full symmetric 2-D array
indptr, indices = A.symmetric_pattern() # full structural pattern
from_dense's triangle keyword defaults to "lower" (unchanged
behavior); "upper" mirrors the upper triangle and "full" reads the
whole array.
Interior-point methods
feral.ipm.KktSolver wraps Solver with the Wächter–Biegler 2006 §3.1
perturbation-escalation loop and caches the symbolic analysis across a
Newton run. See the
Python README
and the 01–05 example notebooks for the full surface — 05 walks
through the LU engine, factor/symbolic access, and introspection added
in 0.11.0.
API reference
The rustdoc-generated API reference is published alongside this book:
feral— the main crate.feral_amd— clean-room AMD ordering.feral_amf— AMF ordering.feral_metis— METIS ordering wrapper.feral_scotch— Scotch ordering wrapper.feral_kahip— KaHIP ordering wrapper.feral_ordering_core— shared ordering types.
If a link is missing, it usually means the workspace member did not produce rustdoc on the most recent deploy — check the Pages workflow.