Introduction

feral is a sparse symmetric indefinite direct solver written in pure Rust, with certified inertia counts. It is a clean-room implementation from published papers and BSD-licensed references, MIT-licensed, with no BLAS, LAPACK, or Fortran dependencies in the core solver.

This book is the narrative companion to the API reference. The book covers concepts, semantics, and design choices; the API reference covers types and functions.

What it does

  • Sparse and dense symmetric indefinite factorization (LDLᵀ with Bunch–Kaufman pivoting).
  • Certified inertia (n_pos, n_neg, n_zero) alongside every factor.
  • Single and batched (multi-RHS) solves — factor once, solve against many right-hand sides, with register-blocked panel kernels for wide nrhs.
  • Iterative refinement and optional MC64 scaling for ill-conditioned and near-singular systems.
  • Unsymmetric LU basis engine (feral::lu) — a separate factorization family for revised-simplex bases, with cheap rank-1 column-replacement updates and warm ftran/btran solves (no inertia).
  • Python bindings (feral-solver on PyPI) and scipy.sparse interop.

Where to go next

  • Getting started — install, link, and run sparse, batched, and dense solves.
  • Inertia semantics — what Inertia means and what feral guarantees on singular matrices.
  • Unsymmetric LU basis engine — the separate feral::lu family for simplex bases: P B Q = L U, rank-1 updates, and warm solves.
  • Python bindings — install, quickstart, and batched solves from Python.
  • API reference — the rustdoc output for the feral crate and workspace members.

Getting started

Add feral to your Cargo.toml:

[dependencies]
feral = "0.10"

feral is primarily a sparse symmetric indefinite solver. The ergonomic entry point is the Solver type: build a matrix, factor it once, then solve it against one or many right-hand sides.

Sparse quickstart

use feral::Solver;
use feral::numeric::solver::FactorStatus;
use feral::sparse::csc::CscMatrix;

// Build a symmetric matrix from triplets of its LOWER triangle
// (row, col, value), `n` rows/cols. Duplicate entries are summed.
let n = 5;
let rows = [0, 1, 2, 3, 4, 1, 2, 3, 4];
let cols = [0, 0, 0, 0, 0, 1, 2, 3, 4];
let vals = [10.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0];
let a = CscMatrix::from_triplets(n, &rows, &cols, &vals)?;

// Factor once. Pass `None` to skip the inertia check, or
// `Some(expected)` to verify the matrix has the inertia you expect.
let mut solver = Solver::new();
let status = solver.factor(&a, None);
assert_eq!(status, FactorStatus::Success);

// The factorization carries a certified inertia (n_pos, n_neg, n_zero).
println!("inertia: {:?}", solver.inertia());

// Solve A·x = b for a single right-hand side.
let b = [1.0, 2.0, 3.0, 4.0, 5.0];
let x = solver.solve(&b)?;

solver.factor returns a FactorStatus; match on it to distinguish Success, Singular, WrongInertia, and FatalError. Once factored, the Solver keeps the factor, so any number of solves reuse it.

Many right-hand sides (batched solve)

The same factorization can be solved against many right-hand sides at once. solve_many shares the supernodal traversal across columns, so it is substantially cheaper than looping a single-RHS solve — and for wide nrhs it runs each supernode's dense panel as register-blocked TRSM + GEMM kernels.

B and X are column-major n × nrhs matrices, stored as flat slices of length n * nrhs (column c occupies [c*n .. (c+1)*n]):

let nrhs = 64;
// b_many[c*n + i] is row i of right-hand-side column c.
let b_many: Vec<f64> = make_columns(n, nrhs);
let x_many = solver.solve_many(&b_many, nrhs)?;   // length n * nrhs

This is the path behind batched KKT back-solves, jax.jacrev over a solve, sensitivity analysis, and parameter sweeps. On 2-D Laplacians it is roughly 3–6× faster per RHS than looping single-RHS solves (the exact factor depends on size, CPU SIMD width, and cache). See GitHub issue #57 and the 02_multi_rhs_batched notebook.

Iterative refinement

With ZeroPivotAction::ForceAccept (the default), an unrefined solve can leave a residual on near-singular pivots. solve_refined runs a few steps of iterative refinement against the original matrix and returns the best iterate:

let x = solver.solve_refined(&a, &b)?;                       // single RHS
let x_many = solver.solve_many_refined(&a, &b_many, nrhs)?;  // batched

solve_many_refined keeps per-column best-iterate convergence but, for wide right-hand sides, refines through the same batched panel kernel as solve_many — one batched solve per refinement step over the still- unconverged columns — so the refined path amortizes too (issue #58).

Dense path

For small dense systems there is a direct API that mirrors the sparse one. factor returns the factors and the certified inertia:

use feral::{factor, solve, BunchKaufmanParams, SymmetricMatrix};

// Lower-triangle entries (row >= col) of an n×n symmetric matrix.
let a = SymmetricMatrix::from_lower_triangle(n, &entries);
let (factors, inertia) = factor(&a, &BunchKaufmanParams::default())?;
let x = solve(&factors, &b)?;

More

  • Runnable Rust programs: examples/ exercises the dense and sparse paths, scaling, and refinement.
  • Python bindings and notebooks: see Python bindings.
  • Inertia guarantees on singular matrices: see Inertia semantics.

Inertia semantics

Inertia is the triple (n_pos, n_neg, n_zero) — the number of positive, negative, and zero eigenvalues of a symmetric matrix. feral reports an inertia alongside every factorization.

Guarantees

  • Non-singular matrices: inertia is exact.
  • Singular matrices: on matrices where the canonical Fortran direct solvers (MUMPS 5.8.2 and SPRAL SSIDS) disagree on inertia, feral agrees with at least one of them.

The corpus consensus framework lives under external_benchmarks/consensus/ and tags matrices with no 3-of-4-oracle agreement as excluded; those matrices are not part of the inertia gate.

See Inertia in the API reference for the type itself.

Scaling

Symmetric scaling replaces the system A·x = b with a congruent system that is better conditioned, then maps the solution back. feral applies a single per-variable scaling vector s symmetrically:

    (D·A·D)·(D⁻¹·x) = D·b ,   D = diag(s)

so the same vector s pre-scales the right-hand side and post-scales the solution — the D⁻¹ cancels algebraically. Concretely, each matrix entry a[i,j] is multiplied by s[i]·s[j] during frontal assembly, the RHS is multiplied by s before the forward sweep, and the solution is multiplied by s after the backward sweep. Good scaling pulls the matrix entries toward unit magnitude, which improves pivot quality and keeps the certified inertia trustworthy on KKT-shaped systems.

Choosing a strategy

The strategy is a feral::scaling::ScalingStrategy, set on the Solver:

use feral::Solver;
use feral::scaling::ScalingStrategy;

let solver = Solver::new().with_scaling(ScalingStrategy::Auto);

The variants:

VariantWhat it computes
Auto (default)Picks Mc64Symmetric or InfNorm per matrix; see below.
InfNormKnight–Ruiz iterative ∞-norm equilibration.
Mc64SymmetricMatching-based (Duff–Koster) symmetric scaling.
IdentityNo scaling (all ones) — the fast path.
External(Vec<f64>)A scaling vector you supply, in user ordering.

InfNorm — equilibration

InfNorm runs an iterative ∞-norm equilibration in the symmetry-preserving style of Ruiz: it computes a diagonal D so that every row (and, by symmetry, every column) of D·A·D has ∞-norm near one. It needs only a handful of sparse mat-vec passes, requires no matching solve, and preserves symmetry exactly. It is the right default for matrices that are already roughly balanced.

See Ruiz (2001) and Knight, Ruiz & Uçar (2014) below.

Mc64Symmetric — matching-based

Mc64Symmetric builds the scaling from a maximum-weight bipartite matching on the log-magnitude graph (the MC64 family). The asymmetric dual variables u, v of the matching are symmetrized into a single per-variable factor

    s_i = exp((u_i + v_i) / 2)

which is exactly the scaling MUMPS and SPRAL SSIDS use for symmetric indefinite matrices. It is more expensive than equilibration (it solves a matching) but dramatically better on arrow-shaped KKT systems, where a few dense coupling rows otherwise dominate the norms.

See Duff & Koster (1999, 2001) and Duff & Pralet (2005) below.

Auto — adaptive routing

Auto (the default) inspects the sparsity pattern and routes to Mc64Symmetric only when the matrix looks like an arrow/saddle-point KKT — specifically when both:

  • a large fraction of columns are structurally diagonal-only (slack columns), and
  • at least one column is dense relative to the rest (an arrow head).

Otherwise it routes to InfNorm. This split matters: arrow-KKT matrices can improve by one to two orders of magnitude under MC64, while banded KKTs can regress under it, so the router keeps each on the strategy that helps. Auto additionally applies fallback guards — if the matrix is already well equilibrated, or if the matching produces a degenerate (over-wide) scaling, it falls back to InfNorm.

The exact thresholds are implementation details that have been tuned against the benchmark corpus and may change between releases; treat Auto as "feral picks a reasonable scaling for this matrix" and reach for an explicit variant only when you have measured a better choice.

References

Matching-based scaling (MC64 family):

  • I. S. Duff and J. Koster. The Design and Use of Algorithms for Permuting Large Entries to the Diagonal of Sparse Matrices. SIAM J. Matrix Anal. Appl. 20(4):889–901, 1999. doi:10.1137/S0895479897317661
  • I. S. Duff and J. Koster. On Algorithms for Permuting Large Entries to the Diagonal of a Sparse Matrix. SIAM J. Matrix Anal. Appl. 22(4):973–996, 2001. doi:10.1137/S0895479899358443
  • I. S. Duff and S. Pralet. Strategies for Scaling and Pivoting for Sparse Symmetric Indefinite Problems. SIAM J. Matrix Anal. Appl. 27(2):313–340, 2005. doi:10.1137/04061043X

Equilibration (Ruiz / Knight–Ruiz family):

  • D. Ruiz. A Scaling Algorithm to Equilibrate Both Rows and Columns Norms in Matrices. Tech. Report RAL-TR-2001-034, Rutherford Appleton Laboratory, 2001. PDF
  • P. A. Knight, D. Ruiz, and B. Uçar. A Symmetry Preserving Algorithm for Matrix Scaling. SIAM J. Matrix Anal. Appl. 35(3):931–955, 2014. doi:10.1137/110825753

Fill-reducing ordering

Before factoring, feral permutes the matrix to reduce fill-in — the new nonzeros created during elimination. A good ordering can change the factor size and factorization time by orders of magnitude, so the choice of ordering is the single most important tuning knob for large sparse problems. All orderings are clean-room implementations in pure Rust (workspace crates feral_amd, feral_amf, feral_metis, feral_scotch, feral_kahip); none link the original C/Fortran libraries.

Choosing a method

The method is a feral::symbolic::OrderingMethod, set on the Solver:

use feral::Solver;
use feral::symbolic::OrderingMethod;

let solver = Solver::new().with_ordering(OrderingMethod::Auto);

The variants:

VariantAlgorithm
AmdApproximate Minimum Degree.
AmfApproximate Minimum Fill (HAMF, the MUMPS variant).
MetisNDMETIS-style multilevel nested dissection.
ScotchNDScotch-style nested dissection.
KahipNDKaHIP flow-based nested dissection with data reduction.
AutoAdaptive: pick a method from cheap pattern metrics.
AutoRaceRun several candidates and keep the smallest factor.

Local (greedy) orderings

  • Amd — Approximate Minimum Degree eliminates the vertex of smallest approximate degree at each step, using a quotient-graph model with element absorption and supervariable detection. It is fast, low-memory, and the strongest general-purpose choice for small and medium matrices. This is the default for symbolic_factorize.
  • Amf — Approximate Minimum Fill scores by an estimate of the fill a pivot would create rather than its degree. It is the HAMF4 variant used by MUMPS, and tends to beat AMD slightly on the medium KKT systems that dominate interior-point workloads.

Nested dissection orderings

Nested dissection recursively finds a small vertex separator, numbers it last, and recurses on the two halves. It produces asymptotically better orderings than greedy methods on large, mesh-like graphs.

  • MetisND — multilevel nested dissection: coarsen the graph by heavy-edge matching, bisect at the coarsest level, then uncoarsen with Fiduccia–Mattheyses refinement, falling back to AMD on small leaves.
  • ScotchND — multilevel nested dissection with graph compression and direct vertex-separator refinement.
  • KahipND — flow-based nested dissection: it computes separators with max-flow/min-cut local improvement and applies exhaustive data reduction (degree-1/2, twin, and indistinguishable-vertex rules) before dissecting. Highest quality and highest setup cost; it is never chosen automatically.

Automatic selection

  • Auto chooses a method from a few cheap pattern metrics: very large, very sparse graphs are sent to Amd (nested dissection's separator search does not pay off there), and everything else follows the size-based default — Amf for smaller matrices, MetisND for larger ones. The crossover and the "very large, very sparse" gate are tuned against the benchmark corpus and may change between releases.
  • AutoRace runs the symbolic analysis for several candidate orderings (Amd, MetisND, ScotchND, KahipND) and keeps the one whose factor is predicted smallest. It costs roughly the sum of those symbolic passes — worth it when one factorization is reused for many solves, since symbolic analysis is amortized and the numeric phase dominates.

Symbolic analysis (ordering + supernode detection) runs once and is cached on the Solver; refactorizing the same pattern with new values reuses it. So an expensive ordering like KahipND or AutoRace is cheap in amortized terms across a Newton run or a refactorization loop.

References

Greedy orderings:

  • P. R. Amestoy, T. A. Davis, and I. S. Duff. An Approximate Minimum Degree Ordering Algorithm. SIAM J. Matrix Anal. Appl. 17(4):886–905, 1996. doi:10.1137/S0895479894278952
  • P. R. Amestoy, T. A. Davis, and I. S. Duff. Algorithm 837: AMD, an Approximate Minimum Degree Ordering Algorithm. ACM Trans. Math. Softw. 30(3):381–388, 2004. doi:10.1145/1024074.1024081
  • A. George and J. W. H. Liu. Computer Solution of Large Sparse Positive Definite Systems. Prentice-Hall, 1981. ISBN 0-13-165274-5. (Minimum-fill and elimination-tree foundations.)

Nested dissection:

  • A. George. Nested Dissection of a Regular Finite Element Mesh. SIAM J. Numer. Anal. 10(2):345–363, 1973. doi:10.1137/0710032
  • R. J. Lipton, D. J. Rose, and R. E. Tarjan. Generalized Nested Dissection. SIAM J. Numer. Anal. 16(2):346–358, 1979. doi:10.1137/0716027
  • G. Karypis and V. Kumar. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J. Sci. Comput. 20(1):359–392, 1998. doi:10.1137/S1064827595287997
  • F. Pellegrini and J. Roman. Scotch: A Software Package for Static Mapping by Dual Recursive Bipartitioning of Process and Architecture Graphs. HPCN Europe, LNCS 1067:493–498, 1996. doi:10.1007/3-540-61142-8_588
  • P. Sanders and C. Schulz. Engineering Multilevel Graph Partitioning Algorithms. ESA 2011, LNCS 6942:469–480. doi:10.1007/978-3-642-23719-5_40
  • W. Ost, C. Schulz, and D. Strash. Engineering Data Reduction for Nested Dissection. ALENEX 2021, 113–127. doi:10.1137/1.9781611976472.9

Refinement primitive:

  • C. M. Fiduccia and R. M. Mattheyses. A Linear-Time Heuristic for Improving Network Partitions. 19th Design Automation Conference, 175–181, 1982. doi:10.1109/DAC.1982.1585498

Unsymmetric LU basis engine

Most of feral is a symmetric indefinite LDLᵀ solver with certified inertia. The feral::lu module is a separate factorization family: an unsymmetric LU built to drive a revised-simplex basis. It is additive — the LDLᵀ solver and every one of its code paths are untouched — and it deliberately does not compute inertia (an unsymmetric basis has no symmetric eigenvalue structure to certify).

The distinguishing requirement is not the one-shot factor/solve but the rank-1 update: a simplex iteration replaces one basic column, and the engine folds that change into the existing factors in O(nnz) rather than refactoring.

The factorization

For a square nonsingular basis B (m × m), feral computes

P B Q = L U

where P is a row permutation (threshold partial pivoting, for stability), Q is a fill-reducing column permutation (sparse path only; Q = I on the dense path), L is unit lower triangular, and U is upper triangular. The two hot operations of revised simplex follow directly:

ftran:  solve  B x = a      (forward transformation,  B⁻¹a)
btran:  solve  Bᵀ x = a     (backward transformation, B⁻ᵀa)

Dense path

For the small bases that dominate (e.g. OBBT bases on a handful of variables), DenseLu factors a general column-major matrix with right-looking LU and threshold partial pivoting:

use feral::{DenseLu, LuParams};

// `cols[j]` is column j of the m×m basis (length m each).
let m = 3;
let cols = vec![
    vec![2.0, 4.0, 8.0],
    vec![1.0, 3.0, 7.0],
    vec![1.0, 3.0, 9.0],
];
let mut lu = DenseLu::factor(&cols, m, LuParams::default())?;

// ftran / btran overwrite the right-hand side in place.
let mut x = vec![1.0, 2.0, 3.0];
lu.ftran(&mut x)?;          // x ← B⁻¹ x
let mut y = vec![1.0, 0.0, 0.0];
lu.btran(&mut y)?;          // y ← B⁻ᵀ y

Sparse path

For larger bases, SparseLu is a left-looking Gilbert–Peierls LU with an output-sensitive depth-first reach (so the factor cost is O(flops), not O(n²)). The fill-reducing column order is a reusable symbolic handle computed by SparseLuSymbolic — feral's in-tree AMD run on the AᵀA (column-intersection) pattern, a stand-in for COLAMD. Because the pattern is invariant under the row permutation and scaling, the same order is valid for numerically different but structurally identical bases:

use feral::{SparseColMatrix, SparseLu, SparseLuSymbolic, LuParams};

let b: SparseColMatrix = /* general CSC basis */;
let symbolic = SparseLuSymbolic::analyze(&b)?;   // reusable across refactors
let mut lu = SparseLu::factor(&b, &symbolic, LuParams::default())?;

let mut x = vec![/* … length m … */];
lu.ftran(&mut x)?;

should_use_dense_lu(m, nnz, &params) mirrors the symmetric router: tiny bases go dense unconditionally, small dense-enough bases go dense by a density gate, and the rest go sparse.

Rank-1 column-replacement update

This is the reason the engine exists. update (and update_sparse, which takes the entering column already in sparse form) replaces one basic column and folds the change into the factors in place:

let leaving_slot = 2;                 // basis column to evict
let entering = vec![0.5, 0.0, 1.5];   // new column aₙₑw
match lu.update(leaving_slot, &entering) {
    Ok(()) => { /* factors now reflect the new basis */ }
    Err(feral::FeralError::NeedsRefactor) => {
        // Budget or stability limit reached; `lu` is unchanged.
        lu.refactor(&b, &symbolic)?;  // (DenseLu::refactor takes the new columns)
    }
    Err(e) => return Err(e),
}
  • Dense updates use Bartels–Golub re-triangularization (spike → cyclic column shift to upper-Hessenberg → Gauss sweep, folded into L/U).
  • Sparse updates use a Forrest–Tomlin / Bartels–Golub–Reid scheme: the bump is re-triangularized by sparse Gaussian elimination with partial pivoting, recorded as a replayable eta and applied between the L- and U-solves. The work is bump-local, so a warm solve after a localized update stays sparse instead of degrading toward a full re-solve.

An update returns NeedsRefactor — leaving the factorization unchanged — when the update count (max_updates) or growth monitor (max_growth) trips, so the caller can refactor rather than accept an unstable factor. A vanished bump pivot returns SingularBasis so the simplex can repair the basis instead of receiving a garbage solve.

Robustness: scaling and refinement

The wrong-answer bugs that historically bit downstream simplex code were all scaling/tolerance, never the update math — so the robustness layer is load-bearing. LuParams::scaling selects a two-sided strategy:

  • LuScaling::InfNorm — Knight–Ruiz ∞-norm equilibration (separate row/column scalings).
  • LuScaling::Mc64 — unsymmetric MC64 (max-weight bipartite matching) that places large entries on the diagonal, with a partial-matching fall back to InfNorm.

Scaling wraps the core solve; the factorization factors the scaled matrix D_row Π B D_col. For ill-conditioned bases, ftran_refined / btran_refined run residual-based iterative refinement against the original basis:

let params = LuParams {
    scaling: feral::LuScaling::Mc64,
    refine_steps: 2,
    refine_tol: 1e-14,
    ..LuParams::default()
};
let mut lu = SparseLu::factor(&b, &symbolic, params)?;
let mut x = a.clone();
lu.ftran_refined(&b, &mut x)?;   // drives the true residual ‖Bx − a‖ down

Scope

The LU engine is a Rust API today; it is not exposed through the Python bindings yet (no inertia, different update-centric surface). The downstream BasisEngine integration and reference (UMFPACK/KLU) benchmarks are tracked as later phases — see dev/plans/unsymmetric-lu-epic.md in the repository.

Python bindings

feral ships Python bindings as the feral-solver package (import name feral). They wrap the same Rust solver, so the certified inertia and the batched multi-RHS performance carry over.

Install

pip install feral-solver           # plain
pip install 'feral-solver[scipy]'  # with scipy.sparse adapters
uv add feral-solver                # via uv

Wheels cover CPython 3.10+ on Linux x86_64/aarch64, macOS universal2, and Windows x86_64.

Quickstart

import numpy as np
import feral

A = feral.CscMatrix.from_dense(np.array([
    [4.0, 1.0, 0.0],
    [1.0, 3.0, 2.0],
    [0.0, 2.0, 5.0],
]))

solver = feral.Solver()
status, inertia = solver.factor(A)
assert status == feral.FactorStatus.SUCCESS
print(inertia)                       # Inertia(n_pos=3, n_neg=0, n_zero=0)

b = np.array([1.0, 2.0, 3.0])
x = solver.solve(b)
print(np.allclose(A.symv(x), b))     # True

Batched multi-RHS solves

Pass Solver.solve a 2-D (n, nrhs) array and it returns an (n, nrhs) solution, solving every column against the one shared factorization. For wide nrhs this engages the register-blocked panel kernels automatically — no API change, just a 2-D right-hand side:

B = np.random.default_rng(0).standard_normal((n, nrhs))
X = solver.solve(B)                  # (n, nrhs), one batched call
# X[:, j] == solver.solve(B[:, j]) to machine precision

On 2-D Laplacians this is roughly 3–6× faster per RHS than looping the single-RHS solve. The 02_multi_rhs_batched notebook walks through a steady-state heat-conduction example with a correctness check, a visualization, and a looped-vs-batched timing.

Iterative refinement batches too: pass a 2-D B to solver.solve_refined(A, B) and each refinement step runs through the panel kernel over the still-unconverged columns (issue #58), so the refined path — the default for KKT back-solves — is ~2–3× faster per RHS than looping the single-RHS refined solve. The batched path amortizes the solves; the per-column residual B − A·X is still un-batched, so on dense Hessians (where the matrix–vector product rivals the solve) the gain is smaller. The notebook's final section measures it.

scipy.sparse interop

import scipy.sparse as sp

A = feral.from_scipy(A_scipy, symmetric="full")   # reads the lower triangle
# ... factor, solve ...
A_back = feral.to_scipy(A)                         # round-trips to scipy

Unsymmetric LU basis engine

LuFactor factors a general (unsymmetric) square matrix and solves A x = b (ftran) and Aᵀ y = c (btran), with simplex-style product-form column updates. It auto-routes between a dense and a sparse engine via the same should_use_dense_lu heuristic the Rust core uses; pass force_dense=True/False to override the routing.

import numpy as np
import feral

A = np.array([[2.0, 1.0, 0.0],
              [0.0, 3.0, 1.0],
              [1.0, 0.0, 4.0]])

lu = feral.LuFactor(feral.LuMatrix.from_dense(A))
x = lu.ftran(np.array([1.0, 2.0, 3.0]))     # solve A x = b
y = lu.btran(np.array([1.0, 0.0, 0.0]))     # solve Aᵀ y = c

lu.update(1, np.array([0.0, 5.0, 1.0]))     # replace basis column 1
print(lu.updates_since_refactor)            # 1

# Factor identity  P A Q = L U :
#   A[lu.perm][:, lu.qcol] == lu.l_array() @ lu.u_array()

A singular basis raises SingularBasisError (a subclass of FactorError, so existing except FactorError handlers keep working); an exhausted update budget raises NeedsRefactorError — recover by calling lu.refactor(new_matrix), which resets updates_since_refactor to 0.

Factor access and introspection

After Solver.factor, the assembled factor and its statistics are available without re-solving. All of the following are additive — the original factor/solve surface is unchanged.

s = feral.Solver(ordering="amd", profiling=True)
s.factor(A)

fac = s.factors()                       # Factors snapshot (None before factor)
indptr, indices, data = fac.l_csc()     # unit-lower L as CSC (factorization order)
d_diag, d_subdiag = fac.d_blocks()      # block-diagonal D (2×2 where d_subdiag != 0)
L_scipy = fac.to_scipy_l()              # optional scipy.sparse.csc_matrix

# Reconstruction identity (in factorization order):
#   L @ D @ Lᵀ  ==  P (S A S) Pᵀ
# using fac.perm and the per-row fac.scaling vector.

stats = s.last_factor_stats()           # nnz_a, nnz_l, fill_ratio, inertia, pivots
print(s.min_pivot_magnitude, s.max_pivot_magnitude)
print(s.scaling_info.kind)              # "applied" | "mc64_fallback_to_infnorm" | ...
print(s.profile_report())               # populated only when profiling=True

Solver.symbolic() — and the standalone feral.analyze(A, ordering=...), which runs no numeric factorization — return a SymbolicAnalysis exposing the resolved ordering, perm/perm_inv, num_supernodes, factor_nnz_estimate (a slack-inflated upper bound on the realized factor_nnz), col_counts, and the elimination-tree etree_parent array (roots marked by root_sentinel).

New Solver(...) keyword arguments — all optional, each defaulting to the prior behavior — expose the tuning knobs: ordering ("amd", "amf", "metis", "scotch", "kahip", "auto", "auto_race"), mc64_cache, profiling, partial_singular_warning, and auto_cascade_break. Counters mc64_fallback_count, mc64_scaling_fallback_count, mc64_retry_attempt_count, and mc64_cache_hit_count round out the introspection surface, alongside invalidate_factors() / invalidate_symbolic_cache().

Conversion conveniences

A = feral.CscMatrix.from_dense(M, triangle="lower")   # or "upper" / "full"
dense = A.to_dense()                                  # full symmetric 2-D array
indptr, indices = A.symmetric_pattern()               # full structural pattern

from_dense's triangle keyword defaults to "lower" (unchanged behavior); "upper" mirrors the upper triangle and "full" reads the whole array.

Interior-point methods

feral.ipm.KktSolver wraps Solver with the Wächter–Biegler 2006 §3.1 perturbation-escalation loop and caches the symbolic analysis across a Newton run. See the Python README and the 0105 example notebooks for the full surface — 05 walks through the LU engine, factor/symbolic access, and introspection added in 0.11.0.

API reference

The rustdoc-generated API reference is published alongside this book:

If a link is missing, it usually means the workspace member did not produce rustdoc on the most recent deploy — check the Pages workflow.