Python bindings

feral ships Python bindings as the feral-solver package (import name feral). They wrap the same Rust solver, so the certified inertia and the batched multi-RHS performance carry over.

Install

pip install feral-solver           # plain
pip install 'feral-solver[scipy]'  # with scipy.sparse adapters
uv add feral-solver                # via uv

Wheels cover CPython 3.10+ on Linux x86_64/aarch64, macOS universal2, and Windows x86_64.

Quickstart

import numpy as np
import feral

A = feral.CscMatrix.from_dense(np.array([
    [4.0, 1.0, 0.0],
    [1.0, 3.0, 2.0],
    [0.0, 2.0, 5.0],
]))

solver = feral.Solver()
status, inertia = solver.factor(A)
assert status == feral.FactorStatus.SUCCESS
print(inertia)                       # Inertia(n_pos=3, n_neg=0, n_zero=0)

b = np.array([1.0, 2.0, 3.0])
x = solver.solve(b)
print(np.allclose(A.symv(x), b))     # True

Batched multi-RHS solves

Pass Solver.solve a 2-D (n, nrhs) array and it returns an (n, nrhs) solution, solving every column against the one shared factorization. For wide nrhs this engages the register-blocked panel kernels automatically — no API change, just a 2-D right-hand side:

B = np.random.default_rng(0).standard_normal((n, nrhs))
X = solver.solve(B)                  # (n, nrhs), one batched call
# X[:, j] == solver.solve(B[:, j]) to machine precision

On 2-D Laplacians this is roughly 3–6× faster per RHS than looping the single-RHS solve. The 02_multi_rhs_batched notebook walks through a steady-state heat-conduction example with a correctness check, a visualization, and a looped-vs-batched timing.

Iterative refinement batches too: pass a 2-D B to solver.solve_refined(A, B) and each refinement step runs through the panel kernel over the still-unconverged columns (issue #58), so the refined path — the default for KKT back-solves — is ~2–3× faster per RHS than looping the single-RHS refined solve. The batched path amortizes the solves; the per-column residual B − A·X is still un-batched, so on dense Hessians (where the matrix–vector product rivals the solve) the gain is smaller. The notebook's final section measures it.

scipy.sparse interop

import scipy.sparse as sp

A = feral.from_scipy(A_scipy, symmetric="full")   # reads the lower triangle
# ... factor, solve ...
A_back = feral.to_scipy(A)                         # round-trips to scipy

Unsymmetric LU basis engine

LuFactor factors a general (unsymmetric) square matrix and solves A x = b (ftran) and Aᵀ y = c (btran), with simplex-style product-form column updates. It auto-routes between a dense and a sparse engine via the same should_use_dense_lu heuristic the Rust core uses; pass force_dense=True/False to override the routing.

import numpy as np
import feral

A = np.array([[2.0, 1.0, 0.0],
              [0.0, 3.0, 1.0],
              [1.0, 0.0, 4.0]])

lu = feral.LuFactor(feral.LuMatrix.from_dense(A))
x = lu.ftran(np.array([1.0, 2.0, 3.0]))     # solve A x = b
y = lu.btran(np.array([1.0, 0.0, 0.0]))     # solve Aᵀ y = c

lu.update(1, np.array([0.0, 5.0, 1.0]))     # replace basis column 1
print(lu.updates_since_refactor)            # 1

# Factor identity  P A Q = L U :
#   A[lu.perm][:, lu.qcol] == lu.l_array() @ lu.u_array()

A singular basis raises SingularBasisError (a subclass of FactorError, so existing except FactorError handlers keep working); an exhausted update budget raises NeedsRefactorError — recover by calling lu.refactor(new_matrix), which resets updates_since_refactor to 0.

Factor access and introspection

After Solver.factor, the assembled factor and its statistics are available without re-solving. All of the following are additive — the original factor/solve surface is unchanged.

s = feral.Solver(ordering="amd", profiling=True)
s.factor(A)

fac = s.factors()                       # Factors snapshot (None before factor)
indptr, indices, data = fac.l_csc()     # unit-lower L as CSC (factorization order)
d_diag, d_subdiag = fac.d_blocks()      # block-diagonal D (2×2 where d_subdiag != 0)
L_scipy = fac.to_scipy_l()              # optional scipy.sparse.csc_matrix

# Reconstruction identity (in factorization order):
#   L @ D @ Lᵀ  ==  P (S A S) Pᵀ
# using fac.perm and the per-row fac.scaling vector.

stats = s.last_factor_stats()           # nnz_a, nnz_l, fill_ratio, inertia, pivots
print(s.min_pivot_magnitude, s.max_pivot_magnitude)
print(s.scaling_info.kind)              # "applied" | "mc64_fallback_to_infnorm" | ...
print(s.profile_report())               # populated only when profiling=True

Solver.symbolic() — and the standalone feral.analyze(A, ordering=...), which runs no numeric factorization — return a SymbolicAnalysis exposing the resolved ordering, perm/perm_inv, num_supernodes, factor_nnz_estimate (a slack-inflated upper bound on the realized factor_nnz), col_counts, and the elimination-tree etree_parent array (roots marked by root_sentinel).

New Solver(...) keyword arguments — all optional, each defaulting to the prior behavior — expose the tuning knobs: ordering ("amd", "amf", "metis", "scotch", "kahip", "auto", "auto_race"), mc64_cache, profiling, partial_singular_warning, and auto_cascade_break. Counters mc64_fallback_count, mc64_scaling_fallback_count, mc64_retry_attempt_count, and mc64_cache_hit_count round out the introspection surface, alongside invalidate_factors() / invalidate_symbolic_cache().

Conversion conveniences

A = feral.CscMatrix.from_dense(M, triangle="lower")   # or "upper" / "full"
dense = A.to_dense()                                  # full symmetric 2-D array
indptr, indices = A.symmetric_pattern()               # full structural pattern

from_dense's triangle keyword defaults to "lower" (unchanged behavior); "upper" mirrors the upper triangle and "full" reads the whole array.

Interior-point methods

feral.ipm.KktSolver wraps Solver with the Wächter–Biegler 2006 §3.1 perturbation-escalation loop and caches the symbolic analysis across a Newton run. See the Python README and the 0105 example notebooks for the full surface — 05 walks through the LU engine, factor/symbolic access, and introspection added in 0.11.0.