Convexity detection#
discopt ships a convexity detector that walks the expression DAG of
a Model and labels every subexpression as CONVEX, CONCAVE, AFFINE,
or UNKNOWN. The downstream solver uses these labels to gate αBB
convexification, decide when outer-approximation cuts apply, and choose
between convex and spatial branching strategies.
This notebook documents how the detector works and what guarantees it does — and does not — provide.
Soundness, not completeness#
The detector is sound by construction: a CONVEX or CONCAVE
verdict is a mathematical proof derived from the disciplined convex
programming composition rule [Grant et al., 2006] and the atom table of
Boyd and Vandenberghe [2004], §3.1. The detector is deliberately incomplete — when
no rule applies it returns UNKNOWN rather than guess. An UNKNOWN
verdict is a conservative label, not a claim of nonconvexity.
The design follows the SUSPECT detector of Ceccon et al. [2020], which demonstrated that a syntactic walk over the expression tree, augmented with sign propagation, recovers the majority of textbook convex MINLP structure at negligible cost.
Three lattices#
The walker propagates three lattices up the DAG; each node publishes a tuple that downstream rules consume.
Curvature —
CONVEX/CONCAVE/AFFINE/UNKNOWN.Sign —
NEG/NONPOS/ZERO/NONNEG/POS/UNKNOWN, derived from variable bounds, parameter values, or algebraic structure (e.g.,exp(·)is alwaysPOS).Monotonicity —
NONDEC/NONINC/CONST/UNKNOWN. This one is not propagated up the tree; it is a property of outer atoms, consulted via an atom table at composition sites.
The DCP composition rule#
For a composite expression \(h(x) = f(g(x))\) the rule is [Grant et al., 2006]:
\(h\) is convex when \(f\) is convex and either
\(g\) is affine, or
\(f\) is nondecreasing and \(g\) is convex, or
\(f\) is nonincreasing and \(g\) is concave.
The concave rule is symmetric.
Sign propagation matters because many atoms’ monotonicity depends on the
sign of the argument: |x| is nondecreasing on \(x \ge 0\) and
nonincreasing on \(x \le 0\). Without the sign lattice, these atoms cannot
be used in the composition rule at all.
Worked examples#
import discopt.modeling as dm
from discopt._jax.convexity import classify_expr, classify_model
from discopt.modeling.core import FunctionCall, Model
m = Model("example")
x = m.continuous("x", lb=0.1, ub=10.0)
y = m.continuous("y", lb=-5.0, ub=5.0)
print("x**2 + exp(y):", classify_expr(x**2 + dm.exp(y), m))
print("log(x): ", classify_expr(dm.log(x), m))
print("1/x: ", classify_expr(1.0 / x, m))
print("x*y: ", classify_expr(x * y, m))
print("max(x**2,y**2):", classify_expr(FunctionCall("max", x**2, y**2), m))
x**2 + exp(y): Curvature.CONVEX
log(x): Curvature.CONCAVE
1/x: Curvature.CONVEX
x*y: Curvature.UNKNOWN
max(x**2,y**2): Curvature.CONVEX
1/x is detected as CONVEX because the sign lattice proves the
denominator is strictly positive; on a mixed-sign domain the same
expression returns UNKNOWN (the reciprocal is undefined through zero).
max(x**2, y**2) is classified by the n-ary rule: the pointwise maximum
of convex functions is convex.
A sign-aware reciprocal trap#
The reciprocal rule must look at the curvature of its argument, not
just its sign. For the logistic denominator \(1 + e^{-x}\) the sign is
provably positive, but the denominator itself is convex; composing a
CONVEX, NONINC outer with a convex inner does not satisfy the DCP
rule, so the sound verdict is UNKNOWN:
z = m.continuous("z", lb=-5.0, ub=5.0)
sigmoid = 1.0 / (1.0 + dm.exp(-z))
classify_expr(sigmoid, m)
<Curvature.UNKNOWN: 'unknown'>
This is the correct answer — the sigmoid has inflection points at \(z = \pm\infty\) and is neither globally convex nor concave.
Model-level classification#
classify_model(model) returns (is_convex, per_constraint_mask). A
maximization problem is “convex” when the objective body is concave or
affine, reflecting the equivalence \(\max f \equiv \min (-f)\).
m2 = Model("convex_nlp")
x = m2.continuous("x", lb=0.1, ub=10.0)
y = m2.continuous("y", lb=0.1, ub=10.0)
m2.maximize(dm.log(x) + dm.log(y)) # concave body -> convex problem
m2.subject_to(x + y <= 10)
is_cvx, mask = classify_model(m2)
print("is_convex:", is_cvx)
print("constraint mask:", mask)
is_convex: True
constraint mask: [True]
How soundness is verified#
Every verdict the detector can emit is checked by two independent numerical oracles in the test suite:
Jensen-inequality fuzz. For each
CONVEXverdict, random sample pairs \((x_1, x_2)\) in the box and a random \(\lambda \in [0, 1]\) are tested against $\(f(\lambda x_1 + (1-\lambda) x_2) \le \lambda f(x_1) + (1-\lambda) f(x_2).\)$ The symmetric inequality is used forCONCAVE. A single violation is a detector bug.Numerical Hessian spectrum. At sampled points the Hessian is computed via JAX automatic differentiation and its eigenvalues are inspected. A negative eigenvalue on a
CONVEXverdict disproves the claim [Boyd and Vandenberghe, 2004], §3.1.4.
Sound box-local certificate#
The syntactic detector is cheap and sound, but incomplete: many
expressions that are convex on a specific box go unrecognised because
no composition rule fires. certify_convex complements the syntactic
walker with a numerical certificate that is also sound — any CONVEX or
CONCAVE verdict it returns is backed by a proof on the box — but can
recognise cases the rules miss.
The certificate chains three interval-arithmetic layers:
Outward-rounded interval arithmetic [Moore, 1966, Neumaier, 1990] on scalar and elementary functions. Every operation nudges its endpoints outward by one ULP so floating-point roundoff never breaks the enclosure.
Interval forward-mode AD [Griewank and Walther, 2008] over the expression DAG. A triple
(value, gradient, hessian)is propagated where every entry is itself an interval; the Hessian matrix \(\mathbf{H}\) encloses the pointwise Hessian \(\nabla^2 f(x)\) for every \(x\) in the box.Interval Gershgorin eigenvalue bound [Gershgorin, 1931]. For a symmetric interval matrix, the minimum eigenvalue over every concrete realisation is bounded below by \(\min_i \bigl(\inf \mathbf{H}_{ii} - \sum_{j\ne i} \max |\mathbf{H}_{ij}|\bigr)\), computed with outward rounding. A non-negative bound is a proof that every \(\nabla^2 f(x)\) on the box is PSD; a tighter Hertz–Rohn vertex enumeration [Hertz, 1992, Rohn, 1994] is available as a follow-up.
The certificate strictly never contradicts a syntactic verdict: it either agrees or abstains. Its value is in box-local reasoning — when FBBT [Belotti et al., 2009] or branching has tightened the variable box, expressions that were UNKNOWN at the root can become provably convex locally.
from discopt._jax.convexity import certify_convex
from discopt._jax.convexity.interval import Interval
m3 = Model("certificate")
x = m3.continuous("x", lb=-3.0, ub=3.0)
expr = x**4 - 2.0 * x**2
print("Syntactic walker: ", classify_expr(expr, m3))
print("Certificate (root box):", certify_convex(expr, m3))
narrow = {x: Interval.from_bounds(1.0, 2.0)}
print("Certificate (tight box):", certify_convex(expr, m3, box=narrow))
Syntactic walker: Curvature.UNKNOWN
Certificate (root box): None
Certificate (tight box): Curvature.CONVEX
On the wide root box the expression \(x^4 - 2x^2\) is not convex — it
has inflection points at \(x = \pm 1/\sqrt{2}\). The syntactic walker
stays UNKNOWN and the certificate abstains. On the tighter box
\([1, 2]\), well away from the inflection, the interval Hessian proves
PSD and certify_convex returns CONVEX — a proof, not a guess.
References#
Grant et al. [2006] — disciplined convex programming; the composition rule implemented by the syntactic walker.
Boyd and Vandenberghe [2004], §3.1 — atom table and second-order conditions.
Ceccon et al. [2020] — SUSPECT detector; the reference implementation this module is aligned with.
Moore [1966], Neumaier [1990] — interval analysis foundations for the sound certificate.
Griewank and Walther [2008] — forward-mode automatic differentiation.
Gershgorin [1931] — eigenvalue localisation used by the sound λ_min bound; Hertz [1992], Rohn [1994] give the tighter vertex-enumeration alternative.
Adjiman et al. [1998], Adjiman et al. [1998] — αBB convexification and the interval-Hessian theory this certificate operationalises.
Belotti et al. [2009] — FBBT bounds tightening used to sharpen sign information and shrink the certificate’s box.