Convexity detection#

discopt ships a convexity detector that walks the expression DAG of a Model and labels every subexpression as CONVEX, CONCAVE, AFFINE, or UNKNOWN. The downstream solver uses these labels to gate αBB convexification, decide when outer-approximation cuts apply, and choose between convex and spatial branching strategies.

This notebook documents how the detector works and what guarantees it does — and does not — provide.

Soundness, not completeness#

The detector is sound by construction: a CONVEX or CONCAVE verdict is a mathematical proof derived from the disciplined convex programming composition rule [Grant et al., 2006] and the atom table of Boyd and Vandenberghe [2004], §3.1. The detector is deliberately incomplete — when no rule applies it returns UNKNOWN rather than guess. An UNKNOWN verdict is a conservative label, not a claim of nonconvexity.

The design follows the SUSPECT detector of Ceccon et al. [2020], which demonstrated that a syntactic walk over the expression tree, augmented with sign propagation, recovers the majority of textbook convex MINLP structure at negligible cost.

Three lattices#

The walker propagates three lattices up the DAG; each node publishes a tuple that downstream rules consume.

  • CurvatureCONVEX / CONCAVE / AFFINE / UNKNOWN.

  • SignNEG / NONPOS / ZERO / NONNEG / POS / UNKNOWN, derived from variable bounds, parameter values, or algebraic structure (e.g., exp(·) is always POS).

  • MonotonicityNONDEC / NONINC / CONST / UNKNOWN. This one is not propagated up the tree; it is a property of outer atoms, consulted via an atom table at composition sites.

The DCP composition rule#

For a composite expression \(h(x) = f(g(x))\) the rule is [Grant et al., 2006]:

  • \(h\) is convex when \(f\) is convex and either

    • \(g\) is affine, or

    • \(f\) is nondecreasing and \(g\) is convex, or

    • \(f\) is nonincreasing and \(g\) is concave.

  • The concave rule is symmetric.

Sign propagation matters because many atoms’ monotonicity depends on the sign of the argument: |x| is nondecreasing on \(x \ge 0\) and nonincreasing on \(x \le 0\). Without the sign lattice, these atoms cannot be used in the composition rule at all.

Worked examples#

import discopt.modeling as dm
from discopt._jax.convexity import classify_expr, classify_model
from discopt.modeling.core import FunctionCall, Model

m = Model("example")
x = m.continuous("x", lb=0.1, ub=10.0)
y = m.continuous("y", lb=-5.0, ub=5.0)

print("x**2 + exp(y):", classify_expr(x**2 + dm.exp(y), m))
print("log(x):        ", classify_expr(dm.log(x), m))
print("1/x:           ", classify_expr(1.0 / x, m))
print("x*y:           ", classify_expr(x * y, m))
print("max(x**2,y**2):", classify_expr(FunctionCall("max", x**2, y**2), m))
x**2 + exp(y): Curvature.CONVEX
log(x):         Curvature.CONCAVE
1/x:            Curvature.CONVEX
x*y:            Curvature.UNKNOWN
max(x**2,y**2): Curvature.CONVEX

1/x is detected as CONVEX because the sign lattice proves the denominator is strictly positive; on a mixed-sign domain the same expression returns UNKNOWN (the reciprocal is undefined through zero). max(x**2, y**2) is classified by the n-ary rule: the pointwise maximum of convex functions is convex.

A sign-aware reciprocal trap#

The reciprocal rule must look at the curvature of its argument, not just its sign. For the logistic denominator \(1 + e^{-x}\) the sign is provably positive, but the denominator itself is convex; composing a CONVEX, NONINC outer with a convex inner does not satisfy the DCP rule, so the sound verdict is UNKNOWN:

z = m.continuous("z", lb=-5.0, ub=5.0)
sigmoid = 1.0 / (1.0 + dm.exp(-z))
classify_expr(sigmoid, m)
<Curvature.UNKNOWN: 'unknown'>

This is the correct answer — the sigmoid has inflection points at \(z = \pm\infty\) and is neither globally convex nor concave.

Model-level classification#

classify_model(model) returns (is_convex, per_constraint_mask). A maximization problem is “convex” when the objective body is concave or affine, reflecting the equivalence \(\max f \equiv \min (-f)\).

m2 = Model("convex_nlp")
x = m2.continuous("x", lb=0.1, ub=10.0)
y = m2.continuous("y", lb=0.1, ub=10.0)
m2.maximize(dm.log(x) + dm.log(y))  # concave body -> convex problem
m2.subject_to(x + y <= 10)

is_cvx, mask = classify_model(m2)
print("is_convex:", is_cvx)
print("constraint mask:", mask)
is_convex: True
constraint mask: [True]

How soundness is verified#

Every verdict the detector can emit is checked by two independent numerical oracles in the test suite:

  1. Jensen-inequality fuzz. For each CONVEX verdict, random sample pairs \((x_1, x_2)\) in the box and a random \(\lambda \in [0, 1]\) are tested against $\(f(\lambda x_1 + (1-\lambda) x_2) \le \lambda f(x_1) + (1-\lambda) f(x_2).\)$ The symmetric inequality is used for CONCAVE. A single violation is a detector bug.

  2. Numerical Hessian spectrum. At sampled points the Hessian is computed via JAX automatic differentiation and its eigenvalues are inspected. A negative eigenvalue on a CONVEX verdict disproves the claim [Boyd and Vandenberghe, 2004], §3.1.4.

Sound box-local certificate#

The syntactic detector is cheap and sound, but incomplete: many expressions that are convex on a specific box go unrecognised because no composition rule fires. certify_convex complements the syntactic walker with a numerical certificate that is also sound — any CONVEX or CONCAVE verdict it returns is backed by a proof on the box — but can recognise cases the rules miss.

The certificate chains three interval-arithmetic layers:

  1. Outward-rounded interval arithmetic [Moore, 1966, Neumaier, 1990] on scalar and elementary functions. Every operation nudges its endpoints outward by one ULP so floating-point roundoff never breaks the enclosure.

  2. Interval forward-mode AD [Griewank and Walther, 2008] over the expression DAG. A triple (value, gradient, hessian) is propagated where every entry is itself an interval; the Hessian matrix \(\mathbf{H}\) encloses the pointwise Hessian \(\nabla^2 f(x)\) for every \(x\) in the box.

  3. Interval Gershgorin eigenvalue bound [Gershgorin, 1931]. For a symmetric interval matrix, the minimum eigenvalue over every concrete realisation is bounded below by \(\min_i \bigl(\inf \mathbf{H}_{ii} - \sum_{j\ne i} \max |\mathbf{H}_{ij}|\bigr)\), computed with outward rounding. A non-negative bound is a proof that every \(\nabla^2 f(x)\) on the box is PSD; a tighter Hertz–Rohn vertex enumeration [Hertz, 1992, Rohn, 1994] is available as a follow-up.

The certificate strictly never contradicts a syntactic verdict: it either agrees or abstains. Its value is in box-local reasoning — when FBBT [Belotti et al., 2009] or branching has tightened the variable box, expressions that were UNKNOWN at the root can become provably convex locally.

from discopt._jax.convexity import certify_convex
from discopt._jax.convexity.interval import Interval

m3 = Model("certificate")
x = m3.continuous("x", lb=-3.0, ub=3.0)
expr = x**4 - 2.0 * x**2

print("Syntactic walker:      ", classify_expr(expr, m3))
print("Certificate (root box):", certify_convex(expr, m3))

narrow = {x: Interval.from_bounds(1.0, 2.0)}
print("Certificate (tight box):", certify_convex(expr, m3, box=narrow))
Syntactic walker:       Curvature.UNKNOWN
Certificate (root box): None
Certificate (tight box): Curvature.CONVEX

On the wide root box the expression \(x^4 - 2x^2\) is not convex — it has inflection points at \(x = \pm 1/\sqrt{2}\). The syntactic walker stays UNKNOWN and the certificate abstains. On the tighter box \([1, 2]\), well away from the inflection, the interval Hessian proves PSD and certify_convex returns CONVEX — a proof, not a guess.

References#

  • Grant et al. [2006] — disciplined convex programming; the composition rule implemented by the syntactic walker.

  • Boyd and Vandenberghe [2004], §3.1 — atom table and second-order conditions.

  • Ceccon et al. [2020] — SUSPECT detector; the reference implementation this module is aligned with.

  • Moore [1966], Neumaier [1990] — interval analysis foundations for the sound certificate.

  • Griewank and Walther [2008] — forward-mode automatic differentiation.

  • Gershgorin [1931] — eigenvalue localisation used by the sound λ_min bound; Hertz [1992], Rohn [1994] give the tighter vertex-enumeration alternative.

  • Adjiman et al. [1998], Adjiman et al. [1998] — αBB convexification and the interval-Hessian theory this certificate operationalises.

  • Belotti et al. [2009] — FBBT bounds tightening used to sharpen sign information and shrink the certificate’s box.