Problem Class Auto-Detection in discopt

Problem Class Auto-Detection in discopt#

discopt automatically classifies optimization problems into one of six classes and routes each to a specialized solver backend. This means a pure LP is solved by a Mehrotra predictor-corrector interior-point method [Mehrotra, 1992] (not the general NLP IPM), and a MILP uses LP relaxations at each Branch & Bound node [Land and Doig, 1960] (not full NLP relaxations).

This notebook demonstrates:

The six problem classes and how auto-detection works
Concrete examples of each class
How to verify which class your model falls into
The performance impact of specialized dispatch

Problem Classes#

Class	Objective	Constraints	Variables	Solver Path
LP	Linear	Linear	Continuous	LP IPM (Mehrotra)
QP	Quadratic	Linear	Continuous	QP IPM
MILP	Linear	Linear	Has integer/binary	B&B + LP IPM
MIQP	Quadratic	Linear	Has integer/binary	B&B + QP IPM
NLP	Nonlinear	Any	Continuous	General NLP IPM / Ipopt
MINLP	Nonlinear	Any	Has integer/binary	B&B + NLP

Classification uses Rust-based expression analysis (is_linear, is_quadratic) via PyO3 bindings to determine the algebraic degree of the objective and constraints.

import os

os.environ["JAX_PLATFORMS"] = "cpu"
os.environ["JAX_ENABLE_X64"] = "1"

import discopt.modeling as dm
import numpy as np

# Internal API for demonstration — not part of the public interface
from discopt._jax.problem_classifier import ProblemClass, classify_problem

print("discopt loaded successfully")

discopt loaded successfully

1. Linear Program (LP)#

A production planning problem: maximize profit from two products subject to resource constraints.

\[\max_{x_1, x_2} \; 5x_1 + 4x_2 \quad \text{s.t.} \quad 6x_1 + 4x_2 \le 24, \; x_1 + 2x_2 \le 6, \; x_1, x_2 \ge 0\]

Known optimal: \(x_1^* = 3, \; x_2^* = 1.5, \; f^* = -21\) (negated because discopt minimizes).

When discopt detects an LP, it extracts c, A_eq, b_eq via autodiff and solves with a specialized LP IPM [Mehrotra, 1992] — no NLP evaluator, no Hessians, no nonlinear iteration.

m = dm.Model("production_planning")
x1 = m.continuous("x1", lb=0, ub=10)
x2 = m.continuous("x2", lb=0, ub=10)

m.minimize(-5 * x1 - 4 * x2)  # maximize 5x1 + 4x2
m.subject_to(6 * x1 + 4 * x2 <= 24)
m.subject_to(x1 + 2 * x2 <= 6)

# Verify classification
pc = classify_problem(m)
print(f"Problem class: {pc.value} (expected: lp)")
assert pc == ProblemClass.LP

result = m.solve()
print(f"Status:    {result.status}")
print(f"Objective: {result.objective:.4f}  (expected: -21.0)")
print(f"x1 = {result.x['x1']:.4f}  (expected: 3.0)")
print(f"x2 = {result.x['x2']:.4f}  (expected: 1.5)")
print(f"Time: {result.wall_time:.4f}s")

Problem class: lp (expected: lp)
Status:    optimal
Objective: -21.0000  (expected: -21.0)
x1 = 3.0000  (expected: 3.0)
x2 = 1.5000  (expected: 1.5)
Time: 0.0181s

2. Quadratic Program (QP)#

A portfolio optimization problem with 3 assets. Minimize variance (quadratic) subject to a return constraint (linear) [Nocedal and Wright, 2006].

\[\min_{w} \; w^T \Sigma \, w \quad \text{s.t.} \quad \mu^T w \ge r_{\min}, \; \sum w_i = 1, \; w_i \ge 0\]

The objective is quadratic and all constraints are linear, so this is a QP.

# Covariance matrix and expected returns for 3 assets
Sigma = np.array([[0.04, 0.006, 0.002], [0.006, 0.09, 0.018], [0.002, 0.018, 0.16]])
mu = np.array([0.10, 0.12, 0.15])
r_min = 0.11

m = dm.Model("portfolio_optimization")
w = m.continuous("w", shape=(3,), lb=0, ub=1)

# Quadratic objective: w^T Sigma w
variance = dm.sum(
    lambda i: dm.sum(lambda j: Sigma[i, j] * w[i] * w[j], over=range(3)), over=range(3)
)
m.minimize(variance)

# Linear constraints
m.subject_to(dm.sum(w) == 1)
m.subject_to(dm.sum(lambda i: mu[i] * w[i], over=range(3)) >= r_min)

pc = classify_problem(m)
print(f"Problem class: {pc.value} (expected: qp)")
assert pc == ProblemClass.QP

result = m.solve()
print(f"Status:    {result.status}")
print(f"Variance:  {result.objective:.6f}")
print(f"Weights:   {result.x['w']}")
print(f"Return:    {np.dot(mu, result.x['w']):.4f} (target >= {r_min})")
print(f"Time: {result.wall_time:.4f}s")

Problem class: qp (expected: qp)

Status:    optimal
Variance:  0.027019
Weights:   [0.63408936 0.23094836 0.13496228]
Return:    0.1114 (target >= 0.11)
Time: 0.0237s

3. Mixed-Integer Linear Program (MILP)#

A facility location problem: decide which of 3 facilities to open (binary) and how much demand to assign from 4 customers (continuous), minimizing total cost.

\[\min \; \sum_j f_j y_j + \sum_{ij} c_{ij} x_{ij} \quad \text{s.t.} \quad \sum_j x_{ij} = 1 \;\forall i, \quad x_{ij} \le y_j, \quad x_{ij} \ge 0, \quad y_j \in \{0,1\}\]

Linear objective + linear constraints + binary variables = MILP. discopt uses B&B [Land and Doig, 1960] with LP relaxation solves at each node.

n_facilities = 3
n_customers = 4
fixed_cost = [10.0, 15.0, 12.0]
# Transport cost from customer i to facility j
transport = [[2.0, 5.0, 3.0], [4.0, 1.0, 6.0], [3.0, 3.0, 2.0], [5.0, 2.0, 4.0]]

m = dm.Model("facility_location")
y = m.binary("y", shape=(n_facilities,))  # open facility j?
x = m.continuous("x", shape=(n_customers, n_facilities), lb=0, ub=1)  # assignment

# Objective: fixed costs + transport costs
obj = dm.sum(lambda j: fixed_cost[j] * y[j], over=range(n_facilities)) + dm.sum(
    lambda i: dm.sum(lambda j: transport[i][j] * x[i, j], over=range(n_facilities)),
    over=range(n_customers),
)
m.minimize(obj)

# Each customer assigned to exactly one facility
for i in range(n_customers):
    m.subject_to(dm.sum(lambda j: x[i, j], over=range(n_facilities)) == 1)

# Can only assign to open facilities
for i in range(n_customers):
    for j in range(n_facilities):
        m.subject_to(x[i, j] <= y[j])

pc = classify_problem(m)
print(f"Problem class: {pc.value} (expected: milp)")
assert pc == ProblemClass.MILP

result = m.solve()
print(f"Status:    {result.status}")
print(f"Objective: {result.objective:.4f}")
print(f"Facilities open: {result.x['y']}")
print(f"Nodes explored: {result.node_count}")
print(f"Time: {result.wall_time:.4f}s")

Problem class: milp (expected: milp)
Status:    optimal
Objective: 24.0000
Facilities open: [1. 0. 0.]
Nodes explored: 1
Time: 0.0176s

4. Mixed-Integer Quadratic Program (MIQP)#

Cardinality-constrained portfolio: minimize variance but limit the number of assets held to at most \(k\) using binary indicator variables.

\[\min_{w, z} \; w^T \Sigma \, w \quad \text{s.t.} \quad w_i \le z_i, \; \sum z_i \le k, \; \mu^T w \ge r_{\min}, \; \sum w_i = 1\]

Quadratic objective + linear constraints + binary variables = MIQP.

k = 2  # at most 2 assets

m = dm.Model("cardinality_portfolio")
w = m.continuous("w", shape=(3,), lb=0, ub=1)
z = m.binary("z", shape=(3,))  # asset selection indicators

# Same quadratic variance objective
variance = dm.sum(
    lambda i: dm.sum(lambda j: Sigma[i, j] * w[i] * w[j], over=range(3)), over=range(3)
)
m.minimize(variance)

# Constraints
m.subject_to(dm.sum(w) == 1)
m.subject_to(dm.sum(lambda i: mu[i] * w[i], over=range(3)) >= r_min)
m.subject_to(dm.sum(z) <= k)  # cardinality
for i in range(3):
    m.subject_to(w[i] <= z[i])  # linking

pc = classify_problem(m)
print(f"Problem class: {pc.value} (expected: miqp)")
assert pc == ProblemClass.MIQP

result = m.solve()
print(f"Status:    {result.status}")
print(f"Variance:  {result.objective:.6f}")
print(f"Weights:   {result.x['w']}")
print(f"Selected:  {result.x['z']}")
print(f"Return:    {np.dot(mu, result.x['w']):.4f}")
print(f"Nodes: {result.node_count}, Time: {result.wall_time:.4f}s")

Problem class: miqp (expected: miqp)

Status:    optimal
Variance:  0.032640
Weights:   [7.99999995e-01 1.00000000e-12 2.00000005e-01]
Selected:  [ 1.e+00 -1.e-12  1.e+00]
Return:    0.1100
Nodes: 11, Time: 0.7165s

5. Nonlinear Program (NLP)#

A reactor design problem with exponential terms. Find the optimal temperature and concentration to maximize yield.

\[\min_{T, c} \; -c \cdot (1 - e^{-k_0 \cdot e^{-E_a/T} \cdot \tau}) \quad \text{s.t.} \quad T + c \le 6\]

The exp() makes this nonlinear in both objective and constraints (if we add a nonlinear constraint). discopt uses the general NLP IPM [Nocedal and Wright, 2006].

m = dm.Model("reactor_design")
T = m.continuous("T", lb=1.0, ub=5.0)  # temperature (scaled)
c = m.continuous("c", lb=0.1, ub=5.0)  # concentration

# Simplified reaction rate: conversion = 1 - exp(-k0 * exp(-Ea/T) * tau)
k0 = 10.0
Ea = 2.0
tau = 1.0
rate = k0 * dm.exp(-Ea / T) * tau
conversion = 1.0 - dm.exp(-rate)
yield_val = c * conversion

m.minimize(-yield_val)  # maximize yield
m.subject_to(T + c <= 6)

pc = classify_problem(m)
print(f"Problem class: {pc.value} (expected: nlp)")
assert pc == ProblemClass.NLP

result = m.solve()
print(f"Status:    {result.status}")
print(f"Objective: {result.objective:.6f}")
print(f"T = {result.x['T']:.4f}")
print(f"c = {result.x['c']:.4f}")
print(f"Yield = {-result.objective:.4f}")
print(f"Time: {result.wall_time:.4f}s")

Problem class: nlp (expected: nlp)

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

Status:    optimal
Objective: -4.183281
T = 1.4362
c = 4.5638
Yield = 4.1833
Time: 0.1419s

6. Mixed-Integer Nonlinear Program (MINLP)#

A process synthesis problem: decide which process units to activate (binary) and optimize continuous operating conditions with nonlinear cost [Belotti et al., 2013].

\[\min \; \sum_j f_j y_j + \sum_j (e^{0.5 x_j} + x_j) \quad \text{s.t.} \quad x_j \le M \cdot y_j, \; \sum_j x_j \ge 2, \; x_j \ge 0\]

Nonlinear objective + binary variables = MINLP.

n_units = 3
fixed_costs = [5.0, 8.0, 6.0]
M = 5.0  # big-M

m = dm.Model("process_synthesis")
x = m.continuous("x", shape=(n_units,), lb=0, ub=M)
y = m.binary("y", shape=(n_units,))

# Nonlinear objective: fixed costs + nonlinear operating costs
obj = dm.sum(lambda j: fixed_costs[j] * y[j], over=range(n_units)) + dm.sum(
    lambda j: dm.exp(x[j] * 0.5) + x[j], over=range(n_units)
)
m.minimize(obj)

# Linking constraints
for j in range(n_units):
    m.subject_to(x[j] <= M * y[j])

# Minimum total production
m.subject_to(dm.sum(x) >= 2)

pc = classify_problem(m)
print(f"Problem class: {pc.value} (expected: minlp)")
assert pc == ProblemClass.MINLP

result = m.solve()
print(f"Status:    {result.status}")
print(f"Objective: {result.objective:.4f}")
print(f"x = {result.x['x']}")
print(f"y = {result.x['y']}")
print(f"Nodes: {result.node_count}, Time: {result.wall_time:.4f}s")

Problem class: minlp (expected: minlp)

Status:    optimal
Objective: 11.7183
x = [2.00000000e+00 1.48188512e-12 1.90868445e-12]
y = [1.00000000e+00 1.04114480e-12 1.09580246e-12]
Nodes: 7, Time: 3.9999s

7. Performance Impact of Specialized Dispatch#

To see why auto-detection matters, we compare two formulations of the same LP:

A clean LP that triggers the specialized LP IPM [Mehrotra, 1992]
The same LP with a dummy 0 * exp(x[0]) term that forces it into the NLP path

The LP path avoids building an NLP evaluator, computing Hessians, and running the general NLP iteration.

import time

n = 20
np.random.seed(42)
c_vec = np.random.randn(n)
A_mat = np.random.randn(10, n)
b_vec = np.abs(A_mat @ np.ones(n)) + 1.0


def build_lp(add_nonlinear=False):
    m = dm.Model("timing_lp")
    x = m.continuous("x", shape=(n,), lb=0, ub=10)
    obj = dm.sum(lambda i: c_vec[i] * x[i], over=range(n))
    if add_nonlinear:
        obj = obj + 0 * dm.exp(x[0])  # force NLP detection
    m.minimize(obj)
    for k in range(10):
        m.subject_to(dm.sum(lambda i: A_mat[k, i] * x[i], over=range(n)) <= b_vec[k])
    return m


# --- LP path ---
m_lp = build_lp(add_nonlinear=False)
pc_lp = classify_problem(m_lp)
t0 = time.perf_counter()
r_lp = m_lp.solve()
t_lp = time.perf_counter() - t0

# --- NLP path (forced) ---
m_nlp = build_lp(add_nonlinear=True)
pc_nlp = classify_problem(m_nlp)
t0 = time.perf_counter()
r_nlp = m_nlp.solve()
t_nlp = time.perf_counter() - t0

print(f"LP path:  class={pc_lp.value}, obj={r_lp.objective:.6f}, time={t_lp:.4f}s")
print(f"NLP path: class={pc_nlp.value}, obj={r_nlp.objective:.6f}, time={t_nlp:.4f}s")
print(f"\nSpeedup: {t_nlp / t_lp:.1f}x (LP path vs forced NLP path)")
print(f"Objectives match: {abs(r_lp.objective - r_nlp.objective) < 1e-3}")

LP path:  class=lp, obj=-58.932888, time=0.0138s
NLP path: class=nlp, obj=-58.932888, time=1.0674s

Speedup: 77.1x (LP path vs forced NLP path)
Objectives match: True

Summary#

Problem Class	Example	Detected?	Solver Path
LP	Production planning	LP	LP IPM (Mehrotra)
QP	Portfolio optimization	QP	QP IPM
MILP	Facility location	MILP	B&B + LP IPM
MIQP	Cardinality portfolio	MIQP	B&B + QP IPM
NLP	Reactor design	NLP	General NLP IPM
MINLP	Process synthesis	MINLP	B&B + NLP IPM

Key takeaway: You don’t need to specify the problem class. discopt analyzes the expression DAG using Rust-based structure detection and automatically dispatches to the most efficient solver. The classification happens inside Model.solve() before any NLP evaluator is constructed.