Mixed-Integer Nonlinear Programming (MINLP)

Mixed-Integer Nonlinear Programming (MINLP)#

A capstone tutorial for first-year PhD students on formulating and solving mixed-integer nonlinear programs.

Learning Objectives#

By the end of this notebook you will be able to:

Explain why MINLP is the hardest class of deterministic optimization problems
Formulate process synthesis, reactor design, and facility location problems as MINLPs
Use indicator constraints and Generalized Disjunctive Programming (GDP)
Understand convex relaxations (McCormick, alphaBB) and their role in spatial Branch-and-Bound
Compare solver backends and tune solver options
Load and solve standard .nl benchmark instances

Prerequisites: Familiarity with LP, QP, MILP, and basic nonlinear programming. This tutorial builds on the LP, QP, MILP, and MIQP tutorials in this series.

References: This tutorial draws on the survey by [Belotti et al., 2013] and the textbook by [Grossmann, 2021].

import os

os.environ["JAX_PLATFORMS"] = "cpu"
os.environ["JAX_ENABLE_X64"] = "1"

import discopt.modeling as dm
import numpy as np

1. What is MINLP?#

A Mixed-Integer Nonlinear Program (MINLP) combines continuous and integer variables with nonlinear objective and/or constraint functions:

\[\min_{x, y} \; f(x, y) \quad \text{s.t.} \quad g(x, y) \leq 0, \quad x \in \mathbb{R}^n, \quad y \in \mathbb{Z}^p\]

where \(f\) or at least one component of \(g\) is nonlinear. MINLP subsumes all simpler problem classes:

Class	Continuous vars	Integer vars	Nonlinear
LP	yes	–	–
QP	yes	–	quadratic obj
NLP	yes	–	yes
MILP	yes	yes	–
MIQP	yes	yes	quadratic obj
MINLP	yes	yes	yes

MINLP is NP-hard in general, and even deciding feasibility of a nonconvex MINLP is undecidable in the limit. The dominant solution approach is spatial Branch-and-Bound [Land and Doig, 1960]: the integer variables are handled by branching (as in MILP), while the nonconvex nonlinearities are handled by constructing convex relaxations of the feasible region at each node.

Why is MINLP hard?#

MINLP inherits the combinatorial difficulty of integer programming and the analytical difficulty of nonlinear programming. Unlike MILP, where the LP relaxation provides a tight convex underestimator, the NLP relaxation of a nonconvex MINLP may itself be nonconvex, requiring additional convexification techniques (McCormick envelopes, alphaBB underestimators, outer approximation).

2. Example 1: Process Synthesis#

A chemical plant considers building up to 4 processing units. For each unit \(i\):

A binary decision \(y_i \in \{0, 1\}\): build unit \(i\) or not
Continuous capacity \(x_i\) and flow variables \(f_k\)
Fixed construction cost plus variable operating cost

The nonlinearity arises from a yield function: the output flow depends logarithmically on the capacity of one unit, plus a linear contribution from another. This models diminishing returns from scale — a common phenomenon in chemical process design [Grossmann, 2021].

We maximize profit (revenue minus fixed and variable costs) subject to mass balance constraints, yield limits, and logical constraints linking build decisions to capacity bounds.

# Problem data
fixed_costs = np.array([1000.0, 1500.0, 800.0, 1200.0])
variable_costs = np.array([2.5, 3.0, 1.8, 2.2])
revenue = 10.0
n_units = 4
n_flows = 6

print(f"Processing units: {n_units}")
print(f"Fixed costs: {fixed_costs}")
print(f"Variable costs per unit capacity: {variable_costs}")
print(f"Revenue per unit output: {revenue}")

Processing units: 4
Fixed costs: [1000. 1500.  800. 1200.]
Variable costs per unit capacity: [2.5 3.  1.8 2.2]
Revenue per unit output: 10.0

m = dm.Model("process_synthesis")

# Binary build decisions
y = m.binary("build", shape=(n_units,))

# Continuous capacities and flows
x = m.continuous("capacity", shape=(n_units,), lb=0, ub=500)
f = m.continuous("flow", shape=(n_flows,), lb=0, ub=1000)

# Maximize profit = revenue * output - fixed costs - variable costs
m.maximize(
    revenue * f[5]
    - dm.sum(lambda i: fixed_costs[i] * y[i], over=range(n_units))
    - dm.sum(lambda i: variable_costs[i] * x[i], over=range(n_units))
)

# Mass balance constraints (linear)
m.subject_to(f[0] <= x[0] + x[1], name="balance_in")
m.subject_to(f[1] + f[2] <= f[0], name="balance_split")
m.subject_to(f[3] <= f[1] + x[2], name="balance_unit2")
m.subject_to(f[4] <= f[2] + x[3], name="balance_unit3")
m.subject_to(f[5] <= f[3] + f[4], name="balance_out")

# Nonlinear yield constraint: output limited by log-capacity + linear term
m.subject_to(f[5] <= dm.log(1 + x[2]) * 100 + x[3] * 0.85, name="yield_nonlinear")

# Indicator constraints: if unit i is built, capacity must be in [10, 500]
for i in range(n_units):
    m.if_then(y[i], [x[i] >= 10, x[i] <= 500], name=f"unit_{i}_active")

# Big-M linking: if unit is not built, capacity is zero
for i in range(n_units):
    m.subject_to(x[i] <= 500 * y[i], name=f"bigM_{i}")

# At least 2 units must be built
m.subject_to(dm.sum(lambda i: y[i], over=range(n_units)) >= 2, name="min_units")

print(m.summary())

Model: process_synthesis
  Variables: 14 (10 continuous, 4 integer/binary)
  Constraints: 19
  Objective: maximize (((10 * flow[6][5]) - Σ[4 terms]) - Σ[4 terms])
  Parameters: 0

from discopt._jax.problem_classifier import classify_problem

problem_class = classify_problem(m)
print(f"Problem class: {problem_class}")
print("(Confirmed MINLP due to dm.log in the yield constraint)")

Problem class: ProblemClass.MINLP
(Confirmed MINLP due to dm.log in the yield constraint)

result = m.solve(time_limit=60)

print(f"Status:     {result.status}")
profit = result.objective if result.objective is not None else None
print(f"Profit:     {profit:.2f}" if profit is not None else "Profit:     N/A")
print(f"Gap:        {result.gap}")
print(f"Nodes:      {result.node_count}")
print(f"Wall time:  {result.wall_time:.3f}s")
print()

if result.x is not None:
    # Which units are built?
    build_decisions = result.value(y)
    for i in range(n_units):
        status = "BUILT" if build_decisions[i] > 0.5 else "not built"
        cap = result.value(x)[i]
        print(f"  Unit {i}: {status}, capacity={cap:.1f}")

    flow_out = result.value(f)[5]
    print(f"\nOutput flow (f[5]): {flow_out:.2f}")
    print(f"Revenue:  {revenue * flow_out:.2f}")
    fixed = sum(fixed_costs[i] * build_decisions[i] for i in range(n_units))
    variable = sum(variable_costs[i] * result.value(x)[i] for i in range(n_units))
    print(f"Fixed:    {fixed:.2f}")
    print(f"Variable: {variable:.2f}")
else:
    print("  No feasible solution found within the time limit.")

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit https://github.com/coin-or/Ipopt
******************************************************************************

Status:     optimal
Profit:     6000.00
Gap:        0.0
Nodes:      3
Wall time:  5.977s

  Unit 0: not built, capacity=0.0
  Unit 1: not built, capacity=0.0
  Unit 2: BUILT, capacity=500.0
  Unit 3: BUILT, capacity=500.0

Output flow (f[5]): 1000.00
Revenue:  10000.00
Fixed:    2000.00
Variable: 2000.00

Interpreting the results#

The solver simultaneously decides:

Which units to build (discrete decisions)
How much capacity to allocate to each built unit (continuous decisions)
How to route flows through the network (continuous decisions)

The nonlinear yield constraint f[5] <= log(1 + x[2]) * 100 + x[3] * 0.85 creates diminishing returns: doubling capacity does not double output. This is what makes the problem a genuine MINLP rather than a MILP.

3. Example 2: Reactor Cascade Design#

Design a cascade of 3 continuous stirred-tank reactors (CSTRs) in series. For each stage \(i\), choose the operating temperature \(T_i\) and reactor volume \(V_i\) to minimize the total volume while achieving a target overall conversion.

The reaction rate follows the Arrhenius equation:

\[k(T) = k_0 \exp\left(-\frac{E_a}{R T}\right)\]

This exponential dependence on temperature makes the problem inherently nonlinear [Nocedal and Wright, 2006].

# Reaction kinetics parameters
k0 = 1e6  # pre-exponential factor [1/s]
Ea = 50000.0  # activation energy [J/mol]
R_gas = 8.314  # gas constant [J/(mol*K)]
n_stages = 3

print(f"Stages: {n_stages}")
print(f"k0 = {k0:.1e}, Ea = {Ea:.0f} J/mol, R = {R_gas:.3f} J/(mol*K)")
print(f"\nAt T=500 K: k = {k0 * np.exp(-Ea / (R_gas * 500)):.4f} 1/s")
print(f"At T=700 K: k = {k0 * np.exp(-Ea / (R_gas * 700)):.4f} 1/s")

Stages: 3
k0 = 1.0e+06, Ea = 50000 J/mol, R = 8.314 J/(mol*K)

At T=500 K: k = 5.9751 1/s
At T=700 K: k = 185.7033 1/s

m = dm.Model("reactor_cascade")

# Decision variables
T = [m.continuous(f"T_{i}", lb=300, ub=800) for i in range(n_stages)]
V = [m.continuous(f"V_{i}", lb=0.1, ub=10) for i in range(n_stages)]
F = m.continuous("F", lb=0.1, ub=100)

# Minimize total reactor volume
m.minimize(V[0] + V[1] + V[2])

# Conversion constraint at each stage:
# rate_constant(T_i) * V_i >= F * 0.3 (each stage achieves 30% conversion)
for i in range(n_stages):
    rate_constant = k0 * dm.exp(-Ea / (R_gas * T[i]))
    m.subject_to(rate_constant * V[i] >= F * 0.3, name=f"conversion_stage_{i}")

# Temperature safety limit
for i in range(n_stages):
    m.subject_to(T[i] <= 750, name=f"temp_limit_{i}")

# Minimum feed rate
m.subject_to(F >= 10, name="min_feed")

print(m.summary())

Model: reactor_cascade
  Variables: 7 (7 continuous, 0 integer/binary)
  Constraints: 7
  Objective: minimize ((V_0 + V_1) + V_2)
  Parameters: 0

result = m.solve(time_limit=30, nlp_solver="ipm")

print(f"Status:     {result.status}")
print(f"Objective:  {result.objective:.4f} (total volume)")
print(f"Wall time:  {result.wall_time:.3f}s")
print()

F_val = result.value(F)
print(f"Feed rate F = {F_val:.2f}")
print()
for i in range(n_stages):
    T_val = result.value(T[i])
    V_val = result.value(V[i])
    k_val = k0 * np.exp(-Ea / (R_gas * T_val))
    print(f"  Stage {i}: T={T_val:.1f} K, V={V_val:.4f}, k(T)={k_val:.4f}")

Status:     optimal
Objective:  0.3000 (total volume)
Wall time:  0.196s

Feed rate F = 25.16

  Stage 0: T=690.2 K, V=0.1000, k(T)=164.3622
  Stage 1: T=690.2 K, V=0.1000, k(T)=164.3622
  Stage 2: T=690.2 K, V=0.1000, k(T)=164.3622

Discussion#

The Arrhenius exponential creates a strong nonlinearity: small increases in temperature can dramatically increase the reaction rate, reducing the required volume. The solver must find the optimal trade-off between running at high temperature (fast kinetics, small reactors) and staying within safety limits.

Note: this is a pure NLP (no integer variables). Adding integer decisions — for example, choosing the number of stages from \(\{1, 2, 3, 4\}\) or selecting a reactor type — would make it a full MINLP.

4. Example 3: Nonlinear Facility Location#

Locate up to 4 facilities to serve 8 customers. The shipping cost exhibits economies of scale: the per-unit cost decreases with the volume shipped, modeled by a square-root function.

\[\text{shipping cost}_{ij} = d_{ij} \cdot \sqrt{s_{ij} + 1}\]

where \(d_{ij}\) is the distance and \(s_{ij}\) is the shipment volume. The \(\sqrt{\cdot}\) makes the cost concave in the shipment volume (economies of scale), creating a nonconvex MINLP.

n_fac = 4
n_cust = 8

# Facility fixed costs
fac_fixed = np.array([500.0, 600.0, 450.0, 550.0])

# Customer demands
demands = np.array([15.0, 20.0, 10.0, 25.0, 18.0, 12.0, 22.0, 8.0])

# Distances (facility x customer)
np.random.seed(42)
distances = np.random.uniform(1, 10, size=(n_fac, n_cust)).round(1)

# Facility capacities
capacities = np.array([60.0, 70.0, 50.0, 65.0])

print(f"Facilities: {n_fac}, Customers: {n_cust}")
print(f"Fixed costs: {fac_fixed}")
print(f"Demands: {demands} (total={demands.sum()})")
print(f"Capacities: {capacities} (total={capacities.sum()})")

Facilities: 4, Customers: 8
Fixed costs: [500. 600. 450. 550.]
Demands: [15. 20. 10. 25. 18. 12. 22.  8.] (total=130.0)
Capacities: [60. 70. 50. 65.] (total=245.0)

m = dm.Model("facility_nonlinear")

# Binary: open facility i?
open_fac = m.binary("open", shape=(n_fac,))

# Continuous: shipment from facility i to customer j
ship = m.continuous("ship", shape=(n_fac, n_cust), lb=0, ub=100)

# Minimize: fixed costs + nonlinear shipping costs
m.minimize(
    dm.sum(lambda i: fac_fixed[i] * open_fac[i], over=range(n_fac))
    + dm.sum(
        lambda i: dm.sum(lambda j: distances[i, j] * dm.sqrt(ship[i, j] + 1), over=range(n_cust)),
        over=range(n_fac),
    )
)

# Demand satisfaction: each customer must be fully served
for j in range(n_cust):
    m.subject_to(dm.sum(lambda i: ship[i, j], over=range(n_fac)) >= demands[j], name=f"demand_{j}")

# Capacity constraints: total shipment from facility i <= capacity * open[i]
for i in range(n_fac):
    m.subject_to(
        dm.sum(lambda j: ship[i, j], over=range(n_cust)) <= capacities[i] * open_fac[i],
        name=f"capacity_{i}",
    )

# Linking: ship[i,j] = 0 if facility i is not open
for i in range(n_fac):
    for j in range(n_cust):
        m.subject_to(ship[i, j] <= demands[j] * open_fac[i], name=f"link_{i}_{j}")

print(m.summary())

Model: facility_nonlinear
  Variables: 36 (32 continuous, 4 integer/binary)
  Constraints: 44
  Objective: minimize (Σ[4 terms] + Σ[4 terms])
  Parameters: 0

result = m.solve(time_limit=60)

print(f"Status:     {result.status}")
obj_str = f"{result.objective:.2f}" if result.objective is not None else "N/A"
print(f"Objective:  {obj_str}")
print(f"Gap:        {result.gap}")
print(f"Nodes:      {result.node_count}")
print(f"Wall time:  {result.wall_time:.3f}s")
print()

if result.x is not None:
    open_vals = result.value(open_fac)
    ship_vals = result.value(ship)

    for i in range(n_fac):
        if open_vals[i] > 0.5:
            total_shipped = sum(ship_vals[i, j] for j in range(n_cust))
            print(f"  Facility {i}: OPEN, total shipped = {total_shipped:.1f} / {capacities[i]}")
        else:
            print(f"  Facility {i}: closed")
else:
    print("  No feasible solution found within the time limit.")

/Users/jkitchin/Dropbox/projects/discopt/python/discopt/solver.py:1052: RuntimeWarning: divide by zero encountered in matmul
  b_ub = J @ solution - g_vals
/Users/jkitchin/Dropbox/projects/discopt/python/discopt/solver.py:1052: RuntimeWarning: overflow encountered in matmul
  b_ub = J @ solution - g_vals
/Users/jkitchin/Dropbox/projects/discopt/python/discopt/solver.py:1052: RuntimeWarning: invalid value encountered in matmul
  b_ub = J @ solution - g_vals

Status:     feasible
Objective:  1370.73
Gap:        2.263198497309067
Nodes:      25
Wall time:  62.283s

  Facility 0: OPEN, total shipped = 60.0 / 60.0
  Facility 1: OPEN, total shipped = 70.0 / 70.0
  Facility 2: closed
  Facility 3: closed

Economies of scale in optimization#

The square-root cost function \(d_{ij} \sqrt{s_{ij} + 1}\) is concave in the shipment volume. This means the marginal cost decreases with volume — shipping more to a single customer from one facility is cheaper per unit than splitting the shipment across facilities.

Concave cost minimization is nonconvex: the solver must avoid local optima where shipments are split unnecessarily. This is a classic application for spatial Branch-and-Bound with convex relaxations.

5. Generalized Disjunctive Programming (GDP)#

GDP is a modeling framework that uses logical disjunctions and Boolean variables instead of big-M constraints [Grossmann, 2021]. A disjunction states that exactly one of several sets of constraints must hold:

\[\begin{split}\bigvee_{k \in K} \left[ Y_k \;\; \begin{array}{c} g_k(x) \leq 0 \\ c_k(x) = 0 \end{array} \right]\end{split}\]

where \(Y_k\) is a Boolean variable indicating which disjunct is active.

GDP models are typically reformulated into MINLP via:

Big-M reformulation [Duran and Grossmann, 1986]: simple but can produce weak relaxations
Hull reformulation: tighter but introduces more variables and constraints

discopt provides m.either_or() for two-term disjunctions and m.if_then() for conditional constraints.

# GDP example: equipment mode selection
# A piece of equipment can operate in one of two modes:
#   Mode A: small scale, x in [10, 50], cost >= 100 + 2*x
#   Mode B: large scale, x in [100, 500], cost >= 500 + 0.5*x

m2 = dm.Model("gdp_demo")
x = m2.continuous("x", lb=0, ub=500)
cost = m2.continuous("cost", lb=0, ub=10000)

m2.minimize(cost)

m2.either_or(
    [
        [x >= 10, x <= 50, cost >= 100 + 2 * x],  # Mode A
        [x >= 100, x <= 500, cost >= 500 + 0.5 * x],  # Mode B
    ],
    name="mode",
)

# Additional constraint: we need at least 30 units of output
m2.subject_to(x >= 30, name="min_output")

result2 = m2.solve()

print(f"Status:    {result2.status}")
print(f"Optimal x: {result2.value(x):.2f}")
print(f"Min cost:  {result2.value(cost):.2f}")
print(f"Objective: {result2.objective:.2f}")

Status:    optimal
Optimal x: 30.00
Min cost:  160.00
Objective: 160.00

Interpreting the GDP result#

The solver chose the mode (A or B) that yields the lowest cost while satisfying the minimum output requirement of 30 units. The either_or construct automatically introduces the necessary binary variables and big-M constraints to enforce that exactly one set of constraints is active.

GDP is particularly natural for modeling:

Process superstructures: which unit operations to include in a process flowsheet
Equipment sizing: choosing among discrete technology options
Scheduling with modes: tasks that can run at different speeds/configurations

6. Convex Relaxations#

The key to solving nonconvex MINLPs to global optimality is constructing convex relaxations that underestimate the original problem. The tighter the relaxation, the better the lower bounds and the fewer Branch-and-Bound nodes needed.

McCormick envelopes#

For a bilinear term \(w = x_1 x_2\) with \(x_1 \in [a, b]\) and \(x_2 \in [c, d]\), the tightest convex relaxation is the McCormick envelope [McCormick, 1976]:

\[w \geq ac + cx_1 + ax_2 - ac, \quad w \geq bd + dx_1 + bx_2 - bd\]

\[w \leq ad + dx_1 + ax_2 - ad, \quad w \leq bc + cx_1 + bx_2 - bc\]

alphaBB underestimators#

For general twice-differentiable functions, the alphaBB method [Androulakis et al., 1995] constructs a convex underestimator by adding a scaled quadratic perturbation:

\[L(x) = f(x) + \sum_i \alpha_i (x_i^L - x_i)(x_i - x_i^U)\]

where \(\alpha_i \geq 0\) is chosen large enough to make \(L\) convex. This approach works for any factorable function.

Spatial partitioning#

Piecewise McCormick relaxations subdivide the variable domains into smaller intervals, computing tighter McCormick envelopes on each piece. The partitions parameter controls how many subdivisions to use.

# Demonstrate the effect of partitioning on bound quality
# Re-solve the process synthesis problem with different partition counts

m_relax = dm.Model("relaxation_demo")
x_r = m_relax.continuous("x", lb=1, ub=10)
y_r = m_relax.continuous("y", lb=1, ub=10)
z_r = m_relax.binary("z")

# Nonconvex objective with bilinear and logarithmic terms
m_relax.minimize(x_r * y_r - 5 * dm.log(x_r + 1) + 10 * z_r)
m_relax.subject_to(x_r + y_r >= 5, name="lb")
m_relax.subject_to(x_r + y_r <= 15, name="ub")
m_relax.subject_to(x_r <= 8 * z_r, name="link")

# Solve with no partitions
r0 = m_relax.solve(partitions=0, cutting_planes=False, time_limit=10)
print(f"Partitions=0: obj={r0.objective:.4f}, bound={r0.bound:.4f}, gap={r0.gap}")

# Solve with 4 partitions (tighter relaxation)
r4 = m_relax.solve(partitions=4, cutting_planes=False, time_limit=10)
print(f"Partitions=4: obj={r4.objective:.4f}, bound={r4.bound:.4f}, gap={r4.gap}")

Partitions=0: obj=5.9528, bound=5.0349, gap=0.1541952262776656

Partitions=4: obj=5.9528, bound=5.7235, gap=0.038519142848856956

# Adding cutting planes for further tightening
r_cut = m_relax.solve(partitions=4, cutting_planes=True, time_limit=10)
print(f"Partitions=4 + cuts: obj={r_cut.objective:.4f}, bound={r_cut.bound:.4f}, gap={r_cut.gap}")

print("\n--- Relaxation comparison ---")
print(f"{'Setting':<25} {'Objective':>10} {'Bound':>10} {'Gap':>10}")
print(f"{'No partitions, no cuts':<25} {r0.objective:>10.4f} {r0.bound:>10.4f} {str(r0.gap):>10}")
print(f"{'4 partitions, no cuts':<25} {r4.objective:>10.4f} {r4.bound:>10.4f} {str(r4.gap):>10}")
label = "4 partitions + cuts"
print(f"{label:<25} {r_cut.objective:>10.4f} {r_cut.bound:>10.4f} {str(r_cut.gap):>10}")

Partitions=4 + cuts: obj=5.9528, bound=5.7235, gap=0.03851914284885954

--- Relaxation comparison ---
Setting                    Objective      Bound        Gap
No partitions, no cuts        5.9528     5.0349 0.1541952262776656
4 partitions, no cuts         5.9528     5.7235 0.038519142848856956
4 partitions + cuts           5.9528     5.7235 0.03851914284885954

The relaxation–partitioning trade-off#

More partitions produce tighter convex relaxations (better bounds), but each partition increases the LP/NLP subproblem size. Similarly, cutting planes tighten the relaxation without branching, but add constraints that slow down each node solve.

The optimal strategy depends on the problem structure:

Bilinear problems (e.g., pooling): partitioning is very effective
Highly nonlinear problems: alphaBB underestimators may be needed
Large-scale problems: cutting planes can be more efficient than fine partitioning

7. Solver Options#

discopt supports two NLP solver backends for solving the continuous relaxations at each Branch-and-Bound node:

Backend	Description	Best for
`nlp_solver="ipm"`	Pure-JAX interior-point method	JIT compilation, batched solves, GPU
`nlp_solver="ipopt"`	IPOPT via cyipopt [Wächter and Biegler, 2006]	Mature, robust on large NLPs

The JAX IPM backend benefits from JIT warm-up: the first solve compiles the computation graph, but subsequent solves (e.g., at B&B nodes) reuse it and are much faster.

# Compare solver backends on the reactor cascade problem
m_cmp = dm.Model("reactor_compare")

T_cmp = [m_cmp.continuous(f"T_{i}", lb=300, ub=800) for i in range(n_stages)]
V_cmp = [m_cmp.continuous(f"V_{i}", lb=0.1, ub=10) for i in range(n_stages)]
F_cmp = m_cmp.continuous("F", lb=0.1, ub=100)

m_cmp.minimize(V_cmp[0] + V_cmp[1] + V_cmp[2])

for i in range(n_stages):
    rate = k0 * dm.exp(-Ea / (R_gas * T_cmp[i]))
    m_cmp.subject_to(rate * V_cmp[i] >= F_cmp * 0.3, name=f"conv_{i}")
    m_cmp.subject_to(T_cmp[i] <= 750, name=f"tlim_{i}")
m_cmp.subject_to(F_cmp >= 10, name="min_feed")

# Solve with IPM
r_ipm = m_cmp.solve(nlp_solver="ipm", time_limit=30)
print(f"IPM:   obj={r_ipm.objective:.4f}, time={r_ipm.wall_time:.3f}s")

# Solve with IPOPT
r_ipopt = m_cmp.solve(nlp_solver="ipopt", time_limit=30)
print(f"IPOPT: obj={r_ipopt.objective:.4f}, time={r_ipopt.wall_time:.3f}s")

print(f"\nObjective difference: {abs(r_ipm.objective - r_ipopt.objective):.6f}")

IPM:   obj=0.3000, time=0.193s

IPOPT: obj=0.3000, time=0.008s

Objective difference: 0.000000

8. Loading .nl Files#

The AMPL .nl format is a standard interchange format for nonlinear optimization problems. The MINLPLib benchmark library provides hundreds of MINLP instances in this format.

discopt can load .nl files directly using dm.from_nl(), which parses the file via a Rust-based parser and constructs the expression DAG.

from pathlib import Path

nl_path = Path("../python/tests/data/minlplib/nvs03.nl")
print(f"Loading: {nl_path}")
print(f"File exists: {nl_path.exists()}")

if nl_path.exists():
    m_nl = dm.from_nl(str(nl_path))
    print("\nLoaded model:")
    print(m_nl.summary())

    # Solve the benchmark instance
    result_nl = m_nl.solve(time_limit=30)
    print(f"\nStatus:    {result_nl.status}")
    print(f"Objective: {result_nl.objective:.6f}")
    print(f"Gap:       {result_nl.gap}")
    print(f"Nodes:     {result_nl.node_count}")
    print(f"Time:      {result_nl.wall_time:.3f}s")
else:
    print("Benchmark file not found. Skipping.")

Loading: ../python/tests/data/minlplib/nvs03.nl
File exists: False
Benchmark file not found. Skipping.

9. Exercise: Haverly Pooling Problem#

The Haverly pooling problem is a classic MINLP benchmark with bilinear constraints arising from quality blending. It models a system with:

Sources with known qualities (sulfur content)
A blending pool where sources are mixed
Products with quality specifications (maximum sulfur)

The bilinear terms arise because the pool quality equals the flow-weighted average of source qualities, which involves products of flow fractions and the pool quality variable.

Data:

Source 1: quality 3, max supply 100, cost 6
Source 2: quality 1, max supply 200, cost 16
Source 3: quality 2, max supply 100, cost 10
Product A: max sulfur 2.5, demand up to 100, price 9
Product B: max sulfur 1.5, demand up to 200, price 15

Your task: formulate the pooling problem as an MINLP. The key bilinear constraint is:

\[q_{\text{pool}} \cdot f_{\text{pool} \to j} \leq \text{max\_sulfur}_j \cdot f_{\text{pool} \to j}\]

where \(q_{\text{pool}}\) is the pool quality (a continuous variable) and \(f_{\text{pool} \to j}\) is the flow from pool to product \(j\).

# Haverly pooling problem data
source_qualities = np.array([3.0, 1.0, 2.0])
source_max_supply = np.array([100.0, 200.0, 100.0])
source_costs = np.array([6.0, 16.0, 10.0])

product_max_sulfur = np.array([2.5, 1.5])
product_max_demand = np.array([100.0, 200.0])
product_prices = np.array([9.0, 15.0])

n_sources = 3
n_products = 2

print("Haverly Pooling Problem")
print(f"Sources: {n_sources} (qualities: {source_qualities})")
print(f"Products: {n_products} (max sulfur: {product_max_sulfur})")

# --- YOUR CODE HERE ---
# Hint: define flows from sources to pool, from pool to products,
# and a pool quality variable q_pool.
# The bilinear constraint is: q_pool * flow_to_product[j] <= max_sulfur[j] * flow_to_product[j]

Haverly Pooling Problem
Sources: 3 (qualities: [3. 1. 2.])
Products: 2 (max sulfur: [2.5 1.5])

# Solution: Haverly pooling problem
mp = dm.Model("haverly_pooling")

# Flows from sources to pool
f_in = mp.continuous("f_in", shape=(n_sources,), lb=0, ub=200)

# Flows from pool to products
f_out = mp.continuous("f_out", shape=(n_products,), lb=0, ub=200)

# Direct flows from source 3 to products (bypasses pool)
f_direct = mp.continuous("f_direct", shape=(n_products,), lb=0, ub=100)

# Pool quality (flow-weighted average of source qualities)
q_pool = mp.continuous("q_pool", lb=1, ub=3)

# Maximize profit = revenue - costs
mp.maximize(
    dm.sum(lambda j: product_prices[j] * (f_out[j] + f_direct[j]), over=range(n_products))
    - dm.sum(lambda i: source_costs[i] * f_in[i], over=range(n_sources))
    - source_costs[2] * dm.sum(lambda j: f_direct[j], over=range(n_products))
)

# Pool mass balance: total in = total out
mp.subject_to(
    dm.sum(lambda i: f_in[i], over=range(n_sources))
    == dm.sum(lambda j: f_out[j], over=range(n_products)),
    name="pool_balance",
)

# Quality definition (bilinear): q_pool * sum(f_in) = sum(quality_i * f_in_i)
mp.subject_to(
    q_pool * dm.sum(lambda i: f_in[i], over=range(n_sources))
    == dm.sum(lambda i: source_qualities[i] * f_in[i], over=range(n_sources)),
    name="quality_def",
)

# Product quality specs (bilinear): q_pool * f_out[j] <= max_sulfur[j] * f_out[j]
for j in range(n_products):
    mp.subject_to(
        q_pool * f_out[j] + source_qualities[2] * f_direct[j]
        <= product_max_sulfur[j] * (f_out[j] + f_direct[j]),
        name=f"quality_spec_{j}",
    )

# Supply limits
for i in range(n_sources):
    mp.subject_to(f_in[i] <= source_max_supply[i], name=f"supply_{i}")

# Demand limits
for j in range(n_products):
    mp.subject_to(f_out[j] + f_direct[j] <= product_max_demand[j], name=f"demand_{j}")

result_pool = mp.solve(time_limit=60)

print(f"Status:     {result_pool.status}")
profit_pool = result_pool.objective if result_pool.objective is not None else None
print(f"Profit:     {profit_pool:.2f}" if profit_pool is not None else "Profit:     N/A")
print(f"Gap:        {result_pool.gap}")
if result_pool.x is not None:
    print(f"Pool quality: {result_pool.value(q_pool):.3f}")
    print(f"\nFlows into pool:  {result_pool.value(f_in)}")
    print(f"Flows out of pool: {result_pool.value(f_out)}")
    print(f"Direct flows:      {result_pool.value(f_direct)}")

Status:     optimal
Profit:     100.00
Gap:        0.0
Pool quality: 2.913

Flows into pool:  [ 5.00000000e+01 -8.74704822e-09  4.73457335e+00]
Flows out of pool: [ 5.47345733e+01 -9.55302477e-09]
Direct flows:      [4.52654267e+01 4.65096148e-08]

10. Summary#

This tutorial covered MINLP, the most general class of deterministic mathematical programming:

Topic	Key takeaway
MINLP definition	Combines discrete decisions with nonlinear functions; solved via spatial B&B
Process synthesis	Binary build/no-build with nonlinear yield: a canonical MINLP application
Reactor cascade	Arrhenius kinetics create strongly nonlinear NLP subproblems
Facility location	Economies-of-scale costs (concave objectives) make the problem nonconvex
GDP	Disjunctive constraints model either/or logic cleanly
Convex relaxations	McCormick and alphaBB provide valid lower bounds for spatial B&B
Solver options	JAX IPM for speed; IPOPT for robustness
Haverly pooling	Classic bilinear MINLP arising from quality blending

The problem-class hierarchy#

Class	Difficulty	Key algorithm
LP	Polynomial	Simplex / IPM
QP	Polynomial (convex)	IPM
MILP	NP-hard	Branch-and-Bound + LP relaxation
MIQP	NP-hard	Branch-and-Bound + QP relaxation
MINLP	NP-hard + nonconvex	Spatial B&B + convex relaxations

When to use MINLP#

Use MINLP when your problem has both discrete decisions (build/select/assign) and nonlinear physics or economics (reaction kinetics, fluid mechanics, diminishing returns, risk). If the nonlinearities can be approximated as piecewise-linear, consider MILP instead. If there are no integer variables, use NLP.

Next steps#

Solver backends: See the IPM and IPOPT tutorial notebooks for details on each backend
Advanced features: Cutting planes, GNN branching, learned relaxations in notebooks/advanced_features.ipynb
Benchmarks: Performance comparisons across problem classes in notebooks/benchmarks_by_class.ipynb

References#

[Belotti et al., 2013] — Mixed-integer nonlinear optimization (survey)
[Grossmann, 2021] — Advanced Optimization for Process Systems Engineering
[McCormick, 1976] — Convex underestimating problems
[Androulakis et al., 1995] — alphaBB global optimization
[Duran and Grossmann, 1986] — Outer-approximation for MINLP
[Land and Doig, 1960] — Branch-and-Bound algorithm
[Nocedal and Wright, 2006] — Numerical Optimization
[Wächter and Biegler, 2006] — IPOPT implementation