Constrained optimization with Lagrange multipliers and autograd

| categories: autograd, optimization | tags:

Constrained optimization is common in engineering problems solving. A prototypical example (from Greenberg, Advanced Engineering Mathematics, Ch 13.7) is to find the point on a plane that is closest to the origin. The plane is defined by the equation \(2x - y + z = 3\), and we seek to minimize \(x^2 + y^2 + z^2\) subject to the equality constraint defined by the plane. scipy.optimize.minimize provides a pretty convenient interface to solve a problem like this, ans shown here.

import numpy as np
from scipy.optimize import minimize

def objective(X):
    x, y, z = X
    return x**2 + y**2 + z**2

def eq(X):
    x, y, z = X
    return 2 * x - y + z - 3

sol = minimize(objective, [1, -0.5, 0.5], constraints={'type': 'eq', 'fun': eq})
    fun: 1.5
    jac: array([ 2.00000001, -0.99999999,  1.00000001])
message: 'Optimization terminated successfully.'
   nfev: 5
    nit: 1
   njev: 1
 status: 0
success: True
      x: array([ 1. , -0.5,  0.5])

I like the minimize function a lot, although I am not crazy for how the constraints are provided. The alternative used to be that there was an argument for equality constraints and another for inequality constraints. Analogous to scipy.integrate.solve_ivp event functions, they could have also used function attributes.

Sometimes, it might be desirable to go back to basics though, especially if you are unaware of the minimize function or perhaps suspect it is not working right and want an independent answer. Next we look at how to construct this constrained optimization problem using Lagrange multipliers. This converts the problem into an augmented unconstrained optimization problem we can use fsolve on. The gist of this method is we formulate a new problem:

\(F(X) = f(X) - \lambda g(X)\)

and then solve the simultaneous resulting equations:

\(F_x(X) = F_y(X) = F_z(X) = g(X) = 0\) where \(F_x\) is the derivative of \(f*\) with respect to \(x\), and \(g(X)\) is the equality constraint written so it is equal to zero. Since we end up with four equations that equal zero, we can simply use fsolve to get the solution. Many years ago I used a finite difference approximation to the derivatives. Today we use autograd to get the desired derivatives. Here it is.

import autograd.numpy as np
from autograd import grad

def F(L):
    'Augmented Lagrange function'
    x, y, z, _lambda = L
    return objective([x, y, z]) - _lambda * eq([x, y, z])

# Gradients of the Lagrange function
dfdL = grad(F, 0)

# Find L that returns all zeros in this function.
def obj(L):
    x, y, z, _lambda = L
    dFdx, dFdy, dFdz, dFdlam = dfdL(L)
    return [dFdx, dFdy, dFdz, eq([x, y, z])]

from scipy.optimize import fsolve
x, y, z, _lam = fsolve(obj, [0.0, 0.0, 0.0, 1.0])
print(f'The answer is at {x, y, z}')
The answer is at (1.0, -0.5, 0.5)

That is the same answer as before. Note we have still relied on some black box solver inside of fsolve (instead of inside minimize), but it might be more clear what problem we are solving (e.g. finding zeros). It takes a bit more work to set this up, since we have to construct the augmented function, but autograd makes it pretty convenient to set up the final objective function we want to solve.

How do we know we are at a minimum? We can check that the Hessian is positive definite in the original function we wanted to minimize. You can see here the array is positive definite, e.g. all the eigenvalues are positive. autograd makes this easy too.

from autograd import hessian
h = hessian(objective, 0)
h(np.array([x, y, z]))
array([[ 2.,  0.,  0.],
       [ 0.,  2.,  0.],
       [ 0.,  0.,  2.]])

In case it isn't evident from that structure that the eigenvalues are all positive, here we compute them:

np.linalg.eig(h(np.array([x, y, z])))[0]
array([ 2.,  2.,  2.])

In summary, autograd continues to enable advanced engineering problems to be solved.

Copyright (C) 2018 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.1.14

Discuss on Twitter

Finding the maximum power of a photovoltaic device.

| categories: python, optimization | tags:

A photovoltaic device is characterized by a current-voltage relationship. Let us say, for argument's sake, that the relationship is known and defined by

\(i = 0.5 - 0.5 * V^2\)

The voltage is highest when the current is equal to zero, but of course then you get no power. The current is highest when the voltage is zero, i.e. short-circuited, but there is again no power. We seek the highest power condition, which is to find the maximum of \(i V\). This is a constrained optimization. We solve it by creating an objective function that returns the negative of (\i V\), and then find the minimum.

First, let us examine the i-V relationship.

import matplotlib.pyplot as plt
import numpy as np

V = np.linspace(0, 1)

def i(V):
    return 0.5 - 0.5 * V**2

plt.plot(V, i(V))
<matplotlib.figure.Figure object at 0x11193ec18>
[<matplotlib.lines.Line2D object at 0x111d43668>]

Now, let us be sure there is a maximum in power.

import matplotlib.pyplot as plt
import numpy as np

V = np.linspace(0, 1)

def i(V):
    return 0.5 - 0.5 * V**2

plt.plot(V, i(V) * V)
[<matplotlib.lines.Line2D object at 0x111d437f0>]

You can see in fact there is a maximum, near V=0.6. We could solve this problem analytically by taking the appropriate derivative and solving it for zero. That still might require solving a nonlinear problem though. We will directly setup and solve the constrained optimization.

from scipy.optimize import fmin_slsqp
import numpy as np
import matplotlib.pyplot as plt

def objective(X):
    i, V = X
    return - i * V

def eqc(X):
    'equality constraint'
    i, V = X
    return (0.5 - 0.5 * V**2) - i

X0 = [0.2, 0.6]
X = fmin_slsqp(objective, X0, eqcons=[eqc])

imax, Vmax = X

V = np.linspace(0, 1)

def i(V):
    return 0.5 - 0.5 * V**2

plt.plot(V, i(V), Vmax, imax, 'ro')
Optimization terminated successfully.    (Exit mode 0)
            Current function value: -0.192450127337
            Iterations: 5
            Function evaluations: 20
            Gradient evaluations: 5
[<matplotlib.lines.Line2D object at 0x111946470>, <matplotlib.lines.Line2D object at 0x11192c518>]

You can see the maximum power is approximately 0.2 (unspecified units), at the conditions indicated by the red dot in the figure above.

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Constrained fits to data

| categories: optimization, data analysis | tags:

Our objective here is to fit a quadratic function in the least squares sense to some data, but we want to constrain the fit so that the function has specific values at the end-points. The application is to fit a function to the lattice constant of an alloy at different compositions. We constrain the fit because we know the lattice constant of the pure metals, which are at the end-points of the fit and we want these to be correct.

We define the alloy composition in terms of the mole fraction of one species, e.g. \(A_xB_{1-x}\). For \(x=0\), the alloy is pure B, whereas for \(x=1\) the alloy is pure A. According to Vegard's law the lattice constant is a linear composition weighted average of the pure component lattice constants, but sometimes small deviations are observed. Here we will fit a quadratic function that is constrained to give the pure metal component lattice constants at the end points.

The quadratic function is \(y = a x^2 + b x + c\). One constraint is at \(x=0\) where \(y = c\), or \(c\) is the lattice constant of pure B. The second constraint is at \(x=1\), where \(a + b + c\) is equal to the lattice constant of pure A. Thus, there is only one degree of freedom. \(c = LC_B\), and \(b = LC_A - c - a\), so \(a\) is our only variable.

We will solve this problem by minimizing the summed squared error between the fit and the data. We use the fmin function in scipy.optimize. First we create a fit function that encodes the constraints. Then we create an objective function that will be minimized. We have to make a guess about the value of \(a\) that minimizes the summed squared error. A line fits the data moderately well, so we guess a small value, i.e. near zero, for \(a\). Here is the solution.

import numpy as np
import matplotlib.pyplot as plt

# Data to fit to
# x=0 is pure B
# x=1 is pure A
X = np.array([0.0, 0.1,  0.25, 0.5,  0.6,  0.8,  1.0])
Y = np.array([3.9, 3.89, 3.87, 3.78, 3.75, 3.69, 3.6])

def func(a, XX):
    LC_A = 3.6
    LC_B = 3.9

    c = LC_B
    b = LC_A - c - a

    yfit = a * XX**2 + b * XX + c
    return yfit

def objective(a):
    'function to minimize'
    SSE = np.sum((Y - func(a, X))**2)
    return SSE

from scipy.optimize import fmin

a_fit = fmin(objective, 0)
plt.plot(X, Y, 'bo ')

x = np.linspace(0, 1)
plt.plot(x, func(a_fit, x))
Optimization terminated successfully.
         Current function value: 0.000445
         Iterations: 19
         Function evaluations: 38

Here is the result:

You can see that the end points go through the end-points as prescribed.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter

Gibbs energy minimization and the NIST webbook

| categories: optimization | tags: thermodynamics

Matlab post In Post 1536 we used the NIST webbook to compute a temperature dependent Gibbs energy of reaction, and then used a reaction extent variable to compute the equilibrium concentrations of each species for the water gas shift reaction.

Today, we look at the direct minimization of the Gibbs free energy of the species, with no assumptions about stoichiometry of reactions. We only apply the constraint of conservation of atoms. We use the NIST Webbook to provide the data for the Gibbs energy of each species.

As a reminder we consider equilibrium between the species \(CO\), \(H_2O\), \(CO_2\) and \(H_2\), at 1000K, and 10 atm total pressure with an initial equimolar molar flow rate of \(CO\) and \(H_2O\).

import numpy as np

T = 1000  # K
R = 8.314e-3 # kJ/mol/K

P = 10.0 # atm, this is the total pressure in the reactor
Po = 1.0 # atm, this is the standard state pressure

We are going to store all the data and calculations in vectors, so we need to assign each position in the vector to a species. Here are the definitions we use in this work.

1  CO
2  H2O
3  CO2
4  H2
species = ['CO', 'H2O', 'CO2', 'H2']

# Heats of formation at 298.15 K

Hf298 = [
    -110.53,  # CO
    -241.826, # H2O
    -393.51,  # CO2
       0.0]   # H2

# Shomate parameters for each species
#           A          B           C          D          E            F          G       H
WB = [[25.56759,  6.096130,     4.054656,  -2.671301,  0.131021, -118.0089, 227.3665,   -110.5271],  # CO
      [30.09200,  6.832514,     6.793435,  -2.534480,  0.082139, -250.8810, 223.3967,   -241.8264],  # H2O
      [24.99735,  55.18696,   -33.69137,    7.948387, -0.136638, -403.6075, 228.2431,   -393.5224],  # CO2
      [33.066178, -11.363417,  11.432816,  -2.772874, -0.158558, -9.980797, 172.707974,    0.0]]     # H2

WB = np.array(WB)

# Shomate equations
t = T/1000
T_H = np.array([t,  t**2 / 2.0, t**3 / 3.0, t**4 / 4.0, -1.0 / t, 1.0, 0.0, -1.0])
T_S = np.array([np.log(t), t,  t**2 / 2.0,  t**3 / 3.0, -1.0 / (2.0 * t**2), 0.0, 1.0, 0.0])

H =, T_H)        # (H - H_298.15) kJ/mol
S =, T_S/1000.0) # absolute entropy kJ/mol/K

Gjo = Hf298 + H - T*S      # Gibbs energy of each component at 1000 K

Now, construct the Gibbs free energy function, accounting for the change in activity due to concentration changes (ideal mixing).

def func(nj):
    nj = np.array(nj)
    Enj = np.sum(nj);
    Gj =  Gjo / (R * T) + np.log(nj / Enj * P / Po)
    return, Gj)

We impose the constraint that all atoms are conserved from the initial conditions to the equilibrium distribution of species. These constraints are in the form of \(A_{eq} n = b_{eq}\), where \(n\) is the vector of mole numbers for each species.

Aeq = np.array([[ 1,    0,    1,    0],  # C balance
                [ 1,    1,    2,    0],  # O balance
                [ 0,    2,    0,    2]]) # H balance

# equimolar feed of 1 mol H2O and 1 mol CO
beq = np.array([1,  # mol C fed
                2,  # mol O fed
                2]) # mol H fed

def ec1(nj):
    'conservation of atoms constraint'
    return, nj) - beq

Now we are ready to solve the problem.

from scipy.optimize import fmin_slsqp

n0 = [0.5, 0.5, 0.5, 0.5]  # initial guesses
N = fmin_slsqp(func, n0, f_eqcons=ec1)
print N
>>> >>> Optimization terminated successfully.    (Exit mode 0)
            Current function value: -91.204832308
            Iterations: 2
            Function evaluations: 13
            Gradient evaluations: 2
[ 0.45502309  0.45502309  0.54497691  0.54497691]

1 Compute mole fractions and partial pressures

The pressures here are in good agreement with the pressures found by other methods. The minor disagreement (in the third or fourth decimal place) is likely due to convergence tolerances in the different algorithms used.

yj = N / np.sum(N)
Pj = yj * P

for s, y, p in zip(species, yj, Pj):
    print '{0:10s}: {1:1.2f} {2:1.2f}'.format(s, y, p)
>>> >>> ... ... CO        : 0.23 2.28
H2O       : 0.23 2.28
CO2       : 0.27 2.72
H2        : 0.27 2.72

2 Computing equilibrium constants

We can compute the equilibrium constant for the reaction \(CO + H_2O \rightleftharpoons CO_2 + H_2\). Compared to the value of K = 1.44 we found at the end of Post 1536 , the agreement is excellent. Note, that to define an equilibrium constant it is necessary to specify a reaction, even though it is not necessary to even consider a reaction to obtain the equilibrium distribution of species!

nuj = np.array([-1, -1, 1, 1])  # stoichiometric coefficients of the reaction
K =**nuj)
print K
>>> 1.43446295961

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter

Finding equilibrium composition by direct minimization of Gibbs free energy on mole numbers

| categories: optimization | tags: thermodynamics

Matlab post Adapted from problem 4.5 in Cutlip and Shacham Ethane and steam are fed to a steam cracker at a total pressure of 1 atm and at 1000K at a ratio of 4 mol H2O to 1 mol ethane. Estimate the equilibrium distribution of products (CH4, C2H4, C2H2, CO2, CO, O2, H2, H2O, and C2H6).

Solution method: We will construct a Gibbs energy function for the mixture, and obtain the equilibrium composition by minimization of the function subject to elemental mass balance constraints.

import numpy as np

R = 0.00198588 # kcal/mol/K
T = 1000 # K

species = ['CH4', 'C2H4', 'C2H2', 'CO2', 'CO', 'O2', 'H2', 'H2O', 'C2H6']

# $G_^\circ for each species. These are the heats of formation for each
# species.
Gjo = np.array([4.61, 28.249, 40.604, -94.61, -47.942, 0, 0, -46.03, 26.13]) # kcal/mol

1 The Gibbs energy of a mixture

We start with \(G=\sum\limits_j n_j \mu_j\). Recalling that we define \(\mu_j = G_j^\circ + RT \ln a_j\), and in the ideal gas limit, \(a_j = y_j P/P^\circ\), and that \(y_j = \frac{n_j}{\sum n_j}\). Since in this problem, P = 1 atm, this leads to the function \(\frac{G}{RT} = \sum\limits_{j=1}^n n_j\left(\frac{G_j^\circ}{RT} + \ln \frac{n_j}{\sum n_j}\right)\).

import numpy as np

def func(nj):
    nj = np.array(nj)
    Enj = np.sum(nj);
    G = np.sum(nj * (Gjo / R / T + np.log(nj / Enj)))
    return G

2 Linear equality constraints for atomic mass conservation

The total number of each type of atom must be the same as what entered the reactor. These form equality constraints on the equilibrium composition. We express these constraints as: \(A_{eq} n = b\) where \(n\) is a vector of the moles of each species present in the mixture. CH4 C2H4 C2H2 CO2 CO O2 H2 H2O C2H6

Aeq = np.array([[0,   0,    0,   2,   1,  2,  0,  1,   0],      # oxygen balance
                [4,   4,    2,   0,   0,  0,  2,  2,   6],      # hydrogen balance
                [1,   2,    2,   1,   1,  0,  0,  0,   2]])     # carbon balance

# the incoming feed was 4 mol H2O and 1 mol ethane
beq = np.array([4,  # moles of oxygen atoms coming in
                14, # moles of hydrogen atoms coming in
                2]) # moles of carbon atoms coming in

def ec1(n):
    'equality constraint'
    return, n) - beq

def ic1(n):
    '''inequality constraint
       all n>=0
    return n

Now we solve the problem.

# initial guess suggested in the example
n0 = [1e-3, 1e-3, 1e-3, 0.993, 1.0, 1e-4, 5.992, 1.0, 1e-3] 

n0 = [0.066, 8.7e-08, 2.1e-14, 0.545, 1.39, 5.7e-14, 5.346, 1.521, 1.58e-7]

from scipy.optimize import fmin_slsqp

X = fmin_slsqp(func, n0, f_eqcons=ec1,f_ieqcons=ic1, iter=300, acc=1e-12)

for s,x in zip(species, X):
    print '{0:10s} {1:1.4g}'.format(s, x)

# check that constraints were met
print, X) - beq
print np.all( np.abs(, X) - beq) < 1e-12)
>>> >>> >>> >>> >>> >>> Optimization terminated successfully.    (Exit mode 0)
            Current function value: -104.403951524
            Iterations: 16
            Function evaluations: 193
            Gradient evaluations: 15
>>> ... ... CH4        0.06644
C2H4       9.48e-08
C2H2       1.487e-13
CO2        0.545
CO         1.389
O2         3.096e-13
H2         5.346
H2O        1.521
C2H6       1.581e-07
... [  0.00000000e+00   0.00000000e+00   4.44089210e-16]

I found it necessary to tighten the accuracy parameter to get pretty good matches to the solutions found in Matlab. It was also necessary to increase the number of iterations. Even still, not all of the numbers match well, especially the very small numbers. You can, however, see that the constraints were satisfied pretty well.

Interestingly there is a distribution of products! That is interesting because only steam and ethane enter the reactor, but a small fraction of methane is formed! The main product is hydrogen. The stoichiometry of steam reforming is ideally \(C_2H_6 + 4H_2O \rightarrow 2CO_2 + 7 H2\). Even though nearly all the ethane is consumed, we do not get the full yield of hydrogen. It appears that another equilibrium, one between CO, CO2, H2O and H2, may be limiting that, since the rest of the hydrogen is largely in the water. It is also of great importance that we have not said anything about reactions, i.e. how these products were formed.

The water gas shift reaction is: \(CO + H_2O \rightleftharpoons CO_2 + H_2\). We can compute the Gibbs free energy of the reaction from the heats of formation of each species. Assuming these are the formation energies at 1000K, this is the reaction free energy at 1000K.

G_wgs = Gjo[3] + Gjo[6] - Gjo[4] - Gjo[7]
print G_wgs

K = np.exp(-G_wgs / (R*T))
print K
>>> >>> 1.37887528109

3 Equilibrium constant based on mole numbers

One normally uses activities to define the equilibrium constant. Since there are the same number of moles on each side of the reaction all factors that convert mole numbers to activity, concentration or pressure cancel, so we simply consider the ratio of mole numbers here.

print (X[3] * X[6]) / (X[4] * X[7])

This is very close to the equilibrium constant computed above.

Clearly, there is an equilibrium between these species that prevents the complete reaction of steam reforming.

4 Summary

This is an appealing way to minimize the Gibbs energy of a mixture. No assumptions about reactions are necessary, and the constraints are easy to identify. The Gibbs energy function is especially easy to code.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter
Next Page ยป