Update on finding the minimum distance from a point to a curve

| categories: optimization | tags:

Almost 10 years ago I wrote about finding the minimum distance from a point to a curve using a constrained optimization. At that time, the way to do this used scipy.optimize.fmin_coblya. I learned today from a student, that sometimes this method fails! I reproduce the code here, updated for Python 3, some style updates, and to show it does indeed fail sometimes, notably when the point is "outside" the parabola.

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import fmin_cobyla

def f(x):
    return x**2

for P in np.array([[0.5, 2],
                   [2, 2],
                   [-1, 2],
                   [-2, 2],
                   [0, 0.5],
                   [0, -0.5]]):
    
    def objective(X):
        X = np.array(X)
        return np.linalg.norm(X - P)

    def c1(X):
        x,y = X
        return f(x) - y

    X = fmin_cobyla(objective, x0=[P[0], f(P[0])], cons=[c1])

    print(f'The minimum distance is {objective(X):1.2f}. Constraint satisfied = {c1(X) < 1e-6}')

    # Verify the vector to this point is normal to the tangent of the curve
    # position vector from curve to point
    v1 = np.array(P) - np.array(X)
    # position vector
    v2 = np.array([1, 2.0 * X[0]])
    print('dot(v1, v2) = ', np.dot(v1, v2))

    x = np.linspace(-2, 2, 100)

    plt.plot(x, f(x), 'r-', label='f(x)')
    plt.plot(P[0], P[1], 'bo', label='point')
    plt.plot([P[0], X[0]], [P[1], X[1]], 'b-', label='shortest distance')
    plt.plot([X[0], X[0] + 1], [X[1], X[1] + 2.0 * X[0]], 'g-', label='tangent')
    plt.axis('equal')
    plt.xlabel('x')
    plt.ylabel('y')    

The minimum distance is 0.86. Constraint satisfied = True dot(v1, v2) = 0.0002913487659186309 The minimum distance is 0.00. Constraint satisfied = False dot(v1, v2) = 0.00021460906432962284 The minimum distance is 0.39. Constraint satisfied = True dot(v1, v2) = 0.00014271520451364372 The minimum distance is 0.00. Constraint satisfied = False dot(v1, v2) = -0.0004089466778209598 The minimum distance is 0.50. Constraint satisfied = True dot(v1, v2) = 1.9999998429305957e-12 The minimum distance is 0.00. Constraint satisfied = False dot(v1, v2) = 8.588744170160093e-06

So, sure enough, the optimizer is failing to find a solution that meets the constraint. It is strange it does not work on the outside. That is almost certainly an algorithm problem. Here we solve it nearly identically with the more modern scipy.optimize.minimize function, and it converges every time.

from scipy.optimize import minimize

for P in np.array([[0.5, 2],
                   [2, 2],
                   [-1, 2],
                   [-2, 2],
                   [0, 0.5],
                   [0, -0.5]]):
    
    def objective(X):
        X = np.array(X)
        return np.linalg.norm(X - P)

    def c1(X):
        x,y = X
        return f(x) - y

    sol = minimize(objective, x0=[P[0], f(P[0])], constraints={'type': 'eq', 'fun': c1})
    X = sol.x

    print(f'The minimum distance is {objective(X):1.2f}. Constraint satisfied = {sol.status < 1e-6}')

    # Verify the vector to this point is normal to the tangent of the curve
    # position vector from curve to point
    v1 = np.array(P) - np.array(X)
    # position vector
    v2 = np.array([1, 2.0 * X[0]])
    print('dot(v1, v2) = ', np.dot(v1, v2))

    x = np.linspace(-2, 2, 100)

    plt.plot(x, f(x), 'r-', label='f(x)')
    plt.plot(P[0], P[1], 'bo', label='point')
    plt.plot([P[0], X[0]], [P[1], X[1]], 'b-', label='shortest distance')
    plt.plot([X[0], X[0] + 1], [X[1], X[1] + 2.0 * X[0]], 'g-', label='tangent')
    plt.axis('equal')
    plt.xlabel('x')
    plt.ylabel('y')

The minimum distance is 0.86. Constraint satisfied = True dot(v1, v2) = 1.0701251773603815e-08 The minimum distance is 0.55. Constraint satisfied = True dot(v1, v2) = -0.0005793028003104883 The minimum distance is 0.39. Constraint satisfied = True dot(v1, v2) = -1.869272921939391e-05 The minimum distance is 0.55. Constraint satisfied = True dot(v1, v2) = 0.0005792953298950909 The minimum distance is 0.50. Constraint satisfied = True dot(v1, v2) = 0.0 The minimum distance is 0.50. Constraint satisfied = True dot(v1, v2) = 0.0

There is no wisdom in fixing the first problem, here I just tried a newer optimization method. Out of the box with default settings it just worked. I did learn the answer is sensitive to the initial guess, so it could make sense to sample the function and find the point that is closest as the initial guess, but here the simple heuristic guess I used worked fine.

Copyright (C) 2023 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.5.5

Discuss on Twitter

Constrained optimization with Lagrange multipliers and autograd

| categories: autograd, optimization | tags:

Constrained optimization is common in engineering problems solving. A prototypical example (from Greenberg, Advanced Engineering Mathematics, Ch 13.7) is to find the point on a plane that is closest to the origin. The plane is defined by the equation \(2x - y + z = 3\), and we seek to minimize \(x^2 + y^2 + z^2\) subject to the equality constraint defined by the plane. scipy.optimize.minimize provides a pretty convenient interface to solve a problem like this, ans shown here.

import numpy as np
from scipy.optimize import minimize

def objective(X):
    x, y, z = X
    return x**2 + y**2 + z**2

def eq(X):
    x, y, z = X
    return 2 * x - y + z - 3

sol = minimize(objective, [1, -0.5, 0.5], constraints={'type': 'eq', 'fun': eq})
sol
    fun: 1.5
    jac: array([ 2.00000001, -0.99999999,  1.00000001])
message: 'Optimization terminated successfully.'
   nfev: 5
    nit: 1
   njev: 1
 status: 0
success: True
      x: array([ 1. , -0.5,  0.5])

I like the minimize function a lot, although I am not crazy for how the constraints are provided. The alternative used to be that there was an argument for equality constraints and another for inequality constraints. Analogous to scipy.integrate.solve_ivp event functions, they could have also used function attributes.

Sometimes, it might be desirable to go back to basics though, especially if you are unaware of the minimize function or perhaps suspect it is not working right and want an independent answer. Next we look at how to construct this constrained optimization problem using Lagrange multipliers. This converts the problem into an augmented unconstrained optimization problem we can use fsolve on. The gist of this method is we formulate a new problem:

\(F(X) = f(X) - \lambda g(X)\)

and then solve the simultaneous resulting equations:

\(F_x(X) = F_y(X) = F_z(X) = g(X) = 0\) where \(F_x\) is the derivative of \(f*\) with respect to \(x\), and \(g(X)\) is the equality constraint written so it is equal to zero. Since we end up with four equations that equal zero, we can simply use fsolve to get the solution. Many years ago I used a finite difference approximation to the derivatives. Today we use autograd to get the desired derivatives. Here it is.

import autograd.numpy as np
from autograd import grad

def F(L):
    'Augmented Lagrange function'
    x, y, z, _lambda = L
    return objective([x, y, z]) - _lambda * eq([x, y, z])

# Gradients of the Lagrange function
dfdL = grad(F, 0)

# Find L that returns all zeros in this function.
def obj(L):
    x, y, z, _lambda = L
    dFdx, dFdy, dFdz, dFdlam = dfdL(L)
    return [dFdx, dFdy, dFdz, eq([x, y, z])]

from scipy.optimize import fsolve
x, y, z, _lam = fsolve(obj, [0.0, 0.0, 0.0, 1.0])
print(f'The answer is at {x, y, z}')
The answer is at (1.0, -0.5, 0.5)


That is the same answer as before. Note we have still relied on some black box solver inside of fsolve (instead of inside minimize), but it might be more clear what problem we are solving (e.g. finding zeros). It takes a bit more work to set this up, since we have to construct the augmented function, but autograd makes it pretty convenient to set up the final objective function we want to solve.

How do we know we are at a minimum? We can check that the Hessian is positive definite in the original function we wanted to minimize. You can see here the array is positive definite, e.g. all the eigenvalues are positive. autograd makes this easy too.

from autograd import hessian
h = hessian(objective, 0)
h(np.array([x, y, z]))
array([[ 2.,  0.,  0.],
       [ 0.,  2.,  0.],
       [ 0.,  0.,  2.]])

In case it isn't evident from that structure that the eigenvalues are all positive, here we compute them:

np.linalg.eig(h(np.array([x, y, z])))[0]
array([ 2.,  2.,  2.])

In summary, autograd continues to enable advanced engineering problems to be solved.

Copyright (C) 2018 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.1.14

Discuss on Twitter

Finding the maximum power of a photovoltaic device.

| categories: python, optimization | tags:

A photovoltaic device is characterized by a current-voltage relationship. Let us say, for argument's sake, that the relationship is known and defined by

\(i = 0.5 - 0.5 * V^2\)

The voltage is highest when the current is equal to zero, but of course then you get no power. The current is highest when the voltage is zero, i.e. short-circuited, but there is again no power. We seek the highest power condition, which is to find the maximum of \(i V\). This is a constrained optimization. We solve it by creating an objective function that returns the negative of (\i V\), and then find the minimum.

First, let us examine the i-V relationship.

import matplotlib.pyplot as plt
import numpy as np

V = np.linspace(0, 1)

def i(V):
    return 0.5 - 0.5 * V**2

plt.figure()
plt.plot(V, i(V))
plt.savefig('images/iV.png')
<matplotlib.figure.Figure object at 0x11193ec18>
[<matplotlib.lines.Line2D object at 0x111d43668>]

Now, let us be sure there is a maximum in power.

import matplotlib.pyplot as plt
import numpy as np

V = np.linspace(0, 1)

def i(V):
    return 0.5 - 0.5 * V**2

plt.plot(V, i(V) * V)
plt.savefig('images/P1.png')
[<matplotlib.lines.Line2D object at 0x111d437f0>]

You can see in fact there is a maximum, near V=0.6. We could solve this problem analytically by taking the appropriate derivative and solving it for zero. That still might require solving a nonlinear problem though. We will directly setup and solve the constrained optimization.

from scipy.optimize import fmin_slsqp
import numpy as np
import matplotlib.pyplot as plt

def objective(X):
    i, V = X
    return - i * V

def eqc(X):
    'equality constraint'
    i, V = X
    return (0.5 - 0.5 * V**2) - i

X0 = [0.2, 0.6]
X = fmin_slsqp(objective, X0, eqcons=[eqc])

imax, Vmax = X


V = np.linspace(0, 1)

def i(V):
    return 0.5 - 0.5 * V**2

plt.plot(V, i(V), Vmax, imax, 'ro')
plt.savefig('images/P2.png')
Optimization terminated successfully.    (Exit mode 0)
            Current function value: -0.192450127337
            Iterations: 5
            Function evaluations: 20
            Gradient evaluations: 5
[<matplotlib.lines.Line2D object at 0x111946470>, <matplotlib.lines.Line2D object at 0x11192c518>]

You can see the maximum power is approximately 0.2 (unspecified units), at the conditions indicated by the red dot in the figure above.

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Constrained fits to data

| categories: data analysis, optimization | tags:

Our objective here is to fit a quadratic function in the least squares sense to some data, but we want to constrain the fit so that the function has specific values at the end-points. The application is to fit a function to the lattice constant of an alloy at different compositions. We constrain the fit because we know the lattice constant of the pure metals, which are at the end-points of the fit and we want these to be correct.

We define the alloy composition in terms of the mole fraction of one species, e.g. \(A_xB_{1-x}\). For \(x=0\), the alloy is pure B, whereas for \(x=1\) the alloy is pure A. According to Vegard's law the lattice constant is a linear composition weighted average of the pure component lattice constants, but sometimes small deviations are observed. Here we will fit a quadratic function that is constrained to give the pure metal component lattice constants at the end points.

The quadratic function is \(y = a x^2 + b x + c\). One constraint is at \(x=0\) where \(y = c\), or \(c\) is the lattice constant of pure B. The second constraint is at \(x=1\), where \(a + b + c\) is equal to the lattice constant of pure A. Thus, there is only one degree of freedom. \(c = LC_B\), and \(b = LC_A - c - a\), so \(a\) is our only variable.

We will solve this problem by minimizing the summed squared error between the fit and the data. We use the fmin function in scipy.optimize. First we create a fit function that encodes the constraints. Then we create an objective function that will be minimized. We have to make a guess about the value of \(a\) that minimizes the summed squared error. A line fits the data moderately well, so we guess a small value, i.e. near zero, for \(a\). Here is the solution.

import numpy as np
import matplotlib.pyplot as plt

# Data to fit to
# x=0 is pure B
# x=1 is pure A
X = np.array([0.0, 0.1,  0.25, 0.5,  0.6,  0.8,  1.0])
Y = np.array([3.9, 3.89, 3.87, 3.78, 3.75, 3.69, 3.6])

def func(a, XX):
    LC_A = 3.6
    LC_B = 3.9

    c = LC_B
    b = LC_A - c - a

    yfit = a * XX**2 + b * XX + c
    return yfit

def objective(a):
    'function to minimize'
    SSE = np.sum((Y - func(a, X))**2)
    return SSE


from scipy.optimize import fmin

a_fit = fmin(objective, 0)
plt.plot(X, Y, 'bo ')

x = np.linspace(0, 1)
plt.plot(x, func(a_fit, x))
plt.savefig('images/constrained-quadratic-fit.png')
Optimization terminated successfully.
         Current function value: 0.000445
         Iterations: 19
         Function evaluations: 38

Here is the result:

You can see that the end points go through the end-points as prescribed.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter

Gibbs energy minimization and the NIST webbook

| categories: optimization | tags: thermodynamics

Matlab post In Post 1536 we used the NIST webbook to compute a temperature dependent Gibbs energy of reaction, and then used a reaction extent variable to compute the equilibrium concentrations of each species for the water gas shift reaction.

Today, we look at the direct minimization of the Gibbs free energy of the species, with no assumptions about stoichiometry of reactions. We only apply the constraint of conservation of atoms. We use the NIST Webbook to provide the data for the Gibbs energy of each species.

As a reminder we consider equilibrium between the species \(CO\), \(H_2O\), \(CO_2\) and \(H_2\), at 1000K, and 10 atm total pressure with an initial equimolar molar flow rate of \(CO\) and \(H_2O\).

import numpy as np

T = 1000  # K
R = 8.314e-3 # kJ/mol/K

P = 10.0 # atm, this is the total pressure in the reactor
Po = 1.0 # atm, this is the standard state pressure

We are going to store all the data and calculations in vectors, so we need to assign each position in the vector to a species. Here are the definitions we use in this work.

1  CO
2  H2O
3  CO2
4  H2
species = ['CO', 'H2O', 'CO2', 'H2']

# Heats of formation at 298.15 K

Hf298 = [
    -110.53,  # CO
    -241.826, # H2O
    -393.51,  # CO2
       0.0]   # H2

# Shomate parameters for each species
#           A          B           C          D          E            F          G       H
WB = [[25.56759,  6.096130,     4.054656,  -2.671301,  0.131021, -118.0089, 227.3665,   -110.5271],  # CO
      [30.09200,  6.832514,     6.793435,  -2.534480,  0.082139, -250.8810, 223.3967,   -241.8264],  # H2O
      [24.99735,  55.18696,   -33.69137,    7.948387, -0.136638, -403.6075, 228.2431,   -393.5224],  # CO2
      [33.066178, -11.363417,  11.432816,  -2.772874, -0.158558, -9.980797, 172.707974,    0.0]]     # H2

WB = np.array(WB)

# Shomate equations
t = T/1000
T_H = np.array([t,  t**2 / 2.0, t**3 / 3.0, t**4 / 4.0, -1.0 / t, 1.0, 0.0, -1.0])
T_S = np.array([np.log(t), t,  t**2 / 2.0,  t**3 / 3.0, -1.0 / (2.0 * t**2), 0.0, 1.0, 0.0])

H = np.dot(WB, T_H)        # (H - H_298.15) kJ/mol
S = np.dot(WB, T_S/1000.0) # absolute entropy kJ/mol/K

Gjo = Hf298 + H - T*S      # Gibbs energy of each component at 1000 K

Now, construct the Gibbs free energy function, accounting for the change in activity due to concentration changes (ideal mixing).

def func(nj):
    nj = np.array(nj)
    Enj = np.sum(nj);
    Gj =  Gjo / (R * T) + np.log(nj / Enj * P / Po)
    return np.dot(nj, Gj)

We impose the constraint that all atoms are conserved from the initial conditions to the equilibrium distribution of species. These constraints are in the form of \(A_{eq} n = b_{eq}\), where \(n\) is the vector of mole numbers for each species.

Aeq = np.array([[ 1,    0,    1,    0],  # C balance
                [ 1,    1,    2,    0],  # O balance
                [ 0,    2,    0,    2]]) # H balance

# equimolar feed of 1 mol H2O and 1 mol CO
beq = np.array([1,  # mol C fed
                2,  # mol O fed
                2]) # mol H fed

def ec1(nj):
    'conservation of atoms constraint'
    return np.dot(Aeq, nj) - beq

Now we are ready to solve the problem.

from scipy.optimize import fmin_slsqp

n0 = [0.5, 0.5, 0.5, 0.5]  # initial guesses
N = fmin_slsqp(func, n0, f_eqcons=ec1)
print N
>>> >>> Optimization terminated successfully.    (Exit mode 0)
            Current function value: -91.204832308
            Iterations: 2
            Function evaluations: 13
            Gradient evaluations: 2
[ 0.45502309  0.45502309  0.54497691  0.54497691]

1 Compute mole fractions and partial pressures

The pressures here are in good agreement with the pressures found by other methods. The minor disagreement (in the third or fourth decimal place) is likely due to convergence tolerances in the different algorithms used.

yj = N / np.sum(N)
Pj = yj * P

for s, y, p in zip(species, yj, Pj):
    print '{0:10s}: {1:1.2f} {2:1.2f}'.format(s, y, p)
>>> >>> ... ... CO        : 0.23 2.28
H2O       : 0.23 2.28
CO2       : 0.27 2.72
H2        : 0.27 2.72

2 Computing equilibrium constants

We can compute the equilibrium constant for the reaction \(CO + H_2O \rightleftharpoons CO_2 + H_2\). Compared to the value of K = 1.44 we found at the end of Post 1536 , the agreement is excellent. Note, that to define an equilibrium constant it is necessary to specify a reaction, even though it is not necessary to even consider a reaction to obtain the equilibrium distribution of species!

nuj = np.array([-1, -1, 1, 1])  # stoichiometric coefficients of the reaction
K = np.prod(yj**nuj)
print K
>>> 1.43446295961

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter
Next Page ยป