TOC

Introduction to nonlinear algebra#

  • KEYWORDS: scipy.optimize.root, scipy.misc.derivative, list comprehension

In non-linear algebra, we seek solutions to the equation \(f(x) = 0\) where \(f(x)\) is non-linear in \(x\). These are examples of non-linear algebraic equations:

  • \(e^x=4\)

  • \(x^2 + x - 1 = 0\)

  • \(f(F_A) = F_{A0} - F_{A} - k (F_A / \nu)^2 V = 0\)

Sometimes you can derive an analytical solution for these, and that is desirable because then we just evaluate the solution. Often it is not easy to do that, and we need solutions anyway.

In some cases we can tell how many solutions there will be. For example, we know how many roots a polynomial will have, the same number as the polynomial order. In other cases though, there is not a general theory for whether there is a solution, multiple solutions, or no solution to nonlinear algebraic equations. For example,

\(sin(x) = 2\) has no solution. We define \(f(x) = sin(x) - 2\) and plot it. You can see there no intersections with the x-axis at y=0, meaning no solutions.

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 10)


def f(x):
    return np.sin(x) - 2


plt.plot(x, f(x))
plt.axhline(0, ls="--")
plt.xlabel("x")
plt.ylabel("y");
../../_images/c06207ec8000c0f1c1fb1b5be4abf78d595d1101b744cea300d47bb5f9180a48.png

In contrast, \(sin(x) = 0.5\) will have an infinite number of solutions, everywhere the function intersects the x-axis.

def f2(x):
    return np.sin(x) - 0.5


plt.plot(x, f2(x))
plt.axhline(0, ls="--")
plt.xlabel("x")
plt.ylabel("y");
../../_images/b753f01b38915e92d442c09e3cae65fccca0871efe0313a6d7330b4a1d80193d.png

Finally, \(sin(x) = x - 1\) has only one solution.

def f3(x):
    return np.sin(x) - (x - 1)


X = np.linspace(-20, 20)
plt.plot(x, f3(X))
plt.axhline(0, ls="--")
plt.xlabel("x")
plt.ylabel("y");
../../_images/8b092ffeede4d55f34d710740c0f7d139e21b7afa7360b7aa3ebf5f5fc270641.png

The equation \(e^{-0.5 x} \sin(x) = 0.5\), evidently has two solutions, but other versions of this equation might have 0, 1, multiple or infinite solutions.

def f3(x):
    return np.exp(-0.5 * x) * np.sin(x) - 0.50


x = np.linspace(0, 10, 500)

plt.plot(x, f3(x))
plt.axhline(0, ls="--")
plt.title("$e^{-0.5 x} \sin(x) = 0.5$")
plt.xlabel("x")
plt.ylabel("objective");
../../_images/a9e31d1b006af5ff71f672ea450bf7e09f3450f444f9392b67bcb23258f2d8ca.png

exercise modify the equation to see 0, 1, many or infinite solutions.

Graphical methods like this are invaluable to visually assess if there are any solutions, and if so how many solutions at least over some range of solutions. Sometimes, this is the fastest way to estimate a solution. Here we focus on nonlinear algebra problems that cannot be analytically solved. These kinds of problems require an iterative solution approach.

Root finding with solve_ivp#

When we find solutions to these equations, it is called root finding. Typically we look for roots where the function is equal to zero, and set up the function so that these are solutions we want. We previously saw in a homework that you can use solve_ivp for this by deriving the derivative of the function, then integrating with an initial condition and an event function to detect the zeros.

This approach has advantages:

  1. If the integrator takes small steps, we can get all the roots in an integration span.

It also has some disadvantages:

  1. We have to derive and implement a derivative function and initial condition, which may be tedious or difficult, and in some cases may not be possible.

Today we explore alternative approaches that avoid the disadvantage above, but lack the advantage too. These methods are iterative, and require an initial guess for the solution.

Newton-Raphson method for finding solutions#

Notes adapted from https://en.wikipedia.org/wiki/Newton%27s_method.

The key idea is that we start with a guess that is close to the solution, and then the function is approximated by a line tangent to the function to find where the line intersects the x-axis. For well-behaved functions, this is a better estimate of where the function equals zero than the first guess. Then, we repeat this until we get sufficiently close to zero.

image.png

So, we start with the point \((x_0, f(x_0))\), and we compute \(f'(x_0)\), which is the slope of the line tangent to \(f(x_0)\). We can express an equation for this line as: \(y - f(x_0) = f'(x_0)(x - x_0)\) If we now solve this for the \(x\) where \(y=0\) leads to:

\(0 = f'(x_0)(x - x_0) + f(x_0)\)

which leads to

\(x = x_0 - f(x_0) / f'(x_0)\)

To implement this, we need to decide what is the tolerance for defining 0, and what is the maximum number of iterations we want to consider?

We will first consider what is the square root of 612? This is equivalent to finding a solution to \(x^2 = 612\)

\(f(x) = x^2 - 612\)

Let’s start with a guess of x=25, since we know \(x^2=625\). We also know \(f'(x) = 2x\).

The approach is iterative, so we will specify the maximum number of steps to iterate to, and a criteria for stopping.

24**2, 25**2  # These bracket the solution
(576, 625)

I like to plot and visualize where the solutions might be. Here we see two solutions (intersections with zero) near \(\pm 24\).

X = np.linspace(-25, 25)
plt.plot(X, X**2 - 612)
plt.axhline(0, color='k')
plt.xlabel('x')
plt.ylabel('f(x)');
../../_images/a535d17147447832c906c83f067951b22a02edf5ac940188082252b906ca4ff2.png
x0 = 24  # initial guess

Nmax = 25  # stop if we hit this many iterations
TOL = 1e-12  # stop if we are less than this number


def f(x):
    "The function to solve."
    return x**2 - 612


def fprime(x):
    "Derivative of the function to solve."
    return 2 * x


# Here is the iterative solution
for i in range(Nmax):
    xnew = x0 - f(x0) / fprime(x0)
    x0 = xnew

    print(f"{i}: {xnew}")
    if np.abs(f(xnew)) < TOL:
        break

print(xnew, xnew**2)
0: 24.75
1: 24.738636363636363
2: 24.7386337537061
3: 24.73863375370596
24.73863375370596 611.9999999999999

That is pretty remarkable, it only took a few iterations. That is partly because we started pretty close to the answer. Try this again with different initial guesses and see how the number of iterations changes. Also try with negative numbers. There are two solutions that are possible, and the one you get depends on the initial guess.

One reason it takes so few iterations here is that Newton’s method converges quadratically when you are close to the solution, and in this simple case we have a quadratic function, so we get to the answer in just a few steps.

Problem problems#

There are pathological situations you can get into. Consider this simple looking polynomial:

def f(x):
    return x**3 - 2 * x + 2


def fprime(x):
    return 3 * x**2 - 2


x = np.linspace(-2, 2)
plt.plot(x, f(x))
plt.axhline(0)
plt.xlabel("x")
plt.ylabel("f(x)");
../../_images/85ae5f1f21f083a73a6778596cd572fde0fee18b45ca4690ea6cea81d71ee5a3.png

It seems obvious there is a root near -1.7. But if you use a guess around x=0, the algorithm simply oscillates back and forth and never converges. Let’s see:

x0 = 1

for i in range(Nmax):
    xnew = x0 - f(x0) / fprime(x0)
    x0 = xnew
    print(f"{i}: {xnew}")
    if np.abs(f(xnew)) < TOL:
        break


print(xnew)
0: 0.0
1: 1.0
2: 0.0
3: 1.0
4: 0.0
5: 1.0
6: 0.0
7: 1.0
8: 0.0
9: 1.0
10: 0.0
11: 1.0
12: 0.0
13: 1.0
14: 0.0
15: 1.0
16: 0.0
17: 1.0
18: 0.0
19: 1.0
20: 0.0
21: 1.0
22: 0.0
23: 1.0
24: 0.0
0.0

Exercise: Try several initial guesses, and see which ones converge.

You can also run into problems when:

  • \(f'(x) = 0\) at the initial guess, or a subsequent update, then you get a singularity in the update.

  • The first derivative is discontinuous at the root. Then you may not converge because the update can bounce back and forth.

  • The first derivative is undefined at the root

We do not frequently run into these issues, but they do occur from time to time. The solution is usually to use a better initial guess.

Derivatives of functions#

When you can derive an analytical derivative, you should probably consider doing that, because otherwise we have to approximate the derivatives numerically using finite differences, which is less accurate and computationally more expensive, or we need to use advance libraries that are capable of finding derivatives automatically. We will first see how to use the finite differences approach, and later learn about the automatic approach.

scipy has deprecated scipy.misc.derivative function and will remove it soon. We will consider numdifftools instead. You provide a function, an x-value that you want the derivative at, and a dx to use in a finite-difference formula. By default, three points are used in the difference formula. You want to use a small dx to get an accurate result, but not too small or you can get numerical errors.

exercise: Try this out with different values of dx from 0.1 to 1e-15.

import numdifftools as nd
nd.Derivative?
import numpy as np

def f(x):
    return x**3


x0 = 12  # we want f'(12)

for dx in np.geomspace(1e-15, 1, 10):
    print(f"{dx:1.2e}", nd.Derivative(f, step=dx)(x0))
    
# Note the default step size is better in this case
nd.Derivative(f)(x0), 3 * x0**2  # the numerical and analytical derivative
1.00e-15 614.4
4.64e-14 431.1578947368421
2.15e-12 432.05771410903844
1.00e-10 432.00085265121237
4.64e-09 431.9999464213735
2.15e-07 432.0000003133145
1.00e-05 431.999999975131
4.64e-04 432.00000021491917
2.15e-02 432.0004641588817
1.00e+00 433.0
(array(432.), 432)

Let’s wrap the Newton method in a function and use nd.Derivative to get the derivative.

def newton(func, x0, tolerance=1e-6, nmax=10):
    if nd.Derivative(func)(x0) == 0:
        print("Division by zero!!!")
        return None
    for i in range(nmax):
        xnew = x0 - func(x0) / nd.Derivative(func)(x0)
        x0 = xnew
        if np.abs(func(xnew)) < tolerance:
            return xnew

    print("Max iterations exceeded")
    return None

Now, we have a pretty convenient way to solve equations:

def f(x):
    return x**2 - 612


newton(f, 25), np.sqrt(612)
(np.float64(24.738633753705965), np.float64(24.73863375370596))

To make this work on an array of inputs, we have to use a list comprehension, or an explicit loop. Here we try a bunch of different guesses.

[newton(f, g) for g in np.linspace(0, 30, 10)]
Division by zero!!!
[None,
 np.float64(24.738633753706004),
 np.float64(24.738633753705987),
 np.float64(24.738633753766084),
 np.float64(24.73863375370596),
 np.float64(24.73863375392133),
 np.float64(24.738633753705976),
 np.float64(24.73863375373235),
 np.float64(24.738633753899723),
 np.float64(24.738633753705965)]

This is the basic idea behind nonlinear algebra solvers. Similar to the ode solver we used, there are functions in scipy written to solve nonlinear equations. We consider these next.

Tip

A common theme in this course will be exploring how numerical algorithms work and implementing basic versions of them, and then transitioning to more sophisticated library functions that are more robust and reliable, and which frequently have much more functionality.

root#

scipy.optimize.root is the main function we will use to solve nonlinear algebra problems. root can be used with functions where you have the derivative, and where you don’t.

from scipy.optimize import root
??root

Let’s see the simplest example.

Solve \(e^x = 2\).

It is a good idea to plot this to get a good initial guess.

def objective(x):
    return np.exp(x) - 2  # equal to zero at the solution


x = np.linspace(0, 2)
plt.plot(x, objective(x))
plt.xlabel("x")
plt.ylabel("objective(x)")
plt.axhline(0, ls="--");
../../_images/a420c5e1824d5748c771b34a18e07b5d766dee3309717bf24e6ea807fb72395b.png
root(objective, 0.6), np.log(2)
( message: The solution converged.
  success: True
   status: 1
      fun: [ 0.000e+00]
        x: [ 6.931e-01]
   method: hybr
     nfev: 9
     fjac: [[-1.000e+00]]
        r: [-2.000e+00]
      qtf: [-1.001e-10],
 np.float64(0.6931471805599453))

Note that the result is an array. We can unpack the array with this syntax. Note the comma. Why a comma? it indicates to Python that the results should be unpacked into the variable in a special way, i.e. the first value of the result goes into the first variable. That is all there is in this case.

This is the preferred way to get the value of the solution into x:

sol = root(objective, 2)
print(f"x = {sol.x[0]:1.2f}")
x = 0.69

Here are two checks on the answer. Both should be practically zero.

objective(sol.x), sol.x - np.log(2)
(array([4.4408921e-16]), array([2.22044605e-16]))

Here is an example with no solution, and a different status flag.

def objective2(x):
    return np.exp(x) + 2


root(objective2, 2)
 message: The iteration is not making good progress, as measured by the 
           improvement from the last ten iterations.
 success: False
  status: 5
     fun: [ 2.000e+00]
       x: [-2.807e+01]
  method: hybr
    nfev: 22
    fjac: [[ 1.000e+00]]
       r: [ 6.750e-13]
     qtf: [ 2.000e+00]

You can see in this plot there is no zero.

X = np.linspace(-30, 3)
plt.plot(X, objective2(X))
plt.xlabel('x');
plt.ylabel('objective(x)');
../../_images/c500f6519d2b0b4f583d795c652e6fdb9a367bc53c64b641b4994e114093b897.png

A worked example#

We can integrate root with a variety of other problems. For example, here is an integral equation we need to solve in engineering problems. The volume of a plug flow reactor can be defined by this equation: \(V = \int_{Fa(V=0)}^{Fa} \frac{1}{r_a} dFa\) where \(r_a\) is the rate law. Suppose we know the reactor volume is 100 L, the inlet concentration of A is 1 mol/L, the volumetric flow is 10 L/min, and \(r_a = -k Ca\), with \(k=0.23\) 1/min. What is the exit molar flow rate? We need to solve the following equation:

\[100 = \int_{Fa(V=0)}^{Fa} \frac{1}{-k Fa/\nu} dFa\]

The equation to solve here is:

\(f(Fa) = 100 - \int_{Fa(V=0)}^{Fa} \frac{1}{-k Fa/\nu} dFa = 0\).

Note

This is a good example of a problem where it would be tedious (but not impossible) to compute a derivative. You would have to employ the fundamental theorem of Calculus to do that here.

import numpy as np
from scipy.integrate import quad
from scipy.optimize import root

k = 0.23  # 1 / min
nu = 10.0  # L / min
Cao = 1.0  # mol / L
Fa0 = Cao * nu


def integrand(Fa):
    return -1.0 / (k * Fa / nu)


def objective(Fa):
    integral, err = quad(integrand, Fa0, Fa)
    return 100.0 - integral


Fa0
10.0

To make a plot, there is a subtlety. We cannot integrate an array of \(F_A\) values. Previously, we used a for loop to get around this. There is another syntax called list comprehension that we can also use:

[objective(fa) for fa in [0.01, 0.1, 1, 2]]
[-200.33718604270166,
 -100.22479069513437,
 -0.11239534756747105,
 30.024438589821685]

You can already see the answer must be between 1 and 2 because the sign changes between these two values, and that it is closer to 1 than 2.

fa = np.linspace(0.01, 2)
obj = [objective(f) for f in fa]
plt.plot(fa, obj)
plt.xlabel("Molar flow rate (mol/min)")
plt.ylabel("objective")
plt.axhline(0, color="k", linestyle="--")
plt.axvline(1, color="k", linestyle="--");
../../_images/5b2a35b19c00fd1d2f838dba0f63cc7a4531ecc38e1ea5220d4bbae8aa7fe478.png

You can see there is one answer in this range, near a flow rate of 1 mol/min. We use that as an initial guess for root:

# Advanced method to make a function act like it is vectorized
vobj = np.vectorize(objective)
plt.plot(fa, vobj(fa))
plt.xlabel("Molar flow rate (mol/min)")
plt.ylabel("objective")
plt.axhline(0, color="k", linestyle="--")
plt.axvline(1, color="k", linestyle="--");
../../_images/5b2a35b19c00fd1d2f838dba0f63cc7a4531ecc38e1ea5220d4bbae8aa7fe478.png
Fa_guess = 1.0
sol = root(objective, Fa_guess)
Fa_exit = sol.x[0]
print(f"The exit flow rate is {Fa_exit:1.4f} mol/min.")
The exit flow rate is 1.0026 mol/min.
objective(Fa_exit)  # this should be close to zero.
-3.268496584496461e-13
sol
 message: The solution converged.
 success: True
  status: 1
     fun: -3.268496584496461e-13
       x: [ 1.003e+00]
  method: hybr
    nfev: 7
    fjac: [[-1.000e+00]]
       r: [-4.337e+01]
     qtf: [ 1.873e-07]

Parameterized objective functions#

Now, suppose we want to see how our solution varies with a parameter value. For example, we can change the rate constant by changing the temperature. Say we want to compute the exit molar flow rate at a range of rate constants, e.g. from 0.02 to 2 1/min. In other words, we treat the rate constant as a parameter and use it in an additional argument.

def integrand(Fa, k):
    return -1.0 / (k * Fa / nu)


def objective(Fa, k):
    integral, err = quad(integrand, Fa0, Fa, args=(k,))
    return 100.0 - integral


KRANGE = np.linspace(0.02, 2)

fa_exit = np.zeros(KRANGE.shape)

guess = 0.1

for i, k in enumerate(KRANGE):
    sol = root(objective, guess, args=(k,))
    if sol.status == 1:
        fa_exit[i] = sol.x
        guess = sol.x
    else:
        print(f"k = {k} failed. {msg}")

plt.plot(KRANGE, fa_exit)
plt.xlabel("k (1/min)")
plt.ylabel("Conversion");
/tmp/ipykernel_2678/250610355.py:19: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
  fa_exit[i] = sol.x
../../_images/9158a2baddb9d2d8e01af43affdaf66a687de06cfb8f6f6c0ae79d753829702c.png

You can see here that any rate constant above about 0.5 1/min leads to near complete conversion, so heating above the temperature required for this would be wasteful.

# alternative way to use variable scoping and nested function definitions
KRANGE = np.linspace(0.02, 2)
fa_exit = np.zeros(KRANGE.shape)
guess = 0.1

for i, k in enumerate(KRANGE):

    def integrand(Fa):
        return -1 / (k * Fa / nu)

    def objective(Fa):
        integral, err = quad(integrand, Fa0, Fa)
        return 100.0 - integral

    sol = root(objective, guess)
    if sol.status == 1:
        fa_exit[i] = sol.x[0]
        guess = sol.x

plt.plot(KRANGE, fa_exit)
plt.xlabel("k (1/min)")
plt.ylabel("Conversion");
../../_images/9158a2baddb9d2d8e01af43affdaf66a687de06cfb8f6f6c0ae79d753829702c.png

Summary#

In this lecture we reviewed methods to solve non-linear algebraic equations (they also work on linear algebra, but it is considered wasteful since there are more efficient methods to solve those).

  • The key idea is to create a function that is equal to zero at the solution, and then use scipy.optimize.root with an initial guess to find the solution.

  • We introduced list comprehension which is a convenient syntax for for loops.

  • We also looked at numdifftools.Derivative which is a convenient way to numerically estimate the derivative of a function by finite difference formulas.

from jupyterquiz import display_quiz
display_quiz('.quiz.json')