Solving ODEs with a neural network and autograd

| categories: ode, autograd | tags: | View Comments

In the last post I explored using a neural network to solve a BVP. Here, I expand the idea to solving an initial value ordinary differential equation. The idea is basically the same, we just have a slightly different objective function.

\(dCa/dt = -k Ca(t)\) where \(Ca(t=0) = 2.0\).

Here is the code that solves this equation, along with a comparison to the analytical solution: \(Ca(t) = Ca0 \exp -kt\).

import autograd.numpy as np
from autograd import grad, elementwise_grad
import autograd.numpy.random as npr
from autograd.misc.optimizers import adam

def init_random_params(scale, layer_sizes, rs=npr.RandomState(0)):
    """Build a list of (weights, biases) tuples, one for each layer."""
    return [(rs.randn(insize, outsize) * scale,   # weight matrix
             rs.randn(outsize) * scale)           # bias vector
            for insize, outsize in zip(layer_sizes[:-1], layer_sizes[1:])]


def swish(x):
    "see https://arxiv.org/pdf/1710.05941.pdf"
    return x / (1.0 + np.exp(-x))


def Ca(params, inputs):
    "Neural network functions"
    for W, b in params:
        outputs = np.dot(inputs, W) + b
        inputs = swish(outputs)    
    return outputs

    
# Here is our initial guess of params:
params = init_random_params(0.1, layer_sizes=[1, 8, 1])

# Derivatives
dCadt = elementwise_grad(Ca, 1)

k = 0.23
Ca0 = 2.0
t = np.linspace(0, 10).reshape((-1, 1))

# This is the function we seek to minimize
def objective(params, step):
    # These should all be zero at the solution
    # dCadt = -k * Ca(t)
    zeq = dCadt(params, t) - (-k * Ca(params, t))
    ic = Ca(params, 0) - Ca0
    return np.mean(zeq**2) + ic**2

def callback(params, step, g):
    if step % 1000 == 0:
        print("Iteration {0:3d} objective {1}".format(step,
                                                      objective(params, step)))

params = adam(grad(objective), params,
              step_size=0.001, num_iters=5001, callback=callback) 


tfit = np.linspace(0, 20).reshape(-1, 1)
import matplotlib.pyplot as plt
plt.plot(tfit, Ca(params, tfit), label='soln')
plt.plot(tfit, Ca0 * np.exp(-k * tfit), 'r--', label='analytical soln')
plt.legend()
plt.xlabel('time')
plt.ylabel('$C_A$')
plt.xlim([0, 20])
plt.savefig('nn-ode.png')
Iteration   0 objective [[ 3.20374053]]
Iteration 1000 objective [[  3.13906829e-05]]
Iteration 2000 objective [[  1.95894699e-05]]
Iteration 3000 objective [[  1.60381564e-05]]
Iteration 4000 objective [[  1.39930673e-05]]
Iteration 5000 objective [[  1.03554970e-05]]

Huh. Those two solutions are nearly indistinguishable. Since we used a neural network, let's hype it up and say we learned the solution to a differential equation! But seriously, note that although we got an "analytical" solution, we should only rely on it in the region we trained the solution on. You can see the solution above is not that good past t=10, even perhaps going negative (which is not even physically correct). That is a reminder that the function we have for the solution is not the same as the analytical solution, it just approximates it really well over the region we solved over. Of course, you can expand that region to the region you care about, but the main point is don't rely on the solution outside where you know it is good.

This idea isn't new. There are several papers in the literature on using neural networks to solve differential equations, e.g. http://www.sciencedirect.com/science/article/pii/S0255270102002076 and https://arxiv.org/pdf/physics/9705023.pdf, and other blog posts that are similar (https://becominghuman.ai/neural-networks-for-solving-differential-equations-fa230ac5e04c, even using autograd). That means to me that there is some merit to continuing to investigate this approach to solving differential equations.

There are some interesting challenges for engineers to consider with this approach though. When is the solution accurate enough? How reliable are derivatives of the solution? What network architecture is appropriate or best? How do you know how good the solution is? Is it possible to build in solution features, e.g. asymptotes, or constraints on derivatives, or that the solution should be monotonic, etc. These would help us trust the solutions not to do weird things, and to extrapolate more reliably.

Copyright (C) 2017 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.1.2

Read and Post Comments

Uncertainty in the solution of an ODE

| categories: ode, uncertainty | tags: | View Comments

Our objective in this post is to examine the effects of uncertainty in parameters that define an ODE on the integrated solution of the ODE. My favorite method for numerical uncertainty analysis is Monte Carlo simulation because it is easy to code and usually easy to understand. We take that approach first.

The problem to solve is to estimate the conversion in a constant volume batch reactor with a second order reaction \(A \rightarrow B\), and the rate law: \(-r_A = k C_A^2\), after one hour of reaction. There is 5% uncertainty in the rate constant \(k=0.001\) and in the initial concentration \(C_{A0}=1\).

The relevant differential equation is:

\(\frac{dX}{dt} = -r_A /C_{A0}\).

We have to assume that 5% uncertainty refers to a normal distribution of error that has a standard deviation of 5% of the mean value.

from scipy.integrate import odeint
import numpy as np

N = 1000

K = np.random.normal(0.001, 0.05*0.001, N)
CA0 = np.random.normal(1, 0.05*1, N)

X = [] # to store answer in
for k, Ca0 in zip(K, CA0):
    # define ODE
    def ode(X, t):
        ra = -k * (Ca0 * (1 - X))**2
        return -ra / Ca0
    
    X0 = 0
    tspan = np.linspace(0,3600)

    sol = odeint(ode, X0, tspan)

    X += [sol[-1][0]]

s = 'Final conversion at one hour is {0:1.3f} +- {1:1.3f} (1 sigma)'
print s.format(np.average(X),
               np.std(X))
Final conversion at one hour is 0.782 +- 0.013 (1 sigma)

See, it is not too difficulty to write. It is however, a little on the expensive side to run, since we typically need 1e3-1e6 samples to get the statistics reasonable. Let us try the uncertainties package too. For this we have to wrap a function that takes uncertainties and returns a single float number.

from scipy.integrate import odeint
import numpy as np
import uncertainties as u

k = u.ufloat(0.001, 0.05*0.001)
Ca0 = u.ufloat(1.0, 0.05)

@u.wrap
def func(k, Ca0):
    # define the ODE
    def ode(X, t):
        ra = -k * (Ca0 * (1 - X))**2
        return -ra / Ca0
    
    X0 = 0 # initial condition
    tspan = np.linspace(0, 3600)
    # integrate it
    sol = odeint(ode, X0, tspan)
    return sol[-1][0]

result = func(k, Ca0)
s = 'Final conversion at one hour is {0}(1 sigma)'
print s.format(result)
Final conversion at one hour is 0.783+/-0.012(1 sigma)

This is about the same amount of code as the Monte Carlo approach, but it runs much faster, and gets approximately the same results. You have to remember the wrapping technique, since the uncertainties package does not run natively with the odeint function.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Read and Post Comments

Linear algebra approaches to solving systems of constant coefficient ODEs

| categories: ode | tags: | View Comments

Matlab post Today we consider how to solve a system of first order, constant coefficient ordinary differential equations using linear algebra. These equations could be solved numerically, but in this case there are analytical solutions that can be derived. The equations we will solve are:

\(y'_1 = -0.02 y_1 + 0.02 y_2\)

\(y'_2 = 0.02 y_1 - 0.02 y_2\)

We can express this set of equations in matrix form as: \(\left[\begin{array}{c}y'_1\\y'_2\end{array}\right] = \left[\begin{array}{cc} -0.02 & 0.02 \\ 0.02 & -0.02\end{array}\right] \left[\begin{array}{c}y_1\\y_2\end{array}\right]\)

The general solution to this set of equations is

\(\left[\begin{array}{c}y_1\\y_2\end{array}\right] = \left[\begin{array}{cc}v_1 & v_2\end{array}\right] \left[\begin{array}{cc} c_1 & 0 \\ 0 & c_2\end{array}\right] \exp\left(\left[\begin{array}{cc} \lambda_1 & 0 \\ 0 & \lambda_2\end{array}\right] \left[\begin{array}{c}t\\t\end{array}\right]\right)\)

where \(\left[\begin{array}{cc} \lambda_1 & 0 \\ 0 & \lambda_2\end{array}\right]\) is a diagonal matrix of the eigenvalues of the constant coefficient matrix, \(\left[\begin{array}{cc}v_1 & v_2\end{array}\right]\) is a matrix of eigenvectors where the \(i^{th}\) column corresponds to the eigenvector of the \(i^{th}\) eigenvalue, and \(\left[\begin{array}{cc} c_1 & 0 \\ 0 & c_2\end{array}\right]\) is a matrix determined by the initial conditions.

In this example, we evaluate the solution using linear algebra. The initial conditions we will consider are \(y_1(0)=0\) and \(y_2(0)=150\).

import numpy as np

A = np.array([[-0.02,  0.02],
              [ 0.02, -0.02]])

# Return the eigenvalues and eigenvectors of a Hermitian or symmetric matrix.
evals, evecs = np.linalg.eigh(A)
print evals
print evecs
>>> ... >>> >>> ... >>> [-0.04  0.  ]
[[ 0.70710678  0.70710678]
 [-0.70710678  0.70710678]]

The eigenvectors are the columns of evecs.

Compute the \(c\) matrix

V*c = Y0

Y0 = [0, 150]

c = np.diag(np.linalg.solve(evecs, Y0))
print c
>>> >>> [[-106.06601718    0.        ]
 [   0.          106.06601718]]

Constructing the solution

We will create a vector of time values, and stack them for each solution, \(y_1(t)\) and \(Y_2(t)\).

import matplotlib.pyplot as plt

t = np.linspace(0, 100)
T = np.row_stack([t, t])

D = np.diag(evals)

# y = V*c*exp(D*T);
y = np.dot(np.dot(evecs, c), np.exp(np.dot(D, T)))

# y has a shape of (2, 50) so we have to transpose it
plt.plot(t, y.T)
plt.xlabel('t')
plt.ylabel('y')
plt.legend(['$y_1$', '$y_2$'])
plt.savefig('images/ode-la.png')
plt.show()
>>> >>> >>> >>> ... >>> >>> ... [<matplotlib.lines.Line2D object at 0x1d4db950>, <matplotlib.lines.Line2D object at 0x1d4db4d0>]
<matplotlib.text.Text object at 0x1d35fbd0>
<matplotlib.text.Text object at 0x1c222390>
<matplotlib.legend.Legend object at 0x1d34ee90>

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Read and Post Comments

Another way to parameterize an ODE - nested function

| categories: ode | tags: | View Comments

Matlab post We saw one method to parameterize an ODE, by creating an ode function that takes an extra parameter argument, and then making a function handle that has the syntax required for the solver, and passes the parameter the ode function.

Here we define the ODE function in a loop. Since the nested function is in the namespace of the main function, it can “see” the values of the variables in the main function. We will use this method to look at the solution to the van der Pol equation for several different values of mu.

import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt

MU = [0.1, 1, 2, 5]
tspan = np.linspace(0, 100, 5000)
Y0 = [0, 3]

for mu in MU:
    # define the ODE
    def vdpol(Y, t):
        x,y = Y
        dxdt = y
        dydt = -x + mu * (1 - x**2) * y
        return  [dxdt, dydt]
    
    Y = odeint(vdpol, Y0, tspan)
    
    x = Y[:,0]; y = Y[:,1]
    plt.plot(x, y, label='mu={0:1.2f}'.format(mu))

plt.axis('equal')
plt.legend(loc='best')
plt.savefig('images/ode-nested-parameterization.png')
plt.show()

You can see the solution changes dramatically for different values of mu. The point here is not to understand why, but to show an easy way to study a parameterize ode with a nested function. Nested functions can be a great way to “share” variables between functions especially for ODE solving, and nonlinear algebra solving, or any other application where you need a lot of parameters defined in one function in another function.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Read and Post Comments

Yet another way to parameterize an ODE

| categories: ode | tags: | View Comments

Matlab post We previously examined a way to parameterize an ODE. In those methods, we either used an anonymous function to parameterize an ode function, or we used a nested function that used variables from the shared workspace.

We want a convenient way to solve \(dCa/dt = -k Ca\) for multiple values of \(k\). Here we use a trick to pass a parameter to an ODE through the initial conditions. We expand the ode function definition to include this parameter, and set its derivative to zero, effectively making it a constant.

import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt

def ode(F, t):
    Ca, k = F
    dCadt = -k * Ca
    dkdt = 0.0
    return [dCadt, dkdt]

tspan = np.linspace(0, 4)

Ca0 = 1;
K = [2.0, 3.0]
for k in K:
    F = odeint(ode, [Ca0, k], tspan)
    Ca = F[:,0]
    plt.plot(tspan, Ca, label='k={0}'.format(k))
plt.xlabel('time')
plt.ylabel('$C_A$')
plt.legend(loc='best')
plt.savefig('images/ode-parameterized-1.png')
plt.show()

I do not think this is a very elegant way to pass parameters around compared to the previous methods, but it nicely illustrates that there is more than one way to do it. And who knows, maybe it will be useful in some other context one day!

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Read and Post Comments

Next Page ยป