<?xml version="1.0" encoding="UTF-8"?>

<rss version="2.0"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
     xmlns:atom="http://www.w3.org/2005/Atom"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:wfw="http://wellformedweb.org/CommentAPI/"
     >
  <channel>
    <atom:link href="http://kitchingroup.cheme.cmu.edu/blog/feed/index.xml" rel="self" type="application/rss+xml" />
    <title>The Kitchin Research Group</title>
    <link>https://kitchingroup.cheme.cmu.edu/blog</link>
    <description>Chemical Engineering at Carnegie Mellon University</description>
    <pubDate>Sat, 01 Nov 2025 13:47:46 GMT</pubDate>
    <generator>Blogofile</generator>
    <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:updateFrequency>1</sy:updateFrequency>
    
    <item>
      <title>Solving differential algebraic equations with help from autograd</title>
      <link>https://kitchingroup.cheme.cmu.edu/blog/2019/09/22/Solving-differential-algebraic-equations-with-help-from-autograd</link>
      <pubDate>Sun, 22 Sep 2019 12:59:25 EDT</pubDate>
      <category><![CDATA[dae]]></category>
      <category><![CDATA[autograd]]></category>
      <category><![CDATA[ode]]></category>
      <guid isPermaLink="false">eATX_e4VOuQRCAt5B_CujmZtj4w=</guid>
      <description>Solving differential algebraic equations with help from autograd</description>
      <content:encoded><![CDATA[


&lt;p&gt;
This problem is adapted from one in "Problem Solving in Chemical Engineering with Numerical Methods, Michael B. Cutlip, Mordechai Shacham".
&lt;/p&gt;

&lt;p&gt;
In the binary, batch distillation of benzene (1) and toluene (2), the moles of liquid \(L\) remaining as a function of the mole fraction of toluene (\(x_2\)) is expressed by:
&lt;/p&gt;

&lt;p&gt;
\(\frac{dL}{dx_2} = \frac{L}{x_2 (k_2 - 1)}\)
&lt;/p&gt;

&lt;p&gt;
where \(k_2\) is the vapor liquid equilibrium ratio for toluene. This can be computed as:
&lt;/p&gt;

&lt;p&gt;
\(k_i = P_i / P\) where \(P_i = 10^{A_i + \frac{B_i}{T +C_i}}\) and that pressure is in mmHg, and the temperature is in degrees Celsius.
&lt;/p&gt;

&lt;p&gt;
One difficulty in solving this problem is that the temperature is not constant; it changes with the composition. We know that the temperature changes to satisfy this constraint  \(k_1(T) x_1 + k_2(T) x_2 = 1\).
&lt;/p&gt;

&lt;p&gt;
Sometimes, one can solve for T directly, and substitute it into the first ODE, but this is not a possibility here. One way you might solve this is to use the constraint to find \(T\) inside an ODE function, but that is tricky; nonlinear algebra solvers need a guess and don't always converge, or may converge to non-physical solutions. They also require iterative solutions, so they will be slower than an approach where we just have to integrate the solution.  A better way is to derive a second ODE \(dT/dx_2\) from the constraint.  The constraint is implicit in \(T\), so We  compute it as \(dT/dx_2 = -df/dx_2 / df/dT\) where \(f(x_2, T) = k_1(T) x_1 + k_2(T) x_2  - 1 = 0\). This equation is used to compute the bubble point temperature. Note, it is possible to derive these analytically, but who wants to?  We can use autograd to get those derivatives for us instead.
&lt;/p&gt;

&lt;p&gt;
The following information is given:
&lt;/p&gt;

&lt;p&gt;
The total pressure is fixed at 1.2 atm, and the distillation starts at \(x_2=0.4\). There are initially 100 moles in the distillation.
&lt;/p&gt;

&lt;table border="2" cellspacing="0" cellpadding="6" rules="groups" frame="hsides"&gt;


&lt;colgroup&gt;
&lt;col  class="org-left" /&gt;

&lt;col  class="org-right" /&gt;

&lt;col  class="org-right" /&gt;

&lt;col  class="org-right" /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th scope="col" class="org-left"&gt;species&lt;/th&gt;
&lt;th scope="col" class="org-right"&gt;A&lt;/th&gt;
&lt;th scope="col" class="org-right"&gt;B&lt;/th&gt;
&lt;th scope="col" class="org-right"&gt;C&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td class="org-left"&gt;benzene&lt;/td&gt;
&lt;td class="org-right"&gt;6.90565&lt;/td&gt;
&lt;td class="org-right"&gt;-1211.033&lt;/td&gt;
&lt;td class="org-right"&gt;220.79&lt;/td&gt;
&lt;/tr&gt;

&lt;tr&gt;
&lt;td class="org-left"&gt;toluene&lt;/td&gt;
&lt;td class="org-right"&gt;6.95464&lt;/td&gt;
&lt;td class="org-right"&gt;-1344.8&lt;/td&gt;
&lt;td class="org-right"&gt;219.482&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;
We have to start by finding the initial temperature from the constraint.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; autograd.numpy &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; np
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; autograd &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; grad
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; scipy.integrate &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; solve_ivp
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; scipy.optimize &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; fsolve
%matplotlib inline
&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; matplotlib.pyplot &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; plt

&lt;span style="color: #BA36A5;"&gt;P&lt;/span&gt; = 760 * 1.2 &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;mmHg&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;A1&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;B1&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;C1&lt;/span&gt; = 6.90565, -1211.033,  220.79
&lt;span style="color: #BA36A5;"&gt;A2&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;B2&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;C2&lt;/span&gt; = 6.95464, -1344.8, 219.482

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;k1&lt;/span&gt;(T):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; 10**(A1 + B1 / (C1 + T)) / P

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;k2&lt;/span&gt;(T):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; 10**(A2 + B2 / (C2 + T)) / P

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;f&lt;/span&gt;(x2, T):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;x1&lt;/span&gt; = 1 - x2
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; k1(T) * x1 + k2(T) * x2 - 1

T0, = fsolve(&lt;span style="color: #0000FF;"&gt;lambda&lt;/span&gt; T: f(0.4, T), 96)
&lt;span style="color: #0000FF;"&gt;print&lt;/span&gt;(f&lt;span style="color: #008000;"&gt;'The initial temperature is {T0:1.2f} degC.'&lt;/span&gt;)
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
The initial temperature is 95.59 degC.
&lt;/p&gt;

&lt;p&gt;
Next, we compute the derivative we need. This derivative is derived from the constraint, which should ensure that the temperature changes as required to maintain the constraint.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;dfdx2&lt;/span&gt; = grad(f, 0)
&lt;span style="color: #BA36A5;"&gt;dfdT&lt;/span&gt; = grad(f, 1)

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;dTdx2&lt;/span&gt;(x2, T):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; -dfdx2(x2, T) / dfdT(x2, T)

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;ode&lt;/span&gt;(x2, X):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;L&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;T&lt;/span&gt; = X
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;dLdx2&lt;/span&gt; = L / (x2 * (k2(T) - 1))
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; [dLdx2, dTdx2(x2, T)]
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
Next we solve and plot the ODE.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;x2span&lt;/span&gt; = (0.4, 0.8)
&lt;span style="color: #BA36A5;"&gt;X0&lt;/span&gt; = (100, T0)
&lt;span style="color: #BA36A5;"&gt;sol&lt;/span&gt; = solve_ivp(ode, x2span, X0, max_step=0.01)

plt.plot(sol.t, sol.y.T)
plt.legend([&lt;span style="color: #008000;"&gt;'L'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'T'&lt;/span&gt;]);
plt.xlabel(&lt;span style="color: #008000;"&gt;'$x_2$'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'L, T'&lt;/span&gt;)
&lt;span style="color: #BA36A5;"&gt;x2&lt;/span&gt; = sol.t
&lt;span style="color: #BA36A5;"&gt;L&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;T&lt;/span&gt; = sol.y
&lt;span style="color: #0000FF;"&gt;print&lt;/span&gt;(f&lt;span style="color: #008000;"&gt;'At x2={x2[-1]:1.2f} there are {L[-1]:1.2f} moles of liquid left at {T[-1]:1.2f} degC'&lt;/span&gt;)
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
At x2=0.80 there are 14.04 moles of liquid left at 108.57 degC
&lt;/p&gt;

&lt;pre class="example"&gt;
&amp;lt;Figure size 432x288 with 1 Axes&amp;gt;
&lt;/pre&gt;


&lt;p&gt;
&lt;figure&gt;&lt;img src="/media/a75e63c53e3c2cb02c40c808789084c337e174ff.png"&gt;&lt;/figure&gt; 
&lt;/p&gt;

&lt;p&gt;
You can see that the liquid level drops, and the temperature rises.
&lt;/p&gt;

&lt;p&gt;
Let's double check that the constraint is actually met. We do that qualitatively here by plotting it, and quantitatively by showing all values are close to 0.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;constraint&lt;/span&gt; = k1(T) * (1 - x2) + k2(T) * x2 - 1
plt.plot(x2, constraint)
plt.ylim([-1, 1])
plt.xlabel(&lt;span style="color: #008000;"&gt;'$x_2$'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'constraint value'&lt;/span&gt;)
&lt;span style="color: #0000FF;"&gt;print&lt;/span&gt;(np.allclose(constraint, np.zeros_like(constraint)))
constraint
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
True
&lt;/p&gt;

&lt;pre class="example"&gt;
array([ 2.22044605e-16,  4.44089210e-16,  2.22044605e-16,  0.00000000e+00,
        1.11022302e-15,  0.00000000e+00,  6.66133815e-16,  0.00000000e+00,
       -2.22044605e-16,  1.33226763e-15,  8.88178420e-16, -4.44089210e-16,
        4.44089210e-16,  1.11022302e-15, -2.22044605e-16,  0.00000000e+00,
       -2.22044605e-16, -1.11022302e-15,  4.44089210e-16,  0.00000000e+00,
       -4.44089210e-16,  4.44089210e-16, -6.66133815e-16, -4.44089210e-16,
        4.44089210e-16, -1.11022302e-16, -8.88178420e-16, -8.88178420e-16,
       -9.99200722e-16, -3.33066907e-16, -7.77156117e-16, -2.22044605e-16,
       -9.99200722e-16, -1.11022302e-15, -3.33066907e-16, -1.99840144e-15,
       -1.33226763e-15, -2.44249065e-15, -1.55431223e-15, -6.66133815e-16,
       -2.22044605e-16])
&lt;/pre&gt;


&lt;pre class="example"&gt;
&amp;lt;Figure size 432x288 with 1 Axes&amp;gt;
&lt;/pre&gt;


&lt;p&gt;
&lt;figure&gt;&lt;img src="/media/bb2b32002658b8724d214f2441c9f55a97c565c8.png"&gt;&lt;/figure&gt; 
&lt;/p&gt;


&lt;p&gt;
So indeed, the constraint is met! Once again, autograd comes to the rescue in making a computable derivative from an algebraic constraint so that we can solve a DAE as a set of ODEs using our regular machinery. Nice work autograd!
&lt;/p&gt;
&lt;p&gt;Copyright (C) 2019 by John Kitchin. See the &lt;a href="/copying.html"&gt;License&lt;/a&gt; for information about copying.&lt;p&gt;
&lt;p&gt;&lt;a href="/org/2019/09/22/Solving-differential-algebraic-equations-with-help-from-autograd.org"&gt;org-mode source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Org-mode version = 9.2.3&lt;/p&gt;]]></content:encoded>
    </item>
    <item>
      <title>Sensitivity analysis with odeint and autograd</title>
      <link>https://kitchingroup.cheme.cmu.edu/blog/2019/09/13/Sensitivity-analysis-with-odeint-and-autograd</link>
      <pubDate>Fri, 13 Sep 2019 09:56:09 EDT</pubDate>
      <category><![CDATA[autograd]]></category>
      <category><![CDATA[ode]]></category>
      <guid isPermaLink="false">a-lpqe22WfbPZV59JCJkZAVvMR0=</guid>
      <description>Sensitivity analysis with odeint and autograd</description>
      <content:encoded><![CDATA[


&lt;p&gt;
In this &lt;a href="http://kitchingroup.cheme.cmu.edu/blog/2018/10/11/A-differentiable-ODE-integrator-for-sensitivity-analysis/"&gt;previous post&lt;/a&gt; I showed a way to do sensitivity analysis of the solution of a differential equation to parameters in the equation using autograd. The basic approach was to write a differentiable integrator, and then use it in a function so that autograd could take the derivative.
&lt;/p&gt;

&lt;p&gt;
Since that time, autograd has added &lt;a href="https://github.com/HIPS/autograd/blob/master/autograd/scipy/integrate.py"&gt;derivative support&lt;/a&gt; for &lt;code&gt;scipy.integrate.odeint&lt;/code&gt;. In this post we examine that. As usual with autograd, we have to import the autograd version of numpy, and the autograd version of odeint. We will find the derivative of the solution to an ODE (which is an array) so we need to also import the jacobian function. Finally, there is a subtle, and non-obvious requirement that we need to import the autograd tuple. That ensures that the variables are differentiable through the tuple we will use for the arguments.
&lt;/p&gt;

&lt;p&gt;
The differential equation we solve returns the concentration of a species as a function of time, and the solution depends on two parameters, i.e. \(C = f(t; k_1, k_{-1})\), and we are interested in the time-dependent sensitivity of \(C\) with respect to those parameters. The approach we use is to define a function that has those parameters as arguments. The function will solve the ODE and return the time-dependent solution. First we make that solution, mostly to see that the autograd version of odeint works.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; autograd.numpy &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; np
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; autograd.scipy.integrate &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; odeint
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; autograd &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; jacobian
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; autograd.builtins &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; &lt;span style="color: #006FE0;"&gt;tuple&lt;/span&gt;

&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; matplotlib.pyplot &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; plt

&lt;span style="color: #BA36A5;"&gt;Ca0&lt;/span&gt; = 1.0
&lt;span style="color: #BA36A5;"&gt;k1&lt;/span&gt; = &lt;span style="color: #BA36A5;"&gt;k_1&lt;/span&gt; = 3.0

&lt;span style="color: #BA36A5;"&gt;tspan&lt;/span&gt; = np.linspace(0, 0.5)

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;C&lt;/span&gt;(K):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;k1&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;k_1&lt;/span&gt; = K
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;dCdt&lt;/span&gt;(Ca, t, k1, k_1):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; -k1 * Ca + k_1 * (Ca0 - Ca)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;sol&lt;/span&gt; = odeint(dCdt, Ca0, tspan, &lt;span style="color: #006FE0;"&gt;tuple&lt;/span&gt;((k1, k_1)))
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; sol

plt.plot(tspan, C([k1, k_1]))
plt.xlim([tspan.&lt;span style="color: #006FE0;"&gt;min&lt;/span&gt;(), tspan.&lt;span style="color: #006FE0;"&gt;max&lt;/span&gt;()])
plt.xlabel(&lt;span style="color: #008000;"&gt;'t'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'C'&lt;/span&gt;);
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
&amp;lt;Figure size 432x288 with 1 Axes&amp;gt;
&lt;/pre&gt;


&lt;p&gt;
&lt;figure&gt;&lt;img src="/media/bca9e95a16f361ce6d92dd6efe90a2e653e014ef.png"&gt;&lt;/figure&gt; 
&lt;/p&gt;


&lt;p&gt;
Now, the solution is an array, and we want the derivative of C with respect to the parameters at each time point. That means we want the jacobian derivative of the output with respect to the input. Here is the autograd approach to doing that. The jacobian function returns a function that we can evaluate to get the derivatives.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; time
&lt;span style="color: #BA36A5;"&gt;t0&lt;/span&gt; = time.time()
&lt;span style="color: #BA36A5;"&gt;dCdk&lt;/span&gt; = jacobian(C, 0)


&lt;span style="color: #BA36A5;"&gt;k_sensitivity&lt;/span&gt; = dCdk(np.array([k1, k_1]))

&lt;span style="color: #BA36A5;"&gt;k1_sensitivity&lt;/span&gt; = k_sensitivity[:, 0, 0]
&lt;span style="color: #BA36A5;"&gt;k_1_sensitivity&lt;/span&gt; = k_sensitivity[:, 0, 1]

plt.plot(tspan, np.&lt;span style="color: #006FE0;"&gt;abs&lt;/span&gt;(k1_sensitivity), label=&lt;span style="color: #008000;"&gt;'dC/dk1'&lt;/span&gt;)
plt.plot(tspan, np.&lt;span style="color: #006FE0;"&gt;abs&lt;/span&gt;(k_1_sensitivity), label=&lt;span style="color: #008000;"&gt;'dC/dk_1'&lt;/span&gt;)
plt.legend(loc=&lt;span style="color: #008000;"&gt;'best'&lt;/span&gt;)
plt.xlabel(&lt;span style="color: #008000;"&gt;'t'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'sensitivity'&lt;/span&gt;)
&lt;span style="color: #0000FF;"&gt;print&lt;/span&gt;(f&lt;span style="color: #008000;"&gt;'Elapsed time = {time.time() - t0:1.1f} seconds'&lt;/span&gt;)
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
Elapsed time = 38.2 seconds
&lt;/p&gt;

&lt;pre class="example"&gt;
&amp;lt;Figure size 432x288 with 1 Axes&amp;gt;
&lt;/pre&gt;


&lt;p&gt;
&lt;figure&gt;&lt;img src="/media/3a0a58bb6d4b3e1b215c2918d511f3a8a3a2ca3d.png"&gt;&lt;/figure&gt; 
&lt;/p&gt;

&lt;p&gt;
That looks similar to the results from before. It is pretty slow I think, that took more than half a minute to work out. That is still faster and probably more correct than if I had to do it by hand. In contrast, however, the finite difference code below is comparatively very fast! I don't know what is slow in the autograd implementation. I guess it is an implementation detail.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; numdifftools &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; nd
&lt;span style="color: #BA36A5;"&gt;t0&lt;/span&gt; = time.time()

&lt;span style="color: #BA36A5;"&gt;fdk1&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;fdk_1&lt;/span&gt; = nd.Jacobian(C)([k1, k_1]).T
&lt;span style="color: #0000FF;"&gt;print&lt;/span&gt;(f&lt;span style="color: #008000;"&gt;'Elapsed time = {time.time() - t0:1.1f} seconds'&lt;/span&gt;)

plt.plot(tspan, np.&lt;span style="color: #006FE0;"&gt;abs&lt;/span&gt;(fdk1), label=&lt;span style="color: #008000;"&gt;'fd dC/dk1'&lt;/span&gt;)
plt.plot(tspan, np.&lt;span style="color: #006FE0;"&gt;abs&lt;/span&gt;(fdk_1), label=&lt;span style="color: #008000;"&gt;'fd dC/dk_1'&lt;/span&gt;)
plt.plot(tspan, np.&lt;span style="color: #006FE0;"&gt;abs&lt;/span&gt;(k1_sensitivity), &lt;span style="color: #008000;"&gt;'y--'&lt;/span&gt;, label=&lt;span style="color: #008000;"&gt;'dC/dk1'&lt;/span&gt;)
plt.plot(tspan, np.&lt;span style="color: #006FE0;"&gt;abs&lt;/span&gt;(k_1_sensitivity),&lt;span style="color: #008000;"&gt;'m--'&lt;/span&gt;, label=&lt;span style="color: #008000;"&gt;'dC/dk_1'&lt;/span&gt;)
plt.legend(loc=&lt;span style="color: #008000;"&gt;'best'&lt;/span&gt;);
plt.xlabel(&lt;span style="color: #008000;"&gt;'t'&lt;/span&gt;);
plt.ylabel(&lt;span style="color: #008000;"&gt;'sensitivity'&lt;/span&gt;);
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
Elapsed time = 0.1 seconds
&lt;/p&gt;

&lt;pre class="example"&gt;
&amp;lt;Figure size 432x288 with 1 Axes&amp;gt;
&lt;/pre&gt;


&lt;p&gt;
&lt;figure&gt;&lt;img src="/media/be7bf4798396d6a27938715f6bb0e22b8f3e0b1c.png"&gt;&lt;/figure&gt; 
&lt;/p&gt;

&lt;p&gt;
You can see the two results are visually indistinguishable. Even the code is pretty similar. I would tend to prefer the autograd way since it should be less sensitive to finite difference artifacts, but it is nice to have an independent way to test if it is working.
&lt;/p&gt;
&lt;p&gt;Copyright (C) 2019 by John Kitchin. See the &lt;a href="/copying.html"&gt;License&lt;/a&gt; for information about copying.&lt;p&gt;
&lt;p&gt;&lt;a href="/org/2019/09/13/Sensitivity-analysis-with-odeint-and-autograd.org"&gt;org-mode source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Org-mode version = 9.2.3&lt;/p&gt;]]></content:encoded>
    </item>
    <item>
      <title>Solving coupled ODEs with a neural network and autograd</title>
      <link>https://kitchingroup.cheme.cmu.edu/blog/2018/11/02/Solving-coupled-ODEs-with-a-neural-network-and-autograd</link>
      <pubDate>Fri, 02 Nov 2018 19:53:00 EDT</pubDate>
      <category><![CDATA[autograd]]></category>
      <category><![CDATA[ode]]></category>
      <guid isPermaLink="false">Ry5Ux3UbG7_HZnK_dVp9iHlLtpE=</guid>
      <description>Solving coupled ODEs with a neural network and autograd</description>
      <content:encoded><![CDATA[


&lt;div id="table-of-contents"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id="text-table-of-contents"&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#orgfefaa95"&gt;1. The standard numerical solution&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#orge5dacb7"&gt;2. Can a neural network learn the solution?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#orga332637"&gt;3. Given a neural network function how do we get the right derivatives?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#orgf85faff"&gt;4. Solving the system of ODEs with a neural network&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#orgbbded67"&gt;5. Summary&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;
In a previous &lt;a href="http://kitchingroup.cheme.cmu.edu/blog/2017/11/28/Solving-ODEs-with-a-neural-network-and-autograd/index.html"&gt;post&lt;/a&gt; I wrote about using ideas from machine learning to solve an ordinary differential equation using a neural network for the solution. A friend recently tried to apply that idea to coupled ordinary differential equations, without success. It seems like that should work, so here we diagnose the issue and figure it out. This is a long post, but it works in the end.
&lt;/p&gt;

&lt;p&gt;
In the classic series reaction \(A \rightarrow B \rightarrow C\) in a batch reactor, we get the set of coupled mole balances:
&lt;/p&gt;

&lt;p&gt;
\(dC_A/dt = -k_1 C_A\)
&lt;/p&gt;

&lt;p&gt;
\(dC_B/dt = k_1 C_A - k_2 C_B\)
&lt;/p&gt;

&lt;p&gt;
\(dC_C/dt = k2 C_B\)
&lt;/p&gt;

&lt;div id="outline-container-orgfefaa95" class="outline-2"&gt;
&lt;h2 id="orgfefaa95"&gt;&lt;span class="section-number-2"&gt;1&lt;/span&gt; The standard numerical solution&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-1"&gt;
&lt;p&gt;
Here is the standard numerical solution to this problem. This will give us a reference for what the solution should look like.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; scipy.integrate &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; solve_ivp

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;ode&lt;/span&gt;(t, C):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;Ca&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;Cb&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;Cc&lt;/span&gt; = C
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;dCadt&lt;/span&gt; = -k1 * Ca
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;dCbdt&lt;/span&gt; = k1 * Ca - k2 * Cb
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;dCcdt&lt;/span&gt; = k2 * Cb
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; [dCadt, dCbdt, dCcdt]

&lt;span style="color: #BA36A5;"&gt;C0&lt;/span&gt; = [1.0, 0.0, 0.0]
&lt;span style="color: #BA36A5;"&gt;k1&lt;/span&gt; = 1
&lt;span style="color: #BA36A5;"&gt;k2&lt;/span&gt; = 1

&lt;span style="color: #BA36A5;"&gt;sol&lt;/span&gt; = solve_ivp(ode, (0, 10), C0)

%matplotlib inline
&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; matplotlib.pyplot &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; plt

plt.plot(sol.t, sol.y.T)
plt.legend([&lt;span style="color: #008000;"&gt;'A'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'B'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'C'&lt;/span&gt;])
plt.xlabel(&lt;span style="color: #008000;"&gt;'Time'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'C'&lt;/span&gt;)
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
&lt;img src="/media/d0abffb7b8615837cad7f2cceb378aac-65837xDK.png"&gt; 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id="outline-container-orge5dacb7" class="outline-2"&gt;
&lt;h2 id="orge5dacb7"&gt;&lt;span class="section-number-2"&gt;2&lt;/span&gt; Can a neural network learn the solution?&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-2"&gt;
&lt;p&gt;
The first thing I want to show is that you can train a neural network to reproduce this solution. That is certainly a prerequisite to the idea working. We use the  same code I used before, but this time our neural network will output three values, one for each concentration.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; autograd.numpy &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; np
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; autograd &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; grad, elementwise_grad, jacobian
&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; autograd.numpy.random &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; npr
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; autograd.misc.optimizers &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; adam

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;init_random_params&lt;/span&gt;(scale, layer_sizes, rs=npr.RandomState(0)):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #036A07;"&gt;"""Build a list of (weights, biases) tuples, one for each layer."""&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; [(rs.randn(insize, outsize) * scale,   &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;weight matrix&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;rs.randn(outsize) * scale)           &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;bias vector&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;for&lt;/span&gt; insize, outsize &lt;span style="color: #0000FF;"&gt;in&lt;/span&gt; &lt;span style="color: #006FE0;"&gt;zip&lt;/span&gt;(layer_sizes[:-1], layer_sizes[1:])]

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;swish&lt;/span&gt;(x):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #036A07;"&gt;"see https://arxiv.org/pdf/1710.05941.pdf"&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; x / (1.0 + np.exp(-x))

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;C&lt;/span&gt;(params, inputs):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #036A07;"&gt;"Neural network functions"&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;for&lt;/span&gt; W, b &lt;span style="color: #0000FF;"&gt;in&lt;/span&gt; params:
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;outputs&lt;/span&gt; = np.dot(inputs, W) + b
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;inputs&lt;/span&gt; = swish(outputs)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; outputs

&lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;initial guess for the weights and biases&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;params&lt;/span&gt; = init_random_params(0.1, layer_sizes=[1, 8, 3])
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
Now, we train our network to reproduce the solution. I ran this block manually a bunch of times, but eventually you see that we can train a one layer network with 8 nodes to output all three concentrations pretty accurately. So, there is no issue there, a neural network can represent the solution.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;objective_soln&lt;/span&gt;(params, step):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; np.&lt;span style="color: #006FE0;"&gt;sum&lt;/span&gt;((sol.y.T - C(params, sol.t.reshape([-1, 1])))**2)

&lt;span style="color: #BA36A5;"&gt;params&lt;/span&gt; = adam(grad(objective_soln), params,
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt; step_size=0.001, num_iters=500)

plt.plot(sol.t.reshape([-1, 1]), C(params, sol.t.reshape([-1, 1])),
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;sol.t, sol.y.T, &lt;span style="color: #008000;"&gt;'o'&lt;/span&gt;)
plt.legend([&lt;span style="color: #008000;"&gt;'A'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'B'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'C'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'Ann'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'Bnn'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'Cnn'&lt;/span&gt;])
plt.xlabel(&lt;span style="color: #008000;"&gt;'Time'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'C'&lt;/span&gt;)
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
&lt;img src="/media/d0abffb7b8615837cad7f2cceb378aac-65837YpQ.png"&gt; 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id="outline-container-orga332637" class="outline-2"&gt;
&lt;h2 id="orga332637"&gt;&lt;span class="section-number-2"&gt;3&lt;/span&gt; Given a neural network function how do we get the right derivatives?&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-3"&gt;
&lt;p&gt;
The next issue is how do we get the relevant derivatives. The solution method I developed here relies on using optimization to find a set of weights that produces a neural network whose derivatives are consistent with the ODE equations. So, we need to be able to get the derivatives that are relevant in the equations.
&lt;/p&gt;

&lt;p&gt;
The neural network outputs three concentrations, and we need the time derivatives of them. Autograd provides three options: &lt;code&gt;grad&lt;/code&gt;, &lt;code&gt;elementwise_grad&lt;/code&gt; and &lt;code&gt;jacobian&lt;/code&gt;. We cannot use &lt;code&gt;grad&lt;/code&gt; because our function is not scalar. We cannot use &lt;code&gt;elementwise_grad&lt;/code&gt; because that will give the wrong shape (I think it may be the sum of the gradients). That leaves us with the &lt;code&gt;jacobian&lt;/code&gt;. This, however, gives an initially unintuitive (i.e. it isn't what we need out of the box) result. The output is 4-dimensional in this case, consistent with the documentation of that function.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;jacC&lt;/span&gt; = jacobian(C, 1)
jacC(params, sol.t.reshape([-1, 1])).shape
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
(17, 3, 17, 1)

&lt;/pre&gt;


&lt;p&gt;
Why does it have this shape? Our time input vector we used has 17 time values, in a column vector. That leads to an output from the NN with a shape of (17, 3), i.e. the concentrations of each species at each time. The jacobian will output an array of shape (17, 3, 17, 1), and we have to extract the pieces we want from that. The first and third dimensions are related to the time steps. The second dimension is the species, and the last dimension is nothing here, but is there because the input is in a column. I use some fancy indexing on the array to get the desired arrays of the derivatives. This is not obvious out of the box. I only figured this out by direct comparison of the data from a numerical solution and the output of the jacobian. Here I show how to do that, and make sure that the derivatives we pull out are comparable to the derivatives defined by the ODEs above. Parity here means they are comparable.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;i&lt;/span&gt; = np.arange(&lt;span style="color: #006FE0;"&gt;len&lt;/span&gt;(sol.t))
plt.plot(jacC(params, sol.t.reshape([-1, 1]))[i, 0, i, 0],   -k1 * sol.y[0], &lt;span style="color: #008000;"&gt;'ro'&lt;/span&gt;)
plt.plot(jacC(params, sol.t.reshape([-1, 1]))[i, 1, i, 0],   -k2 * sol.y[1] + k1 * sol.y[0], &lt;span style="color: #008000;"&gt;'bo'&lt;/span&gt;)
plt.plot(jacC(params, sol.t.reshape([-1, 1]))[i, 2, i, 0],   k2 * sol.y[1], &lt;span style="color: #008000;"&gt;'go'&lt;/span&gt;)
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
[&amp;lt;matplotlib.lines.Line2D at 0x118a2e860&amp;gt;]

&lt;/pre&gt;



&lt;p&gt;
&lt;img src="/media/d0abffb7b8615837cad7f2cceb378aac-65837yLF.png"&gt; 
&lt;/p&gt;

&lt;p&gt;
Note this is pretty inefficient. It requires a lot of calculations (the jacobian here has &lt;code class="src src-python"&gt;&lt;span style="color: #0000FF;"&gt;print&lt;/span&gt;(17*3*17)&lt;/code&gt; &lt;code&gt;867&lt;/code&gt; elements) to create the jacobian, and we don't need most of them. You could avoid this by creating separate neural networks for each species, and then just use elementwise_grad on each one. Alternatively, one might be able to more efficiently compute some vector-jacobian product. Nevertheless, it looks like we can get the correct derivatives out of the neural network, we just need a convenient function to return them. Here is one such function for this problem, using a fancier slicing and reshaping to get the derivative array.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;Derivatives&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;jac&lt;/span&gt; = jacobian(C, 1)

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;dCdt&lt;/span&gt;(params, t):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;i&lt;/span&gt; = np.arange(&lt;span style="color: #006FE0;"&gt;len&lt;/span&gt;(t))
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; jac(params, t)[i, :, i].reshape((&lt;span style="color: #006FE0;"&gt;len&lt;/span&gt;(t), 3))
&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;


&lt;div id="outline-container-orgf85faff" class="outline-2"&gt;
&lt;h2 id="orgf85faff"&gt;&lt;span class="section-number-2"&gt;4&lt;/span&gt; Solving the system of ODEs with a neural network&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-4"&gt;
&lt;p&gt;
Finally, we are ready to try solving the ODEs solely by the neural network approach. We reinitialize the neural network first, and define a time grid to solve it on.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;t&lt;/span&gt; = np.linspace(0, 10, 25).reshape((-1, 1))
&lt;span style="color: #BA36A5;"&gt;params&lt;/span&gt; = init_random_params(0.1, layer_sizes=[1, 8, 3])
&lt;span style="color: #BA36A5;"&gt;i&lt;/span&gt; = 0    &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;number of training steps&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;N&lt;/span&gt; = 501  &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;epochs for training&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;et&lt;/span&gt; = 0.0 &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;total elapsed time&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
We define our objective function. This function will be zero at the perfect solution, and has contributions for each mole balance and the initial conditions. It could make sense to put additional penalties for things like negative concentrations, or the sum of concentrations is a constant, but we do not do that here, and it does not seem to be necessary.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;objective&lt;/span&gt;(params, step):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;Ca&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;Cb&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;Cc&lt;/span&gt; = C(params, t).T
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;dCadt&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;dCbdt&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;dCcdt&lt;/span&gt; = dCdt(params, t).T

&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;z1&lt;/span&gt; = np.&lt;span style="color: #006FE0;"&gt;sum&lt;/span&gt;((dCadt + k1 * Ca)**2)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;z2&lt;/span&gt; = np.&lt;span style="color: #006FE0;"&gt;sum&lt;/span&gt;((dCbdt - k1 * Ca + k2 * Cb)**2)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;z3&lt;/span&gt; = np.&lt;span style="color: #006FE0;"&gt;sum&lt;/span&gt;((dCcdt - k2 * Cb)**2)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;ic&lt;/span&gt; = np.&lt;span style="color: #006FE0;"&gt;sum&lt;/span&gt;((np.array([Ca[0], Cb[0], Cc[0]]) - C0)**2)  &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;initial conditions&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; z1 + z2 + z3 + ic

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;callback&lt;/span&gt;(params, step, g):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;if&lt;/span&gt; step % 100 == 0:
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;print&lt;/span&gt;(&lt;span style="color: #008000;"&gt;"Iteration {0:3d} objective {1}"&lt;/span&gt;.&lt;span style="color: #006FE0;"&gt;format&lt;/span&gt;(step,
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt; objective(params, step)))

objective(params, 0)  &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;make sure the objective is scalar&lt;/span&gt;
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
5.2502237371050295

&lt;/pre&gt;

&lt;p&gt;
Finally, we run the optimization. I also manually ran this block several times.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; time
&lt;span style="color: #BA36A5;"&gt;t0&lt;/span&gt; = time.time()

&lt;span style="color: #BA36A5;"&gt;params&lt;/span&gt; = adam(grad(objective), params,
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt; step_size=0.001, num_iters=N, callback=callback)

&lt;span style="color: #BA36A5;"&gt;i&lt;/span&gt; += N
&lt;span style="color: #BA36A5;"&gt;t1&lt;/span&gt; = (time.time() - t0) / 60
&lt;span style="color: #BA36A5;"&gt;et&lt;/span&gt; += t1

plt.plot(t, C(params, t), sol.t, sol.y.T, &lt;span style="color: #008000;"&gt;'o'&lt;/span&gt;)
plt.legend([&lt;span style="color: #008000;"&gt;'Ann'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'Bnn'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'Cnn'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'A'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'B'&lt;/span&gt;, &lt;span style="color: #008000;"&gt;'C'&lt;/span&gt;])
plt.xlabel(&lt;span style="color: #008000;"&gt;'Time'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'C'&lt;/span&gt;)
&lt;span style="color: #0000FF;"&gt;print&lt;/span&gt;(f&lt;span style="color: #008000;"&gt;'{t1:1.1f} minutes elapsed this time. Total time = {et:1.2f} min. Total epochs = {i}.'&lt;/span&gt;)
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
Iteration   0 objective 0.00047651643957525214
Iteration 100 objective 0.0004473301532609342
Iteration 200 objective 0.00041218410058863227
Iteration 300 objective 0.00037161526137030344
Iteration 400 objective 0.000327567400443358
Iteration 500 objective 0.0002836975879675981
0.6 minutes elapsed this time. Total time = 4.05 min. Total epochs = 3006.


&lt;/pre&gt;


&lt;p&gt;
&lt;img src="/media/d0abffb7b8615837cad7f2cceb378aac-65837AXS.png"&gt; 
&lt;/p&gt;

&lt;p&gt;
The effort seems to have been worth it though, we get a pretty good solution from our neural network.
&lt;/p&gt;

&lt;p&gt;
We can check the accuracy of the derivatives by noting the sum of the derivatives in this case should be zero. Here you can see that the sum is pretty small. It would take additional optimization to a lower error to get this to be smaller.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;plt.plot(t, np.&lt;span style="color: #006FE0;"&gt;sum&lt;/span&gt;(dCdt(params, t), axis=1))
plt.xlabel(&lt;span style="color: #008000;"&gt;'Time'&lt;/span&gt;)
plt.ylabel(r&lt;span style="color: #008000;"&gt;'$\Sigma dC/dt$'&lt;/span&gt;)
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
&lt;img src="/media/d0abffb7b8615837cad7f2cceb378aac-65837NhY.png"&gt; 
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;



&lt;div id="outline-container-orgbbded67" class="outline-2"&gt;
&lt;h2 id="orgbbded67"&gt;&lt;span class="section-number-2"&gt;5&lt;/span&gt; Summary&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-5"&gt;
&lt;p&gt;
In the end, this method is illustrated to work for systems of ODEs also. There is some subtlety in how to get the relevant derivatives from the jacobian, but after that, it is essentially the same. I think it would be &lt;i&gt;much&lt;/i&gt; faster to do this with separate neural networks for each function in the solution because then you do not need the jacobian, you can use elementwise_grad.
&lt;/p&gt;

&lt;p&gt;
This is not faster than direct numerical integration. One benefit to this solution over a numerical solution is we get an actual continuous function as the solution, rather than an array of data.  This solution is not reliable at longer times, but then again neither is extrapolation of numeric data. It could be interesting to explore if this has any benefits for stiff equations. Maybe another day. For now, I am declaring victory for autograd on this problem.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Copyright (C) 2018 by John Kitchin. See the &lt;a href="/copying.html"&gt;License&lt;/a&gt; for information about copying.&lt;p&gt;
&lt;p&gt;&lt;a href="/org/2018/11/02/Solving-coupled-ODEs-with-a-neural-network-and-autograd.org"&gt;org-mode source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Org-mode version = 9.1.14&lt;/p&gt;]]></content:encoded>
    </item>
    <item>
      <title>A differentiable ODE integrator for sensitivity analysis</title>
      <link>https://kitchingroup.cheme.cmu.edu/blog/2018/10/11/A-differentiable-ODE-integrator-for-sensitivity-analysis</link>
      <pubDate>Thu, 11 Oct 2018 12:13:01 EDT</pubDate>
      <category><![CDATA[autograd]]></category>
      <category><![CDATA[sensitivity]]></category>
      <category><![CDATA[ode]]></category>
      <guid isPermaLink="false">x1b-cquyAyl6Ic4uE4zs0jj8a6U=</guid>
      <description>A differentiable ODE integrator for sensitivity analysis</description>
      <content:encoded><![CDATA[


&lt;p&gt;
&lt;a href="http://kitchingroup.cheme.cmu.edu/blog/2018/10/10/Autograd-and-the-derivative-of-an-integral-function/"&gt;Last time&lt;/a&gt; I wrote about using automatic differentiation to find the derivative of an integral function. A related topic is finding derivatives of functions that are defined by differential equations. We typically use a numerical integrator to find solutions to these functions. Those leave us with numeric solutions which we then have to use to approximate derivatives. What if the integrator itself was differentiable? It is after all, just a program, and automatic differentiation should be able to tell us the derivatives of functions that use them. This is not a new idea, there is already a differentiable ODE solver in &lt;a href="https://www.tensorflow.org/versions/r1.1/api_docs/python/tf/contrib/integrate/odeint"&gt;Tensorflow&lt;/a&gt;. Here I will implement a simple Runge Kutta integrator and then show how we can use automatic differentiation to do &lt;i&gt;sensitivity analysis&lt;/i&gt; on the numeric solution.
&lt;/p&gt;

&lt;p&gt;
I previously used autograd for sensitivity analysis on &lt;i&gt;analytical&lt;/i&gt; solutions in this &lt;a href="http://kitchingroup.cheme.cmu.edu/blog/2017/11/15/Sensitivity-analysis-using-automatic-differentiation-in-Python/"&gt;post&lt;/a&gt;. Here I will compare those results to the results from sensitivity analysis on the &lt;i&gt;numerical solutions&lt;/i&gt;.
&lt;/p&gt;

&lt;p&gt;
First, we need an autograd compatible ODE integrator. Here is one implementation of a simple, fourth order Runge-Kutta integrator. Usually, I would use indexing to do this, but that was not compatible with autograd, so I just accumulate the solution. This is a limitation of autograd, and it is probably not an issue with Tensorflow, for example, or probably pytorch. Those are more sophisticated, and more difficult to use packages than autograd. Here I am just prototyping an idea, so we stick with autograd.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; autograd.numpy &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; np
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; autograd &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; grad
%matplotlib inline
&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; matplotlib.pyplot &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; plt

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;rk4&lt;/span&gt;(f, tspan, y0, N=50):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;x&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;h&lt;/span&gt; = np.linspace(*tspan, N, retstep=&lt;span style="color: #D0372D;"&gt;True&lt;/span&gt;)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;y&lt;/span&gt; = []
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;y&lt;/span&gt; = y + [y0]
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;for&lt;/span&gt; i &lt;span style="color: #0000FF;"&gt;in&lt;/span&gt; &lt;span style="color: #006FE0;"&gt;range&lt;/span&gt;(0, &lt;span style="color: #006FE0;"&gt;len&lt;/span&gt;(x) - 1):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;k1&lt;/span&gt; = h * f(x[i], y[i])
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;k2&lt;/span&gt; = h * f(x[i] + h / 2, y[i] + k1 / 2)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;k3&lt;/span&gt; = h * f(x[i] + h / 2, y[i] + k2 / 2)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;k4&lt;/span&gt; = h * f(x[i + 1], y[i] + k3)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;y&lt;/span&gt; += [y[-1] + (k1 + (2 * k2) + (2 * k3) + k4) / 6]
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; x, y
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
Now, we just check that it works as expected:
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;Ca0&lt;/span&gt; = 1.0
&lt;span style="color: #BA36A5;"&gt;k1&lt;/span&gt; = &lt;span style="color: #BA36A5;"&gt;k_1&lt;/span&gt; = 3.0

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;dCdt&lt;/span&gt;(t, Ca):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; -k1 * Ca + k_1 * (Ca0 - Ca)

&lt;span style="color: #BA36A5;"&gt;t&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;Ca&lt;/span&gt; = rk4(dCdt, (0, 0.5), Ca0)

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;analytical_A&lt;/span&gt;(t, k1, k_1):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; Ca0 / (k1 + k_1) * (k1 * np.exp(-(k1 + k_1) * t) + k_1)

plt.plot(t, Ca, label=&lt;span style="color: #008000;"&gt;'RK4'&lt;/span&gt;)
plt.plot(t, analytical_A(t, k1, k_1), &lt;span style="color: #008000;"&gt;'r--'&lt;/span&gt;, label=&lt;span style="color: #008000;"&gt;'analytical'&lt;/span&gt;)
plt.xlabel(&lt;span style="color: #008000;"&gt;'t'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'[A]'&lt;/span&gt;)
plt.xlim([0, 0.5])
plt.ylim([0.5, 1])
plt.legend()
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
&lt;img src="/media/6a1c5e4c896d855655b8da8b54214af3-90490Zdl.png"&gt; 
&lt;/p&gt;

&lt;p&gt;
That looks fine, we cannot visually distinguish the two solutions, and they both look like Figure 1 in this &lt;a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.428.6699&amp;amp;rep=rep1&amp;amp;type=pdf"&gt;paper&lt;/a&gt;. Note the analytical solution is not that complex, but it would not take much variation of the rate law to make this solution difficult to derive.
&lt;/p&gt;

&lt;p&gt;
Next, to do sensitivity analysis, we need to define a function for \(A\) that depends on the rate constants, so we can take a derivative of it with respect to the parameters we want the sensitivity from. We seek the derivatives: \(\frac{dC_A}{dk_1}\) and \(\frac{dC_A}{dk_{-1}}\). Here is a function that does that. It will return the value of [A] at \(t\) given an initial concentration and the rate constants.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;A&lt;/span&gt;(Ca0, k1, k_1, t):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;dCdt&lt;/span&gt;(t, Ca):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; -k1 * Ca + k_1 * (Ca0 - Ca)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;t&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;Ca_&lt;/span&gt; = rk4(dCdt, (0, t), Ca0)
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; Ca_[-1]

&lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;Here are the two derivatives we seek.&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;dCadk1&lt;/span&gt; = grad(A, 1)
&lt;span style="color: #BA36A5;"&gt;dCadk_1&lt;/span&gt; = grad(A, 2)
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
We also use autograd to get the derivatives from the analytical solution for comparison.
&lt;/p&gt;
&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;dAdk1&lt;/span&gt; = grad(analytical_A, 1)
&lt;span style="color: #BA36A5;"&gt;dAdk_1&lt;/span&gt; = grad(analytical_A, 2)
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
Now, we can plot the sensitivities over the time range and compare them. I use the list comprehensions here because the AD functions aren't vectorized.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;tspan&lt;/span&gt; = np.linspace(0, 0.5)

&lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;From the numerical solutions&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;k1_sensitivity&lt;/span&gt; = [dCadk1(1.0, 3.0, 3.0, t) &lt;span style="color: #0000FF;"&gt;for&lt;/span&gt; t &lt;span style="color: #0000FF;"&gt;in&lt;/span&gt; tspan]
&lt;span style="color: #BA36A5;"&gt;k_1_sensitivity&lt;/span&gt; = [dCadk_1(1.0, 3.0, 3.0, t) &lt;span style="color: #0000FF;"&gt;for&lt;/span&gt; t &lt;span style="color: #0000FF;"&gt;in&lt;/span&gt; tspan]

&lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;from the analytical solutions&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;ak1_sensitivity&lt;/span&gt; = [dAdk1(t, 3.0, 3.0) &lt;span style="color: #0000FF;"&gt;for&lt;/span&gt; t &lt;span style="color: #0000FF;"&gt;in&lt;/span&gt; tspan]
&lt;span style="color: #BA36A5;"&gt;ak_1_sensitivity&lt;/span&gt; = [dAdk_1(t, 3.0, 3.0) &lt;span style="color: #0000FF;"&gt;for&lt;/span&gt; t &lt;span style="color: #0000FF;"&gt;in&lt;/span&gt; tspan]

plt.plot(tspan, np.&lt;span style="color: #006FE0;"&gt;abs&lt;/span&gt;(ak1_sensitivity), &lt;span style="color: #008000;"&gt;'b-'&lt;/span&gt;, label=&lt;span style="color: #008000;"&gt;'k1 analytical'&lt;/span&gt;)
plt.plot(tspan, np.&lt;span style="color: #006FE0;"&gt;abs&lt;/span&gt;(k1_sensitivity), &lt;span style="color: #008000;"&gt;'y--'&lt;/span&gt;, label=&lt;span style="color: #008000;"&gt;'k1 numerical'&lt;/span&gt;)

plt.plot(tspan, np.&lt;span style="color: #006FE0;"&gt;abs&lt;/span&gt;(ak_1_sensitivity), &lt;span style="color: #008000;"&gt;'r-'&lt;/span&gt;, label=&lt;span style="color: #008000;"&gt;'k_1 analytical'&lt;/span&gt;)
plt.plot(tspan, np.&lt;span style="color: #006FE0;"&gt;abs&lt;/span&gt;(k_1_sensitivity), &lt;span style="color: #008000;"&gt;'k--'&lt;/span&gt;, label=&lt;span style="color: #008000;"&gt;'k_1 numerical'&lt;/span&gt;)

plt.xlim([0, 0.5])
plt.ylim([0, 0.1])
plt.legend()
plt.xlabel(&lt;span style="color: #008000;"&gt;'t'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'sensitivity'&lt;/span&gt;)
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;
&lt;img src="/media/6a1c5e4c896d855655b8da8b54214af3-90490mnr.png"&gt; 
&lt;/p&gt;



&lt;p&gt;
The two approaches are indistinguishable on paper. I will note that it takes a lot longer to make the graph from the numerical solution than from the analytical solution because at each point you have to reintegrate the solution from the beginning, which is certainly not efficient. That is an implementation detail that could probably be solved, at the expense of making the code look different than the way I would normally think about the problem.
&lt;/p&gt;

&lt;p&gt;
On the other hand, it is remarkable we get derivatives from the numerical solution, &lt;i&gt;and they look really good&lt;/i&gt;! That means we could do sensitivity analysis on more complex reactions, and still have a reasonable way to get sensitivity. The work here is a long way from that. My simple Runge-Kutta integrator isn't directly useful for systems of ODEs, it wouldn't work well on stiff problems, the step size isn't adaptive, etc. The Tensorflow implementation might be more suitable for this though, and maybe this post is motivation to learn how to use it!
&lt;/p&gt;
&lt;p&gt;Copyright (C) 2018 by John Kitchin. See the &lt;a href="/copying.html"&gt;License&lt;/a&gt; for information about copying.&lt;p&gt;
&lt;p&gt;&lt;a href="/org/2018/10/11/A-differentiable-ODE-integrator-for-sensitivity-analysis.org"&gt;org-mode source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Org-mode version = 9.1.13&lt;/p&gt;]]></content:encoded>
    </item>
    <item>
      <title>Compressibility factor variation from the van der Waals equation by three different approaches</title>
      <link>https://kitchingroup.cheme.cmu.edu/blog/2018/10/07/Compressibility-factor-variation-from-the-van-der-Waals-equation-by-three-different-approaches</link>
      <pubDate>Sun, 07 Oct 2018 13:08:11 EDT</pubDate>
      <category><![CDATA[autograd]]></category>
      <category><![CDATA[ode]]></category>
      <category><![CDATA[nonlinear algebra]]></category>
      <category><![CDATA[python]]></category>
      <guid isPermaLink="false">nzuQ2552fiegwa_OqTuF-vQ3pMs=</guid>
      <description>Compressibility factor variation from the van der Waals equation by three different approaches</description>
      <content:encoded><![CDATA[


&lt;div id="table-of-contents"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;div id="text-table-of-contents"&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#org2fd7cfa"&gt;1. Method 1 - fsolve&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#org7ade82a"&gt;2. Method 2 - solve_ivp&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#orge63b16e"&gt;3. Method 3 - autograd + solve_ivp&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;
In the book &lt;span class="underline"&gt;Problem solving in chemical and biochemical engineering with POLYMATH, Excel and Matlab&lt;/span&gt; by Cutlip and Shacham there is a problem (7.1) where you want to plot the compressibility factor for CO&lt;sub&gt;2&lt;/sub&gt; over a range of \(0.1 \le P_r &lt;= 10\) for a constant \(T_r=1.1\) using the van der Waal equation of state. There are a two standard ways to do this:
&lt;/p&gt;

&lt;ol class="org-ol"&gt;
&lt;li&gt;Solve a nonlinear equation for different values of \(P_r\).&lt;/li&gt;
&lt;li&gt;Solve a nonlinear equation for one value of \(P_r\), then derive an ODE for how the compressibility varies with \(P_r\) and integrate it over the relevant range.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;
In this post, we compare and contrast the two methods, and consider a variation of the second method that uses automatic differentiation.
&lt;/p&gt;

&lt;div id="outline-container-org2fd7cfa" class="outline-2"&gt;
&lt;h2 id="org2fd7cfa"&gt;&lt;span class="section-number-2"&gt;1&lt;/span&gt; Method 1 - fsolve&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-1"&gt;
&lt;p&gt;
The van der Waal equation of state is:
&lt;/p&gt;

&lt;p&gt;
\(P = \frac{R T}{V - b} - \frac{a}{V^2}\).
&lt;/p&gt;

&lt;p&gt;
We define the reduced pressure as \(P_r = P / P_c\), and the reduced temperature as \(T_r = T / T_c\).
&lt;/p&gt;

&lt;p&gt;
So, we simply solve for V at a given \(P_r\), and then compute \(Z\). There is a subtle trick needed to make this easy to solve, and that is to multiply each side of the equation by \((V - b)\) to avoid a singularity when \(V = b\), which happens in this case near \(P_r \approx 7.5\).
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; scipy.optimize &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; fsolve
&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; numpy &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; np
%matplotlib inline
&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; matplotlib.pyplot &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; plt

&lt;span style="color: #BA36A5;"&gt;R&lt;/span&gt; = 0.08206
&lt;span style="color: #BA36A5;"&gt;Pc&lt;/span&gt; = 72.9
&lt;span style="color: #BA36A5;"&gt;Tc&lt;/span&gt; = 304.2

&lt;span style="color: #BA36A5;"&gt;a&lt;/span&gt; = 27 * R**2 * Tc**2 / (Pc * 64)
&lt;span style="color: #BA36A5;"&gt;b&lt;/span&gt; = R * Tc / (8 * Pc)

&lt;span style="color: #BA36A5;"&gt;Tr&lt;/span&gt; = 1.1

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;objective&lt;/span&gt;(V, Pr):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;P&lt;/span&gt; = Pr * Pc
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;T&lt;/span&gt; = Tr * Tc
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; P * (V - b) - (R * T)  +  a / V**2 * (V - b)


&lt;span style="color: #BA36A5;"&gt;Pr_range&lt;/span&gt; = np.linspace(0.1, 10)
&lt;span style="color: #BA36A5;"&gt;V&lt;/span&gt; = [fsolve(objective, 3, args=(Pr,))[0] &lt;span style="color: #0000FF;"&gt;for&lt;/span&gt; Pr &lt;span style="color: #0000FF;"&gt;in&lt;/span&gt; Pr_range]

&lt;span style="color: #BA36A5;"&gt;T&lt;/span&gt; = Tr * Tc
&lt;span style="color: #BA36A5;"&gt;P_range&lt;/span&gt; = Pr_range * Pc
&lt;span style="color: #BA36A5;"&gt;Z&lt;/span&gt; = P_range * V / (R * T)

plt.plot(Pr_range, Z)
plt.xlabel(&lt;span style="color: #008000;"&gt;'$P_r$'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'Z'&lt;/span&gt;)
plt.xlim([0, 10])
plt.ylim([0, 2])
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
(0, 2)

&lt;/pre&gt;



&lt;p&gt;
&lt;img src="/media/13bc1d996aa4bd032faad00425793120-90490byl.png"&gt; 
&lt;/p&gt;

&lt;p&gt;
That looks like Figure 7-1 in the book. This approach is fine, but the equation did require a little algebraic finesse to solve, and you have to use some iteration to get the solution.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id="outline-container-org7ade82a" class="outline-2"&gt;
&lt;h2 id="org7ade82a"&gt;&lt;span class="section-number-2"&gt;2&lt;/span&gt; Method 2 - solve_ivp&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-2"&gt;
&lt;p&gt;
In this method, you have to derive an expression for \(\frac{dV}{dP_r}\). That derivation goes like this:
&lt;/p&gt;

&lt;p&gt;
\(\frac{dV}{dP_r} = \frac{dV}{dP} \frac{dP}{dP_r}\)
&lt;/p&gt;

&lt;p&gt;
The first term \(\frac{dV}{dP}\) is \((\frac{dP}{dV})^{-1}\), which we can derive directly from the van der Waal equation, and the second term is just a constant: \(P_c\) from the definition of \(P_r\).
&lt;/p&gt;

&lt;p&gt;
They derived:
&lt;/p&gt;

&lt;p&gt;
\(\frac{dP}{dV} = -\frac{R T}{(V - b)^2} + \frac{2 a}{V^3}\)
&lt;/p&gt;

&lt;p&gt;
We need to solve for one V, at the beginning of the range of \(P_r\) we are interested in.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;V0, = fsolve(objective, 3, args=(0.1,))
V0
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
3.6764763125625461

&lt;/pre&gt;

&lt;p&gt;
Now, we can define the functions, and integrate them to get the same solution. I defined these pretty verbosely, just for readability.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; scipy.integrate &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; solve_ivp

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;dPdV&lt;/span&gt;(V):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; -R * T / (V - b)**2 + 2 * a / V**3

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;dVdP&lt;/span&gt;(V):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; 1 / dPdV(V)

&lt;span style="color: #BA36A5;"&gt;dPdPr&lt;/span&gt; = Pc

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;dVdPr&lt;/span&gt;(Pr, V):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; dVdP(V) * dPdPr

&lt;span style="color: #BA36A5;"&gt;Pr_span&lt;/span&gt; = (0.1, 10)
&lt;span style="color: #BA36A5;"&gt;Pr_eval&lt;/span&gt;, &lt;span style="color: #BA36A5;"&gt;h&lt;/span&gt; = np.linspace(*Pr_span, retstep=&lt;span style="color: #D0372D;"&gt;True&lt;/span&gt;)

&lt;span style="color: #BA36A5;"&gt;sol&lt;/span&gt; = solve_ivp(dVdPr, Pr_span, (V0,), dense_output=&lt;span style="color: #D0372D;"&gt;True&lt;/span&gt;, max_step=h)

&lt;span style="color: #BA36A5;"&gt;V&lt;/span&gt; = sol.y[0]
&lt;span style="color: #BA36A5;"&gt;P&lt;/span&gt; = sol.t * Pc
&lt;span style="color: #BA36A5;"&gt;Z&lt;/span&gt; = P * V / (R * T)
plt.plot(sol.t, Z)
plt.xlabel(&lt;span style="color: #008000;"&gt;'$P_r$'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'Z'&lt;/span&gt;)
plt.xlim([0, 10])
plt.ylim([0, 2])
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
(0, 2)

&lt;/pre&gt;



&lt;p&gt;
&lt;img src="/media/13bc1d996aa4bd032faad00425793120-90490o8r.png"&gt; 
&lt;/p&gt;

&lt;p&gt;
This also looks like Figure 7-1. It is arguably a better approach since we only need an initial condition, and after that have a reliable integration (rather than many iterative solutions from an initial guess of the solution in fsolve).
&lt;/p&gt;

&lt;p&gt;
The only downside to this approach (in my opinion) is the need to derive and implement derivatives. As equations of state get more complex, this gets more tedious and complicated.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;div id="outline-container-orge63b16e" class="outline-2"&gt;
&lt;h2 id="orge63b16e"&gt;&lt;span class="section-number-2"&gt;3&lt;/span&gt; Method 3 - autograd + solve_ivp&lt;/h2&gt;
&lt;div class="outline-text-2" id="text-3"&gt;
&lt;p&gt;
The whole point of automatic differentiation is to get derivatives of functions that are written as programs. We explore here the possibility of using this to solve this problem. The idea is to use autograd to define the derivative \(dP/dV\), and then solve the ODE like we did before.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; autograd &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; grad

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;P&lt;/span&gt;(V):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; R * T / (V - b) - a / V**2

&lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;autograd.grad returns a callable that acts like a function&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;dPdV&lt;/span&gt; = grad(P, 0)

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;dVdPr&lt;/span&gt;(Pr, V):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; 1 / dPdV(V) * Pc

&lt;span style="color: #BA36A5;"&gt;sol&lt;/span&gt; = solve_ivp(dVdPr,  Pr_span, (V0,), dense_output=&lt;span style="color: #D0372D;"&gt;True&lt;/span&gt;, max_step=h)

V, = sol.y
&lt;span style="color: #BA36A5;"&gt;P&lt;/span&gt; = sol.t * Pc
&lt;span style="color: #BA36A5;"&gt;Z&lt;/span&gt; = P * V / (R * T)
plt.plot(sol.t, Z)
plt.xlabel(&lt;span style="color: #008000;"&gt;'$P_r$'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'Z'&lt;/span&gt;)
plt.xlim([0, 10])
plt.ylim([0, 2])
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
(0, 2)

&lt;/pre&gt;



&lt;p&gt;
&lt;img src="/media/13bc1d996aa4bd032faad00425793120-90490O2H.png"&gt; 
&lt;/p&gt;

&lt;p&gt;
Not surprisingly, this answer looks the same as the previous ones. I think this solution is pretty awesome. We only had to implement the van der Waal equation, and then let autograd do its job to get the relevant derivative. We don't get a free pass on calculus here; we still have to know which derivatives are important. We also need some knowledge about how to use autograd, but with that, this problem becomes pretty easy to solve.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Copyright (C) 2018 by John Kitchin. See the &lt;a href="/copying.html"&gt;License&lt;/a&gt; for information about copying.&lt;p&gt;
&lt;p&gt;&lt;a href="/org/2018/10/07/Compressibility-factor-variation-from-the-van-der-Waals-equation-by-three-different-approaches.org"&gt;org-mode source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Org-mode version = 9.1.13&lt;/p&gt;]]></content:encoded>
    </item>
    <item>
      <title>A new ode integrator function in scipy</title>
      <link>https://kitchingroup.cheme.cmu.edu/blog/2018/09/04/A-new-ode-integrator-function-in-scipy</link>
      <pubDate>Tue, 04 Sep 2018 21:20:58 EDT</pubDate>
      <category><![CDATA[scipy]]></category>
      <category><![CDATA[ode]]></category>
      <guid isPermaLink="false">e7L6DytRVe1VWizNIhScM2uYiQs=</guid>
      <description>A new ode integrator function in scipy</description>
      <content:encoded><![CDATA[


&lt;p&gt;
I learned recently about a new way to solve ODEs in scipy: &lt;a href="https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.solve_ivp.html"&gt;scipy.integrate.solve_ivp&lt;/a&gt;. This new function is recommended instead of &lt;code&gt;scipy.integrate.odeint&lt;/code&gt; for new code. This function caught my eye because it added functionality that was previously missing, and that I had written into my pycse package. That functionality is events.
&lt;/p&gt;

&lt;p&gt;
To explore how to use this new function, I will recreate an old &lt;a href="http://kitchingroup.cheme.cmu.edu/blog/2013/01/28/Mimicking-ode-events-in-python/"&gt;blog post&lt;/a&gt; where I used events to count the number of roots in a function. Spoiler alert: it may not be ready for production.
&lt;/p&gt;

&lt;p&gt;
The question at hand is how many roots are there in \(f(x) = x^3 + 6x^2 - 4x - 24\), and what are they. Now, I know there are three roots and that you can use &lt;code&gt;np.roots&lt;/code&gt; for this, but that
only works for polynomials. Here they are, so we know what we are looking for.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; numpy &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; np
np.roots([1, 6, -4, -24])
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
array([-6.,  2., -2.])

&lt;/pre&gt;

&lt;p&gt;
The point of this is to find a more general way to count roots in an interval. We do it by integrating the derivative of the function, and using an event function to  count when the function is equal to zero. First, we define the derivative:
&lt;/p&gt;

&lt;p&gt;
\(f'(x) = 3x^2 + 12x - 4\), and the value of our original function at some value that is the beginning of the range we want to consider, say \(f(-8) = -120\). Now, we have an ordinary differential equation that can be integrated. Our event function is simply, it is just the function value \(y\). In the next block, I include an optional t_eval arg so we can see the solution at more points.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;fprime&lt;/span&gt;(x, y):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; 3 * x**2 + 12 * x - 4

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;event&lt;/span&gt;(x, y):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; y

&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; numpy &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; np
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; scipy.integrate &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; solve_ivp
&lt;span style="color: #BA36A5;"&gt;sol&lt;/span&gt; = solve_ivp(fprime, (-8, 4), np.array([-120]), t_eval=np.linspace(-8, 4, 10), events=[event])
sol
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
 message: 'The solver successfully reached the interval end.'
    nfev: 26
    njev: 0
     nlu: 0
     sol: None
  status: 0
 success: True
       t: array([-8.        , -6.66666667, -5.33333333, -4.        , -2.66666667,
      -1.33333333,  0.        ,  1.33333333,  2.66666667,  4.        ])
t_events: [array([-6.])]
       y: array([[-120.        ,  -26.96296296,   16.2962963 ,   24.        ,
         10.37037037,  -10.37037037,  -24.        ,  -16.2962963 ,
         26.96296296,  120.        ]])

&lt;/pre&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;sol.t_events
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
[array([-6.])]

&lt;/pre&gt;

&lt;p&gt;
Huh. That is not what I expected. There should be three values in sol.t_events, but there is only one. Looking at sol.y, you can see there are three sign changes, which means three zeros. The graph here confirms that.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;%matplotlib inline
&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; matplotlib.pyplot &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; plt
plt.plot(sol.t, sol.y[0])
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
[&amp;lt;matplotlib.lines.Line2D at 0x151281d860&amp;gt;]

&lt;/pre&gt;



&lt;p&gt;
&lt;img src="/media/e56c3df20f7d52f874861f0041da6fd5-18185E.png"&gt; 
&lt;/p&gt;

&lt;p&gt;
What appears to be happening is that the events are only called during the solver steps, which are &lt;i&gt;different&lt;/i&gt; than the t_eval steps. It appears a workaround is to specify a max_step that can be taken by the solver to force the event functions to be evaluated more often. Adding this seems to create a new cryptic warning.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;sol&lt;/span&gt; = solve_ivp(fprime, (-8, 4), np.array([-120]), events=[event], max_step=1.0)
sol
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
/Users/jkitchin/anaconda/lib/python3.6/site-packages/scipy/integrate/_ivp/rk.py:145: RuntimeWarning: divide by zero encountered in double_scalars
  max(1, SAFETY * error_norm ** (-1 / (order + 1))))


&lt;/pre&gt;

&lt;pre class="example"&gt;
 message: 'The solver successfully reached the interval end.'
    nfev: 80
    njev: 0
     nlu: 0
     sol: None
  status: 0
 success: True
       t: array([-8.        , -7.89454203, -6.89454203, -5.89454203, -4.89454203,
      -3.89454203, -2.89454203, -1.89454203, -0.89454203,  0.10545797,
       1.10545797,  2.10545797,  3.10545797,  4.        ])
t_events: [array([-6., -2.,  2.])]
       y: array([[-120.        , -110.49687882,  -38.94362768,    3.24237128,
         22.06111806,   23.51261266,   13.59685508,   -1.68615468,
        -16.33641662,  -24.35393074,  -19.73869704,    3.50928448,
         51.39001383,  120.        ]])

&lt;/pre&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;sol.t_events
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
[array([-6., -2.,  2.])]

&lt;/pre&gt;

&lt;p&gt;
That is more like it. Here, I happen to know the answers, so we are safe setting a max_step of 1.0, but that feels awkward and unreliable. You don't want this max_step to be too small, because it probably makes for more computations. On the other hand, it can't be too large either because you might miss roots. It seems there is room for improvement on this.
&lt;/p&gt;

&lt;p&gt;
It also seems odd that the solve_ivp only returns the t_events, and not also the corresponding solution values. I guess in this case, we know the solution values are zero at t_events, but, supposing you instead were looking for a maximum value by getting a derivative that was equal to zero, you might end up getting stuck solving for it some how.
&lt;/p&gt;

&lt;p&gt;
Let's consider this parabola with a maximum at \(x=2\), where \(y=2\):
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;x&lt;/span&gt; = np.linspace(0, 4)
plt.plot(x, 2 - (x - 2)**2)
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
[&amp;lt;matplotlib.lines.Line2D at 0x1512dad9e8&amp;gt;]

&lt;/pre&gt;



&lt;p&gt;
&lt;img src="/media/e56c3df20f7d52f874861f0041da6fd5-181K3p.png"&gt; 
&lt;/p&gt;

&lt;p&gt;
We can find the maximum like this.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;yprime&lt;/span&gt;(x, y):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; -2  * (x - 2)

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;maxevent&lt;/span&gt;(x, y):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; yprime(x, y)

&lt;span style="color: #BA36A5;"&gt;sol&lt;/span&gt; = solve_ivp(yprime, (0, 4), np.array([-2]), events=[maxevent])
sol
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
/Users/jkitchin/anaconda/lib/python3.6/site-packages/scipy/integrate/_ivp/rk.py:145: RuntimeWarning: divide by zero encountered in double_scalars
  max(1, SAFETY * error_norm ** (-1 / (order + 1))))


&lt;/pre&gt;

&lt;pre class="example"&gt;
 message: 'The solver successfully reached the interval end.'
    nfev: 20
    njev: 0
     nlu: 0
     sol: None
  status: 0
 success: True
       t: array([ 0.        ,  0.08706376,  0.95770136,  4.        ])
t_events: [array([ 2.])]
       y: array([[-2.        , -1.65932506,  0.91361355, -2.        ]])

&lt;/pre&gt;

&lt;p&gt;
Clearly, we found the maximum at x=2, but now what?  Re-solve the ODE and use t_eval with the t_events values? Use a fine t_eval array, and interpolate the solution? That doesn't seem smart. You could make the event terminal, so that it stops at the max, and then read off the last value, but this will not work if you want to count more than one maximum, for example.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-ipython"&gt;&lt;span style="color: #BA36A5;"&gt;maxevent.terminal&lt;/span&gt; = &lt;span style="color: #D0372D;"&gt;True&lt;/span&gt;
solve_ivp(yprime, (0, 4), (-2,), events=[maxevent])
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
/Users/jkitchin/anaconda/lib/python3.6/site-packages/scipy/integrate/_ivp/rk.py:145: RuntimeWarning: divide by zero encountered in double_scalars
  max(1, SAFETY * error_norm ** (-1 / (order + 1))))


&lt;/pre&gt;

&lt;pre class="example"&gt;
 message: 'A termination event occurred.'
    nfev: 20
    njev: 0
     nlu: 0
     sol: None
  status: 1
 success: True
       t: array([ 0.        ,  0.08706376,  0.95770136,  2.        ])
t_events: [array([ 2.])]
       y: array([[-2.        , -1.65932506,  0.91361355,  2.        ]])

&lt;/pre&gt;

&lt;p&gt;
Internet: am I missing something obvious here?
&lt;/p&gt;
&lt;p&gt;Copyright (C) 2018 by John Kitchin. See the &lt;a href="/copying.html"&gt;License&lt;/a&gt; for information about copying.&lt;p&gt;
&lt;p&gt;&lt;a href="/org/2018/09/04/A-new-ode-integrator-function-in-scipy.org"&gt;org-mode source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Org-mode version = 9.1.13&lt;/p&gt;]]></content:encoded>
    </item>
    <item>
      <title>Solving ODEs with a neural network and autograd</title>
      <link>https://kitchingroup.cheme.cmu.edu/blog/2017/11/28/Solving-ODEs-with-a-neural-network-and-autograd</link>
      <pubDate>Tue, 28 Nov 2017 07:23:03 EST</pubDate>
      <category><![CDATA[autograd]]></category>
      <category><![CDATA[ode]]></category>
      <guid isPermaLink="false">9_0DaFnIWtKTkjNoCBbYQlWrdEU=</guid>
      <description>Solving ODEs with a neural network and autograd</description>
      <content:encoded><![CDATA[


&lt;p&gt;
In the last &lt;a href="http://kitchingroup.cheme.cmu.edu/blog/2017/11/27/Solving-BVPs-with-a-neural-network-and-autograd/"&gt;post&lt;/a&gt; I explored using a neural network to solve a BVP. Here, I expand the idea to solving an initial value ordinary differential equation. The idea is basically the same, we just have a slightly different objective function.
&lt;/p&gt;

&lt;p&gt;
\(dCa/dt = -k Ca(t)\) where \(Ca(t=0) = 2.0\).
&lt;/p&gt;

&lt;p&gt;
Here is the code that solves this equation, along with a comparison to the analytical solution: \(Ca(t) = Ca0 \exp -kt\).
&lt;/p&gt;

&lt;div class="org-src-container"&gt;
&lt;pre class="src src-python"&gt;&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; autograd.numpy &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; np
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; autograd &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; grad, elementwise_grad
&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; autograd.numpy.random &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; npr
&lt;span style="color: #0000FF;"&gt;from&lt;/span&gt; autograd.misc.optimizers &lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; adam

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;init_random_params&lt;/span&gt;(scale, layer_sizes, rs=npr.RandomState(0)):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #036A07;"&gt;"""Build a list of (weights, biases) tuples, one for each layer."""&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; [(rs.randn(insize, outsize) * scale,   &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;weight matrix&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;rs.randn(outsize) * scale)           &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;bias vector&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;for&lt;/span&gt; insize, outsize &lt;span style="color: #0000FF;"&gt;in&lt;/span&gt; &lt;span style="color: #006FE0;"&gt;zip&lt;/span&gt;(layer_sizes[:-1], layer_sizes[1:])]


&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;swish&lt;/span&gt;(x):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #036A07;"&gt;"see https://arxiv.org/pdf/1710.05941.pdf"&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; x / (1.0 + np.exp(-x))


&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;Ca&lt;/span&gt;(params, inputs):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #036A07;"&gt;"Neural network functions"&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;for&lt;/span&gt; W, b &lt;span style="color: #0000FF;"&gt;in&lt;/span&gt; params:
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;outputs&lt;/span&gt; = np.dot(inputs, W) + b
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;inputs&lt;/span&gt; = swish(outputs)    
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; outputs

&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   
&lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;Here is our initial guess of params:&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;params&lt;/span&gt; = init_random_params(0.1, layer_sizes=[1, 8, 1])

&lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;Derivatives&lt;/span&gt;
&lt;span style="color: #BA36A5;"&gt;dCadt&lt;/span&gt; = elementwise_grad(Ca, 1)

&lt;span style="color: #BA36A5;"&gt;k&lt;/span&gt; = 0.23
&lt;span style="color: #BA36A5;"&gt;Ca0&lt;/span&gt; = 2.0
&lt;span style="color: #BA36A5;"&gt;t&lt;/span&gt; = np.linspace(0, 10).reshape((-1, 1))

&lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;This is the function we seek to minimize&lt;/span&gt;
&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;objective&lt;/span&gt;(params, step):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;These should all be zero at the solution&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #8D8D84;"&gt;# &lt;/span&gt;&lt;span style="color: #8D8D84; font-style: italic;"&gt;dCadt = -k * Ca(t)&lt;/span&gt;
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;zeq&lt;/span&gt; = dCadt(params, t) - (-k * Ca(params, t))
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #BA36A5;"&gt;ic&lt;/span&gt; = Ca(params, 0) - Ca0
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;return&lt;/span&gt; np.mean(zeq**2) + ic**2

&lt;span style="color: #0000FF;"&gt;def&lt;/span&gt; &lt;span style="color: #006699;"&gt;callback&lt;/span&gt;(params, step, g):
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;if&lt;/span&gt; step % 1000 == 0:
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #0000FF;"&gt;print&lt;/span&gt;(&lt;span style="color: #008000;"&gt;"Iteration {0:3d} objective {1}"&lt;/span&gt;.&lt;span style="color: #006FE0;"&gt;format&lt;/span&gt;(step,
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt; objective(params, step)))

&lt;span style="color: #BA36A5;"&gt;params&lt;/span&gt; = adam(grad(objective), params,
&lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt;   &lt;span style="color: #9B9B9B; background-color: #EDEDED;"&gt; &lt;/span&gt; step_size=0.001, num_iters=5001, callback=callback) 


&lt;span style="color: #BA36A5;"&gt;tfit&lt;/span&gt; = np.linspace(0, 20).reshape(-1, 1)
&lt;span style="color: #0000FF;"&gt;import&lt;/span&gt; matplotlib.pyplot &lt;span style="color: #0000FF;"&gt;as&lt;/span&gt; plt
plt.plot(tfit, Ca(params, tfit), label=&lt;span style="color: #008000;"&gt;'soln'&lt;/span&gt;)
plt.plot(tfit, Ca0 * np.exp(-k * tfit), &lt;span style="color: #008000;"&gt;'r--'&lt;/span&gt;, label=&lt;span style="color: #008000;"&gt;'analytical soln'&lt;/span&gt;)
plt.legend()
plt.xlabel(&lt;span style="color: #008000;"&gt;'time'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #008000;"&gt;'$C_A$'&lt;/span&gt;)
plt.xlim([0, 20])
plt.savefig(&lt;span style="color: #008000;"&gt;'nn-ode.png'&lt;/span&gt;)
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
Iteration   0 objective [[ 3.20374053]]
Iteration 1000 objective [[  3.13906829e-05]]
Iteration 2000 objective [[  1.95894699e-05]]
Iteration 3000 objective [[  1.60381564e-05]]
Iteration 4000 objective [[  1.39930673e-05]]
Iteration 5000 objective [[  1.03554970e-05]]

&lt;/pre&gt;


&lt;p&gt;
&lt;img src="/media/nn-ode.png"&gt; 
&lt;/p&gt;

&lt;p&gt;
Huh. Those two solutions are nearly indistinguishable. Since we used a neural network, let's hype it up and say we learned the solution to a differential equation! But seriously, note that although we got an "analytical" solution, we should only rely on it in the region we trained the solution on. You can see the solution above is not that good past t=10, even perhaps going negative (which is not even physically correct). That is a reminder that the function we have for the solution &lt;i&gt;is not the same as the analytical solution&lt;/i&gt;, it just approximates it really well over the region we solved over. Of course, you can expand that region to the region you care about, but the main point is don't rely on the solution outside where you know it is good.
&lt;/p&gt;

&lt;p&gt;
This idea isn't new. There are several papers in the literature on using neural networks to solve differential equations, e.g. &lt;a href="http://www.sciencedirect.com/science/article/pii/S0255270102002076"&gt;http://www.sciencedirect.com/science/article/pii/S0255270102002076&lt;/a&gt; and &lt;a href="https://arxiv.org/pdf/physics/9705023.pdf"&gt;https://arxiv.org/pdf/physics/9705023.pdf&lt;/a&gt;, and other blog posts that are similar (&lt;a href="https://becominghuman.ai/neural-networks-for-solving-differential-equations-fa230ac5e04c"&gt;https://becominghuman.ai/neural-networks-for-solving-differential-equations-fa230ac5e04c&lt;/a&gt;, even using autograd). That means to me that there is some merit to continuing to investigate this approach to solving differential equations.
&lt;/p&gt;

&lt;p&gt;
There are some interesting challenges for engineers to consider with this approach though. When is the solution accurate enough? How reliable are derivatives of the solution? What network architecture is appropriate or best? How do you know how good the solution is? Is it possible to build in solution features, e.g. asymptotes, or constraints on derivatives, or that the solution should be monotonic, etc. These would help us trust the solutions not to do weird things, and to extrapolate more reliably.
&lt;/p&gt;
&lt;p&gt;Copyright (C) 2017 by John Kitchin. See the &lt;a href="/copying.html"&gt;License&lt;/a&gt; for information about copying.&lt;p&gt;
&lt;p&gt;&lt;a href="/org/2017/11/28/Solving-ODEs-with-a-neural-network-and-autograd.org"&gt;org-mode source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Org-mode version = 9.1.2&lt;/p&gt;]]></content:encoded>
    </item>
    <item>
      <title>Uncertainty in the solution of an ODE</title>
      <link>https://kitchingroup.cheme.cmu.edu/blog/2013/07/14/Uncertainty-in-the-solution-of-an-ODE</link>
      <pubDate>Sun, 14 Jul 2013 13:36:36 EDT</pubDate>
      <category><![CDATA[uncertainty]]></category>
      <category><![CDATA[ode]]></category>
      <guid isPermaLink="false">vgxLM1eNdDWFxYoYzKL_cdL_bP8=</guid>
      <description>Uncertainty in the solution of an ODE</description>
      <content:encoded><![CDATA[



&lt;p&gt;
Our objective in this post is to examine the effects of uncertainty in parameters that define an ODE on the integrated solution of the ODE. My favorite method for numerical uncertainty analysis is Monte Carlo simulation because it is easy to code and usually easy to understand. We take that approach first.
&lt;/p&gt;

&lt;p&gt;
The problem to solve is to estimate the conversion in a constant volume batch reactor with a second order reaction \(A \rightarrow B\), and the rate law: \(-r_A = k C_A^2\), after one hour of reaction. There is 5% uncertainty in the rate constant \(k=0.001\) and in the initial concentration \(C_{A0}=1\). 
&lt;/p&gt;

&lt;p&gt;
The relevant differential equation is:
&lt;/p&gt;

&lt;p&gt;
\(\frac{dX}{dt} = -r_A /C_{A0}\).
&lt;/p&gt;

&lt;p&gt;
We have to assume that 5% uncertainty refers to a normal distribution of error that has a standard deviation of 5% of the mean value. 
&lt;/p&gt;

&lt;div class="org-src-container"&gt;

&lt;pre class="src src-python"&gt;&lt;span style="color: #8b0000;"&gt;from&lt;/span&gt; scipy.integrate &lt;span style="color: #8b0000;"&gt;import&lt;/span&gt; odeint
&lt;span style="color: #8b0000;"&gt;import&lt;/span&gt; numpy &lt;span style="color: #8b0000;"&gt;as&lt;/span&gt; np

&lt;span style="color: #8b008b;"&gt;N&lt;/span&gt; = 1000

&lt;span style="color: #8b008b;"&gt;K&lt;/span&gt; = np.random.normal(0.001, 0.05*0.001, N)
&lt;span style="color: #8b008b;"&gt;CA0&lt;/span&gt; = np.random.normal(1, 0.05*1, N)

&lt;span style="color: #8b008b;"&gt;X&lt;/span&gt; = [] &lt;span style="color: #ff0000; font-weight: bold;"&gt;# &lt;/span&gt;&lt;span style="color: #ff0000; font-weight: bold;"&gt;to store answer in&lt;/span&gt;
&lt;span style="color: #8b0000;"&gt;for&lt;/span&gt; k, Ca0 &lt;span style="color: #8b0000;"&gt;in&lt;/span&gt; &lt;span style="color: #cd0000;"&gt;zip&lt;/span&gt;(K, CA0):
    &lt;span style="color: #ff0000; font-weight: bold;"&gt;# &lt;/span&gt;&lt;span style="color: #ff0000; font-weight: bold;"&gt;define ODE&lt;/span&gt;
    &lt;span style="color: #8b0000;"&gt;def&lt;/span&gt; &lt;span style="color: #8b2323;"&gt;ode&lt;/span&gt;(X, t):
        &lt;span style="color: #8b008b;"&gt;ra&lt;/span&gt; = -k * (Ca0 * (1 - X))**2
        &lt;span style="color: #8b0000;"&gt;return&lt;/span&gt; -ra / Ca0
    
    &lt;span style="color: #8b008b;"&gt;X0&lt;/span&gt; = 0
    &lt;span style="color: #8b008b;"&gt;tspan&lt;/span&gt; = np.linspace(0,3600)

    &lt;span style="color: #8b008b;"&gt;sol&lt;/span&gt; = odeint(ode, X0, tspan)

    &lt;span style="color: #8b008b;"&gt;X&lt;/span&gt; += [sol[-1][0]]

&lt;span style="color: #8b008b;"&gt;s&lt;/span&gt; = &lt;span style="color: #228b22;"&gt;'Final conversion at one hour is {0:1.3f} +- {1:1.3f} (1 sigma)'&lt;/span&gt;
&lt;span style="color: #8b0000;"&gt;print&lt;/span&gt; s.&lt;span style="color: #cd0000;"&gt;format&lt;/span&gt;(np.average(X),
               np.std(X))
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
Final conversion at one hour is 0.782 +- 0.013 (1 sigma)
&lt;/pre&gt;

&lt;p&gt;
See, it is not too difficulty to write. It is however, a little on the expensive side to run, since we typically need 1e3-1e6 samples to get the statistics reasonable. Let us try the uncertainties package too. For this we have to wrap a function that takes uncertainties and returns a single float number. 
&lt;/p&gt;

&lt;div class="org-src-container"&gt;

&lt;pre class="src src-python"&gt;&lt;span style="color: #8b0000;"&gt;from&lt;/span&gt; scipy.integrate &lt;span style="color: #8b0000;"&gt;import&lt;/span&gt; odeint
&lt;span style="color: #8b0000;"&gt;import&lt;/span&gt; numpy &lt;span style="color: #8b0000;"&gt;as&lt;/span&gt; np
&lt;span style="color: #8b0000;"&gt;import&lt;/span&gt; uncertainties &lt;span style="color: #8b0000;"&gt;as&lt;/span&gt; u

&lt;span style="color: #8b008b;"&gt;k&lt;/span&gt; = u.ufloat(0.001, 0.05*0.001)
&lt;span style="color: #8b008b;"&gt;Ca0&lt;/span&gt; = u.ufloat(1.0, 0.05)

&lt;span style="color: #4682b4;"&gt;@u.wrap&lt;/span&gt;
&lt;span style="color: #8b0000;"&gt;def&lt;/span&gt; &lt;span style="color: #8b2323;"&gt;func&lt;/span&gt;(k, Ca0):
    &lt;span style="color: #ff0000; font-weight: bold;"&gt;# &lt;/span&gt;&lt;span style="color: #ff0000; font-weight: bold;"&gt;define the ODE&lt;/span&gt;
    &lt;span style="color: #8b0000;"&gt;def&lt;/span&gt; &lt;span style="color: #8b2323;"&gt;ode&lt;/span&gt;(X, t):
        &lt;span style="color: #8b008b;"&gt;ra&lt;/span&gt; = -k * (Ca0 * (1 - X))**2
        &lt;span style="color: #8b0000;"&gt;return&lt;/span&gt; -ra / Ca0
    
    &lt;span style="color: #8b008b;"&gt;X0&lt;/span&gt; = 0 &lt;span style="color: #ff0000; font-weight: bold;"&gt;# &lt;/span&gt;&lt;span style="color: #ff0000; font-weight: bold;"&gt;initial condition&lt;/span&gt;
    &lt;span style="color: #8b008b;"&gt;tspan&lt;/span&gt; = np.linspace(0, 3600)
    &lt;span style="color: #ff0000; font-weight: bold;"&gt;# &lt;/span&gt;&lt;span style="color: #ff0000; font-weight: bold;"&gt;integrate it&lt;/span&gt;
    &lt;span style="color: #8b008b;"&gt;sol&lt;/span&gt; = odeint(ode, X0, tspan)
    &lt;span style="color: #8b0000;"&gt;return&lt;/span&gt; sol[-1][0]

&lt;span style="color: #8b008b;"&gt;result&lt;/span&gt; = func(k, Ca0)
&lt;span style="color: #8b008b;"&gt;s&lt;/span&gt; = &lt;span style="color: #228b22;"&gt;'Final conversion at one hour is {0}(1 sigma)'&lt;/span&gt;
&lt;span style="color: #8b0000;"&gt;print&lt;/span&gt; s.&lt;span style="color: #cd0000;"&gt;format&lt;/span&gt;(result)
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
Final conversion at one hour is 0.783+/-0.012(1 sigma)
&lt;/pre&gt;

&lt;p&gt;
This is about the same amount of code as the Monte Carlo approach, but it runs much faster, and gets approximately the same results. You have to remember the wrapping technique, since the uncertainties package does not run natively with the odeint function. 
&lt;/p&gt;
&lt;p&gt;Copyright (C) 2013 by John Kitchin. See the &lt;a href="/copying.html"&gt;License&lt;/a&gt; for information about copying.&lt;p&gt;&lt;p&gt;&lt;a href="/org/2013/07/14/Uncertainty-in-the-solution-of-an-ODE.org"&gt;org-mode source&lt;/a&gt;&lt;p&gt;]]></content:encoded>
    </item>
    <item>
      <title>Linear algebra approaches to solving systems of constant coefficient ODEs</title>
      <link>https://kitchingroup.cheme.cmu.edu/blog/2013/02/27/Linear-algebra-approaches-to-solving-systems-of-constant-coefficient-ODEs</link>
      <pubDate>Wed, 27 Feb 2013 14:33:11 EST</pubDate>
      <category><![CDATA[ode]]></category>
      <guid isPermaLink="false">zNDsNinXh264YgGTqRhzeDTMzpk=</guid>
      <description>Linear algebra approaches to solving systems of constant coefficient ODEs</description>
      <content:encoded><![CDATA[


&lt;p&gt;
&lt;a href="http://matlab.cheme.cmu.edu/2011/10/20/linear-algebra-approaches-to-solving-systems-of-constant-coefficient-odes" &gt;Matlab post&lt;/a&gt;

Today we consider how to solve a system of first order, constant coefficient ordinary differential equations using linear algebra. These equations could be solved numerically, but in this case there are analytical solutions that can be derived. The equations we will solve are:
&lt;/p&gt;

&lt;p&gt;
\(y'_1 = -0.02 y_1 + 0.02 y_2\)
&lt;/p&gt;

&lt;p&gt;
\(y'_2 = 0.02 y_1 - 0.02 y_2\)
&lt;/p&gt;

&lt;p&gt;
We can express this set of equations in matrix form as: \(\left[\begin{array}{c}y'_1\\y'_2\end{array}\right] = \left[\begin{array}{cc} -0.02 &amp; 0.02 \\ 0.02 &amp; -0.02\end{array}\right] \left[\begin{array}{c}y_1\\y_2\end{array}\right]\)
&lt;/p&gt;

&lt;p&gt;
The general solution to this set of equations is
&lt;/p&gt;

&lt;p&gt;
\(\left[\begin{array}{c}y_1\\y_2\end{array}\right] = \left[\begin{array}{cc}v_1 &amp; v_2\end{array}\right] \left[\begin{array}{cc} c_1 &amp; 0 \\ 0 &amp; c_2\end{array}\right] \exp\left(\left[\begin{array}{cc} \lambda_1 &amp; 0 \\ 0 &amp; \lambda_2\end{array}\right] \left[\begin{array}{c}t\\t\end{array}\right]\right)\)
&lt;/p&gt;

&lt;p&gt;
where \(\left[\begin{array}{cc} \lambda_1 &amp; 0 \\ 0 &amp; \lambda_2\end{array}\right]\) is a diagonal matrix of the eigenvalues of the constant coefficient matrix, \(\left[\begin{array}{cc}v_1 &amp; v_2\end{array}\right]\) is a matrix of eigenvectors where the \(i^{th}\) column corresponds to the eigenvector of the \(i^{th}\) eigenvalue, and \(\left[\begin{array}{cc} c_1 &amp; 0 \\ 0 &amp; c_2\end{array}\right]\) is a matrix determined by the initial conditions.
&lt;/p&gt;

&lt;p&gt;
In this example, we evaluate the solution using linear algebra. The initial conditions we will consider are \(y_1(0)=0\) and \(y_2(0)=150\).
&lt;/p&gt;

&lt;div class="org-src-container"&gt;

&lt;pre class="src src-python"&gt;&lt;span style="color: #8b0000;"&gt;import&lt;/span&gt; numpy &lt;span style="color: #8b0000;"&gt;as&lt;/span&gt; np

A = np.array([[-0.02,  0.02],
              [ 0.02, -0.02]])

&lt;span style="color: #ff0000; font-weight: bold;"&gt;# &lt;/span&gt;&lt;span style="color: #ff0000; font-weight: bold;"&gt;Return the eigenvalues and eigenvectors of a Hermitian or symmetric matrix.&lt;/span&gt;
evals, evecs = np.linalg.eigh(A)
&lt;span style="color: #8b0000;"&gt;print&lt;/span&gt; evals
&lt;span style="color: #8b0000;"&gt;print&lt;/span&gt; evecs
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
&amp;gt;&amp;gt;&amp;gt; ... &amp;gt;&amp;gt;&amp;gt; &amp;gt;&amp;gt;&amp;gt; ... &amp;gt;&amp;gt;&amp;gt; [-0.04  0.  ]
[[ 0.70710678  0.70710678]
 [-0.70710678  0.70710678]]
&lt;/pre&gt;

&lt;p&gt;
The eigenvectors are the &lt;i&gt;columns&lt;/i&gt; of evecs.
&lt;/p&gt;

&lt;p&gt;
Compute the \(c\) matrix
&lt;/p&gt;

&lt;p&gt;
V*c = Y0
&lt;/p&gt;

&lt;div class="org-src-container"&gt;

&lt;pre class="src src-python"&gt;Y0 = [0, 150]

c = np.diag(np.linalg.solve(evecs, Y0))
&lt;span style="color: #8b0000;"&gt;print&lt;/span&gt; c
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
&amp;gt;&amp;gt;&amp;gt; &amp;gt;&amp;gt;&amp;gt; [[-106.06601718    0.        ]
 [   0.          106.06601718]]
&lt;/pre&gt;

&lt;p&gt;
Constructing the solution
&lt;/p&gt;

&lt;p&gt;
We will create a vector of time values, and stack them for each solution, \(y_1(t)\) and \(Y_2(t)\).
&lt;/p&gt;

&lt;div class="org-src-container"&gt;

&lt;pre class="src src-python"&gt;&lt;span style="color: #8b0000;"&gt;import&lt;/span&gt; matplotlib.pyplot &lt;span style="color: #8b0000;"&gt;as&lt;/span&gt; plt

t = np.linspace(0, 100)
T = np.row_stack([t, t])

D = np.diag(evals)

&lt;span style="color: #ff0000; font-weight: bold;"&gt;# &lt;/span&gt;&lt;span style="color: #ff0000; font-weight: bold;"&gt;y = V*c*exp(D*T);&lt;/span&gt;
y = np.dot(np.dot(evecs, c), np.exp(np.dot(D, T)))

&lt;span style="color: #ff0000; font-weight: bold;"&gt;# &lt;/span&gt;&lt;span style="color: #ff0000; font-weight: bold;"&gt;y has a shape of (2, 50) so we have to transpose it&lt;/span&gt;
plt.plot(t, y.T)
plt.xlabel(&lt;span style="color: #228b22;"&gt;'t'&lt;/span&gt;)
plt.ylabel(&lt;span style="color: #228b22;"&gt;'y'&lt;/span&gt;)
plt.legend([&lt;span style="color: #228b22;"&gt;'$y_1$'&lt;/span&gt;, &lt;span style="color: #228b22;"&gt;'$y_2$'&lt;/span&gt;])
plt.savefig(&lt;span style="color: #228b22;"&gt;'images/ode-la.png'&lt;/span&gt;)
plt.show()
&lt;/pre&gt;
&lt;/div&gt;

&lt;pre class="example"&gt;
&amp;gt;&amp;gt;&amp;gt; &amp;gt;&amp;gt;&amp;gt; &amp;gt;&amp;gt;&amp;gt; &amp;gt;&amp;gt;&amp;gt; ... &amp;gt;&amp;gt;&amp;gt; &amp;gt;&amp;gt;&amp;gt; ... [&amp;lt;matplotlib.lines.Line2D object at 0x1d4db950&amp;gt;, &amp;lt;matplotlib.lines.Line2D object at 0x1d4db4d0&amp;gt;]
&amp;lt;matplotlib.text.Text object at 0x1d35fbd0&amp;gt;
&amp;lt;matplotlib.text.Text object at 0x1c222390&amp;gt;
&amp;lt;matplotlib.legend.Legend object at 0x1d34ee90&amp;gt;
&lt;/pre&gt;

&lt;p&gt;&lt;img src="/img/./images/ode-la.png"&gt;&lt;p&gt;
&lt;p&gt;Copyright (C) 2013 by John Kitchin. See the &lt;a href="/copying.html"&gt;License&lt;/a&gt; for information about copying.&lt;p&gt;&lt;p&gt;&lt;a href="/org/2013/02/27/Linear-algebra-approaches-to-solving-systems-of-constant-coefficient-ODEs.org"&gt;org-mode source&lt;/a&gt;&lt;p&gt;]]></content:encoded>
    </item>
    <item>
      <title>Another way to parameterize an ODE - nested function</title>
      <link>https://kitchingroup.cheme.cmu.edu/blog/2013/02/27/Another-way-to-parameterize-an-ODE-nested-function</link>
      <pubDate>Wed, 27 Feb 2013 14:31:51 EST</pubDate>
      <category><![CDATA[ode]]></category>
      <guid isPermaLink="false">u3mJNx-1UEuGfUuSR_D1-Gz473I=</guid>
      <description>Another way to parameterize an ODE - nested function</description>
      <content:encoded><![CDATA[


&lt;p&gt;
&lt;a href="http://matlab.cheme.cmu.edu/2011/09/18/another-way-to-parameterize-an-ode-nested-function/" &gt;Matlab post&lt;/a&gt;

We saw one method to parameterize an ODE, by creating an ode function that takes an extra parameter argument, and then making a function handle that has the syntax required for the solver, and passes the parameter the ode function. 
&lt;/p&gt;

&lt;p&gt;
Here we define the ODE function in a loop. Since the nested function is in the namespace of the main function, it can &amp;ldquo;see&amp;rdquo; the values of the variables in the main function. We will use this method to look at the solution to the van der Pol equation for several different values of mu.
&lt;/p&gt;

&lt;div class="org-src-container"&gt;

&lt;pre class="src src-python"&gt;&lt;span style="color: #8b0000;"&gt;import&lt;/span&gt; numpy &lt;span style="color: #8b0000;"&gt;as&lt;/span&gt; np
&lt;span style="color: #8b0000;"&gt;from&lt;/span&gt; scipy.integrate &lt;span style="color: #8b0000;"&gt;import&lt;/span&gt; odeint
&lt;span style="color: #8b0000;"&gt;import&lt;/span&gt; matplotlib.pyplot &lt;span style="color: #8b0000;"&gt;as&lt;/span&gt; plt

MU = [0.1, 1, 2, 5]
tspan = np.linspace(0, 100, 5000)
Y0 = [0, 3]

&lt;span style="color: #8b0000;"&gt;for&lt;/span&gt; mu &lt;span style="color: #8b0000;"&gt;in&lt;/span&gt; MU:
    &lt;span style="color: #ff0000; font-weight: bold;"&gt;# &lt;/span&gt;&lt;span style="color: #ff0000; font-weight: bold;"&gt;define the ODE&lt;/span&gt;
    &lt;span style="color: #8b0000;"&gt;def&lt;/span&gt; &lt;span style="color: #8b2323;"&gt;vdpol&lt;/span&gt;(Y, t):
        x,y = Y
        dxdt = y
        dydt = -x + mu * (1 - x**2) * y
        &lt;span style="color: #8b0000;"&gt;return&lt;/span&gt;  [dxdt, dydt]
    
    Y = odeint(vdpol, Y0, tspan)
    
    x = Y[:,0]; y = Y[:,1]
    plt.plot(x, y, label=&lt;span style="color: #228b22;"&gt;'mu={0:1.2f}'&lt;/span&gt;.format(mu))

plt.axis(&lt;span style="color: #228b22;"&gt;'equal'&lt;/span&gt;)
plt.legend(loc=&lt;span style="color: #228b22;"&gt;'best'&lt;/span&gt;)
plt.savefig(&lt;span style="color: #228b22;"&gt;'images/ode-nested-parameterization.png'&lt;/span&gt;)
plt.show()
&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img src="/img/./images/ode-nested-parameterization.png"&gt;&lt;p&gt;

&lt;p&gt;
You can see the solution changes dramatically for different values of mu. The point here is not to understand why, but to show an easy way to study a parameterize ode with a nested function. Nested functions can be a great way to &amp;ldquo;share&amp;rdquo; variables between functions especially for ODE solving, and nonlinear algebra solving, or any other application where you need a lot of parameters defined in one function in another function.
&lt;/p&gt;
&lt;p&gt;Copyright (C) 2013 by John Kitchin. See the &lt;a href="/copying.html"&gt;License&lt;/a&gt; for information about copying.&lt;p&gt;&lt;p&gt;&lt;a href="/org/2013/02/27/Another-way-to-parameterize-an-ODE---nested-function.org"&gt;org-mode source&lt;/a&gt;&lt;p&gt;]]></content:encoded>
    </item>
  </channel>
</rss>
