Numerical Simpsons rule

| categories: math, integration | tags: | View Comments

A more accurate numerical integration than the trapezoid method is Simpson's rule. The syntax is similar to trapz, but the method is in scipy.integrate.

import numpy as np
from scipy.integrate import simps, romb

a = 0.0; b = np.pi / 4.0;
N = 10  # this is the number of intervals

x = np.linspace(a, b, N)
y = np.cos(x)

t = np.trapz(y, x)
s = simps(y, x)
a = np.sin(b) - np.sin(a)

print
print 'trapz = {0} ({1:%} error)'.format(t, (t - a)/a)
print 'simps = {0} ({1:%} error)'.format(s, (s - a)/a)
print 'analy = {0}'.format(a)
>>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>
trapz = 0.70665798038 (-0.063470% error)
simps = 0.707058914216 (-0.006769% error)
analy = 0.707106781187

You can see the Simpson's method is more accurate than the trapezoid method.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Read and Post Comments

Integrating the Fermi distribution to compute entropy

| categories: dft, integration, gotcha | tags: | View Comments

The Fermi distribution is defined by \(f(\epsilon) = \frac{1}{e^{(\epsilon - \mu)/(k T)} + 1}\). This function describes the occupation of energy levels at temperatures above absolute zero. We use this function to compute electronic entropy in a metal, which contains an integral of \(\int n(\epsilon) (f \ln f + (1 - f) \ln (1-f)) d\epsilon\), where \(n(\epsilon)\) is the electronic density of states. Here we plot the Fermi distribution function. It shows that well below the Fermi level the states are fully occupied, and well above the Fermi level, they are unoccupied. Near the Fermi level, the states go from occupied to unoccupied smoothly.

import numpy as np
import matplotlib.pyplot as plt

mu = 0
k = 8.6e-5
T = 1000

def f(e):
    return 1.0 / (np.exp((e - mu)/(k*T)) + 1)

espan = np.linspace(-10, 10, 200)
plt.plot(espan, f(espan))
plt.ylim([-0.1, 1.1])
plt.savefig('images/fermi-entropy-integrand-1.png')

Let us consider a simple density of states function, just a parabola. This could represent a s-band for example. We will use this function to explore the integral.

import numpy as np
import matplotlib.pyplot as plt

mu = 0
k = 8.6e-5
T = 1000

def f(e):
    return 1.0 / (np.exp((e - mu)/(k*T)) + 1)

def dos(e):
    d = (np.ones(e.shape) - 0.03 * e**2) 
    return d * (d > 0)
espan = np.linspace(-10, 10)

plt.plot(espan, dos(espan), label='Total dos')
plt.plot(espan, f(espan) * dos(espan), label='Occupied states')
plt.legend(loc='best')
plt.savefig('images/fermi-entropy-integrand-2.png')

Now, we consider the integral to compute the electronic entropy. The entropy is proportional to this integral.

\( \int n(\epsilon) (f \ln f + (1 - f) \ln (1-f)) d\epsilon \)

It looks straightforward to compute, but it turns out there is a wrinkle. Evaluating the integrand leads to nan elements because the ln(0) is -∞.

import numpy as np
mu = 0
k = 8.6e-5
T = 100

def fermi(e):
    return 1.0 / (np.exp((e - mu)/(k*T)) + 1)

espan = np.array([-20, -10, -5, 0.0, 5, 10])
f = fermi(espan)

print f * np.log(f)
print (1 - f) * np.log(1 - f)
[  0.00000000e+000   0.00000000e+000   0.00000000e+000  -3.46573590e-001
  -1.85216532e-250               nan]
[        nan         nan         nan -0.34657359  0.          0.        ]

In this case, these nan elements should be equal to zero (x ln(x) goes to zero as x goes to zero). So, we can just ignore those elements in the integral. Here is how to do that.

import numpy as np
import matplotlib.pyplot as plt

mu = 0
k = 8.6e-5
T = 1000

def fermi(e):
    return 1.0 / (np.exp((e - mu)/(k*T)) + 1)

def dos(e):
    d = (np.ones(e.shape) - 0.03 * e**2) 
    return d * (d > 0)

espan = np.linspace(-20, 10)
f = fermi(espan)
n = dos(espan)

g = n * (f * np.log(f) + (1 - f) * np.log(1 - f))

print np.trapz(espan, g) # nan because of the nan in the g vector
print g

plt.plot(espan, g)
plt.savefig('images/fermi-entropy-integrand-3.png')

# find the elements that are not nan
ind = np.logical_not(np.isnan(g))

# evaluate the integrand for only those points
print np.trapz(espan[ind], g[ind])
nan
[             nan              nan              nan              nan
              nan              nan              nan              nan
              nan              nan              nan              nan
              nan              nan              nan              nan
              nan              nan              nan              nan
              nan              nan              nan              nan
              nan              nan              nan              nan
  -9.75109643e-14  -1.05987106e-10  -1.04640574e-07  -8.76265644e-05
  -4.92684641e-02  -2.91047740e-01  -7.75652579e-04  -1.00962241e-06
  -1.06972936e-09  -1.00527877e-12  -8.36436686e-16  -6.48930917e-19
  -4.37946336e-22  -2.23285389e-25  -1.88578082e-29   0.00000000e+00
   0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
   0.00000000e+00   0.00000000e+00]
0.208886080897

The integrand is pretty well behaved in the figure above. You do not see the full range of the x-axis, because the integrand evaluates to nan for very negative numbers. This causes the trapz function to return nan also. We can solve the problem by only integrating the parts that are not nan. We have to use numpy.logicalnot to get an element-wise array of which elements are not nan. In this example, the integrand is not well sampled, so the area under that curve may not be very accurate.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Read and Post Comments

The trapezoidal method of integration

| categories: math, integration | tags: | View Comments

Matlab post See http://en.wikipedia.org/wiki/Trapezoidal_rule

$$\int_a^b f(x) dx \approx \frac{1}{2}\displaystyle\sum\limits_{k=1}^N(x_{k+1}-x_k)(f(x_{k+1}) + f(x_k))$$

Let us compute the integral of sin(x) from x=0 to \(\pi\). To approximate the integral, we need to divide the interval from \(a\) to \(b\) into \(N\) intervals. The analytical answer is 2.0.

We will use this example to illustrate the difference in performance between loops and vectorized operations in python.

import numpy as np
import time

a = 0.0; b = np.pi;
N = 1000; # this is the number of intervals

h = (b - a)/N; # this is the width of each interval
x = np.linspace(a, b, N) 
y = np.sin(x); # the sin function is already vectorized

t0 = time.time()
f = 0.0
for k in range(len(x) - 1):
    f += 0.5 * ((x[k+1] - x[k]) * (y[k+1] + y[k]))

tf = time.time() - t0
print 'time elapsed = {0} sec'.format(tf)

print f
>>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> ... ... >>> >>> time elapsed = 0.0780000686646 sec
>>> 1.99999835177
t0 = time.time()
Xk = x[1:-1] - x[0:-2] # vectorized version of (x[k+1] - x[k])
Yk = y[1:-1] + y[0:-2] # vectorized version of (y[k+1] + y[k])

f = 0.5 * np.sum(Xk * Yk) # vectorized version of the loop above
tf = time.time() - t0
print 'time elapsed = {0} sec'.format(tf)

print f
>>> >>> >>> >>> >>> time elapsed = 0.077999830246 sec
>>> 1.99999340709

In the last example, there may be loop buried in the sum command. Let us do one final method, using linear algebra, in a single line. The key to understanding this is to recognize the sum is just the result of a dot product of the x differences and y sums.

t0 = time.time()
f = 0.5 * np.dot(Xk, Yk)
tf = time.time() - t0
print 'time elapsed = {0} sec'.format(tf)

print f
>>> >>> time elapsed = 0.0310001373291 sec
>>> 1.99999340709

The loop method is straightforward to code, and looks alot like the formula that defines the trapezoid method. the vectorized methods are not as easy to read, and take fewer lines of code to write. However, the vectorized methods are much faster than the loop, so the loss of readability could be worth it for very large problems.

The times here are considerably slower than in Matlab. I am not sure if that is a totally fair comparison. Here I am running python through emacs, which may result in slower performance. I also used a very crude way of timing the performance which lumps some system performance in too.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Read and Post Comments

On the quad or trapz'd in ChemE heaven

| categories: python, integration | tags: | View Comments

Matlab post

What is the difference between quad and trapz? The short answer is that quad integrates functions (via a function handle) using numerical quadrature, and trapz performs integration of arrays of data using the trapezoid method.

Let us look at some examples. We consider the example of computing \(\int_0^2 x^3 dx\). the analytical integral is \(1/4 x^4\), so we know the integral evaluates to 16/4 = 4. This will be our benchmark for comparison to the numerical methods.

We use the scipy.integrate.quad command to evaluate this \(\int_0^2 x^3 dx\).

from scipy.integrate import quad

ans, err = quad(lambda x: x**3, 0, 2)
print ans
4.0

you can also define a function for the integrand.

from scipy.integrate import quad

def integrand(x):
    return x**3

ans, err = quad(integrand, 0, 2)
print ans
4.0

1 Numerical data integration

if we had numerical data like this, we use trapz to integrate it

import numpy as np

x = np.array([0, 0.5, 1, 1.5, 2])
y = x**3

i2 = np.trapz(y, x)

error = (i2 - 4)/4

print i2, error
4.25 0.0625

Note the integral of these vectors is greater than 4! You can see why here.

import numpy as np
import matplotlib.pyplot as plt
x = np.array([0, 0.5, 1, 1.5, 2])
y = x**3

x2 = np.linspace(0, 2)
y2 = x2**3

plt.plot(x, y, label='5 points')
plt.plot(x2, y2, label='50 points')
plt.legend()
plt.savefig('images/quad-1.png')

The trapezoid method is overestimating the area significantly. With more points, we get much closer to the analytical value.

import numpy as np

x2 = np.linspace(0, 2, 100)
y2 = x2**3

print np.trapz(y2, x2)
4.00040812162

2 Combining numerical data with quad

You might want to combine numerical data with the quad function if you want to perform integrals easily. Let us say you are given this data:

x = [0 0.5 1 1.5 2]; y = [0 0.1250 1.0000 3.3750 8.0000];

and you want to integrate this from x = 0.25 to 1.75. We do not have data in those regions, so some interpolation is going to be needed. Here is one approach.

from scipy.interpolate import interp1d
from scipy.integrate import quad
import numpy as np

x = [0, 0.5, 1, 1.5, 2]
y = [0,    0.1250,    1.0000,    3.3750,    8.0000]

f = interp1d(x, y)

# numerical trapezoid method
xfine = np.linspace(0.25, 1.75)
yfine = f(xfine)
print np.trapz(yfine, xfine)

# quadrature with interpolation
ans, err = quad(f, 0.25, 1.75)
print ans
2.53199187838
2.53125

These approaches are very similar, and both rely on linear interpolation. The second approach is simpler, and uses fewer lines of code.

3 Summary

trapz and quad are functions for getting integrals. Both can be used with numerical data if interpolation is used. The syntax for the quad and trapz function is different in scipy than in Matlab.

Finally, see this post for an example of solving an integral equation using quad and fsolve.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Read and Post Comments

Integrating equations in python

| categories: python, integration | tags: | View Comments

A common need in engineering calculations is to integrate an equation over some range to determine the total change. For example, say we know the volumetric flow changes with time according to \(d\nu/dt = \alpha t\), where \(\alpha = 1\) L/min and we want to know how much liquid flows into a tank over 10 minutes if the volumetric flowrate is \(\nu_0 = 5\) L/min at \(t=0\). The answer to that question is the value of this integral: \(V = \int_0^{10} \nu_0 + \alpha t dt\).

import scipy
from scipy.integrate import quad

nu0 = 5     # L/min
alpha = 1.0 # L/min
def integrand(t):
    return nu0 + alpha * t

t0 = 0.0
tfinal = 10.0
V, estimated_error = quad(integrand, t0, tfinal)
print('{0:1.2f} L flowed into the tank over 10 minutes'.format(V))
100.00 L flowed into the tank over 10 minutes

That is all there is too it!

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Read and Post Comments

Next Page ยป