Uncertainty in an integral equation

| categories: math, uncertainty | tags:

In a previous example, we solved for the time to reach a specific conversion in a batch reactor. However, it is likely there is uncertainty in the rate constant, and possibly in the initial concentration. Here we examine the effects of that uncertainty on the time to reach the desired conversion.

To do this we have to write a function that takes arguments with uncertainty, and wrap the function with the uncertainties.wrap decorator. The function must return a single float number (current limitation of the uncertainties package). Then, we simply call the function, and the uncertainties from the inputs will be automatically propagated to the outputs. Let us say there is about 10% uncertainty in the rate constant, and 1% uncertainty in the initial concentration.

from scipy.integrate import quad
import uncertainties as u

k = u.ufloat((1.0e-3, 1.0e-4))
Ca0 = u.ufloat((1.0, 0.01))# mol/L

@u.wrap
def func(k, Ca0):
    def integrand(X):
        return 1./(k*Ca0)*(1./(1-X)**2)
    integral, abserr = quad(integrand, 0, 0.9)
    return integral

sol = func(k, Ca0)
print 't = {0} seconds ({1} hours)'.format(sol, sol/3600)
t = 9000.0+/-904.488801332 seconds (2.5+/-0.251246889259 hours)

The result shows about a 10% uncertainty in the time, which is similar to the largest uncertainty in the inputs. This information should certainly be used in making decisions about how long to actually run the reactor to be sure of reaching the goal. For example, in this case, running the reactor for 3 hours (that is roughly + 2σ) would ensure at a high level of confidence (approximately 95% confidence) that you reach at least 90% conversion.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter

New paper on SO2 tolerance of CO2 sorbent accepted in I&ECR

| categories: news | tags:

Our paper "Effects of O2 and SO2 on the capture capacity of a primary-amine based polymeric CO2 sorbent" by Alexander Hallenbeck and John R. Kitchin was accepted today in Industrial & Engineering Chemistry Research. In this paper we showed that the ion exchange resin OC1065 is susceptible to poisoning by SO2, but that it can be partially chemically regenerated. It can also be damaged by long term exposure to air at elevated temperatures.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter

for-else loops

| categories: programming | tags:

I just learned of for/else loops (http://pyvideo.org/video/1780/transforming-code-into-beautiful-idiomatic-pytho). They are interesting enough to write about. The idea is that there is an "else" clause of a for loop that is only executed if the loop completes without a break statement. The use case is to avoid using a flag. For example, let us say we want to loop through a list and determine if a number exists. Here is a typical way you might think to do it:

def f():
    flag = False
    for i in range(10):
        if i == 5:
            flag = True
            break

    return flag

print f()
True

A for/else loop does this in a different way. Essentially, the else clause runs if the loop completes, otherwise if the break occurs it is skipped. In this example the break statement occurs, so the else statement is skipped.

def f():
    for i in range(10):
        if i == 5:
            break
    else:
        return False

    return True

print f()
True

In this example no break statement occurs, so the else clause is executed.

def f():
    for i in range(10):
        if i == 15:
            break
    else:
        return False

    return True

print f()
False

It is hard to say if this is an improvement over the flag. They both use the same number of lines of code, and I find it debatable if the else statement is intuitive in its meaning. Maybe if there were multiple potential breaks this would be better.

Needless to say, go watch http://pyvideo.org/video/1780/transforming-code-into-beautiful-idiomatic-pytho. You will learn a lot of interesting things!

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter

The and-or trick in python

| categories: logic, programming | tags:

The boolean logic commands and and or have return values in python. Let us first review briefly what these operators do by examples. The typical usage is in conditional statements. First, we look at what kind of values evaluate to "True" or "False" in python. Anything that is "empty" usually evaluates to False, along with the integer 0 and the boolean value of False.

for value in ('', 0, None, [], (), {}, False):
    if value:
        print value, "True"
    else:
        print value, "False"
 False
0 False
None False
[] False
() False
{} False
False False

Objects that are not empty evaluate to "True", along with numbers not equal to 0, and the boolean value True.

for value in (' ', 2, 1, "a", [1], (3,), True):
    if value:
        print value, "True"
    else:
        print value, "False"
  True
2 True
1 True
a True
[1] True
(3,) True
True True

The and and or operators compare two objects. and evaluates to "True" if both objects evaluate to "True" and or evaluates to "True" if either object evaluates to "True". Here are some examples.

a = None
b = 5

if a and b:
    print True
else:
    print False
False
a = None
b = 5

if a or b:
    print True
else:
    print False
True

Now the interesting part. The and and or operators actually return values! With the and operator, each argument is evaluated, and if they all evaluate to True, the last argument is returned. Otherwise the first False argument is returned.

a = 1
b = 5
print a and b
print b and a
print a and False
print a and True
print a and None
print False and a
print None and a
print True and 'a' and 0 and True # first False item is zero
5
1
False
True
None
False
None
0

The or operator returns the first True value or the last value if nothing is True. Note that if a True value is found, the values in the expressions after that value are not evaluated.

print 2 or False
print 0 or False
print 0 or False or 4 or {}
2
False
4

One way you might see this is to set variables depending on what command-line arguments were used in a script. For example:

import sys

# replace this:
if 'plot' in sys.argv:
    PLOT = True
else:
    PLOT = False

# with this
PLOT = 'plot' in sys.argv or False

# later in your code:
if PLOT: 
    # do something to make a plot

Now we get to the and-or trick. The trick is to assign a variable one value if some boolean value is True, and another value if the expression is False.

a = True
b = True

if a and b:
    c = "value1"
else: 
    c = "value2"

print c
value1

We can replace the if/else code above with this one line expression:

a = True
b = True

c = (a and b) and "value1" or "value2"
print c
value1

There is a problem. If the first value evaluates to False, you will not get what you expect:

a = True
b = True

c = (a and b) and None or "value2"
print c
value2

In this case, (a and b) evaluates to True, so we would expect the value of c to be the first value. However, None evaluates to False, so the or operator returns the first "True" value, which is the second value. We have to modify the code so that both the or arguments are True. We do this by putting both arguments inside a list, which will then always evaluate to True. This will assign the first list to c if the expression is True, and the second list if it is False. We wrap the whole thing in parentheses, and then index the returned list to get the contents of the list.

a = True
b = True

c = ((a and b) and [None] or ["value2"])[0]

print c
None
a = True
b = True

c = (not (a and b) and [None] or ["value2"])[0]

print c
value2

This is definitely a trick. I find the syntax difficult to read, especially compared to the more verbose if/else statements. It is interesting though, and there might be places where the return value of the boolean operators might be useful, now that you know you can get them.

Here is a tough example of using this to update a dictionary entry. Previously we used a dictionary to count unique entries in a list.

d = {}

d['key'] = (d.get('key', None) and [d['key'] + 1] or [1])[0]

print d

d['key'] = (d.get('key', None) and [d['key'] + 1] or [1])[0]
print d
{'key': 1}
{'key': 2}

This works because the .get function on a dictionary returns None if the key does not exist, resulting in assigning the value of 1 to that key, or it returns something that evaluates to True if the key does exist, so the key gets incremented.

Let us see this trick in action. Before we used if/else statements to achieve our goal, checking if the key is in the dictionary and incrementing by one if it is, and if not, setting the key to 1.

L = ['a', 'a', 'b','d', 'e', 'b', 'e', 'a']

# old method
d = {}
for el in L:
    if el in d:
        d[el] += 1
    else:
        d[el] = 1

print d
{'a': 3, 'b': 2, 'e': 2, 'd': 1}

Here is the new method:

# new method:
L = ['a', 'a', 'b','d', 'e', 'b', 'e', 'a']
d = {}
for el in L:
    d[el] = (d.get(el, None) and [d[el] + 1] or [1])[0]
print d
{'a': 3, 'b': 2, 'e': 2, 'd': 1}

We have in (an admittedly hard to read) a single single line replaced the if/else statement! I have for a long time thought this should possible. I am somewhat disappointed that it is not easier to read though.

Update 7/8/2013

1 Using more modern python syntax than the and-or trick

@Mark_ pointed out in a comment the more modern syntax in python is "value1" if a else "value2". Here is how it works.

a = True
c = "value1" if a else "value2"
print c
value1
a = ''
c = "value1" if a else "value2"
print c
value2

This is indeed very clean to read. This leads to a cleaner and easier to read implementation of the counting code.

L = ['a', 'a', 'b','d', 'e', 'b', 'e', 'a']
d = {}
for el in L:
    d[el] = (d[el] + 1) if (el in d) else 1
print d
{'a': 3, 'b': 2, 'e': 2, 'd': 1}

See the next section for an even cleaner implementation.

2 using defaultdict

@Mark_ also suggested the use of defaultdict for the counting code. That is pretty concise! It is not obvious that the default value is equal to zero, but int() returns zero. This is much better than the and-or trick.

from collections import defaultdict
print int()

L = ['a', 'a', 'b','d', 'e', 'b', 'e', 'a']
d = defaultdict(int)
for el in L:
    d[el] += 1
print d
0
defaultdict(<type 'int'>, {'a': 3, 'b': 2, 'e': 2, 'd': 1})

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter

Uncertainty in polynomial roots - Part II

| categories: data analysis, uncertainty | tags:

We previously looked at uncertainty in polynomial roots where we had an analytical formula for the roots of the polynomial, and we knew the uncertainties in the polynomial parameters. It would be inconvenient to try this for a cubic polynomial, although there may be formulas for the roots. I do not know of there are general formulas for the roots of a 4th order polynomial or higher.

Unfortunately, we cannot use the uncertainties package out of the box directly here.

import uncertainties as u
import numpy as np
c, b, a = [-0.99526746, -0.011546,    1.00188999]
sc, sb, sa = [ 0.0249142,   0.00860025,  0.00510128]

A = u.ufloat((a, sa))
B = u.ufloat((b, sb))
C = u.ufloat((c, sc))

print np.roots([A, B, C])
>>> >>> >>> >>> >>> >>> >>> >>> Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\Users\jkitchin\AppData\Local\Enthought\Canopy\User\lib\site-packages\numpy\lib\polynomial.py", line 218, in roots
    p = p.astype(float)
  File "c:\Users\jkitchin\AppData\Local\Enthought\Canopy\User\lib\site-packages\uncertainties\__init__.py", line 1257, in raise_error
    % (self.__class__, coercion_type))
TypeError: can't convert an affine function (<class 'uncertainties.Variable'>) to float; use x.nominal_value

To make some progress, we have to understand how the numpy.roots function works. It constructs a Companion matrix, and the eigenvalues of that matrix are the same as the roots of the polynomial.

import numpy as np

c0, c1, c2 = [-0.99526746, -0.011546,    1.00188999]

p = np.array([c2, c1, c0])
N = len(p)

# we construct the companion matrix like this
# see https://github.com/numpy/numpy/blob/v1.7.0/numpy/lib/polynomial.py#L220
# for this code.
# build companion matrix and find its eigenvalues (the roots)
A = np.diag(np.ones((N-2,), p.dtype), -1)
A[0, :] = -p[1:] / p[0]

print A

roots = np.linalg.eigvals(A)
print roots
[[ 0.01152422  0.99338996]
 [ 1.          0.        ]]
[ 1.00246827 -0.99094405]

This definition of the companion matrix is a little different than the one here, but primarily in the scaling of the coefficients. That does not seem to change the eigenvalues, or the roots.

Now, we have a path to estimate the uncertainty in the roots. Since we know the polynomial coefficients and their uncertainties from the fit, we can use Monte Carlo sampling to estimate the uncertainty in the roots.

import numpy as np
import uncertainties as u

c, b, a = [-0.99526746, -0.011546,    1.00188999]
sc, sb, sa = [ 0.0249142,   0.00860025,  0.00510128]

NSAMPLES = 100000
A = np.random.normal(a, sa, (NSAMPLES, ))
B = np.random.normal(b, sb, (NSAMPLES, ))
C = np.random.normal(c, sc, (NSAMPLES, ))

roots = [[] for i in range(NSAMPLES)]

for i in range(NSAMPLES):
    p = np.array([A[i], B[i], C[i]])
    N = len(p)
    
    M = np.diag(np.ones((N-2,), p.dtype), -1)
    M[0, :] = -p[1:] / p[0]
    r = np.linalg.eigvals(M)
    r.sort()  # there is no telling what order the values come out in
    roots[i] = r
    
avg = np.average(roots, axis=0)
std = np.std(roots, axis=0)

for r, s in zip(avg, std):
    print '{0: f} +/- {1: f}'.format(r, s)
-0.990949 +/-  0.013435
 1.002443 +/-  0.013462

Compared to our previous approach with the uncertainties package where we got:

: -0.990944048037+/-0.0134208013339
:  1.00246826738 +/-0.0134477390832

the agreement is quite good! The advantage of this approach is that we do not have to know the formula for the roots of higher order polynomials to estimate the uncertainty in the roots. The downside is we have to evaluate the eigenvalues of a matrix a large number of times to get good estimates of the uncertainty. For high power polynomials this could be problematic. I do not currently see a way around this, unless it becomes possible to get the uncertainties package to propagate through the numpy.eigvals function. It is possible to wrap some functions with uncertainties, but so far only functions that return a single number.

There are some other potential problems with this approach. It is assumed that the accuracy of the eigenvalue solver is much better than the uncertainty in the polynomial parameters. You have to use some judgment in using these uncertainties. We are approximating the uncertainties of a nonlinear problem. In other words, the uncertainties of the roots are not linearly dependent on the uncertainties of the polynomial coefficients.

It is possible to wrap some functions with uncertainties, but so far only functions that return a single number. Here is an example of getting the nth root and its uncertainty.

import uncertainties as u
import numpy as np

@u.wrap
def f(n=0, *P):
    ''' compute the nth root of the polynomial P and the uncertainty of the root'''
    p =  np.array(P)
    N = len(p)
    
    M = np.diag(np.ones((N-2,), p.dtype), -1)
    M[0, :] = -p[1:] / p[0]
    r = np.linalg.eigvals(M)
    r.sort()  # there is no telling what order the values come out in
    return r[n]

# our polynomial coefficients and standard errors
c, b, a = [-0.99526746, -0.011546,    1.00188999]
sc, sb, sa = [ 0.0249142,   0.00860025,  0.00510128]

A = u.ufloat((a, sa))
B = u.ufloat((b, sb))
C = u.ufloat((c, sc))

for result in [f(n, A, B, C) for n in [0, 1]]:
    print result
-0.990944048037+/-0.013420800377
1.00246826738+/-0.0134477388218

It is good to see this is the same result we got earlier, with a lot less work (although we do have to solve it for each root, which is a bit redundant)! It is a bit more abstract though, and requires a specific formulation of the function for the wrapper to work.

Copyright (C) 2013 by John Kitchin. See the License for information about copying.

org-mode source

Discuss on Twitter
« Previous Page -- Next Page »