Solving nonlinear algebra problems with internal state information

| categories: python, nonlinear algebra | tags: | View Comments

In engineering, we often need to solve an equation in one variable, and then use the solution to compute other variables. For example, we might want the bubble point temperature of a mixture, and then to determine the composition of the vapor phase that has formed. In other words, we compute the temperature, and then have to use it in a subsequent step to get the composition. Here is a bubble point computation adapted from example 10.2 in Smith and van Ness, Introduction to Chemical Engineering Thermodynamics.

Given a solution of acetone (x1=0.3), acetonitrile (x2=0.45) and nitromethane (x3=0.25) at a total pressure of 80 kPa, compute the bubble point temperature and gas phase composition.

The key here is to find a temperature where the gas-phase mole fractions sum to one. The gas phase mole fractions are defined by:

\(y_i = x_i Pvap_i(T) / P\)

The typical way I would teach students how solve this looks like this. It uses the Antoine equation coded below to estimate the vapor pressure of each component as a function of temperature, and then uses fsolve to find a temperature where the gas-phase mole fractions sum to one.

import numpy as np
from scipy.optimize import fsolve

acetone = (14.5463, 2940.46, 237.22)
acetonitrile = (14.2724, 2945.47,224)
nitromethane = (14.2043, 2972.64, 209)

def antoine(T, A, B, C):
    T = float(T) # there is some subtle issue that comes up when T is an array,
                 # as passed in from fsolve. It needs to be a float, or you get
                 # the wrong answer.
    return np.exp(A - B / (T + C))

x = np.array([0.3, 0.45, 0.25])
P = 80

def objective(T):
    Pvap = np.array([antoine(T, *pars) for pars in [acetone, acetonitrile, nitromethane]])
    y = x * Pvap / P
    return 1 - y.sum()

Tans, = fsolve(objective, 70)

# This is where we end up repeating code
Pvap = np.array([antoine(Tans, *pars) for pars in [acetone, acetonitrile, nitromethane]])
y = x * Pvap / P

print(f'The bubble point temperature is {Tans:1.2f} degC, and the gas phase compositions are {np.round(y, 4)}.')
The bubble point temperature is 68.60 degC, and the gas phase compositions are [ 0.5196  0.3773  0.1031].


This solution works fine, but there is in my opinion, an issue with the small amount of repeated code at the end that is required to get the composition of the gas-phase. This is a small problem here, but as the problems get bigger it is more and more tedious to correctly repeat all the code to see what the state of a system is at the solution, and it seems wasteful to have to repeat the computations; they were known in the objective function. In the following subsections, I explore some alternative approaches to reduce the repetition.

1 First approach

There are two small chunks of repeated code in the example above. One way to minimize the amount of repeated code is to pull these out into reusable functions. Here, we do that, and only have to repeat one function call at the end to get the system composition out.

import numpy as np
from scipy.optimize import fsolve

acetone = (14.5463, 2940.46, 237.22)
acetonitrile = (14.2724, 2945.47,224)
nitromethane = (14.2043, 2972.64, 209)

def antoine(T, A, B, C):
    T = float(T) # there is some subtle issue that comes up when T is an array,
                 # as passed in from fsolve. It needs to be a float, or you get
                 # the wrong answer.
    return np.exp(A - B / (T + C))

x = np.array([0.3, 0.45, 0.25])
P = 80

def Pvap(T):
    return np.array([antoine(T, *pars) for pars in [acetone, acetonitrile, nitromethane]])

def y(T):
    return x * Pvap(T) / P

def objective(T):
    return 1 - y(T).sum()

Tans, = fsolve(objective, 70)

yans = y(Tans) # minimal repetition of a calculation to get the composition state.

print(f'The bubble point temperature is {Tans:1.2f} degC, and the gas phase compositions are {np.round(yans, 4)}.')
The bubble point temperature is 68.60 degC, and the gas phase compositions are [ 0.5196  0.3773  0.1031].


That is a small improvement. The code is not much shorter, just reorganized for easier reuse. It would be easy in this case to also get the vapor pressures of each species at this temperature, just by calling the Pvap function. Still, it feels annoying we have to recalculate the answer to something you know must have already been known when the objective function was evaluated.

2 Second approach - use a state dictionary as an arg in the objective function

In this approach, we will use a dictionary to store the state of the objective function. The dictionary will be in the global namespace, and we will just update it each time the objective function is called.

import numpy as np
from scipy.optimize import fsolve

acetone = (14.5463, 2940.46, 237.22)
acetonitrile = (14.2724, 2945.47,224)
nitromethane = (14.2043, 2972.64, 209)

def antoine(T, A, B, C):
    return np.exp(A - B / (T + C))

x = np.array([0.3, 0.45, 0.25])

state = {}

P = 80


def objective(T, state):
    T = float(T)
    Pvap = np.array([antoine(T, *pars) for pars in [acetone, acetonitrile, nitromethane]])
    y = x * Pvap / P
    state.update({'y': y,
                  'T':  T,
                  'Pvap': Pvap,
                  'z': 1 - y.sum()})
    return state['z']

Tans, = fsolve(objective, 70, args=(state,))

print(f'The bubble point temperature is {Tans:1.2f} degC, and the gas phase compositions are {np.round(state["y"], 4)}.')
print(Tans- state['T']) # check to make sure last value from objective is the same as the solution
state
The bubble point temperature is 68.60 degC, and the gas phase compositions are [ 0.5196  0.3773  0.1031].
0.0


{'Pvap': array([ 138.5620209 ,   67.07966082,   32.98218545]),
 'T': 68.60064626680659,
 'y': array([ 0.51960758,  0.37732309,  0.10306933]),
 'z': -3.4194869158454821e-14}

What we see in the state dictionary is the result from the last time that the objective function was called. It appears that the list time it was called is also where the solution comes from, so the other variables stored here should be consistent. Now you can see we have access to both the Pvap and y composition data from the solution without needing any further computations. This approach could be easily extended to store any derived quantities that represent internal states you want. We don't store any history in this, but you could by appending to lists in the dictionary.

One feature of this is the state dictionary is updated by side effect, and you have to use the state dictionary as an argument parameter to the function.

3 third approach - a callable object

A standard approach to tracking state data is to encapsulate it in a class. fsolve requires a callable function that returns zero at the solution. It is possible to make an object act like a callable function if we define a __call__ method on it. Then, in this method, we can set attributes on the object to keep track of the state, similar to what we did with the dictionary. Since we have a class, we can define some other special dunder methods, e.g. to print the solution. Here is one implementation.

import numpy as np
from scipy.optimize import fsolve

class Objective(object):
    acetone = (14.5463, 2940.46, 237.22)
    acetonitrile = (14.2724, 2945.47,224)
    nitromethane = (14.2043, 2972.64, 209)

    def __init__(self, x, P):
        self.x = np.array(x)
        self.P = P

    def antoine(self, T, A, B, C):
        return np.exp(A - B / (T + C))

    def __str__(self):
        s = f'The bubble point temperature is {self.T:1.2f} degC, and the gas phase compositions are {np.round(self.y, 4)}.'
        return s

    def __call__(self, T):
        T = float(T)
        self.T = T
        self.Pvap = np.array([self.antoine(T, *pars) for pars in [self.acetone, self.acetonitrile, self.nitromethane]])
        self.y = self.x * self.Pvap / self.P
        return 1 - self.y.sum()

obj = Objective(x=np.array([0.3, 0.45, 0.25]), P=80)
ans, = fsolve(obj, 60)

print(obj)
The bubble point temperature is 68.60 degC, and the gas phase compositions are [ 0.5196  0.3773  0.1031].


Similar to the state dictionary approach, there is no repeated code here, and no repeated evaluations to get to the state after the solution. This is a bit more advanced Python than the state dictionary. Note, this implementation doesn't have any checking in it, so if you try to print the object before calling fsolve, you will get an error because the attributes don't exist until after the object has been called. That is also an issue with the state dictionary above.

There are many choices you could make in constructing this example. Maybe this one has gone too far in encapsulating the antoine function as a method. That limits its reusability for another problem. On the other hand, you can reuse it for any other pressure or liquid composition of acetone, acetonitrile and nitromethane very readily.

4 Summary

We looked at three ways to reduce having redundant code in the solution to problems involving nonlinear algebra. The first approach is conceptually simple; you break out as much as you can into reusable functions, and then at most have repeated function calls. These computations are usually not expensive, so repeating them is mostly tedious and provides opportunities for mistakes. This is probably what I will stick to for teaching students that are just seeing this for the first time.

The second approach used a dictionary (other data structures could work too) as an argument to the objective function, and internal states were kept in the dictionary so that after the problem was solved, you have immediate access to them. This is more advanced than the first approach because it requires understanding that the dictionary is modified as a side effect of solving the problem.

Finally, we considered an object-oriented class encapsulation of the information we wanted. I consider this the most advanced Python solution, since it requires some understanding of classes, dunder methods and attributes, and how to make an instance of a class.

The last two methods seem like candidates for an advanced class in problem solving. Thoughts?

Copyright (C) 2018 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.1.13

Read and Post Comments

Integration of the heat capacity

| categories: uncategorized | tags: | View Comments

From thermodynamics, the heat capacity is defined as \(C_p = \left(\frac{dH}{dT}\right)_P\). That means we can calculate the heat required to change the temperature of some material from the following integral:

\(H_2 - H_1 = Q = \int_{T_1}^{T_2} C_p(T) dT\)

In the range of 298-1200K, the heat capacity of CO2 is given by a Shomate polynomial:

\(C_p(t) = A + B t + C t^2 + D t^3 + E/t^2\) with units of J/mol/K.

where \(t = T / 1000\), and \(T\) is the temperature in K. The constants in the equation are

  value
A 24.99735
B 55.18696
C -33.69137
D 7.948387
E -0.136638
F -403.6075
G 228.2431
H -393.5224

1 Integrate the heat capacity

Use this information to compute the energy (Q in kJ/mol) required to raise the temperature of CO2 from 300K to 600K. You should use scipy.integrate.quad to perform the integration.


1.1 solution   solution

A =  24.99735
B =  55.18696
C = -33.69137
D =  7.948387
E = -0.136638
F = -403.6075
G =  228.2431
H = -393.5224

def Cp(T):
    t = T / 1000
    return A + B*t + C*t**2 + D*t**3 + E / t**2

from scipy.integrate import quad

dH, _ = quad(Cp, 300, 600)
print(f'The change in enthalpy is {dH / 1000:1.3f} kJ/mol')
The change in enthalpy is 12.841 kJ/mol


2 Verify via Δ H

The change in enthalpy (in kJ / mol) from standard state is

\(dH − dH_{298.15}= A t + B t^2/2 + C t^3/3 + D t^4/4 − E/t + F − H\)

again, \(t = T / 1000\).

Use this equation to compute the change in enthalpy when you increase the temperature from 300 K to 600 K.


2.1 solution   solution

def dH(T):
    t = T / 1000
    return A * t + B*t**2 / 2 + C * t**3 / 3 + D * t**4 / 4 - E/t + F - H

print(f'The change in enthalpy is {dH(600) - dH(300):1.3f} kJ/mol')
The change in enthalpy is 12.841 kJ/mol


Copyright (C) 2018 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.1.13

Read and Post Comments

A new ode integrator function in scipy

| categories: ode, scipy | tags: | View Comments

I learned recently about a new way to solve ODEs in scipy: scipy.integrate.solve_ivp. This new function is recommended instead of scipy.integrate.odeint for new code. This function caught my eye because it added functionality that was previously missing, and that I had written into my pycse package. That functionality is events.

To explore how to use this new function, I will recreate an old blog post where I used events to count the number of roots in a function. Spoiler alert: it may not be ready for production.

The question at hand is how many roots are there in \(f(x) = x^3 + 6x^2 - 4x - 24\), and what are they. Now, I know there are three roots and that you can use np.roots for this, but that only works for polynomials. Here they are, so we know what we are looking for.

import numpy as np
np.roots([1, 6, -4, -24])
array([-6.,  2., -2.])

The point of this is to find a more general way to count roots in an interval. We do it by integrating the derivative of the function, and using an event function to count when the function is equal to zero. First, we define the derivative:

\(f'(x) = 3x^2 + 12x - 4\), and the value of our original function at some value that is the beginning of the range we want to consider, say \(f(-8) = -120\). Now, we have an ordinary differential equation that can be integrated. Our event function is simply, it is just the function value \(y\). In the next block, I include an optional t_eval arg so we can see the solution at more points.

def fprime(x, y):
    return 3 * x**2 + 12 * x - 4

def event(x, y):
    return y

import numpy as np
from scipy.integrate import solve_ivp
sol = solve_ivp(fprime, (-8, 4), np.array([-120]), t_eval=np.linspace(-8, 4, 10), events=[event])
sol
 message: 'The solver successfully reached the interval end.'
    nfev: 26
    njev: 0
     nlu: 0
     sol: None
  status: 0
 success: True
       t: array([-8.        , -6.66666667, -5.33333333, -4.        , -2.66666667,
      -1.33333333,  0.        ,  1.33333333,  2.66666667,  4.        ])
t_events: [array([-6.])]
       y: array([[-120.        ,  -26.96296296,   16.2962963 ,   24.        ,
         10.37037037,  -10.37037037,  -24.        ,  -16.2962963 ,
         26.96296296,  120.        ]])

sol.t_events
[array([-6.])]

Huh. That is not what I expected. There should be three values in sol.t_events, but there is only one. Looking at sol.y, you can see there are three sign changes, which means three zeros. The graph here confirms that.

%matplotlib inline
import matplotlib.pyplot as plt
plt.plot(sol.t, sol.y[0])
[<matplotlib.lines.Line2D at 0x151281d860>]

What appears to be happening is that the events are only called during the solver steps, which are different than the t_eval steps. It appears a workaround is to specify a max_step that can be taken by the solver to force the event functions to be evaluated more often. Adding this seems to create a new cryptic warning.

sol = solve_ivp(fprime, (-8, 4), np.array([-120]), events=[event], max_step=1.0)
sol
/Users/jkitchin/anaconda/lib/python3.6/site-packages/scipy/integrate/_ivp/rk.py:145: RuntimeWarning: divide by zero encountered in double_scalars
  max(1, SAFETY * error_norm ** (-1 / (order + 1))))


 message: 'The solver successfully reached the interval end.'
    nfev: 80
    njev: 0
     nlu: 0
     sol: None
  status: 0
 success: True
       t: array([-8.        , -7.89454203, -6.89454203, -5.89454203, -4.89454203,
      -3.89454203, -2.89454203, -1.89454203, -0.89454203,  0.10545797,
       1.10545797,  2.10545797,  3.10545797,  4.        ])
t_events: [array([-6., -2.,  2.])]
       y: array([[-120.        , -110.49687882,  -38.94362768,    3.24237128,
         22.06111806,   23.51261266,   13.59685508,   -1.68615468,
        -16.33641662,  -24.35393074,  -19.73869704,    3.50928448,
         51.39001383,  120.        ]])

sol.t_events
[array([-6., -2.,  2.])]

That is more like it. Here, I happen to know the answers, so we are safe setting a max_step of 1.0, but that feels awkward and unreliable. You don't want this max_step to be too small, because it probably makes for more computations. On the other hand, it can't be too large either because you might miss roots. It seems there is room for improvement on this.

It also seems odd that the solve_ivp only returns the t_events, and not also the corresponding solution values. I guess in this case, we know the solution values are zero at t_events, but, supposing you instead were looking for a maximum value by getting a derivative that was equal to zero, you might end up getting stuck solving for it some how.

Let's consider this parabola with a maximum at \(x=2\), where \(y=2\):

x = np.linspace(0, 4)
plt.plot(x, 2 - (x - 2)**2)
[<matplotlib.lines.Line2D at 0x1512dad9e8>]

We can find the maximum like this.

def yprime(x, y):
    return -2  * (x - 2)

def maxevent(x, y):
    return yprime(x, y)

sol = solve_ivp(yprime, (0, 4), np.array([-2]), events=[maxevent])
sol
/Users/jkitchin/anaconda/lib/python3.6/site-packages/scipy/integrate/_ivp/rk.py:145: RuntimeWarning: divide by zero encountered in double_scalars
  max(1, SAFETY * error_norm ** (-1 / (order + 1))))


 message: 'The solver successfully reached the interval end.'
    nfev: 20
    njev: 0
     nlu: 0
     sol: None
  status: 0
 success: True
       t: array([ 0.        ,  0.08706376,  0.95770136,  4.        ])
t_events: [array([ 2.])]
       y: array([[-2.        , -1.65932506,  0.91361355, -2.        ]])

Clearly, we found the maximum at x=2, but now what? Re-solve the ODE and use t_eval with the t_events values? Use a fine t_eval array, and interpolate the solution? That doesn't seem smart. You could make the event terminal, so that it stops at the max, and then read off the last value, but this will not work if you want to count more than one maximum, for example.

maxevent.terminal = True
solve_ivp(yprime, (0, 4), (-2,), events=[maxevent])
/Users/jkitchin/anaconda/lib/python3.6/site-packages/scipy/integrate/_ivp/rk.py:145: RuntimeWarning: divide by zero encountered in double_scalars
  max(1, SAFETY * error_norm ** (-1 / (order + 1))))


 message: 'A termination event occurred.'
    nfev: 20
    njev: 0
     nlu: 0
     sol: None
  status: 1
 success: True
       t: array([ 0.        ,  0.08706376,  0.95770136,  2.        ])
t_events: [array([ 2.])]
       y: array([[-2.        , -1.65932506,  0.91361355,  2.        ]])

Internet: am I missing something obvious here?

Copyright (C) 2018 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.1.13

Read and Post Comments

Getting geo-tagged information from photos for blogging

| categories: geotag, emacs, orgmode | tags: | View Comments

I am kind of late to this game, but recently I turned on location services for the camera on my phone. That means the location of the photo is stored in the photo, and we can use that to create urls to the photo location in a map for example. While traveling, I thought this would be a good application for org-mode to add functionality to documents with photos in them, e.g. to be able to click on them to see where they are from, or to automate creation of html pages with links to maps, etc. In this post I explore some ways to achieve those ideas. What I would like is a custom org link that shows me a thumbnail of the image, and which exports to show the image in an html file with a link to a pin on Google maps.

So, let's dig in. Imagemagick provides an identify command that can extract the information stored in the images. Here we consider just the GPS information. I some pictures on a recent vacation, and one is unimaginatively named IMG_1759.JPG. Let's see where it was taken.

identify -verbose IMG_1759.JPG | grep GPS
exif:GPSAltitude: 14426/387    
exif:GPSAltitudeRef: 0    
exif:GPSDateStamp: 2018:06:30    
exif:GPSDestBearing: 11767/80    
exif:GPSDestBearingRef: T    
exif:GPSImgDirection: 11767/80    
exif:GPSImgDirectionRef: T    
exif:GPSInfo: 1632    
exif:GPSLatitude: 22/1, 11/1, 614/100
exif:GPSLatitudeRef: N    
exif:GPSLongitude: 159/1, 40/1, 4512/100
exif:GPSLongitudeRef: W    
exif:GPSSpeed: 401/100    
exif:GPSSpeedRef: K    
exif:GPSTimeStamp: 3/1, 44/1, 3900/100

The interpretation here is that I took that photo at latitude 22° 11' 6.14" N, and longitude 159° 40' 45.12" W. Evidently I was moving at 4.01 in some unit; I can confirm that I was at least moving, I was on a ship when I took that picture, and it was moving.

According to this you can make a url to a Google maps pin in satellite picture mode that looks like this: http://maps.google.com/maps?q=22 11 6.14N,159 40 45.12W&t=k. It doesn't seem possible to set the zoom in this url (at least setting the zoom doesn't do anything, and I didn't feel like trying all the other variations that are reported to sometimes work). I guess that is ok for now, it adds some suspense that you have to zoom out to see where the image is in some cases.

We need a little function to take an image file and generate that link. We have to do some algebra on the latitude and longitude which are stored as integers with a division operator. I am going to pipe this through an old unix utility called bc mostly because it is simple, and I won't have to parse it much. bc is a little archaic, you have to set the scale first, which tells it how many decimal places to output. The degrees and minutes are integers, so we will have to deal with that later.

echo "scale=2; 614/100" | bc
6.14

Here is our function. I filter out the lines with GPS in them into an a-list. Then, I grab out the specific quantities I want and construct the url. There is a little hackery since it appears the degrees and minutes should be integers in the url formulation used here, so I convert them to numbers and then take the floor. The function is a little longer than I thought, but it isn't too bad I guess. It is a little repetitious, but not enough to justify refactoring.

(defun iphoto-map-url (fname)
  (let* ((gps-lines (-keep (lambda (line)
                             (when (s-contains? "GPS" line) (s-trim line)))
                           (process-lines "identify" "-verbose" fname)))
         (gps-alist (mapcar (lambda (s) (s-split ": " s t))  gps-lines))
         (latitude (mapcar
                    (lambda (s)
                      (s-trim (shell-command-to-string
                               (format "echo \"scale=2;%s\" | bc" s))))
                    (s-split "," (cadr (assoc "exif:GPSLatitude" gps-alist)))))
         (latitude-ref (cadr (assoc "exif:GPSLatitudeRef" gps-alist)))
         (longitude (mapcar
                     (lambda (s)
                       (s-trim
                        (shell-command-to-string
                         (format "echo \"scale=2;%s\" | bc" s))))
                     (s-split "," (cadr (assoc "exif:GPSLongitude" gps-alist)))))
         (longitude-ref (cadr (assoc "exif:GPSLongitudeRef" gps-alist))))
    (s-format "http://maps.google.com/maps?q=$0 $1 $2$3,$4 $5 $6$7&t=k"
              'elt
              (list
               (floor (string-to-number (nth 0 latitude)))
               (floor (string-to-number (nth 1 latitude)))
               (nth 2 latitude)
               latitude-ref
               (floor (string-to-number (nth 0 longitude)))
               (floor (string-to-number (nth 1 longitude)))
               (nth 2 longitude)
               longitude-ref))))
iphoto-map-url

Here is the function in action, making the url.

(iphoto-map-url "IMG_1759.JPG")
http://maps.google.com/maps?q=22 11 6.14N,159 40 45.12W&t=k

It is kind of slow, but that is because the identify shell command is kind of slow when you run it with the -verbose tag. Now, I would like the following things to happen when I publish it to html:

  1. I want the image wrapped in an img tag inside a figure environment.
  2. I want the image to by hyperlinked to its location in Google maps.

In the org file, I want a thumbnail overlay on it, so I can see the image while writing, and I want it to toggle like other images. I use an iPhone to take the photos, so we will call it an iphoto link.

Here is the html export function I will use. It is a little hacky that I hard code the width in at 300 pixels, but I didn't feel like figuring out how to get it from an #+attr_html line right now. It probably requires a filter function where you have access to the actual org-elements. I put the url to the image location in a figure caption here.

(defun iphoto-export (path desc backend)
  (cond
   ((eq 'html backend)
    (format "<figure>
<img src=\"%s\" width=\"300\">
%s
</figure>"
            path
            (format "<figcaption>%s <a href=\"%s\">map</a></figcaption>"
                    (or desc "")
                    (iphoto-map-url path))))))
iphoto-export

Ok, the last detail I want is to put an image overlay on my new link so I can see it. I want this to work with org-toggle-inline-images so I can turn the images on and off like regular image links with C-c C-x C-v. This function creates overlays as needed, and ties into the org-inline-image-overlays so they get deleted on toggling. We have to advise the display function to redraw these, which we clumsily do by restarting the org font-lock machinery which will redraw the thumbnails from the activate-func property of the links. I also hard code the thumbnail width in this function, when it could be moved out to a variable or attribute.

(defun iphoto-thumbnails (start end imgfile bracketp)
  (unless bracketp
    (when (and
           ;; it is an image
           (org-string-match-p (image-file-name-regexp) imgfile)
           ;; and it exists
           (f-exists? imgfile)
           ;; and there is no overlay here.
           (not (ov-at start)))
      (setq img (create-image (expand-file-name imgfile)
                              'imagemagick nil :width 300
                              :background "lightgray"))
      (setq ov (make-overlay start end))
      (overlay-put ov 'display img)
      (overlay-put ov 'face 'default)
      (overlay-put ov 'org-image-overlay t)
      (overlay-put ov 'modification-hooks
                   (list
                    `(lambda (&rest args)
                       (org-display-inline-remove-overlay ,ov t ,start ,end))))
      (push ov org-inline-image-overlays))))

(defun iphoto-redraw-thumbnails (&rest args)
  (org-restart-font-lock))

;; this redisplays these thumbnails on image toggling
(advice-add 'org-display-inline-images :after 'iphoto-redraw-thumbnails)

Next, we define the link with a follow, export, tooltip and activate-func (which puts the overlay on).

(org-link-set-parameters
 "iphoto"
 :follow (lambda (path) (browse-url (iphoto-map-url path)))
 :export 'iphoto-export
 :help-echo "Click me to see where this photo is on a map."
 :activate-func 'iphoto-thumbnails)

So finally, here is the mysterious image.

map

Now, in org-mode, I see the image in an overlay, and I can toggle it on and off. If I click on the image, it opens a browser to Google maps with a pin at the spot I took it. When I export it, it wraps the image in a <figure> tag, and puts a url in the caption to the map. If you click on it, and zoom out, you will see this is a picture of the Nāpali Coast on Kauai in Hawaii, and I was in fact out at sea when I took the picture. It was spectacular. Here is another one. This one is a little more obvious with the zoom. Here, I was on land. Since this link is bracketed, it does not show the overlay however in the org-file.

Another vacation picture, this time with a caption. map

Overall, this was easier than I expected. It might be faster to outsource reading the exif data to some dedicate library, perhaps in python that would return everything you want in an easy to parse json data structure. The speed of computing the url is only annoying when you export or click on the links though.

I didn't build in any error handling, e.g. if you do this on a photo with no GPS data it will probably not handle it gracefully. I also haven't tested this on any other images, e.g. south of the equator, from other cameras, etc. I assume this exif data is pretty standard, but it is a wild world out there… It would still be nice to find a way to get a string representing the nearest known location somehow, that would help the caption be more useful.

There is one little footnote to speak of, and that is I had to do a little hackery to get this to work with my blog machinery. You can see what it is in the org-source, I buried it in a noexport subheading, because it isn't that interesting in the grand scheme of things. It was just necessary because I export these org-files to blogofile, which then builds the html pages, instead of just exporting them. The images have to be copied to a source directory, and paths changed in the html to point to them. See, boring stuff. Otherwise, the code above should be fine for regular org and html files! Now, my vacation is over so it is time to get back to work.

Copyright (C) 2018 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.1.13

Read and Post Comments

Literate programming with python doctests

| categories: python, noweb, orgmode | tags: | View Comments

On the org-mode mailing list we had a nice discussion about using noweb and org-mode in literate programming. The results of that discussion were blogged about here. I thought of a different application of this for making doctests in Python functions. I have to confess I have never liked these because I have always thought they were a pain to write since you basically have to put code and results into a docstring. The ideas developed in the discussion above led me to think of a new way to write these that seems totally reasonable.

The idea is just to put noweb placeholders in the function docstring for the doctests. The placeholders will be expanded when you tangle the file, and they will get their contents from other src-blocks where you have written and run examples to test them.

This video might make the rest of this post easier to follow:

I will illustrate the idea using org-mode and the ob-ipython I have in scimax. The defaults of my ob-ipython setup are not useful for this example because it puts the execution count and mime types of output in the output. These are not observed in a REPL, and so we turn this off by setting these variables.

(setq ob-ipython-suppress-execution-count t
      ob-ipython-show-mime-types nil)

Now, we make an example function that takes a single argument and returns one divided by that argument. This block is runnable, and the function is then defined in the jupyter kernel. The docstring contains several noweb references to doctest blocks we define later. For now, they don't do anything. See The noweb doctest block section for the block that is used to expand these. This block also has a tangle header which indicates the file to tangle the results to. When I run this block, it is sent to a Jupyter kernel and saved in memory for use in subsequent blocks.

Here is the block with no noweb expansion. Note that this is easier to read in the original org source than it is to read in the published blog format.

def func(a):
    """A function to divide one by a.

    <<doctest("doctest-1")>>

    <<doctest("doctest-2")>>

    <<doctest("doctest-3")>>

    Returns: 1 / a.
    """
    return 1 / a

Now, we can write a series of named blocks that define various tests we might want to use as doctests. You can run these blocks here, and verify they are correct. Later, when we tangle the document, these will be incorporated into the tangled file in the docstring we defined above.

func(5) == 0.2
True

This next test will raise an Exception, and we just run it to make sure it does.

func(0)

ZeroDivisionErrorTraceback (most recent call last)
<ipython-input-6-ba0cd5a88f0a> in <module>()
----> 1 func(0)

<ipython-input-1-eafd354a3163> in func(a)
     18     Returns: 1 / a.
     19     """
---> 20     return 1 / a

ZeroDivisionError: division by zero

This is just a doctest with indentation to show how it is used.

for i in range(1, 4):
    print(func(i))
1.0
0.5
0.3333333333333333


That concludes the examples I want incorporated into the doctests. Each one of these blocks has a name, which is used as an argument to the noweb references in the function docstring.

1 Add a way to run the tests

This is a common idiom to enable easy running of the doctests. This will get tangled out to the file.

if __name__ == "__main__":
    import doctest
    doctest.testmod()

2 Tangle the file

So far, the Python code we have written only exists in the org-file, and in memory. Tangling is the extraction of the code into a code file.

We run this command, which extracts the code blocks marked for tangling, and expands the noweb references in them.

(org-babel-tangle)
test.py

Here is what we get:

def func(a):
    """A function to divide one by a.

    >>> func(5) == 0.2
    True

    >>> func(0)
    Traceback (most recent call last):
    ZeroDivisionError: division by zero

    >>> for i in range(1, 4):
    ...     print(func(i))
    1.0
    0.5
    0.3333333333333333


    Returns: 1 / a.
    """
    return 1 / a

if __name__ == "__main__":
    import doctest
    doctest.testmod()

That looks like a reasonable python file. You can see the doctest blocks have been inserted into the docstring, as desired. The proof of course is that we can run these doctests, and use the python module. We show that next.

3 Run the tests

Now, we can check if the tests pass in a fresh run (i.e. not using the version stored in the jupyter kernel.) The standard way to run the doctests is like this:

python test.py -v

Well, that's it! It worked fine. Now we have a python file we can import and reuse, with some doctests that show how it works. For example, here it is in a small Python script.

from test import func
print(func(3))
0.3333333333333333

There are surely some caveats to keep in mind here. This was just a simple proof of concept idea that isn't tested beyond this example. I don't know how many complexities would arise from more complex doctests. But, it seems like a good idea to continue pursuing if you like using doctests, and like using org-mode and interactive/literate programming techniques.

It is definitely an interesting way to use noweb to build up better code files in my opinion.

4 The noweb doctest block

These blocks are used in the noweb expansions. Each block takes a variable which is the name of a block. This block grabs the body of the named src block and formats it as if it was in a REPL.

We also grab the results of the named block and format it for the doctest. We use a heuristic to detect Tracebacks and modify the output to be consistent with it. In that case we assume the relevant Traceback is on the last line.

Admittedly, this does some fragile feeling things, like trimming whitespace here and there to remove blank lines, and quoting quotes (which was not actually used in this example), and removing the ": " pieces of ob-ipython results. Probably other ways of running the src-blocks would not be that suitable for this.

(org-babel-goto-named-src-block name)
(let* ((src (s-trim-right (org-element-property :value (org-element-context))))
       (src-lines (split-string src "\n"))
       body result)
  (setq body
        (s-trim-right
         (s-concat ">>> " (car src-lines) "\n"
                   (s-join "\n" (mapcar (lambda (s)
                                          (concat "... " s))
                                        (cdr src-lines))))))
  ;; now the results
  (org-babel-goto-named-result name)
  (let ((result (org-element-context)))
    (setq result
          (thread-last
              (buffer-substring (org-element-property :contents-begin result)
                                (org-element-property :contents-end result))
            (s-trim)
            ;; remove ": " from beginning of lines
            (replace-regexp-in-string "^: *" "")
            ;; quote quotes
            (replace-regexp-in-string "\\\"" "\\\\\"")))
    (when (string-match "Traceback" result)
      (setq result (format
                    "Traceback (most recent call last):\n%s"
                    (car (last (split-string result "\n"))))))
    (concat body "\n" result)))

Copyright (C) 2018 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.1.13

Read and Post Comments

« Previous Page -- Next Page »