Generating an alphabetized list of collaborators from the past five years

Posted February 20, 2016 at 05:03 PM | categories: python, scopus | tags:

Updated February 20, 2016 at 05:24 PM

Almost every proposal I write requires some list of my coauthors from the past several years. Some want the list alphabetized, and some want affiliations too. It has always bothered me to make this list, mostly because it is tedious, and it seems like something that should not be hard to generate. It turns out it is not too hard. I have been developing a Python interface ((https://github.com/jkitchin/scopus )) to Scopus that more or less enables me to script this.

Scopus is not free. You need either a license, or institutional access to use it. Here is the strategy to generate my list of coauthors. First, we need to get the articles for the past 5 years that are mine, and for each paper we get the coauthors. I use my Scopus author id in the query, and then sort the names alphabetically into a table. Then, I use that table as input to a second code block that does an author query in Scopus to get the current affiliations. Here is the code.

from scopus.scopus_api import ScopusAbstract
from scopus.scopus_search import ScopusSearch

s = ScopusSearch('AU-ID(7004212771) AND PUBYEAR > 2010')

coauthors = {}
for eid in s.EIDS:
    ab = ScopusAbstract(eid)
    for au in ab.authors:
        if au.auid not in coauthors and au.auid != '7004212771':
            coauthors[au.auid] = au.indexed_name

return sorted([[auid, name] for auid,name in coauthors.items()], key=lambda x:x[1])

52463103500	Akhade S.A.
6506329719	Albenze E.
36472906200	Alesi W.R.
56963752500	Anna S.L.
56522803500	Boes J.R.
26433085700	Calle-Vallejo F.
54973276000	Chao R.
7201800897	Collins T.J.
54883867200	Curnan M.T.
7003584159	Damodaran K.
55328415000	Demeter E.L.
37005464900	Dsilva C.
18037364800	Egbebi A.
35603120700	Eslick J.C.
56673468200	Fan Q.
24404182600	Frenkel A.I.
35514271900	Gellman A.J.
12803603300	Gerdes K.
54585146800	Gumuslu G.
55569145100	Hallenbeck A.P.
24316829300	Hansen H.A.
56009239000	Hilburg S.L.
55676869000	Hopkinson D.
56674328100	Illes S.M.
23479647900	Inoglu N.G.
6603398169	Jaramillo T.F.
8054222900	Joshi Y.V.
47962378000	Keturakis C.
57056061900	Kondratyuk P.
55391991800	Kondratyuk P.
7006205398	Koper M.T.M.
23004637900	Kusuma V.A.
35787409400	Landon J.
55005205100	Lee A.S.
6701399651	Luebke D.R.
35491189200	Man I.C.
27467500000	Mantripragada H.
55373026900	Mao J.X.
55210428500	Marks A.
27667815700	Martinez J.I.
56071079300	Mehta P.
56673592900	Michael J.D.
55772901000	Miller D.C.
7501599910	Miller J.B.
26032231600	Miller S.D.
35576929100	Morreale B.
55308251800	Munprom R.
14036290400	Myers C.R.
7007042214	Norskov J.K.
24081524800	Nulwala H.B.
56347288000	Petrova R.
7006208748	Pushkarev V.V.
56591664500	Raman S.
7004217247	Resnik K.P.
47962694800	Richard Alesi Jr. W.
9742604300	Rossmeisl J.
7201763336	Rubin E.S.
6602471339	Sabolsky E.M.
7004541416	Salvador P.A.
22981503200	Shi W.
55885836600	Siefert N.S.
25224517700	Su H.-Y.
57016792200	Thirumalai H.
8724572500	Thompson R.L.
8238710700	Vasic R.
37081979100	Versteeg P.
7006804734	Wachs I.E.
6701692232	Washburn N.R.
56542538800	Watkins J.D.
55569461200	Xu Z.
56424861600	Yin C.
56969809500	Zhou X.

It is worth inspecting this list for duplicates. I see at least two duplicates. That is a limitation of almost every indexing service I have seen. Names are hard to disambiguate. I will live with it. Now, we will use another query to get affiliations, and the names. Since we use a sorted list from above, these names are in alphabetical order. We exclude co-authors from Carnegie Mellon University since these are often my students, or colleagues, and they are obvious conflicts of interest for proposal reviewing anyway. I split the current affiliation on a comma, since it appears the institution comes first, followed by the department. We only need an institution here.

from scopus.scopus_author import ScopusAuthor

coauthors = [ScopusAuthor(auid) for auid, name in data]

print(', '.join(['{0} ({1})'.format(au.name, au.current_affiliation.split(',')[0])
                 for au in coauthors
                 if au.current_affiliation.split(',')[0] != 'Carnegie Mellon University']))

Sneha A. Akhade (Pennsylvania State University), Erik J. Albenze (National Energy Technology Laboratory), Federico Calle-Vallejo (Leiden Institute of Chemistry), Robin Chao (National Energy Technology Laboratory), Krishnan V. Damodaran (University of Pittsburgh), Carmeline J. Dsilva (Princeton University), Adefemi A. Egbebi (URS), John C. Eslick (National Energy Technology Laboratory), Anatoly I. Frenkel (Yeshiva University), Kirk R. Gerdes (National Energy Technology Laboratory), Heine Anton Hansen (Danmarks Tekniske Universitet), David P. Hopkinson (National Energy Technology Laboratory), Thomas Francisco Jaramillo (Fermi National Accelerator Laboratory), Yogesh V. Joshi (Exxon Mobil Research and Engineering), Christopher J. Keturakis (Lehigh University), Marc T M Koper (Leiden Institute of Chemistry), Victor A. Kusuma (National Energy Technology Laboratory), James Landon (University of Kentucky), David R. Luebke (Liquid Ion Solutions), Isabelacostinela Man (Universitatea din Bucuresti), James X. Mao (University of Pittsburgh), José Ignacio Martínez (CSIC - Instituto de Ciencia de Materiales de Madrid (ICMM)), David C M Miller (National Energy Technology Laboratory), Bryan D. Morreale (National Energy Technology Laboratory), Christina R. Myers (National Energy Technology Laboratory), Jens Kehlet Nørskov (Stanford Linear Accelerator Center), Rumyana V. Petrova (International Iberian Nanotechnology Laboratory), Vladimir V. Pushkarev (Dow Corning Corporation), Sumathy Raman (Exxon Mobil Research and Engineering), Kevin P. Resnik (URS), Walter Richard Alesi (National Energy Technology Laboratory), Jan Rossmeisl (Kobenhavns Universitet), Edward M. Sabolsky (West Virginia University), Wei Shi (University of Pittsburgh), Nicholas S. Siefert (National Energy Technology Laboratory), Haiyan Su (Dalian Institute of Chemical Physics Chinese Academy of Sciences), Robert Lee Thompson (University of Pittsburgh Medical Center), Relja Vasić (SUNY College of Nanoscale Science and Engineering), Israel E. Wachs (Lehigh University), John D. Watkins (National Energy Technology Laboratory), Chunrong Yin (United States Department of Energy), Xu Zhou (Liquid Ion Solutions)

This is pretty sweet. I could pretty easily create a query that had all the PIs on a proposal, and alphabetize everyone's coauthors, or print them to a CSV file for import to Excel, or whatever format is required for conflict of interest reporting. The list is not perfect, but it is easy to manually fix it here.

That little bit of code is wrapped in a command-line utility in the scopus Python package. You use it like this. Just run it every time you need an updated list of coauthors! It isn't super flexible for now, e.g. excluding multiple affiliations, including multiple authors, etc… isn't fully supported.

./scopus_coauthors 7004212771 2010 --exclude-affiliation="Carnegie Mellon University"

Sneha A. Akhade (Pennsylvania State University), Erik J. Albenze (National Energy Technology Laboratory), Federico Calle-Vallejo (Leiden Institute of Chemistry), Robin Chao (National Energy Technology Laboratory), Krishnan V. Damodaran (University of Pittsburgh), Carmeline J. Dsilva (Princeton University), Adefemi A. Egbebi (URS), John C. Eslick (National Energy Technology Laboratory), Anatoly I. Frenkel (Yeshiva University), Kirk R. Gerdes (National Energy Technology Laboratory), Heine Anton Hansen (Danmarks Tekniske Universitet), David P. Hopkinson (National Energy Technology Laboratory), Thomas Francisco Jaramillo (Fermi National Accelerator Laboratory), Yogesh V. Joshi (Exxon Mobil Research and Engineering), Christopher J. Keturakis (Lehigh University), Marc T M Koper (Leiden Institute of Chemistry), Victor A. Kusuma (National Energy Technology Laboratory), James Landon (University of Kentucky), David R. Luebke (Liquid Ion Solutions), Isabelacostinela Man (Universitatea din Bucuresti), James X. Mao (University of Pittsburgh), José Ignacio Martínez (CSIC - Instituto de Ciencia de Materiales de Madrid (ICMM)), David C M Miller (National Energy Technology Laboratory), Bryan D. Morreale (National Energy Technology Laboratory), Christina R. Myers (National Energy Technology Laboratory), Jens Kehlet Nørskov (Stanford Linear Accelerator Center), Rumyana V. Petrova (International Iberian Nanotechnology Laboratory), Vladimir V. Pushkarev (Dow Corning Corporation), Sumathy Raman (Exxon Mobil Research and Engineering), Kevin P. Resnik (URS), Walter Richard Alesi (National Energy Technology Laboratory), Jan Rossmeisl (Kobenhavns Universitet), Edward M. Sabolsky (West Virginia University), Wei Shi (University of Pittsburgh), Nicholas S. Siefert (National Energy Technology Laboratory), Haiyan Su (Dalian Institute of Chemical Physics Chinese Academy of Sciences), Robert Lee Thompson (University of Pittsburgh Medical Center), Relja Vasić (SUNY College of Nanoscale Science and Engineering), Israel E. Wachs (Lehigh University), John D. Watkins (National Energy Technology Laboratory), Chunrong Yin (United States Department of Energy), Xu Zhou (Liquid Ion Solutions)

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Interactive figures in blog posts with mpld3

Posted February 08, 2016 at 07:33 AM | categories: python, interactive, plotting | tags:

Continuing the exploration of interactive figures, today we consider the Python plotting library mpld3 . We will again use our own published data. We wrote this great paper on core level shifts (CLS) in Cu-Pd alloys boes-2015-core-cu. I want an interactive figure that shows the name of the calculation on each point as a tooltip. This data is all stored in the supporting information file, and you can see how we use it here. This figure shows how the core level shift of a Cu atom changes depending on the number of nearest neighbor Cu atoms. Just hover your mouse over a point to see the name and CLS for that point.

1 Data and code

You can check out our preprint at https://github.com/KitchinHUB/kitchingroup-51 . We are going to adapt the code to make Figure 6a in the manuscript interactive. The code needed a somewhat surprising amount of adaptation. Apparently the ase database interface has changed a lot since we write that paper, so the code here looks a bit different than what we published. The biggest difference is due to name-mangling so each key that started with a number now starts with _, and and periods are replaced by _ also. The rest of the script is nearly unchanged. At the end is the very small bit of mpld3 code that generates the figure for html. We will add tooltips onto datapoints to indicate what the name associated with each data point is. Here is the code.

import matplotlib.pyplot as plt
from ase.db import connect

# loads the ASE database and select certain keywords
db = connect('~/Desktop/cappa/kitchingroup-51/supporting-information/data.json')

keys = ['bcc', 'GS', '_54atom', 'ensam']

CLS, IMP, labels = [], [], []
for k in db.select(keys + ['_1cl']):
    name = k.keywords[-2]

    Cu0 = db.select('bcc,GS,_72atom,_0cl,_1_00Cu').next().energy
    Cu1 = db.select('bcc,GS,_72atom,_1cl,_1_00Cu').next().energy
    x0 = db.select(','.join(keys + [name, '_0cl'])).next().energy
    x1 = k.energy

    cls0 = x0 - Cu0
    cls1 = x1 - Cu1

    IMP.append(int(name[1]))
    CLS.append(cls1 - cls0)
    labels += ['{0} ({1}, {2})'.format(name, int(name[1]), cls1 - cls0)]

Cu0 = db.select(','.join(['bcc', 'GS', '_72atom',
                          '_0cl', '_1_00Cu'])).next().energy
Cu1 = db.select(','.join(['bcc', 'GS', '_72atom',
                          '_1cl', '_1_00Cu'])).next().energy

x0 = db.select(','.join(['bcc', 'GS', '_54atom',
                         '_0cl', '_1'])).next().energy
x1 = db.select(','.join(['bcc', 'GS', '_54atom',
                         '_1cl', '_1'])).next().energy

cls0 = x0 - Cu0
cls1 = x1 - Cu1

IMP.append(1)
CLS.append(cls1 - cls0)
labels += ['(1, {0})'.format(cls1 - cls0)]

Cu0 = db.select(','.join(['bcc', 'GS', '_72atom',
                          '_0cl', '_1_00Cu'])).next().energy
Cu1 = db.select(','.join(['bcc', 'GS', '_72atom',
                          '_1cl', '_1_00Cu'])).next().energy

x0 = db.select(','.join(['bcc', 'GS', '_54atom',
                         '_0cl', '_0'])).next().energy
x1 = db.select(','.join(['bcc', 'GS', '_54atom',
                         '_1cl', '_0'])).next().energy

cls0 = x0 - Cu0
cls1 = x1 - Cu1

IMP.append(0)
CLS.append(cls1 - cls0)
labels += ['(0, {0})'.format(cls1 - cls0)]

fig = plt.figure()

p = plt.scatter(IMP, CLS, c='g', marker='o', s=25)
ax1 = plt.gca()
ax1.set_ylim(-1.15, -0.6)
ax1.set_xlim(-0.1, 5.1)

ax1.set_xlabel('# Cu Nearest neighbors')
ax1.set_ylabel('Cu 2p(3/2) Core Level Shift (eV)')

ax1.set_title('Hover over a point to see the calculation name')

# Now the mpld3 stuff.
import mpld3
from mpld3 import plugins

tooltip = plugins.PointHTMLTooltip(p, labels, voffset=0, hoffset=10)
plugins.connect(fig, tooltip)

print mpld3.fig_to_html(fig)

I like this workflow pretty well. It seems less functional than plotly and Bokeh (e.g. it does not look like it you can export the data from the html here), but it is well integrated with Matplotlib, with my blogging style, and does not require a server, oran account. The code outputs html that is self-contained in the body of the html. The smooth integration with Matplotlib means I could have static images in org-mode, and dynamic images in HTML potentially. Overall, this is a nice tool for making interactive plots in blog posts.

2 References

Bibliography

[boes-2015-core-cu] Jacob Boes, Peter Kondratyuk, Chunrong Yin, James, Miller, Andrew Gellman & John Kitchin, Core Level Shifts in Cu-Pd Alloys As a Function of Bulk Composition and Structure, Surface Science, 640, 127-132 (2015). link. doi.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Interactive Bokeh plots in HTML

Posted February 07, 2016 at 10:53 AM | categories: python, interactive, plotting | tags:

Updated February 07, 2016 at 11:24 AM

1. The data and code
2. References

In our last post we examined the use of plotly to generate interactive plots in HTML. Today we expand the idea, and use Bokeh . One potential issue with plotly is the need for an account and API-key, some limitations on how many times a graph can be viewed per day (although I should aspire to have my graphs viewed 1000+ times a day!), and who knows what happens to the graphs if plotly ever goes out of business. While the static images we usually use have limited utility, at least they stick around.

So, today we look at Bokeh which allows you to embed some json data in your HTML, which is made interactive by your browser with more javascript magic. We get straight to the image here so you can see what this is all about. Briefly, this data shows trends (or lack of) in the adsorption energies of some atoms on the atop and fcc sites of several transition metals as a function of adsorbate coverage xu-2014-probin-cover. The code to do this is found here.

Using Bokeh does not integrate real smoothly with my blog workflow, which only generates the body of HTML posts. Bokeh needs some javascript injected into the header to work. To get around that, I show the plot in a frame here. You can see a full HTML version here: bokeh-plot.html .

This is somewhat similar to the plotly concept. The data is embedded in the html in this case, which is different. For very large plots, I actually had some trouble exporting the blog post (it was taking a long time to export and I killed it)! I suspect that is a limitation of the org-mode exporter though, because I could save the html files from Python and view them fine. I also noted that having all the javascript in the org-file make font-lock work very slow. It would be better to generate that only on export.

Note to make this work, we need these lines in our HTML header:

#+HTML_HEAD: <link rel="stylesheet" href="http://cdn.pydata.org/bokeh/release/bokeh-0.11.1.min.css" type="text/css" />
#+HTML_HEAD: <script type="text/javascript" src="http://cdn.pydata.org/bokeh/release/bokeh-0.11.1.min.js"></script>

Since we do not host those locally, if they ever disappear, our plots will not show ;(

1 The data and code

We will get the data from our paper on coverage dependent adsorption energies xu-2014-probin-cover. There are some data rich figures there that would benefit from some interactivity. You can get the data here: http://pubs.acs.org/doi/suppl/10.1021/jp508805h . Extract out the supporting-information.org and energies.json file to follow here. We will make Figure 2a in the SI document here, and make it interactive with hover tooltips.

import json

from collections import OrderedDict
from bokeh import mpl
from bokeh.plotting import *
from bokeh.models import HoverTool
from bokeh.embed import components

with open('/users/jkitchin/Desktop/energies.json', 'r') as f:
    data = json.load(f)


# color for metal
# letter symbol for adsorbate
colors = {'Cu':'Orange',
          'Ag':'Silver',
          'Au':'Yellow',
          'Pd':'Green',
          'Pt':'Red',
          'Rh':'Blue',
          'Ir':'Purple'}

all_ads = ['O', 'S']

TOOLS="crosshair,pan,wheel_zoom,box_zoom,reset,hover,previewsave"
p = figure(title="Correlation between atop and fcc sites", tools=TOOLS)

for metal in ['Rh', 'Pd', 'Cu', 'Ag']:
    for adsorbate in all_ads:
        E1, E2 = [], []
        for coverage in '0.25', '0.5', '0.75', '1.0':
            if (isinstance(data[metal][adsorbate]['ontop'][coverage], float) and
                isinstance(data[metal][adsorbate]['fcc'][coverage], float)):
                E1.append(data[metal][adsorbate]['ontop'][coverage])
                E2.append(data[metal][adsorbate]['fcc'][coverage])
        labels = ['{0}-{1} {2} ML'.format(metal, adsorbate, x)
                  for x in ['0.25', '0.5', '0.75', '1.0']]
        p.line('x', 'y', color=colors[metal],
               source=ColumnDataSource(data={'x': E1,
                                             'y': E2,
                                             'label': labels}))
        p.circle('x', 'y', color=colors[metal],
               source=ColumnDataSource(data={'x': E1,
                                             'y': E2,
                                             'label': labels}))


hover =p.select({'type': HoverTool})
hover.tooltips = OrderedDict([("(atop,fcc)", "(@x, @y)"),
                              ("label", "@label")])

p.xaxis.axis_label = 'Adsorption energy on the atop site'
p.yaxis.axis_label = 'Adsorption energy on the fcc site'

script, div = components(p)
script = '\n'.join(['#+HTML_HEAD_EXTRA: ' + line for line in script.split('\n')])

print '''{script}

#+BEGIN_HTML
<a name="figure"></a>
{div}
#+END_HTML
'''.format(script=script, div=div)

2 References

Bibliography

[xu-2014-probin-cover] Zhongnan Xu & John Kitchin, Probing the Coverage Dependence of Site and Adsorbate Configurational Correlations on (111) Surfaces of Late Transition Metals, J. Phys. Chem. C, 118(44), 25597-25602 (2014). link. doi.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Interactive plots in HTML with Plotly

Posted February 06, 2016 at 12:44 PM | categories: python, interactive, plotting | tags:

Most of the plots in this blog are static. Today, I look at making them interactive. I will use https://plot.ly for this. I want to use some data from a paper we published on the relative stabilities of oxide polymorphs mehta-2015-ident-poten. We will make an interactive figure showing the relative stabilities of the RuO₂ polymorphs. When you hover on a point, it will show you which polymorph the point refers to. Let's see the figure first here. If you think its interesting read on to see how we made it!

We get our data source here: http://pubs.acs.org/doi/suppl/10.1021/am4059149/suppl_file/am4059149_si_001.pdf .

Now, we extract the data files:

pdftk ~/Desktop/am4059149_si_001.pdf  unpack_files

That extracts a json file called supporting-information.json. We use it as suggested in the SI pdf to plot the equations of state for RuO₂ for several polymorphs.

# coding=utf-8

import plotly.plotly as py
import plotly.graph_objs as go
import plotly.tools as tls
import numpy as np

import json
import matplotlib.pyplot as plt
from ase.utils.eos import EquationOfState
with open('supporting-information.json', 'rb') as f:
    d = json.loads(f.read())

BO2 = 'RuO2'
xc = 'PBE'

layout = go.Layout(title='Energy vs. Volume for RuO<sub>2</sub> polymorphs',
                   xaxis=dict(title='Volume (Å<sup>3</sup>)'),
                   yaxis=dict(title='Energy (eV)'))

traces = []

for polymorph in ['rutile','anatase','brookite','columbite','pyrite','fluorite']:

    # number of atoms in the unit cell - used to normalize
    natoms= len(d[BO2][polymorph][xc]['EOS']['calculations']
                [0]['atoms']['symbols'])
    volumes = [entry['data']['volume']*3./natoms for entry in
               d[BO2][polymorph][xc]['EOS']['calculations']]
    energies =  [entry['data']['total_energy']*3./natoms for entry in
                 d[BO2][polymorph][xc]['EOS']['calculations']]

    trace = go.Scatter(x=np.array(volumes),
                       y=np.array(energies),
                       mode='lines+markers',
                       name=polymorph,
                       text=polymorph)

    traces += [trace]

fig = go.Figure(data=traces, layout=layout)
plot_url = py.plot(fig, filename='ruo2-2')

print tls.get_embed(plot_url)

Pretty nice, now we should have an interactive plot in our browser with the data points labeled with tags, zooming, etc… That is nice for the blog. It isn't so nice for daily work, as there is no visual version of the plot in my org-file. Of course, I can visit the url to see the plot in my browser, it is just different from what I am used to. For everyone else, this is probably better. It looks like you can actually get the data from the web page, including some minimal analysis like regression, and save your view to an image! That could be pretty nice for some data sets.

1 Using Plotly yourself

First, go to https://plot.ly and sign up for an account. You will want to register your API key like this, which will save it in a file for your convenience. Then you can do things like I did above too.

import plotly.tools as tls
tls.set_credentials_file(username='jkitchin', api_key='xxxxxxx')

2 References

Bibliography

[mehta-2015-ident-poten] Prateek Mehta, Paul Salvador & John Kitchin, Identifying Potential \ceBO2 Oxide Polymorphs for Epitaxial Growth Candidates, ACS Appl. Mater. Interfaces, 6(5), 3630-3639 (2015). link. doi.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Asynchronously running python blocks in org-mode

Posted November 20, 2015 at 11:46 AM | categories: orgmode, emacs, python | tags:

Updated November 20, 2015 at 07:30 PM

If you run long Python blocks from org-mode, you might want to keep working while it runs. Currently Emacs gets blocked and you have to wait patiently. In this post we consider some ways to avoid this that run our code asynchronously, but still put results where they belong in the org-buffer.

This is a long post. You may want to see the video: https://www.youtube.com/watch?v=VDyoN8yipSE , or skip to the end where the best and final version is shown.

1 The async module

Here we consider an approach that uses https://github.com/jwiegley/emacs-async module. The idea is to tangle the Python block at point to a temp file, then asynchronously run it. We capture the output and put it back in the buffer. We use a uuid to find the place to put the results in org-mode format. Here is the code that implements this idea.

(require 'async)

(defun org-babel-async-execute ()
  "Run a python block at point asynchrously."
  (interactive)

  (let ((current-file (buffer-file-name))
        (uuid (org-id-uuid))
        (temporary-file-directory "./")
        (tempfile (make-temp-file "py-")))

    (org-babel-tangle '(4) tempfile)
    (org-babel-remove-result)
    (save-excursion
      (re-search-forward "#\\+END_SRC")
      (insert (format
               "\n\n#+RESULTS: %s\n: %s"
               (or (org-element-property :name (org-element-context))
                   "")
               uuid)))

    (async-start
     ;; what to start
     `(lambda ()
        ;; now we run the command then cleanup
        (prog1
            (shell-command-to-string (format "python %s" ,tempfile))
          (delete-file ,tempfile)))

     `(lambda (result)
        "Code that runs when the async function finishes."
        (save-window-excursion
          (save-excursion
            (save-restriction
              (with-current-buffer (find-file-noselect ,current-file)
                (goto-char (point-min))
                (re-search-forward ,uuid)
                (beginning-of-line)
                (kill-line)
                (insert (mapconcat
                         (lambda (x)
                           (format ": %s" x))
                         (butlast (s-split "\n" result))
                         "\n"))))))))))

org-babel-async-execute

Here is a block to test it on. We can run the block, and keep on working while the code runs. The results seem to get inserted correctly at the right point even if I am in another window or frame! We don't get easy access to continuous output of the command. This wouldn't work if we close Emacs, but who does that?

print 'hello world'
import time
time.sleep(5)

import os
print os.getcwd()
print time.asctime()

hello world
/Users/jkitchin/blogofile-jkitchin.github.com/_blog
Fri Nov 20 10:17:53 2015

There are some limitations to this approach. One of them is it assumes the src block is a stand-alone block that will run on its own. That is usually how I run mine, but I could see having other modules that should be tangled out of a file too. I think the script is being run in the current working directory, so it probably will find any local imports it needs.

You don't get any intermediate feedback on this process. It seems to be possible to do that with a different approach that puts some output in a new buffer, e.g. with start-process. But, you still need some clever code like the async model to know when to insert the results back into this buffer. We consider Emacs processes and sentinels next.

2 Emacs process approach with tangling

We can start a process in Emacs, and attach a sentinel function to it that runs after the process completes. Here is an example of that. We still tangle the src-block here.

(defun org-babel-async-execute ()
  (interactive)
  (let* ((current-file (buffer-file-name))
        (uuid (org-id-uuid))
        (temporary-file-directory "./")
        (tempfile (make-temp-file "py-"))
        (pbuffer (format "*%s*" uuid))
        process)

    (org-babel-tangle '(4) tempfile)
    (org-babel-remove-result)

    (save-excursion
      (re-search-forward "#\\+END_SRC")
      (insert (format
               "\n\n#+RESULTS: %s\n: %s"
               (or (org-element-property :name (org-element-context))
                   "")
               uuid)))

    (setq process (start-process
                   uuid
                   pbuffer
                   "python"
                   tempfile))

    (set-process-sentinel
     process
     `(lambda (process event)
        (when (string= "finished\n" event)
          (delete-file ,tempfile)
          (save-window-excursion
            (save-excursion
              (save-restriction
                (with-current-buffer (find-file-noselect ,current-file)
                  (goto-char (point-min))
                  (re-search-forward ,uuid)
                  (beginning-of-line)
                  (kill-line)
                  (insert (mapconcat
                           (lambda (x)
                             (format ": %s" x))
                           (split-string
                            (with-current-buffer ,pbuffer (buffer-string))
                            "\n")
                           "\n")))))))
        (kill-buffer ,pbuffer)))))

org-babel-async-execute

print 'hello world'
import time
time.sleep(10)

import os
print os.getcwd()
print time.asctime()

hello world
/Users/jkitchin/blogofile-jkitchin.github.com/_blog
Fri Nov 20 10:20:01 2015

That works well from what I can see. There are some limitations. I doubt this will work if you use variables in the src block header. Next we consider an approach that does not do the tangling, and that will show us code output as it goes.

3 Emacs process approach with no tangling

As an alternative to tangling to a file, here we just copy the code to a file and then run it. This allows us to use :var in the header to pass data in at run time. At the moment, this code only supports printed output from code blocks, not the value for :results.

(defun org-babel-async-execute:python ()
  "Execute the python src-block at point asynchronously.
:var headers are supported.
:results output is all that is supported for output.

A new window will pop up showing you the output as it appears,
and the output in that window will be put in the RESULTS section
of the code block."
  (interactive)
  (let* ((current-file (buffer-file-name))
         (uuid (org-id-uuid))
         (code (org-element-property :value (org-element-context)))
         (temporary-file-directory ".")
         (tempfile (make-temp-file "py-"))
         (pbuffer (format "*%s*" uuid))
         (varcmds (org-babel-variable-assignments:python
                   (nth 2 (org-babel-get-src-block-info))))
         process)

    ;; get rid of old results, and put a place-holder for the new results to
    ;; come.
    (org-babel-remove-result)

    (save-excursion
      (re-search-forward "#\\+END_SRC")
      (insert (format
               "\n\n#+RESULTS: %s\n: %s"
               (or (org-element-property :name (org-element-context))
                   "")
               uuid)))

    ;; open the results buffer to see the results in.
    (switch-to-buffer-other-window pbuffer)

    ;; Create temp file containing the code.
    (with-temp-file tempfile
      ;; if there are :var headers insert them.
      (dolist (cmd varcmds)
        (insert cmd)
        (insert "\n"))
      (insert code))

    ;; run the code
    (setq process (start-process
                   uuid
                   pbuffer
                   "python"
                   tempfile))

    ;; when the process is done, run this code to put the results in the
    ;; org-mode buffer.
    (set-process-sentinel
     process
     `(lambda (process event)
        (save-window-excursion
          (save-excursion
            (save-restriction
              (with-current-buffer (find-file-noselect ,current-file)
                (goto-char (point-min))
                (re-search-forward ,uuid)
                (beginning-of-line)
                (kill-line)
                (insert
                 (mapconcat
                  (lambda (x)
                    (format ": %s" x))
                  (butlast (split-string
                            (with-current-buffer
                                ,pbuffer
                              (buffer-string))
                            "\n"))
                  "\n"))))))
        ;; delete the results buffer then delete the tempfile.
        ;; finally, delete the process.
        (when (get-buffer ,pbuffer)
          (kill-buffer ,pbuffer)
          (delete-window))
        (delete-file ,tempfile)
        (delete-process process)))))

org-babel-async-execute:python

Let us try it out again.

print 'hello world'
import time
time.sleep(1)

for i in range(5):
    print i

    time.sleep(0.5)


import os
print os.getcwd()
print time.asctime()

print data

raise IOError('No file!')

hello world
0
1
2
3
4
/Users/jkitchin/blogofile-jkitchin.github.com/_blog
Fri Nov 20 19:30:16 2015
[1, 3]
Traceback (most recent call last):
  File "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/py-84344aa1", line 18, in <module>
    raise IOError('No file!')
IOError: No file!

It works fine for this simple example. We get to see the output as the code executes, which is a pleasant change from the usual way of running python blocks. There is some support for some header arguments, notably the :var header. I don't use :results value in Python, so for now only output is supported. We even support Exceptions in the output finally!

Maybe some org-moder's out there can try this and run it through some more rigorous paces?

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

« Previous Page -- Next Page »

The Kitchin Research Group

Chemical Engineering at Carnegie Mellon University

Generating an alphabetized list of collaborators from the past five years

Interactive figures in blog posts with mpld3

1 Data and code

2 References

Bibliography

Interactive Bokeh plots in HTML

Table of Contents

1 The data and code

2 References

Bibliography

Interactive plots in HTML with Plotly

1 Using Plotly yourself

2 References

Bibliography

Asynchronously running python blocks in org-mode

1 The async module

2 Emacs process approach with tangling

3 Emacs process approach with no tangling