Generating an alphabetized list of collaborators from the past five years

| categories: scopus, python | tags:

Almost every proposal I write requires some list of my coauthors from the past several years. Some want the list alphabetized, and some want affiliations too. It has always bothered me to make this list, mostly because it is tedious, and it seems like something that should not be hard to generate. It turns out it is not too hard. I have been developing a Python interface ((https://github.com/jkitchin/scopus )) to Scopus that more or less enables me to script this.

Scopus is not free. You need either a license, or institutional access to use it. Here is the strategy to generate my list of coauthors. First, we need to get the articles for the past 5 years that are mine, and for each paper we get the coauthors. I use my Scopus author id in the query, and then sort the names alphabetically into a table. Then, I use that table as input to a second code block that does an author query in Scopus to get the current affiliations. Here is the code.

from scopus.scopus_api import ScopusAbstract
from scopus.scopus_search import ScopusSearch

s = ScopusSearch('AU-ID(7004212771) AND PUBYEAR > 2010')

coauthors = {}
for eid in s.EIDS:
    ab = ScopusAbstract(eid)
    for au in ab.authors:
        if au.auid not in coauthors and au.auid != '7004212771':
            coauthors[au.auid] = au.indexed_name

return sorted([[auid, name] for auid,name in coauthors.items()], key=lambda x:x[1])
52463103500 Akhade S.A.
6506329719 Albenze E.
36472906200 Alesi W.R.
56963752500 Anna S.L.
56522803500 Boes J.R.
26433085700 Calle-Vallejo F.
54973276000 Chao R.
7201800897 Collins T.J.
54883867200 Curnan M.T.
7003584159 Damodaran K.
55328415000 Demeter E.L.
37005464900 Dsilva C.
18037364800 Egbebi A.
35603120700 Eslick J.C.
56673468200 Fan Q.
24404182600 Frenkel A.I.
35514271900 Gellman A.J.
12803603300 Gerdes K.
54585146800 Gumuslu G.
55569145100 Hallenbeck A.P.
24316829300 Hansen H.A.
56009239000 Hilburg S.L.
55676869000 Hopkinson D.
56674328100 Illes S.M.
23479647900 Inoglu N.G.
6603398169 Jaramillo T.F.
8054222900 Joshi Y.V.
47962378000 Keturakis C.
57056061900 Kondratyuk P.
55391991800 Kondratyuk P.
7006205398 Koper M.T.M.
23004637900 Kusuma V.A.
35787409400 Landon J.
55005205100 Lee A.S.
6701399651 Luebke D.R.
35491189200 Man I.C.
27467500000 Mantripragada H.
55373026900 Mao J.X.
55210428500 Marks A.
27667815700 Martinez J.I.
56071079300 Mehta P.
56673592900 Michael J.D.
55772901000 Miller D.C.
7501599910 Miller J.B.
26032231600 Miller S.D.
35576929100 Morreale B.
55308251800 Munprom R.
14036290400 Myers C.R.
7007042214 Norskov J.K.
24081524800 Nulwala H.B.
56347288000 Petrova R.
7006208748 Pushkarev V.V.
56591664500 Raman S.
7004217247 Resnik K.P.
47962694800 Richard Alesi Jr. W.
9742604300 Rossmeisl J.
7201763336 Rubin E.S.
6602471339 Sabolsky E.M.
7004541416 Salvador P.A.
22981503200 Shi W.
55885836600 Siefert N.S.
25224517700 Su H.-Y.
57016792200 Thirumalai H.
8724572500 Thompson R.L.
8238710700 Vasic R.
37081979100 Versteeg P.
7006804734 Wachs I.E.
6701692232 Washburn N.R.
56542538800 Watkins J.D.
55569461200 Xu Z.
56424861600 Yin C.
56969809500 Zhou X.

It is worth inspecting this list for duplicates. I see at least two duplicates. That is a limitation of almost every indexing service I have seen. Names are hard to disambiguate. I will live with it. Now, we will use another query to get affiliations, and the names. Since we use a sorted list from above, these names are in alphabetical order. We exclude co-authors from Carnegie Mellon University since these are often my students, or colleagues, and they are obvious conflicts of interest for proposal reviewing anyway. I split the current affiliation on a comma, since it appears the institution comes first, followed by the department. We only need an institution here.

from scopus.scopus_author import ScopusAuthor

coauthors = [ScopusAuthor(auid) for auid, name in data]

print(', '.join(['{0} ({1})'.format(au.name, au.current_affiliation.split(',')[0])
                 for au in coauthors
                 if au.current_affiliation.split(',')[0] != 'Carnegie Mellon University']))
Sneha A. Akhade (Pennsylvania State University), Erik J. Albenze (National Energy Technology Laboratory), Federico Calle-Vallejo (Leiden Institute of Chemistry), Robin Chao (National Energy Technology Laboratory), Krishnan V. Damodaran (University of Pittsburgh), Carmeline J. Dsilva (Princeton University), Adefemi A. Egbebi (URS), John C. Eslick (National Energy Technology Laboratory), Anatoly I. Frenkel (Yeshiva University), Kirk R. Gerdes (National Energy Technology Laboratory), Heine Anton Hansen (Danmarks Tekniske Universitet), David P. Hopkinson (National Energy Technology Laboratory), Thomas Francisco Jaramillo (Fermi National Accelerator Laboratory), Yogesh V. Joshi (Exxon Mobil Research and Engineering), Christopher J. Keturakis (Lehigh University), Marc T M Koper (Leiden Institute of Chemistry), Victor A. Kusuma (National Energy Technology Laboratory), James Landon (University of Kentucky), David R. Luebke (Liquid Ion Solutions), Isabelacostinela Man (Universitatea din Bucuresti), James X. Mao (University of Pittsburgh), José Ignacio Martínez (CSIC - Instituto de Ciencia de Materiales de Madrid (ICMM)), David C M Miller (National Energy Technology Laboratory), Bryan D. Morreale (National Energy Technology Laboratory), Christina R. Myers (National Energy Technology Laboratory), Jens Kehlet Nørskov (Stanford Linear Accelerator Center), Rumyana V. Petrova (International Iberian Nanotechnology Laboratory), Vladimir V. Pushkarev (Dow Corning Corporation), Sumathy Raman (Exxon Mobil Research and Engineering), Kevin P. Resnik (URS), Walter Richard Alesi (National Energy Technology Laboratory), Jan Rossmeisl (Kobenhavns Universitet), Edward M. Sabolsky (West Virginia University), Wei Shi (University of Pittsburgh), Nicholas S. Siefert (National Energy Technology Laboratory), Haiyan Su (Dalian Institute of Chemical Physics Chinese Academy of Sciences), Robert Lee Thompson (University of Pittsburgh Medical Center), Relja Vasić (SUNY College of Nanoscale Science and Engineering), Israel E. Wachs (Lehigh University), John D. Watkins (National Energy Technology Laboratory), Chunrong Yin (United States Department of Energy), Xu Zhou (Liquid Ion Solutions)

This is pretty sweet. I could pretty easily create a query that had all the PIs on a proposal, and alphabetize everyone's coauthors, or print them to a CSV file for import to Excel, or whatever format is required for conflict of interest reporting. The list is not perfect, but it is easy to manually fix it here.

That little bit of code is wrapped in a command-line utility in the scopus Python package. You use it like this. Just run it every time you need an updated list of coauthors! It isn't super flexible for now, e.g. excluding multiple affiliations, including multiple authors, etc… isn't fully supported.

./scopus_coauthors 7004212771 2010 --exclude-affiliation="Carnegie Mellon University"
Sneha A. Akhade (Pennsylvania State University), Erik J. Albenze (National Energy Technology Laboratory), Federico Calle-Vallejo (Leiden Institute of Chemistry), Robin Chao (National Energy Technology Laboratory), Krishnan V. Damodaran (University of Pittsburgh), Carmeline J. Dsilva (Princeton University), Adefemi A. Egbebi (URS), John C. Eslick (National Energy Technology Laboratory), Anatoly I. Frenkel (Yeshiva University), Kirk R. Gerdes (National Energy Technology Laboratory), Heine Anton Hansen (Danmarks Tekniske Universitet), David P. Hopkinson (National Energy Technology Laboratory), Thomas Francisco Jaramillo (Fermi National Accelerator Laboratory), Yogesh V. Joshi (Exxon Mobil Research and Engineering), Christopher J. Keturakis (Lehigh University), Marc T M Koper (Leiden Institute of Chemistry), Victor A. Kusuma (National Energy Technology Laboratory), James Landon (University of Kentucky), David R. Luebke (Liquid Ion Solutions), Isabelacostinela Man (Universitatea din Bucuresti), James X. Mao (University of Pittsburgh), José Ignacio Martínez (CSIC - Instituto de Ciencia de Materiales de Madrid (ICMM)), David C M Miller (National Energy Technology Laboratory), Bryan D. Morreale (National Energy Technology Laboratory), Christina R. Myers (National Energy Technology Laboratory), Jens Kehlet Nørskov (Stanford Linear Accelerator Center), Rumyana V. Petrova (International Iberian Nanotechnology Laboratory), Vladimir V. Pushkarev (Dow Corning Corporation), Sumathy Raman (Exxon Mobil Research and Engineering), Kevin P. Resnik (URS), Walter Richard Alesi (National Energy Technology Laboratory), Jan Rossmeisl (Kobenhavns Universitet), Edward M. Sabolsky (West Virginia University), Wei Shi (University of Pittsburgh), Nicholas S. Siefert (National Energy Technology Laboratory), Haiyan Su (Dalian Institute of Chemical Physics Chinese Academy of Sciences), Robert Lee Thompson (University of Pittsburgh Medical Center), Relja Vasić (SUNY College of Nanoscale Science and Engineering), Israel E. Wachs (Lehigh University), John D. Watkins (National Energy Technology Laboratory), Chunrong Yin (United States Department of Energy), Xu Zhou (Liquid Ion Solutions)

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Interactive figures in blog posts with mpld3

| categories: plotting, interactive, python | tags:

Continuing the exploration of interactive figures, today we consider the Python plotting library mpld3 . We will again use our own published data. We wrote this great paper on core level shifts (CLS) in Cu-Pd alloys boes-2015-core-cu. I want an interactive figure that shows the name of the calculation on each point as a tooltip. This data is all stored in the supporting information file, and you can see how we use it here. This figure shows how the core level shift of a Cu atom changes depending on the number of nearest neighbor Cu atoms. Just hover your mouse over a point to see the name and CLS for that point.

1 Data and code

You can check out our preprint at https://github.com/KitchinHUB/kitchingroup-51 . We are going to adapt the code to make Figure 6a in the manuscript interactive. The code needed a somewhat surprising amount of adaptation. Apparently the ase database interface has changed a lot since we write that paper, so the code here looks a bit different than what we published. The biggest difference is due to name-mangling so each key that started with a number now starts with _, and and periods are replaced by _ also. The rest of the script is nearly unchanged. At the end is the very small bit of mpld3 code that generates the figure for html. We will add tooltips onto datapoints to indicate what the name associated with each data point is. Here is the code.

import matplotlib.pyplot as plt
from ase.db import connect

# loads the ASE database and select certain keywords
db = connect('~/Desktop/cappa/kitchingroup-51/supporting-information/data.json')

keys = ['bcc', 'GS', '_54atom', 'ensam']

CLS, IMP, labels = [], [], []
for k in db.select(keys + ['_1cl']):
    name = k.keywords[-2]

    Cu0 = db.select('bcc,GS,_72atom,_0cl,_1_00Cu').next().energy
    Cu1 = db.select('bcc,GS,_72atom,_1cl,_1_00Cu').next().energy
    x0 = db.select(','.join(keys + [name, '_0cl'])).next().energy
    x1 = k.energy

    cls0 = x0 - Cu0
    cls1 = x1 - Cu1

    IMP.append(int(name[1]))
    CLS.append(cls1 - cls0)
    labels += ['{0} ({1}, {2})'.format(name, int(name[1]), cls1 - cls0)]

Cu0 = db.select(','.join(['bcc', 'GS', '_72atom',
                          '_0cl', '_1_00Cu'])).next().energy
Cu1 = db.select(','.join(['bcc', 'GS', '_72atom',
                          '_1cl', '_1_00Cu'])).next().energy

x0 = db.select(','.join(['bcc', 'GS', '_54atom',
                         '_0cl', '_1'])).next().energy
x1 = db.select(','.join(['bcc', 'GS', '_54atom',
                         '_1cl', '_1'])).next().energy

cls0 = x0 - Cu0
cls1 = x1 - Cu1

IMP.append(1)
CLS.append(cls1 - cls0)
labels += ['(1, {0})'.format(cls1 - cls0)]

Cu0 = db.select(','.join(['bcc', 'GS', '_72atom',
                          '_0cl', '_1_00Cu'])).next().energy
Cu1 = db.select(','.join(['bcc', 'GS', '_72atom',
                          '_1cl', '_1_00Cu'])).next().energy

x0 = db.select(','.join(['bcc', 'GS', '_54atom',
                         '_0cl', '_0'])).next().energy
x1 = db.select(','.join(['bcc', 'GS', '_54atom',
                         '_1cl', '_0'])).next().energy

cls0 = x0 - Cu0
cls1 = x1 - Cu1

IMP.append(0)
CLS.append(cls1 - cls0)
labels += ['(0, {0})'.format(cls1 - cls0)]

fig = plt.figure()

p = plt.scatter(IMP, CLS, c='g', marker='o', s=25)
ax1 = plt.gca()
ax1.set_ylim(-1.15, -0.6)
ax1.set_xlim(-0.1, 5.1)

ax1.set_xlabel('# Cu Nearest neighbors')
ax1.set_ylabel('Cu 2p(3/2) Core Level Shift (eV)')

ax1.set_title('Hover over a point to see the calculation name')

# Now the mpld3 stuff.
import mpld3
from mpld3 import plugins

tooltip = plugins.PointHTMLTooltip(p, labels, voffset=0, hoffset=10)
plugins.connect(fig, tooltip)

print mpld3.fig_to_html(fig)

I like this workflow pretty well. It seems less functional than plotly and Bokeh (e.g. it does not look like it you can export the data from the html here), but it is well integrated with Matplotlib, with my blogging style, and does not require a server, oran account. The code outputs html that is self-contained in the body of the html. The smooth integration with Matplotlib means I could have static images in org-mode, and dynamic images in HTML potentially. Overall, this is a nice tool for making interactive plots in blog posts.

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Interactive Bokeh plots in HTML

| categories: plotting, interactive, python | tags:

In our last post we examined the use of plotly to generate interactive plots in HTML. Today we expand the idea, and use Bokeh . One potential issue with plotly is the need for an account and API-key, some limitations on how many times a graph can be viewed per day (although I should aspire to have my graphs viewed 1000+ times a day!), and who knows what happens to the graphs if plotly ever goes out of business. While the static images we usually use have limited utility, at least they stick around.

So, today we look at Bokeh which allows you to embed some json data in your HTML, which is made interactive by your browser with more javascript magic. We get straight to the image here so you can see what this is all about. Briefly, this data shows trends (or lack of) in the adsorption energies of some atoms on the atop and fcc sites of several transition metals as a function of adsorbate coverage xu-2014-probin-cover. The code to do this is found here.

Using Bokeh does not integrate real smoothly with my blog workflow, which only generates the body of HTML posts. Bokeh needs some javascript injected into the header to work. To get around that, I show the plot in a frame here. You can see a full HTML version here: bokeh-plot.html .

This is somewhat similar to the plotly concept. The data is embedded in the html in this case, which is different. For very large plots, I actually had some trouble exporting the blog post (it was taking a long time to export and I killed it)! I suspect that is a limitation of the org-mode exporter though, because I could save the html files from Python and view them fine. I also noted that having all the javascript in the org-file make font-lock work very slow. It would be better to generate that only on export.

Note to make this work, we need these lines in our HTML header:

#+HTML_HEAD: <link rel="stylesheet" href="http://cdn.pydata.org/bokeh/release/bokeh-0.11.1.min.css" type="text/css" />
#+HTML_HEAD: <script type="text/javascript" src="http://cdn.pydata.org/bokeh/release/bokeh-0.11.1.min.js"></script>

Since we do not host those locally, if they ever disappear, our plots will not show ;(

1 The data and code

We will get the data from our paper on coverage dependent adsorption energies xu-2014-probin-cover. There are some data rich figures there that would benefit from some interactivity. You can get the data here: http://pubs.acs.org/doi/suppl/10.1021/jp508805h . Extract out the supporting-information.org and energies.json file to follow here. We will make Figure 2a in the SI document here, and make it interactive with hover tooltips.

import json

from collections import OrderedDict
from bokeh import mpl
from bokeh.plotting import *
from bokeh.models import HoverTool
from bokeh.embed import components

with open('/users/jkitchin/Desktop/energies.json', 'r') as f:
    data = json.load(f)


# color for metal
# letter symbol for adsorbate
colors = {'Cu':'Orange',
          'Ag':'Silver',
          'Au':'Yellow',
          'Pd':'Green',
          'Pt':'Red',
          'Rh':'Blue',
          'Ir':'Purple'}

all_ads = ['O', 'S']

TOOLS="crosshair,pan,wheel_zoom,box_zoom,reset,hover,previewsave"
p = figure(title="Correlation between atop and fcc sites", tools=TOOLS)

for metal in ['Rh', 'Pd', 'Cu', 'Ag']:
    for adsorbate in all_ads:
        E1, E2 = [], []
        for coverage in '0.25', '0.5', '0.75', '1.0':
            if (isinstance(data[metal][adsorbate]['ontop'][coverage], float) and
                isinstance(data[metal][adsorbate]['fcc'][coverage], float)):
                E1.append(data[metal][adsorbate]['ontop'][coverage])
                E2.append(data[metal][adsorbate]['fcc'][coverage])
        labels = ['{0}-{1} {2} ML'.format(metal, adsorbate, x)
                  for x in ['0.25', '0.5', '0.75', '1.0']]
        p.line('x', 'y', color=colors[metal],
               source=ColumnDataSource(data={'x': E1,
                                             'y': E2,
                                             'label': labels}))
        p.circle('x', 'y', color=colors[metal],
               source=ColumnDataSource(data={'x': E1,
                                             'y': E2,
                                             'label': labels}))


hover =p.select({'type': HoverTool})
hover.tooltips = OrderedDict([("(atop,fcc)", "(@x, @y)"),
                              ("label", "@label")])

p.xaxis.axis_label = 'Adsorption energy on the atop site'
p.yaxis.axis_label = 'Adsorption energy on the fcc site'

script, div = components(p)
script = '\n'.join(['#+HTML_HEAD_EXTRA: ' + line for line in script.split('\n')])

print '''{script}

#+BEGIN_HTML
<a name="figure"></a>
{div}
#+END_HTML
'''.format(script=script, div=div)

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Interactive plots in HTML with Plotly

| categories: plotting, interactive, python | tags:

Most of the plots in this blog are static. Today, I look at making them interactive. I will use https://plot.ly for this. I want to use some data from a paper we published on the relative stabilities of oxide polymorphs mehta-2015-ident-poten. We will make an interactive figure showing the relative stabilities of the RuO2 polymorphs. When you hover on a point, it will show you which polymorph the point refers to. Let's see the figure first here. If you think its interesting read on to see how we made it!

We get our data source here: http://pubs.acs.org/doi/suppl/10.1021/am4059149/suppl_file/am4059149_si_001.pdf .

Now, we extract the data files:

pdftk ~/Desktop/am4059149_si_001.pdf  unpack_files

That extracts a json file called supporting-information.json. We use it as suggested in the SI pdf to plot the equations of state for RuO2 for several polymorphs.

# coding=utf-8

import plotly.plotly as py
import plotly.graph_objs as go
import plotly.tools as tls
import numpy as np

import json
import matplotlib.pyplot as plt
from ase.utils.eos import EquationOfState
with open('supporting-information.json', 'rb') as f:
    d = json.loads(f.read())

BO2 = 'RuO2'
xc = 'PBE'

layout = go.Layout(title='Energy vs. Volume for RuO<sub>2</sub> polymorphs',
                   xaxis=dict(title='Volume (Å<sup>3</sup>)'),
                   yaxis=dict(title='Energy (eV)'))

traces = []

for polymorph in ['rutile','anatase','brookite','columbite','pyrite','fluorite']:

    # number of atoms in the unit cell - used to normalize
    natoms= len(d[BO2][polymorph][xc]['EOS']['calculations']
                [0]['atoms']['symbols'])
    volumes = [entry['data']['volume']*3./natoms for entry in
               d[BO2][polymorph][xc]['EOS']['calculations']]
    energies =  [entry['data']['total_energy']*3./natoms for entry in
                 d[BO2][polymorph][xc]['EOS']['calculations']]

    trace = go.Scatter(x=np.array(volumes),
                       y=np.array(energies),
                       mode='lines+markers',
                       name=polymorph,
                       text=polymorph)

    traces += [trace]

fig = go.Figure(data=traces, layout=layout)
plot_url = py.plot(fig, filename='ruo2-2')

print tls.get_embed(plot_url)

Pretty nice, now we should have an interactive plot in our browser with the data points labeled with tags, zooming, etc… That is nice for the blog. It isn't so nice for daily work, as there is no visual version of the plot in my org-file. Of course, I can visit the url to see the plot in my browser, it is just different from what I am used to. For everyone else, this is probably better. It looks like you can actually get the data from the web page, including some minimal analysis like regression, and save your view to an image! That could be pretty nice for some data sets.

1 Using Plotly yourself

First, go to https://plot.ly and sign up for an account. You will want to register your API key like this, which will save it in a file for your convenience. Then you can do things like I did above too.

import plotly.tools as tls
tls.set_credentials_file(username='jkitchin', api_key='xxxxxxx')

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Asynchronously running python blocks in org-mode

| categories: python, orgmode, emacs | tags:

If you run long Python blocks from org-mode, you might want to keep working while it runs. Currently Emacs gets blocked and you have to wait patiently. In this post we consider some ways to avoid this that run our code asynchronously, but still put results where they belong in the org-buffer.

This is a long post. You may want to see the video: https://www.youtube.com/watch?v=VDyoN8yipSE , or skip to the end where the best and final version is shown.

1 The async module

Here we consider an approach that uses https://github.com/jwiegley/emacs-async module. The idea is to tangle the Python block at point to a temp file, then asynchronously run it. We capture the output and put it back in the buffer. We use a uuid to find the place to put the results in org-mode format. Here is the code that implements this idea.

(require 'async)

(defun org-babel-async-execute ()
  "Run a python block at point asynchrously."
  (interactive)

  (let ((current-file (buffer-file-name))
        (uuid (org-id-uuid))
        (temporary-file-directory "./")
        (tempfile (make-temp-file "py-")))

    (org-babel-tangle '(4) tempfile)
    (org-babel-remove-result)
    (save-excursion
      (re-search-forward "#\\+END_SRC")
      (insert (format
               "\n\n#+RESULTS: %s\n: %s"
               (or (org-element-property :name (org-element-context))
                   "")
               uuid)))

    (async-start
     ;; what to start
     `(lambda ()
        ;; now we run the command then cleanup
        (prog1
            (shell-command-to-string (format "python %s" ,tempfile))
          (delete-file ,tempfile)))

     `(lambda (result)
        "Code that runs when the async function finishes."
        (save-window-excursion
          (save-excursion
            (save-restriction
              (with-current-buffer (find-file-noselect ,current-file)
                (goto-char (point-min))
                (re-search-forward ,uuid)
                (beginning-of-line)
                (kill-line)
                (insert (mapconcat
                         (lambda (x)
                           (format ": %s" x))
                         (butlast (s-split "\n" result))
                         "\n"))))))))))
org-babel-async-execute

Here is a block to test it on. We can run the block, and keep on working while the code runs. The results seem to get inserted correctly at the right point even if I am in another window or frame! We don't get easy access to continuous output of the command. This wouldn't work if we close Emacs, but who does that?

print 'hello world'
import time
time.sleep(5)

import os
print os.getcwd()
print time.asctime()
hello world
/Users/jkitchin/blogofile-jkitchin.github.com/_blog
Fri Nov 20 10:17:53 2015

There are some limitations to this approach. One of them is it assumes the src block is a stand-alone block that will run on its own. That is usually how I run mine, but I could see having other modules that should be tangled out of a file too. I think the script is being run in the current working directory, so it probably will find any local imports it needs.

You don't get any intermediate feedback on this process. It seems to be possible to do that with a different approach that puts some output in a new buffer, e.g. with start-process. But, you still need some clever code like the async model to know when to insert the results back into this buffer. We consider Emacs processes and sentinels next.

2 Emacs process approach with tangling

We can start a process in Emacs, and attach a sentinel function to it that runs after the process completes. Here is an example of that. We still tangle the src-block here.

(defun org-babel-async-execute ()
  (interactive)
  (let* ((current-file (buffer-file-name))
        (uuid (org-id-uuid))
        (temporary-file-directory "./")
        (tempfile (make-temp-file "py-"))
        (pbuffer (format "*%s*" uuid))
        process)

    (org-babel-tangle '(4) tempfile)
    (org-babel-remove-result)

    (save-excursion
      (re-search-forward "#\\+END_SRC")
      (insert (format
               "\n\n#+RESULTS: %s\n: %s"
               (or (org-element-property :name (org-element-context))
                   "")
               uuid)))

    (setq process (start-process
                   uuid
                   pbuffer
                   "python"
                   tempfile))

    (set-process-sentinel
     process
     `(lambda (process event)
        (when (string= "finished\n" event)
          (delete-file ,tempfile)
          (save-window-excursion
            (save-excursion
              (save-restriction
                (with-current-buffer (find-file-noselect ,current-file)
                  (goto-char (point-min))
                  (re-search-forward ,uuid)
                  (beginning-of-line)
                  (kill-line)
                  (insert (mapconcat
                           (lambda (x)
                             (format ": %s" x))
                           (split-string
                            (with-current-buffer ,pbuffer (buffer-string))
                            "\n")
                           "\n")))))))
        (kill-buffer ,pbuffer)))))
org-babel-async-execute
print 'hello world'
import time
time.sleep(10)

import os
print os.getcwd()
print time.asctime()
hello world
/Users/jkitchin/blogofile-jkitchin.github.com/_blog
Fri Nov 20 10:20:01 2015

That works well from what I can see. There are some limitations. I doubt this will work if you use variables in the src block header. Next we consider an approach that does not do the tangling, and that will show us code output as it goes.

3 Emacs process approach with no tangling

As an alternative to tangling to a file, here we just copy the code to a file and then run it. This allows us to use :var in the header to pass data in at run time. At the moment, this code only supports printed output from code blocks, not the value for :results.

(defun org-babel-async-execute:python ()
  "Execute the python src-block at point asynchronously.
:var headers are supported.
:results output is all that is supported for output.

A new window will pop up showing you the output as it appears,
and the output in that window will be put in the RESULTS section
of the code block."
  (interactive)
  (let* ((current-file (buffer-file-name))
         (uuid (org-id-uuid))
         (code (org-element-property :value (org-element-context)))
         (temporary-file-directory ".")
         (tempfile (make-temp-file "py-"))
         (pbuffer (format "*%s*" uuid))
         (varcmds (org-babel-variable-assignments:python
                   (nth 2 (org-babel-get-src-block-info))))
         process)

    ;; get rid of old results, and put a place-holder for the new results to
    ;; come.
    (org-babel-remove-result)

    (save-excursion
      (re-search-forward "#\\+END_SRC")
      (insert (format
               "\n\n#+RESULTS: %s\n: %s"
               (or (org-element-property :name (org-element-context))
                   "")
               uuid)))

    ;; open the results buffer to see the results in.
    (switch-to-buffer-other-window pbuffer)

    ;; Create temp file containing the code.
    (with-temp-file tempfile
      ;; if there are :var headers insert them.
      (dolist (cmd varcmds)
        (insert cmd)
        (insert "\n"))
      (insert code))

    ;; run the code
    (setq process (start-process
                   uuid
                   pbuffer
                   "python"
                   tempfile))

    ;; when the process is done, run this code to put the results in the
    ;; org-mode buffer.
    (set-process-sentinel
     process
     `(lambda (process event)
        (save-window-excursion
          (save-excursion
            (save-restriction
              (with-current-buffer (find-file-noselect ,current-file)
                (goto-char (point-min))
                (re-search-forward ,uuid)
                (beginning-of-line)
                (kill-line)
                (insert
                 (mapconcat
                  (lambda (x)
                    (format ": %s" x))
                  (butlast (split-string
                            (with-current-buffer
                                ,pbuffer
                              (buffer-string))
                            "\n"))
                  "\n"))))))
        ;; delete the results buffer then delete the tempfile.
        ;; finally, delete the process.
        (when (get-buffer ,pbuffer)
          (kill-buffer ,pbuffer)
          (delete-window))
        (delete-file ,tempfile)
        (delete-process process)))))
org-babel-async-execute:python

Let us try it out again.

print 'hello world'
import time
time.sleep(1)

for i in range(5):
    print i

    time.sleep(0.5)


import os
print os.getcwd()
print time.asctime()

print data

raise IOError('No file!')
hello world
0
1
2
3
4
/Users/jkitchin/blogofile-jkitchin.github.com/_blog
Fri Nov 20 19:30:16 2015
[1, 3]
Traceback (most recent call last):
  File "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/py-84344aa1", line 18, in <module>
    raise IOError('No file!')
IOError: No file!

It works fine for this simple example. We get to see the output as the code executes, which is a pleasant change from the usual way of running python blocks. There is some support for some header arguments, notably the :var header. I don't use :results value in Python, so for now only output is supported. We even support Exceptions in the output finally!

Maybe some org-moder's out there can try this and run it through some more rigorous paces?

Copyright (C) 2015 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter
« Previous Page -- Next Page »