Org-mode and ipython enhancements in scimax

| categories: ipython, orgmode, emacs | tags:

We have made some improvements to using Ipython in org-mode in the past including:

  1. Inline figures
  2. Export to Jupyter notebooks

Today I will talk about a few new features and improvements I have introduced to scimax for using org-mode and Ipython together.

The video for this post might be more obvious than the post:

1 Some convenience functions

There are a few nice shortcuts in the Jupyter notebook. Now we have some convenient commands in scimax to mimic those. My favorites are adding cells above or below the current cell. You can insert a new src block above the current one with (M-x org-babel-insert-block). You can use a prefix arg to insert it below the current block.

# code
# below
# some code

I am particularly fond of splitting a large block into two smaller blocks. Use (M-x org-babel-split-src-block) to do that and leave the point in the upper block. Use a prefix arg to leave the point in the lower block.

# lots of code in large block
# Even more code
# The end of the long block

You can execute all the blocks up to the current point with (M-x org-babel-execute-to-point).

2 ob-ipython-inspect works

In the original ob-ipython I found that ob-ipython-inspect did not work unless you were in special edit mode. That is too inconvenient. I modified a few functions to work directly from the org-buffer. I bind this to M-. in org-mode.

%matplotlib inline
import numpy as np

import matplotlib.pyplot as plt

# Compute areas and colors
N = 150
r = 2 * np.random.rand(N)
theta = 2 * np.pi * np.random.rand(N)
area = 200 * r**2
colors = theta

ax = plt.subplot(111, projection='polar')
c = ax.scatter(theta, r, c=colors, s=area, cmap='hsv', alpha=0.75)

<matplotlib.figure.Figure at 0x114ded710>

3 Getting selective output from Ipython

Out of the box Ipython returns a lot of results. This block, for example returns a plain text, image and latex result as output.

from sympy import *
# commenting out init_printing() results in no output
init_printing()

var('x y')
x**2 + y

2 x + y

We can select which one we want with a new header argument :ob-ipython-results. For this block you can give it the value of text/plain, text/latex or image/png.

var('x y')
x**2 + y

2 x + y

Or to get the image:

var('x y')
x**2 + y

This shows up with pandas too. This block creates a table of data and then shows the first 5 rows. Ipython returns both plain text and html here.

import pandas as pd
import numpy as np
import datetime as dt

def makeSim(nHosps, nPatients):
    df = pd.DataFrame()
    df['patientid'] = range(nPatients)
    df['hospid'] = np.random.randint(0, nHosps, nPatients)
    df['sex'] = np.random.randint(0, 2, nPatients)
    df['age'] = np.random.normal(65,18, nPatients)
    df['race'] = np.random.randint(0, 4, nPatients)
    df['cptCode'] = np.random.randint(1, 100, nPatients)
    df['rdm30d'] = np.random.uniform(0, 1, nPatients) < 0.1
    df['mort30d'] = np.random.uniform(0, 1, nPatients) < 0.2
    df['los'] = np.random.normal(8, 2, nPatients)
    return df

discharges = makeSim(50, 10000)
discharges.head()

patientid hospid sex age race cptCode rdm30d mort30d los 0 0 10 1 64.311947 0 8 False False 8.036793 1 1 6 0 82.951484 1 73 True False 7.996024 2 2 27 1 53.064501 3 95 False False 9.015144 3 3 37 0 64.799128 0 93 False False 10.099032 4 4 46 0 99.111394 2 25 False False 11.711427

patientid hospid sex age race cptCode rdm30d mort30d los
0 0 10 1 64.311947 0 8 False False 8.036793
1 1 6 0 82.951484 1 73 True False 7.996024
2 2 27 1 53.064501 3 95 False False 9.015144
3 3 37 0 64.799128 0 93 False False 10.099032
4 4 46 0 99.111394 2 25 False False 11.711427

We can use the header to select only the plain text output!

import pandas as pd
import numpy as np
import datetime as dt

def makeSim(nHosps, nPatients):
    df = pd.DataFrame()
    df['patientid'] = range(nPatients)
    df['hospid'] = np.random.randint(0, nHosps, nPatients)
    df['sex'] = np.random.randint(0, 2, nPatients)
    df['age'] = np.random.normal(65,18, nPatients)
    df['race'] = np.random.randint(0, 4, nPatients)
    df['cptCode'] = np.random.randint(1, 100, nPatients)
    df['rdm30d'] = np.random.uniform(0, 1, nPatients) < 0.1
    df['mort30d'] = np.random.uniform(0, 1, nPatients) < 0.2
    df['los'] = np.random.normal(8, 2, nPatients)
    return df

discharges = makeSim(50, 10000)
discharges.head()

patientid hospid sex age race cptCode rdm30d mort30d los 0 0 21 0 73.633836 1 38 False False 7.144019 1 1 16 1 67.518804 3 23 False False 3.340534 2 2 15 0 44.139033 0 8 False False 9.258706 3 3 29 1 45.510276 2 5 False False 10.590245 4 4 7 0 52.974924 2 4 False True 5.811064

4 Where was that error?

A somewhat annoying feature of running cells in org-mode is when there is an exception there has not been a good way to jump to the line that caused the error to edit it. The lines in the src block are not numbered, so in a large block it can be tedious to find the line. In scimax, when you get an exception it will number the lines in the src block, and when you press q in the exception traceback buffer it will jump to the line in the block where the error occurred.

print(1)
#raise Exception('Here')
print(2)

1 2

If you don't like the numbers add this to your init file:

(setq ob-ipython-number-on-exception nil)

5 Asynchronous Ipython

I have made a few improvements to the asynchronous workflow in Ipython. We now have a calculation queue, so you can use C-c C-c to execute several blocks in a row, and they will run asynchronously in the order you ran them. While they are running you can continue using Emacs, e.g. writing that paper, reading email, checking RSS feeds, tetris, … This also lets you run all the blocks up to the current point (M-x org-babel-execute-ipython-buffer-to-point-async) or the whole buffer (of Ipython) blocks asynchronously (M-x org-babel-execute-ipython-buffer-async).

To turn this on by default put this in your init file:

(setq org-babel-async-ipython t)

This requires all src blocks to have a name, and running the block will give it a name if you have not named the block. By default we use human-readable names. While the block is running, there will be a link indicating it is running. You can click on the link to cancel it. Running subsequent blocks will queue them to be run when the first block is done.

Here is an example:

import time
time.sleep(5)
a = 5
print('done')
print(3 * a)

15

Occasionally you will run into an issue. You can clear the queue with org-babel-async-ipython-clear-queue.

Copyright (C) 2017 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.0.5

Discuss on Twitter