Editing org-mode python source blocks in an external editor (Canopy)

| categories: orgmode, python | tags:

Continuing on the last post about leveraging org-mode and python syntax checkers, here we consider using (heresy alert…) an external editor for Python src blocks in org-mode. Why would we consider such insanity? Because, for beginners, environments such as Canopy are (IMHO) easier to use, and better than anything I have used in Emacs. And, I still want the framework of org-mode for content, just a better Python code writing environment.

This problem has some interesting challenges. I would like a command that opens a code block with its contents in the Canopy editor, or that creates a code block if needed. We need to figure out that context based on the cursor position. We will use the same temporary file strategy as used before, so Canopy has something to read and save to. We need to wait for Canopy to finish, which will be tricky because it returns as soon as you run it. Finally, I want the code block to run after it is put back in the org-file, so that the results are captured.

This code block implements the idea, and the comments in the code explain what each section is doing.

(defun edit-in-canopy ()
  (interactive)
  (let* ((eop (org-element-at-point))
         ;; use current directory for temp file so relative paths work
         (temporary-file-directory ".")
         (tempfile))

    ;; create a tempfile. 
    (setq tempfile (make-temp-file "canopy" nil ".py"))

    ;; figure out what to do
    (when
        ;; in an existing source block. we want to edit it.
        (and (eq 'src-block (car eop))
             (string= "python" (org-element-property :language eop)))
          
      ;; put code into tempfile
      (with-temp-file tempfile
        (insert (org-element-property :value eop))))

    ;; open tempfile in canopy
    (shell-command (concat "canopy " tempfile))
    (sleep-for 2) ;; startup time. canopy is slow to showup in
                  ;; ps. This gives it some time to do that. Canopy
                  ;; returns right away, so we sleep while there is
                  ;; evidence that it is open. We get that evidence
                  ;; from ps by searching for canopy.app.main, which
                  ;; seems to exist in the output while Canopy is
                  ;; open.
    (while
        (string-match "canopy\.app\.main"
                      (shell-command-to-string "ps aux"))
      ;; pause a while, then check again.
      (sleep-for 1))

    ;; Canopy has closed, so we get the new script contents
    (let ((new-contents (with-temp-buffer
                          (insert-file-contents tempfile)
                          (buffer-string))))
      (cond
       ;; replace existing code block contents
       ((and (eq 'src-block (car eop))
             (string= "python" (org-element-property :language eop)))
        (goto-char (org-element-property :begin eop))
        (search-forward (org-element-property :value eop))
        (replace-match (concat new-contents "\n")))
       ;; create new code block
       (t
        (insert
         (format "\n#+BEGIN_SRC python
%s
#+END_SRC
" new-contents))
        ;; go into new block so we can run it.
        (previous-line 2))))

    ;; delete the tempfile so they do not accumulate
    (delete-file tempfile)
    ;; and run the new block to get the results
    (org-babel-execute-src-block)))
edit-in-canopy

That seems to work. It is difficult to tell from this post the function works as advertised. You can see it in action here: http://www.youtube.com/watch?v=-noKrT1dfFE .

from scipy.integrate import odeint


def dydx(y, x):
    k = 1
    return -k * y

print odeint(dydx, 1, [0, 1])

import numpy as np
print np.exp(-1)
[[ 1.        ]
 [ 0.36787947]]
0.367879441171

We created this code block externally.

print 'hello'
hello

1 Summary thoughts

Opening Canopy is a little slow (and that is coming from someone who opens Emacs ;). But, once it is open it is pretty nice for writing code, with the interactive Ipython console, and integrated help. Yes, it is probably possible to get Emacs to do that too, and maybe it will do that one day. Canopy does it today.

Unfortunately, this code will not work on Windows, most likely, since it relies on the ps program. There does seem to be a tasklist function in Windows that is similar, but it seems that Canopy runs as pythonw in that function, which is not very specific.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter

Improved debugging of Python code blocks in org-mode

| categories: orgmode, python | tags:

Writing and running code blocks in org-mode is awesome, when it works. I find as the code blocks get past a certain size though, it can be tedious to debug, especially for new users. Since I am teaching 59 students to use Python in org-mode, I see this issue a lot! They lack experience to avoid many simple errors, and to find and fix them. Even in my hands, I do not always want to be switching to Python mode to run and debug blocks.

org-mode src-blocks offer a unique challenge for the usual tools like pylint and pychecker, because the code does not exist in a file. In this post, I will explore developing some functions that do syntax checking on a src block. We will use a simple method which will write the block to a temporary file, and to the checking on that block. Then, we will create temporary buffers with the output.

Here is the first idea. We create a temp file in the working directory, write the code to it, and run pychecker, pyflakes and pep8 on the file.

(defun org-pychecker ()
  "Run pychecker on a source block"
  (interactive)
  (let ((eop (org-element-at-point))
        (temporary-file-directory ".")
        (tempfile))
    (when (and (eq 'src-block (car eop))
               (string= "python" (org-element-property :language eop)))
      (setq tempfile (make-temp-file "pychecker" nil ".py"))
      ;; create code file
      (with-temp-file tempfile
        (insert (org-element-property :value eop)))
      (switch-to-buffer "*pychecker*")
      (erase-buffer)
      (insert "pychecker\n=================\n")
      (insert
       (shell-command-to-string (format "pychecker %s" (file-name-nondirectory tempfile))))
      (insert "\npyflakes\n=================\n")
      (insert
       (shell-command-to-string (format "pyflakes %s" (file-name-nondirectory tempfile))))
      (insert "\npep8\n=================\n")
      (insert
       (shell-command-to-string (format "pep8 %s" (file-name-nondirectory tempfile))))
      (delete-file tempfile))))

Here is a sample code block with some errors in it.

a = 5  # a variable we do not use


def f(x, y):  # unused argument
    return x - b # undefined variable

print 6 * c

On the code block above, that function leads to this output.

pychecker
=================
Processing module pychecker63858xo0 (pychecker63858xo0.py)...
  Caught exception importing module pychecker63858xo0:
    File "/Users/jkitchin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pychecker/pcmodules.py", line 540, in setupMainCode()
      module = imp.load_module(self.moduleName, handle, filename, smt)
    File "pychecker63858xo0.py", line 7, in <module>()
      print 6 * c
  NameError: name 'c' is not defined

Warnings...

pychecker63858xo0:1: NOT PROCESSED UNABLE TO IMPORT

pyflakes
=================
pychecker63858xo0.py:5: undefined name 'b'
pychecker63858xo0.py:7: undefined name 'c'

pep8
=================
pychecker63858xo0.py:5:17: E261 at least two spaces before inline comment

That is pretty helpful, but it gives us line numbers we cannot directly access in our code block. We can open the code block in Python mode, and then navigate to them, but that is likely to make the buffer with this information disappear. It would be better if we could just click on a link and go to the right place. Let us explore what we need for that.

We need to parse the output to get the line numbers, and then we can construct org-links to those places in the src block. pyflakes, pep8 and pylint look like the easiest to get. A way to get to the line would be a lisp function that moves to the beginning of the code block, and then moves forward n lines. We will use a regular expression on each line of the output of pyflakes and pep8 to get the line number. We will construct an org-link to go to the source block at the line.

In this long code block, we create a function that will run pyflakes, pep8 and pylint, and create a new buffer with links to the issues it finds. Finally, we apply this as advice on executing org-babel-execute:python so it only runs when we execute a python block in org-mode. This is a long block, because I have made it pretty feature complete.

(defun org-py-check ()
  "Run python check programs on a source block.
Opens a buffer with links to what is found."
  (interactive)
  (let ((eop (org-element-at-point))
        (temporary-file-directory ".")
        (cb (current-buffer))
        (n) ; for line number
        (content) ; error on line
        (pb "*org pycheck*")
        (pyflakes-status nil)
        (link)
        (tempfile))

    (unless (executable-find "pyflakes")
      (error "pyflakes is not installed."))
    
    (unless (executable-find "pep8")
      (error "pep8 not installed"))

    (unless (executable-find "pylint")
      (error "pylint not installed"))

    ;; rm buffer if it exists
    (when (get-buffer pb) (kill-buffer pb))
    
    ;; only run if in a python code-block
    (when (and (eq 'src-block (car eop))
               (string= "python" (org-element-property :language eop)))

      ;; tempfile for the code
      (setq tempfile (make-temp-file "pychecker" nil ".py"))
      ;; create code file
      (with-temp-file tempfile
        (insert (org-element-property :value eop)))
      
      (let ((status (shell-command
                     (format "pyflakes %s" (file-name-nondirectory tempfile))))
            (output (delete "" (split-string
                                (with-current-buffer "*Shell Command Output*"
                                  (buffer-string)) "\n"))))
        (setq pyflakes-status status)
        (kill-buffer "*Shell Command Output*")
        (when output
          (set-buffer (get-buffer-create pb))
          (insert (format "\n* pyflakes output (status=%s)
pyflakes checks your code for errors. You should probably fix all of these.

" status))
          (dolist (line output)
            ;; get the line number
            (if 
                (string-match (format "^%s:\\([0-9]*\\):\\(.*\\)"
                                      (file-name-nondirectory tempfile))
                              line)
                (progn
                  (setq n (match-string 1 line))
                  (setq content (match-string 2 line))
                  (setq link (format "[[elisp:(progn (switch-to-buffer-other-window \"%s\")(goto-char %s)(forward-line %s))][%s]]\n"
                                     cb
                                     (org-element-property :begin eop)
                                     n
                                     (format "Line %s: %s" n content))))
              ;; no match, just insert line
              (setq link (concat line "\n")))
            (insert link))))

      (let ((status (shell-command
                     (format "pep8 %s" (file-name-nondirectory tempfile))))
            (output (delete "" (split-string
                                (with-current-buffer "*Shell Command Output*"
                                  (buffer-string)) "\n"))))
        (kill-buffer "*Shell Command Output*")
        (when output
          (set-buffer (get-buffer-create pb))
          (insert (format "\n\n* pep8 output (status = %s)\n" status))
          (insert "pep8 is the [[http://legacy.python.org/dev/peps/pep-0008][officially recommended style]] for writing Python code. Fixing these will usually make your code more readable and beautiful. Your code will probably run if you do not fix them, but, it will be ugly.

")
          (dolist (line output)
            ;; get the line number
            (if 
                (string-match (format "^%s:\\([0-9]*\\):\\(.*\\)"
                                      (file-name-nondirectory tempfile))
                              line)
                (progn
                  (setq n (match-string 1 line))
                  (setq content (match-string 2 line))
                  (setq link (format "[[elisp:(progn (switch-to-buffer-other-window \"%s\")(goto-char %s)(forward-line %s))][%s]]\n"
                                     cb
                                     (org-element-property :begin eop)
                                     n
                                     (format "Line %s: %s" n content))))
              ;; no match, just insert line
              (setq link (concat line "\n")))
            (insert link))))

      ;; pylint
      (let ((status (shell-command
                     (format "pylint -r no %s" (file-name-nondirectory tempfile))))
            (output (delete "" (split-string
                                (with-current-buffer "*Shell Command Output*"
                                  (buffer-string)) "\n"))))
        (kill-buffer "*Shell Command Output*")
        (when output
          (set-buffer (get-buffer-create pb))
          (insert (format "\n\n* pylint (status = %s)\n" status))
          (insert "pylint checks your code for errors, style and convention. It is complementary to pyflakes and pep8, and usually more detailed.

")

          (dolist (line output)
            ;; pylint gives a line and column number
            (if 
                (string-match "[A-Z]:\\s-+\\([0-9]*\\),\\s-*\\([0-9]*\\):\\(.*\\)"                            
                              line)
                (let ((line-number (match-string 1 line))
                      (column-number (match-string 2 line))
                      (content (match-string 3 line)))
                     
                  (setq link (format "[[elisp:(progn (switch-to-buffer-other-window \"%s\")(goto-char %s)(forward-line %s)(forward-line 0)(forward-char %s))][%s]]\n"
                                     cb
                                     (org-element-property :begin eop)
                                     line-number
                                     column-number
                                     line)))
              ;; no match, just insert line
              (setq link (concat line "\n")))
            (insert link))))
    
      (when (get-buffer pb)
        (switch-to-buffer-other-window pb)
        (goto-char (point-min))
        (insert "Press q to close the window\n")
        (org-mode)       
        (org-cycle '(64))
        ;; make read-only and press q to quit
        (setq buffer-read-only t)
        (use-local-map (copy-keymap org-mode-map))
        (local-set-key "q" #'(lambda () (interactive) (kill-buffer))))

      (unless (= 0 pyflakes-status)
        (forward-line 4)
        (error "pyflakes exited non-zero. please fix errors"))
      ;; final cleanup and delete file
      (delete-file tempfile)
      (switch-to-buffer-other-window cb))))


(defadvice org-babel-execute:python (before pychecker)
  (org-py-check))

(ad-activate 'org-babel-execute:python)
org-babel-execute:python

Now, when I try to run this code block, which has some errors in it:

a = 5  # a variable we do not use


def f(x, y):  # unused argument
    return x - b # undefined

print 6 * c

I get a new buffer with approximately these contents:

Press q to close the window

* pyflakes output (status=1)
pyflakes checks your code for errors. You should probably fix all of these.

Line 5:  undefined name 'b'
Line 7:  undefined name 'c'


* pep8 output (status = 1)
pep8 is the officially recommended style for writing Python code. Fixing these will usually make your code more readable and beautiful. Your code will probably run if you do not fix them, but, it will be ugly.

Line 5: 17: E261 at least two spaces before inline comment


* pylint (status = 22)pylint checks your code for errors, style and convention. It is complementary to pyflakes and pep8, and usually more detailed.

No config file found, using default configuration
************* Module pychecker68224dkX
C:  1, 0: Invalid module name "pychecker68224dkX" (invalid-name)
C:  1, 0: Missing module docstring (missing-docstring)
C:  1, 0: Invalid constant name "a" (invalid-name)
C:  4, 0: Invalid function name "f" (invalid-name)
C:  4, 0: Invalid argument name "x" (invalid-name)
C:  4, 0: Invalid argument name "y" (invalid-name)
C:  4, 0: Missing function docstring (missing-docstring)
E:  5,15: Undefined variable 'b' (undefined-variable)
W:  4, 9: Unused argument 'y' (unused-argument)
E:  7,10: Undefined variable 'c' (undefined-variable)

Each of those links takes me to either the line, or the position of the error (in the case of pylint)! I have not tested this on more than a handful of code blocks, but it has worked pretty nicely on them so far!

Of course, you must have pyflakes, pep8 and pylint installed. But those are all easily installed with pip as far as I can tell.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter

Generating an atomic stoichiometric matrix

| categories: thermodynamics, python | tags:

In computing thermodynamic properties with species, it is sometimes required to get a matrix that specifies number of each type of atom in each species. For example, we can create this by hand as follows:

  H2O CO2 H2 CO
H 2 0 2 0
C 0 1 0 1
O 1 2 0 1

Here we aim to generate this table from code. Why? 1. We can readily add species to it if we do it right. 2. We are less likely to make mistakes in generation of the table, and if we do, it will be faster to regenerate the table.

We will start with a list of strings that represent the chemical formula of each species. We will need to parse the strings to find the elements, and number of them. We will use a fairly naive regular expression to parse a chemical formula. Basically, we match a capital letter + an optional lowercase letter, followed by an optional number. Here is a fictitous example to illustrate. Note, this will not work with formulas that have parentheses, or charges.

import re
m = re.findall('([A-Z][a-z]?)(\d?)' , 'ArC2H6Cu56Pd47Co')
print m
[('Ar', ''), ('C', '2'), ('H', '6'), ('Cu', '5'), ('Pd', '4'), ('Co', '')]

Now, we need to loop over the species, and collect all the elements in them. We will just make a list of all of the elments, and then get the set.

import re

# save for future use
cf = re.compile('([A-Z][a-z]?)(\d?)')

species = ['H2O', 'CO2', 'H2', 'CO2']

all_elements = []

for s in species:
    for el, count in re.findall(cf, s):
        all_elements += [el]

print set(all_elements)
set(['H', 'C', 'O'])

Finally, we can create the table. We need to loop through each element, and then through each species

import re

# save for future use
cf = re.compile('([A-Z][a-z]?)(\d?)')

species = ['H2O', 'CO2', 'H2', 'CO2']

all_elements = []

for s in species:
    for el, count in re.findall(cf, s):
        all_elements += [el]

atoms = set(all_elements)

# we put a placeholder in the first row
counts = [[""] + species]
for e in atoms:
    # store the element in the first column
    count = [e]
    for s in species:    
        d = dict(re.findall(cf, s))
        n = d.get(e, 0)
        if n == '': n = 1
        count += [int(n)]
    counts += [count]

# this directly returns the array to org-mode
return counts
  H2O CO2 H2 CO2
H 2 0 2 0
C 0 1 0 1
O 1 2 0 2

For this simple example it seems like a lot of code. If there were 200 species though, it would be the same code! Only the list of species would be longer. It might be possible to avoid the two sets of looping, if you could represent the stoichiometric matrix as a sparse matrix, i.e. only store non-zero elements. The final comment I have is related to the parsing of the chemical formulas. Here we can only parse simple formulas. To do better than this would require a pretty sophisticated parser, probably built on the grammar of chemical formulas. The example here implements the code above using pyparsing, and could probably be extended to include more complex formulas such as (CH3)3CH.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter

Finding the maximum power of a photovoltaic device.

| categories: optimization, python | tags:

A photovoltaic device is characterized by a current-voltage relationship. Let us say, for argument's sake, that the relationship is known and defined by

\(i = 0.5 - 0.5 * V^2\)

The voltage is highest when the current is equal to zero, but of course then you get no power. The current is highest when the voltage is zero, i.e. short-circuited, but there is again no power. We seek the highest power condition, which is to find the maximum of \(i V\). This is a constrained optimization. We solve it by creating an objective function that returns the negative of (\i V\), and then find the minimum.

First, let us examine the i-V relationship.

import matplotlib.pyplot as plt
import numpy as np

V = np.linspace(0, 1)

def i(V):
    return 0.5 - 0.5 * V**2

plt.figure()
plt.plot(V, i(V))
plt.savefig('images/iV.png')
<matplotlib.figure.Figure object at 0x11193ec18>
[<matplotlib.lines.Line2D object at 0x111d43668>]

Now, let us be sure there is a maximum in power.

import matplotlib.pyplot as plt
import numpy as np

V = np.linspace(0, 1)

def i(V):
    return 0.5 - 0.5 * V**2

plt.plot(V, i(V) * V)
plt.savefig('images/P1.png')
[<matplotlib.lines.Line2D object at 0x111d437f0>]

You can see in fact there is a maximum, near V=0.6. We could solve this problem analytically by taking the appropriate derivative and solving it for zero. That still might require solving a nonlinear problem though. We will directly setup and solve the constrained optimization.

from scipy.optimize import fmin_slsqp
import numpy as np
import matplotlib.pyplot as plt

def objective(X):
    i, V = X
    return - i * V

def eqc(X):
    'equality constraint'
    i, V = X
    return (0.5 - 0.5 * V**2) - i

X0 = [0.2, 0.6]
X = fmin_slsqp(objective, X0, eqcons=[eqc])

imax, Vmax = X


V = np.linspace(0, 1)

def i(V):
    return 0.5 - 0.5 * V**2

plt.plot(V, i(V), Vmax, imax, 'ro')
plt.savefig('images/P2.png')
Optimization terminated successfully.    (Exit mode 0)
            Current function value: -0.192450127337
            Iterations: 5
            Function evaluations: 20
            Gradient evaluations: 5
[<matplotlib.lines.Line2D object at 0x111946470>, <matplotlib.lines.Line2D object at 0x11192c518>]

You can see the maximum power is approximately 0.2 (unspecified units), at the conditions indicated by the red dot in the figure above.

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Scheduling tasks on a rotating semester basis

| categories: python | tags:

Let us say we have a list of tasks labeled task a through k. We want to schedule these tasks on a rotating basis, so that some tasks are done in even years and some tasks are done in odd years. Within those years, some tasks are done in the Fall, and some are done in the spring. This post explores how to code those tasks so we can figure out which tasks should be done in some part of some year.

We break the problem down like this. A year is an even year if mod(year,2)=0, and it is odd if mod(year,2)=1. So for a year, we have a bit of information. Now, since there are two times of the year we will do the tasks, we can assign this as another bit, e.g. FALL=0, and SPRING=1. Now, we have the following possibilities:

year time period binary code decimal number
2013 Fall 10 2
2014 Spring 01 1
2014 Fall 00 0
2015 Spring 11 3

And then the cycle will repeat. So, if we code each task with an integer of 0, 1, 2 or 3, we can say in a given year and time period whether a task should be completed. If 2 * mod(year, 2) + period_code is equal to the code on the task, then it should be executed.

Now, we need to start the task sequence. Let us say we start in the Fall of 2013. That is an odd year, so year % 2 = 1, and we use a tag of 0 to represent the Fall semester, giving an overall binary code of 10 which is equal to 2, so all tasks labeled 2 should be executed.

We will assign the codes to each task by enumerating a string of letters, and giving the task a code of mod(letter index, 4). That will loop through the tasks assigning codes of 0, 1, 2 or 3 to each task.

So to schedule these we will loop through a list of years, calculate the code for each year and time perid, and then filter the list of tasks with that code.

tasks = [(letter, i % 4) for i,letter in enumerate('abcdefghijk')]

print 'tasks = ',tasks

SEMESTERS = (('FALL',0), ('SPRING',1))

for year in [2013, 2014, 2015, 2016, 2017, 2018]:
    for semester,i in SEMESTERS:
        N = 2 * (year % 2) + i
        print '{0} {1:8s}: {2}'.format(year, semester,
                                    [x[0] for x in 
                                     filter(lambda x: x[1]==N,
                                            tasks)])
tasks =  [('a', 0), ('b', 1), ('c', 2), ('d', 3), ('e', 0), ('f', 1), ('g', 2), ('h', 3), ('i', 0), ('j', 1), ('k', 2)]
2013 FALL    : ['c', 'g', 'k']
2013 SPRING  : ['d', 'h']
2014 FALL    : ['a', 'e', 'i']
2014 SPRING  : ['b', 'f', 'j']
2015 FALL    : ['c', 'g', 'k']
2015 SPRING  : ['d', 'h']
2016 FALL    : ['a', 'e', 'i']
2016 SPRING  : ['b', 'f', 'j']
2017 FALL    : ['c', 'g', 'k']
2017 SPRING  : ['d', 'h']
2018 FALL    : ['a', 'e', 'i']
2018 SPRING  : ['b', 'f', 'j']

This leads to each task being completed every other year. We could also write a function and filter by list comprehension.

tasks = [(letter, i % 4) for i,letter in enumerate('abcdefghijk')]

FALL = 0
SPRING = 1

def execute_p(year, semester, task):
    'year is an integer, semester is 0 for fall, 1 for spring, task is a tuple of (label,code)'
    N = 2 * (year % 2) + semester
    return task[1] == N

YEAR, SEMESTER = 2018, FALL
print '{0} {1:8s}: {2}'.format(YEAR, 
                               'FALL' if SEMESTER==0 else 'SPRING',
                               [task[0]  for task in tasks
                                if execute_p(2018, FALL, task)])
2018 FALL    : ['a', 'e', 'i']

Now, at any point in the future you can tell what tasks should be done!

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.5h

Discuss on Twitter
« Previous Page -- Next Page »