Abbreviated journal names in bibtex

| categories: bibtex | tags:

Some journals require abbreviated journal names in the bibliography, and some require full names. Unfortunately, it is not possible to have both in your bibtex file. Or is it…

It is possible to define a @string that is replaced in your bibtex file. If we have the definition of the @string in a separate file, we can specify its definition there, e.g. as an abbreviation, or as the full name. To make this useful, we need a simple way to add new journals, and to generate the definitions.

First, you can find accepted journal name abbreviations here: http://cassi.cas.org/search.jsp .

We are going to define a variable to hold the string definition, journal full name and an abbreviation. You can find our production version of what follows here: https://github.com/jkitchin/jmax/blob/master/jmax-bibtex.el

(defvar jmax-bibtex-abbreviations
  '(("ACAT" "ACS Catalysis" "ACS Catal.")
    ("AM" "Acta Materialia" "Acta Mater.")
    ("AMM" "Acta Metallurgica et Materialia" "Acta Metall. Mater.")
    ("AMiner" "American Mineralogist" "Am. Mineral.")
    ("AngC" "Angewandte Chemie-International Edition" "Angew. Chem. Int. Edit.")
    ("APLM" "APL Materials" "APL Mat.")
    ("ACBE" "Applied Catalysis B: Environmental" "Appl. Catal. B-Environ.")
    ("APL" "Applied Physics Letters" "Appl. Phys. Lett.")
    ("ASS" "Applied Surface Science" "Appl. Surf. Sci.")
    ("CL" "Catalysis Letters" "Catal. Lett.")
    ("CT" "Catalysis Today" "Catal. Today")
    ("CPL" "Chemical Physics Letters" "Chem. Phys. Lett")
    ("CR" "Chemical Reviews" "Chem. Rev.")
    ("CSR" "Chemical Society Reviews" "Chem. Soc. Rev.")
    ("CSR" "Chemical Society Reviews" "Chem. Soc. Rev.")
    ("CM" "Chemistry of Materials" "Chem. Mater.")
    ("CSA" "Colloids and Surfaces, A: Physicochemical and Engineering Aspects" "Colloids Surf., A")
    ("CPMS" "Computational Materials Science" "Comp. Mater. Sci.")
    ("CPC" "Computer Physics Communications" "Comput. Phys. Commun.")
    ("CGD" "Crystal Growth \\& Design" "Cryst. Growth Des.")
    ("CEC" "CrystEngComm" "CrystEngComm")
    ("ECST" "ECS Transactions" "ECS Trans.")
    ("EES" "Energy \\& Environmental Science" "Energy Environ. Sci.")
    ("HPR" "High Pressure Research" "High Pressure Res.")
    ("IC" "Inorganic Chemistry" "Inorg. Chem.")
    ("IECR" "Industrial \\& Engineering Chemistry Research" "Ind. Eng. Chem. Res.")
    ("JJAP" "Japanese Journal of Applied Physics" "Jpn. J. Appl. Phys.")
    ("JMatR" "Journal of  Materials Research" "J. Mater. Res.")
    ("JALC" "Journal of Alloys and Compounds" "J. Alloy Compd.")
    ("JAC" "Journal of Applied Crystallography" "J. Appl. Crystallogr.")
    ("JAP" "Journal of Applied Physics" "J. Appl. Phys.")
    ("JC" "Journal of Catalysis" "J. Catal.")
    ("JCP" "Journal of Chemical Physics" "J. Chem. Phys.")
    ("JCG" "Journal of Crystal Growth" "J. Crys. Growth")
    ("JMC" "Journal of Materials Chemistry" "J. Mater. Chem.")
    ("JMC" "Journal of Materials Chemistry" "J. Mater. Chem.")
    ("JMSL" "Journal of Materials Science Letters" "J. Mater. Sci. Lett.")
    ("JMS" "Journal of Membrane Science" "J. Memb. Sci.")
    ("JPE" "Journal of Phase Equilibria" "J. Phase Equilib.")
    ("JPCS" "Journal of Physics and Chemistry of Solids" "J. Phys. Chem. Solids")
    ("JPCM" "Journal of Physics: Condensed Matter" "J. Phys.: Condens. Matter")
    ("JSSC" "Journal of Solid State Chemistry" "J. Solid State Chem.")
    ("JACerS" "Journal of the American Ceramic Society" "J. Am. Ceram. Soc.")
    ("JACS" "Journal of the American Chemical Society" "J. Am. Chem. Soc.")
    ("JES" "Journal of The Electrochemical Society" "J. Electrochem. Soc.")
    ("JES" "Journal of The Electrochemical Society" "J. Electrochem. Soc.")
    ("JMS" "Journal of Membrane Science" "J. Memb. Sci.")
    ("JVST" "Journal of Vacuum Science \\& Technology A" "J. Vac. Sci. Technol. A")
    ("ML" "Materials Letters" "Mater. Lett.")
    ("MSE-BS" "Materials Science and Engineering B" "Mat. Sci. Eng. B-Solid")
    ("MOLSIM" "Molecular Simulation" "Mol. Sim.")
    ("Nature" "Nature" "Nature")
    ("NM" "Nature Materials" "Nat. Mater.")
    ("PML" "Philosophical Magazine Letters" "Phil. Mag. Lett.")
    ("PMA" "Philosophical Magazine A" "Phil. Mag. A")
    ("PA" "Physica A: Statistical Mechanics and its Applications" "Physica A")
    ("PB" "Physica B-Condensed Matter" "Physica B")
    ("PCCP" "Physical Chemistry Chemical Physics" "Phys. Chem. Chem. Phys.")
    ("PSSB" "physica status solidi (b)" "Phys. Status Solidi B")
    ("PRA" "Physical Review A" "Phys. Rev. A")
    ("PRB" "Physical Review B" "Phys. Rev. B")
    ("PRL" "Physical Review Letters" "Phys. Rev. Lett.")
    ("PCM" "Physics and Chemistry of Minerals" "Phys. Chem. Miner.")
    ("PSurfSci" "Progress in Surface Science" "Prog. Surf. Sci.")
    ("Science" "Science" "Science")
    ("SABC" "Sensors and Actuators B: Chemical" "Sensor. Actuat. B-Chem.")
    ("SS" "Surface Science" "Surf. Sci.")
    ("EPJB" "The European Physical Journal B" "Eur. Phys. J. B")
    ("JPC" "The Journal of Physical Chemistry" "J. Phys. Chem.")
    ("JPCB" "The Journal of Physical Chemistry  B" "J. Phys. Chem. B")
    ("JPCC" "The Journal of Physical Chemistry C" "J. Phys. Chem. C")
    ("JCP" "The Journal of Chemical Physics" "J. Chem. Phys.")
    ("TSF" "Thin Solid Films" "Thin Solid Films")
    ("TC" "Topics in Catalysis" "Top. Catal.")
    ("WR" "Water Research" "Water Res."))
  "List of (string journal-full-name journal-abbreviation)")
bibtex-abbreviations

This data structure will serve a few purposes.

  1. We will generate the bib files that define the @string definitions
  2. We will use it to modify bibtex files to use those strings.

First, here are some simple functions to generate the @string definitions.

(defun jmax-bibtex-generate-longtitles ()
  (interactive)
  (with-temp-file "longtitles.bib"
    (dolist (row bibtex-abbreviations)
      (insert (format "@string{%s=\"%s\"}\n"
                      (nth 0 row)
                      (nth 1 row))))))

(defun jmax-bibtex-generate-shorttitles ()
  (interactive)
  (with-temp-file "shorttitles.bib"
    (dolist (row bibtex-abbreviations)
      (insert (format "@string{%s=\"%s\"}\n"
                      (nth 0 row)
                      (nth 2 row))))))
jmax-bibtex-generate-shorttitles
(jmax-bibtex-generate-longtitles)
(jmax-bibtex-generate-shorttitles)

Here are the results of running that code: shorttitles.bib and longtitles.bib . This is the first step. We have the @strings defined. Now, we need to convert the names in a bibtex entry to use our string. We want to replace full names and abbreviated names with the @string.

(defun jmax-stringify-journal-name (&optional key start end)
  "replace journal name with a string. The strings are defined in `bibtex-abbreviations'."
  (interactive)
  (bibtex-beginning-of-entry)
  (when
      (string= "article"
               (downcase
                (cdr (assoc "=type=" (bibtex-parse-entry)))))
    (let* ((full-names (mapcar
                        (lambda (row)
                          (cons  (nth 1 row) (nth 0 row)))
                        bibtex-abbreviations))
           (abbrev-names (mapcar
                          (lambda (row)
                            (cons  (nth 2 row) (nth 0 row)))
                          bibtex-abbreviations))
           (journal (s-trim (bibtex-autokey-get-field "journal")))
           (bstring (or
                     (cdr (assoc journal full-names))
                     (cdr (assoc journal abbrev-names)))))
      (when bstring
        (bibtex-set-field "journal" bstring t)
        (bibtex-fill-entry)))))
jmax-stringify-journal-name

Now, with a single command, we can convert this:

@article{lizzit-2001-surfac-ru,
  author =       {S. Lizzit and A. Baraldi and A. Groso and K. Reuter
                  and M. Ganduglia-Pirovano and C. Stampfl and
                  M. Scheffler and M. Stichler and C. Keller and
                  W. Wurth and D. Menzel},
  title =        {Surface Core-level Shifts of Clean and
                  Oxygen-covered {R}u(0001)},
  journal =      {Physical Review B,
  volume =       63,
  number =       20,
  pages =        {nil},
  year =         2001,
  doi =          {10.1103/physrevb.63.205419},
  url =          {https://doi.org/10.1103/PhysRevB.63.205419},
  month =        5,
}

into this:

@article{lizzit-2001-surfac-ru,
  author =       {S. Lizzit and A. Baraldi and A. Groso and K. Reuter
                  and M. Ganduglia-Pirovano and C. Stampfl and
                  M. Scheffler and M. Stichler and C. Keller and
                  W. Wurth and D. Menzel},
  title =        {Surface Core-level Shifts of Clean and
                  Oxygen-covered {R}u(0001)},
  journal =      PRB,
  volume =       63,
  number =       20,
  pages =        {nil},
  year =         2001,
  doi =          {10.1103/physrevb.63.205419},
  url =          {https://doi.org/10.1103/PhysRevB.63.205419},
  month =        5,
}

If you have a lot of entries you want to modify, you can use bibtex-map-entries like this. Basically, put the elisp form in a comment, and then execute the elisp form

%% (bibtex-map-entries 'jmax-stringify-journal-name)  <- put cursor here. C-x C-e

This saves some effort. Over time, I will keep adding entries to the abbreviation table. As long as a standard journal name or abbreviation is in your bibtex file, this approach should work pretty well. After you replace the journal names with @string entries, you have to generate the string file, either shorttitles.bib or longtitles.bib, and in your LaTeX file, change your bibliography line to:

\bibliography{shorttitles,references}

The order is important. The @string definitions are in shorttitles.bib, and your bibtex entries in references.bib.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter

Editing org-mode python source blocks in an external editor (Canopy)

| categories: orgmode, python | tags:

Continuing on the last post about leveraging org-mode and python syntax checkers, here we consider using (heresy alert…) an external editor for Python src blocks in org-mode. Why would we consider such insanity? Because, for beginners, environments such as Canopy are (IMHO) easier to use, and better than anything I have used in Emacs. And, I still want the framework of org-mode for content, just a better Python code writing environment.

This problem has some interesting challenges. I would like a command that opens a code block with its contents in the Canopy editor, or that creates a code block if needed. We need to figure out that context based on the cursor position. We will use the same temporary file strategy as used before, so Canopy has something to read and save to. We need to wait for Canopy to finish, which will be tricky because it returns as soon as you run it. Finally, I want the code block to run after it is put back in the org-file, so that the results are captured.

This code block implements the idea, and the comments in the code explain what each section is doing.

(defun edit-in-canopy ()
  (interactive)
  (let* ((eop (org-element-at-point))
         ;; use current directory for temp file so relative paths work
         (temporary-file-directory ".")
         (tempfile))

    ;; create a tempfile. 
    (setq tempfile (make-temp-file "canopy" nil ".py"))

    ;; figure out what to do
    (when
        ;; in an existing source block. we want to edit it.
        (and (eq 'src-block (car eop))
             (string= "python" (org-element-property :language eop)))
          
      ;; put code into tempfile
      (with-temp-file tempfile
        (insert (org-element-property :value eop))))

    ;; open tempfile in canopy
    (shell-command (concat "canopy " tempfile))
    (sleep-for 2) ;; startup time. canopy is slow to showup in
                  ;; ps. This gives it some time to do that. Canopy
                  ;; returns right away, so we sleep while there is
                  ;; evidence that it is open. We get that evidence
                  ;; from ps by searching for canopy.app.main, which
                  ;; seems to exist in the output while Canopy is
                  ;; open.
    (while
        (string-match "canopy\.app\.main"
                      (shell-command-to-string "ps aux"))
      ;; pause a while, then check again.
      (sleep-for 1))

    ;; Canopy has closed, so we get the new script contents
    (let ((new-contents (with-temp-buffer
                          (insert-file-contents tempfile)
                          (buffer-string))))
      (cond
       ;; replace existing code block contents
       ((and (eq 'src-block (car eop))
             (string= "python" (org-element-property :language eop)))
        (goto-char (org-element-property :begin eop))
        (search-forward (org-element-property :value eop))
        (replace-match (concat new-contents "\n")))
       ;; create new code block
       (t
        (insert
         (format "\n#+BEGIN_SRC python
%s
#+END_SRC
" new-contents))
        ;; go into new block so we can run it.
        (previous-line 2))))

    ;; delete the tempfile so they do not accumulate
    (delete-file tempfile)
    ;; and run the new block to get the results
    (org-babel-execute-src-block)))
edit-in-canopy

That seems to work. It is difficult to tell from this post the function works as advertised. You can see it in action here: http://www.youtube.com/watch?v=-noKrT1dfFE .

from scipy.integrate import odeint


def dydx(y, x):
    k = 1
    return -k * y

print odeint(dydx, 1, [0, 1])

import numpy as np
print np.exp(-1)
[[ 1.        ]
 [ 0.36787947]]
0.367879441171

We created this code block externally.

print 'hello'
hello

1 Summary thoughts

Opening Canopy is a little slow (and that is coming from someone who opens Emacs ;). But, once it is open it is pretty nice for writing code, with the interactive Ipython console, and integrated help. Yes, it is probably possible to get Emacs to do that too, and maybe it will do that one day. Canopy does it today.

Unfortunately, this code will not work on Windows, most likely, since it relies on the ps program. There does seem to be a tasklist function in Windows that is similar, but it seems that Canopy runs as pythonw in that function, which is not very specific.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter

Improved debugging of Python code blocks in org-mode

| categories: orgmode, python | tags:

Writing and running code blocks in org-mode is awesome, when it works. I find as the code blocks get past a certain size though, it can be tedious to debug, especially for new users. Since I am teaching 59 students to use Python in org-mode, I see this issue a lot! They lack experience to avoid many simple errors, and to find and fix them. Even in my hands, I do not always want to be switching to Python mode to run and debug blocks.

org-mode src-blocks offer a unique challenge for the usual tools like pylint and pychecker, because the code does not exist in a file. In this post, I will explore developing some functions that do syntax checking on a src block. We will use a simple method which will write the block to a temporary file, and to the checking on that block. Then, we will create temporary buffers with the output.

Here is the first idea. We create a temp file in the working directory, write the code to it, and run pychecker, pyflakes and pep8 on the file.

(defun org-pychecker ()
  "Run pychecker on a source block"
  (interactive)
  (let ((eop (org-element-at-point))
        (temporary-file-directory ".")
        (tempfile))
    (when (and (eq 'src-block (car eop))
               (string= "python" (org-element-property :language eop)))
      (setq tempfile (make-temp-file "pychecker" nil ".py"))
      ;; create code file
      (with-temp-file tempfile
        (insert (org-element-property :value eop)))
      (switch-to-buffer "*pychecker*")
      (erase-buffer)
      (insert "pychecker\n=================\n")
      (insert
       (shell-command-to-string (format "pychecker %s" (file-name-nondirectory tempfile))))
      (insert "\npyflakes\n=================\n")
      (insert
       (shell-command-to-string (format "pyflakes %s" (file-name-nondirectory tempfile))))
      (insert "\npep8\n=================\n")
      (insert
       (shell-command-to-string (format "pep8 %s" (file-name-nondirectory tempfile))))
      (delete-file tempfile))))

Here is a sample code block with some errors in it.

a = 5  # a variable we do not use


def f(x, y):  # unused argument
    return x - b # undefined variable

print 6 * c

On the code block above, that function leads to this output.

pychecker
=================
Processing module pychecker63858xo0 (pychecker63858xo0.py)...
  Caught exception importing module pychecker63858xo0:
    File "/Users/jkitchin/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pychecker/pcmodules.py", line 540, in setupMainCode()
      module = imp.load_module(self.moduleName, handle, filename, smt)
    File "pychecker63858xo0.py", line 7, in <module>()
      print 6 * c
  NameError: name 'c' is not defined

Warnings...

pychecker63858xo0:1: NOT PROCESSED UNABLE TO IMPORT

pyflakes
=================
pychecker63858xo0.py:5: undefined name 'b'
pychecker63858xo0.py:7: undefined name 'c'

pep8
=================
pychecker63858xo0.py:5:17: E261 at least two spaces before inline comment

That is pretty helpful, but it gives us line numbers we cannot directly access in our code block. We can open the code block in Python mode, and then navigate to them, but that is likely to make the buffer with this information disappear. It would be better if we could just click on a link and go to the right place. Let us explore what we need for that.

We need to parse the output to get the line numbers, and then we can construct org-links to those places in the src block. pyflakes, pep8 and pylint look like the easiest to get. A way to get to the line would be a lisp function that moves to the beginning of the code block, and then moves forward n lines. We will use a regular expression on each line of the output of pyflakes and pep8 to get the line number. We will construct an org-link to go to the source block at the line.

In this long code block, we create a function that will run pyflakes, pep8 and pylint, and create a new buffer with links to the issues it finds. Finally, we apply this as advice on executing org-babel-execute:python so it only runs when we execute a python block in org-mode. This is a long block, because I have made it pretty feature complete.

(defun org-py-check ()
  "Run python check programs on a source block.
Opens a buffer with links to what is found."
  (interactive)
  (let ((eop (org-element-at-point))
        (temporary-file-directory ".")
        (cb (current-buffer))
        (n) ; for line number
        (content) ; error on line
        (pb "*org pycheck*")
        (pyflakes-status nil)
        (link)
        (tempfile))

    (unless (executable-find "pyflakes")
      (error "pyflakes is not installed."))
    
    (unless (executable-find "pep8")
      (error "pep8 not installed"))

    (unless (executable-find "pylint")
      (error "pylint not installed"))

    ;; rm buffer if it exists
    (when (get-buffer pb) (kill-buffer pb))
    
    ;; only run if in a python code-block
    (when (and (eq 'src-block (car eop))
               (string= "python" (org-element-property :language eop)))

      ;; tempfile for the code
      (setq tempfile (make-temp-file "pychecker" nil ".py"))
      ;; create code file
      (with-temp-file tempfile
        (insert (org-element-property :value eop)))
      
      (let ((status (shell-command
                     (format "pyflakes %s" (file-name-nondirectory tempfile))))
            (output (delete "" (split-string
                                (with-current-buffer "*Shell Command Output*"
                                  (buffer-string)) "\n"))))
        (setq pyflakes-status status)
        (kill-buffer "*Shell Command Output*")
        (when output
          (set-buffer (get-buffer-create pb))
          (insert (format "\n* pyflakes output (status=%s)
pyflakes checks your code for errors. You should probably fix all of these.

" status))
          (dolist (line output)
            ;; get the line number
            (if 
                (string-match (format "^%s:\\([0-9]*\\):\\(.*\\)"
                                      (file-name-nondirectory tempfile))
                              line)
                (progn
                  (setq n (match-string 1 line))
                  (setq content (match-string 2 line))
                  (setq link (format "[[elisp:(progn (switch-to-buffer-other-window \"%s\")(goto-char %s)(forward-line %s))][%s]]\n"
                                     cb
                                     (org-element-property :begin eop)
                                     n
                                     (format "Line %s: %s" n content))))
              ;; no match, just insert line
              (setq link (concat line "\n")))
            (insert link))))

      (let ((status (shell-command
                     (format "pep8 %s" (file-name-nondirectory tempfile))))
            (output (delete "" (split-string
                                (with-current-buffer "*Shell Command Output*"
                                  (buffer-string)) "\n"))))
        (kill-buffer "*Shell Command Output*")
        (when output
          (set-buffer (get-buffer-create pb))
          (insert (format "\n\n* pep8 output (status = %s)\n" status))
          (insert "pep8 is the [[http://legacy.python.org/dev/peps/pep-0008][officially recommended style]] for writing Python code. Fixing these will usually make your code more readable and beautiful. Your code will probably run if you do not fix them, but, it will be ugly.

")
          (dolist (line output)
            ;; get the line number
            (if 
                (string-match (format "^%s:\\([0-9]*\\):\\(.*\\)"
                                      (file-name-nondirectory tempfile))
                              line)
                (progn
                  (setq n (match-string 1 line))
                  (setq content (match-string 2 line))
                  (setq link (format "[[elisp:(progn (switch-to-buffer-other-window \"%s\")(goto-char %s)(forward-line %s))][%s]]\n"
                                     cb
                                     (org-element-property :begin eop)
                                     n
                                     (format "Line %s: %s" n content))))
              ;; no match, just insert line
              (setq link (concat line "\n")))
            (insert link))))

      ;; pylint
      (let ((status (shell-command
                     (format "pylint -r no %s" (file-name-nondirectory tempfile))))
            (output (delete "" (split-string
                                (with-current-buffer "*Shell Command Output*"
                                  (buffer-string)) "\n"))))
        (kill-buffer "*Shell Command Output*")
        (when output
          (set-buffer (get-buffer-create pb))
          (insert (format "\n\n* pylint (status = %s)\n" status))
          (insert "pylint checks your code for errors, style and convention. It is complementary to pyflakes and pep8, and usually more detailed.

")

          (dolist (line output)
            ;; pylint gives a line and column number
            (if 
                (string-match "[A-Z]:\\s-+\\([0-9]*\\),\\s-*\\([0-9]*\\):\\(.*\\)"                            
                              line)
                (let ((line-number (match-string 1 line))
                      (column-number (match-string 2 line))
                      (content (match-string 3 line)))
                     
                  (setq link (format "[[elisp:(progn (switch-to-buffer-other-window \"%s\")(goto-char %s)(forward-line %s)(forward-line 0)(forward-char %s))][%s]]\n"
                                     cb
                                     (org-element-property :begin eop)
                                     line-number
                                     column-number
                                     line)))
              ;; no match, just insert line
              (setq link (concat line "\n")))
            (insert link))))
    
      (when (get-buffer pb)
        (switch-to-buffer-other-window pb)
        (goto-char (point-min))
        (insert "Press q to close the window\n")
        (org-mode)       
        (org-cycle '(64))
        ;; make read-only and press q to quit
        (setq buffer-read-only t)
        (use-local-map (copy-keymap org-mode-map))
        (local-set-key "q" #'(lambda () (interactive) (kill-buffer))))

      (unless (= 0 pyflakes-status)
        (forward-line 4)
        (error "pyflakes exited non-zero. please fix errors"))
      ;; final cleanup and delete file
      (delete-file tempfile)
      (switch-to-buffer-other-window cb))))


(defadvice org-babel-execute:python (before pychecker)
  (org-py-check))

(ad-activate 'org-babel-execute:python)
org-babel-execute:python

Now, when I try to run this code block, which has some errors in it:

a = 5  # a variable we do not use


def f(x, y):  # unused argument
    return x - b # undefined

print 6 * c

I get a new buffer with approximately these contents:

Press q to close the window

* pyflakes output (status=1)
pyflakes checks your code for errors. You should probably fix all of these.

Line 5:  undefined name 'b'
Line 7:  undefined name 'c'


* pep8 output (status = 1)
pep8 is the officially recommended style for writing Python code. Fixing these will usually make your code more readable and beautiful. Your code will probably run if you do not fix them, but, it will be ugly.

Line 5: 17: E261 at least two spaces before inline comment


* pylint (status = 22)pylint checks your code for errors, style and convention. It is complementary to pyflakes and pep8, and usually more detailed.

No config file found, using default configuration
************* Module pychecker68224dkX
C:  1, 0: Invalid module name "pychecker68224dkX" (invalid-name)
C:  1, 0: Missing module docstring (missing-docstring)
C:  1, 0: Invalid constant name "a" (invalid-name)
C:  4, 0: Invalid function name "f" (invalid-name)
C:  4, 0: Invalid argument name "x" (invalid-name)
C:  4, 0: Invalid argument name "y" (invalid-name)
C:  4, 0: Missing function docstring (missing-docstring)
E:  5,15: Undefined variable 'b' (undefined-variable)
W:  4, 9: Unused argument 'y' (unused-argument)
E:  7,10: Undefined variable 'c' (undefined-variable)

Each of those links takes me to either the line, or the position of the error (in the case of pylint)! I have not tested this on more than a handful of code blocks, but it has worked pretty nicely on them so far!

Of course, you must have pyflakes, pep8 and pylint installed. But those are all easily installed with pip as far as I can tell.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter

Generating an atomic stoichiometric matrix

| categories: thermodynamics, python | tags:

In computing thermodynamic properties with species, it is sometimes required to get a matrix that specifies number of each type of atom in each species. For example, we can create this by hand as follows:

  H2O CO2 H2 CO
H 2 0 2 0
C 0 1 0 1
O 1 2 0 1

Here we aim to generate this table from code. Why? 1. We can readily add species to it if we do it right. 2. We are less likely to make mistakes in generation of the table, and if we do, it will be faster to regenerate the table.

We will start with a list of strings that represent the chemical formula of each species. We will need to parse the strings to find the elements, and number of them. We will use a fairly naive regular expression to parse a chemical formula. Basically, we match a capital letter + an optional lowercase letter, followed by an optional number. Here is a fictitous example to illustrate. Note, this will not work with formulas that have parentheses, or charges.

import re
m = re.findall('([A-Z][a-z]?)(\d?)' , 'ArC2H6Cu56Pd47Co')
print m
[('Ar', ''), ('C', '2'), ('H', '6'), ('Cu', '5'), ('Pd', '4'), ('Co', '')]

Now, we need to loop over the species, and collect all the elements in them. We will just make a list of all of the elments, and then get the set.

import re

# save for future use
cf = re.compile('([A-Z][a-z]?)(\d?)')

species = ['H2O', 'CO2', 'H2', 'CO2']

all_elements = []

for s in species:
    for el, count in re.findall(cf, s):
        all_elements += [el]

print set(all_elements)
set(['H', 'C', 'O'])

Finally, we can create the table. We need to loop through each element, and then through each species

import re

# save for future use
cf = re.compile('([A-Z][a-z]?)(\d?)')

species = ['H2O', 'CO2', 'H2', 'CO2']

all_elements = []

for s in species:
    for el, count in re.findall(cf, s):
        all_elements += [el]

atoms = set(all_elements)

# we put a placeholder in the first row
counts = [[""] + species]
for e in atoms:
    # store the element in the first column
    count = [e]
    for s in species:    
        d = dict(re.findall(cf, s))
        n = d.get(e, 0)
        if n == '': n = 1
        count += [int(n)]
    counts += [count]

# this directly returns the array to org-mode
return counts
  H2O CO2 H2 CO2
H 2 0 2 0
C 0 1 0 1
O 1 2 0 2

For this simple example it seems like a lot of code. If there were 200 species though, it would be the same code! Only the list of species would be longer. It might be possible to avoid the two sets of looping, if you could represent the stoichiometric matrix as a sparse matrix, i.e. only store non-zero elements. The final comment I have is related to the parsing of the chemical formulas. Here we can only parse simple formulas. To do better than this would require a pretty sophisticated parser, probably built on the grammar of chemical formulas. The example here implements the code above using pyparsing, and could probably be extended to include more complex formulas such as (CH3)3CH.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter

Showing what data went into a code block on export

| categories: orgmode | tags:

Sometimes I define variables in the header of a code block and then use the code to analyze the data. In org-mode this is super, and you can read the file and easily see what is going on.

When you export the file, however, the information is lost, and in the exported result you cannot see what data went into a code block, or figure out where it is from.

Today we examine how to get that information into exported code. First, we setup a simple example that will do what need.

x y
1 1
2 4
3 9

Now a code block that has a defined variable in the header that uses data from the table defined above.

1 1
2 4
3 9

During export, org-mode does some interesting things to the document, including removing the headers from the code blocks, which makes it impossible to access them inside the export. The headers are apparently removed during org-babel-exp-process-buffer. It does not appear possible to advise this function because it processes the whole buffer at once, and we need to save data for each code block.

So, we will have to preprocess the buffer to get the parameters on each block, and then put the parameters in the export afterwards. For this, we can use a filter. We will preprocess the buffer to get names of tables, and parameters of src-blocks. (I suppose we could put this preprocessing in the advice function, but I tend to avoid advice when possible).

Here is how we can get a list of the table-names indicating their name or that they are results (results are enclosed in ()).

(org-element-map (org-element-parse-buffer) 'table
  (lambda (element)     
    (or (org-element-property :name element) (org-element-property :results element))))
tbl-data (print-table) () ()

Similarly, here is the list of parameters for each block.

(org-element-map (org-element-parse-buffer) 'src-block
  (lambda (element)     
    (org-element-property :parameters element)))
:var data=tbl-data :results value

Now, we combine them with filters to modify the output. First, we preprocess to get each list, and then in the filter, we will pop off each value and insert the data. We will also get the language for each code block, and add that in the export. We use a filter because we are not modified the transcoded text, simply adding some new text in front of it.

(defun ox-mrkup-filter-table (text back-end info)
  (let ((tblname (pop tblnames)))
    (message "tblname is \"%s\"" tblname)
    ; pop does not remove nil from the list, so we do it here.
    (when (null tblname) (setq tblnames (cdr tblnames)))
    (cond
     ((listp tblname)  ; from results
      (concat (format "<br>Results: %s" (car tblname)) text))
     ((null tblname)   ; no name
      text)
     (t ; everything else
      (concat (format "<br>Table name: %s" tblname) text)))))

(defun ox-mrkup-filter-src-block (text back-end info)
  (let ((params (pop src-params))
        (lang (pop src-langs)))
    (when (null params) (setq src-params (cdr src-params)))
    (if params  
        (concat (format "<pre>Language = %s\nParameters = %s</pre>" lang params) text)
      text)))

;; preprocess to get table names, src parameters and languages.
(let ((tblnames (org-element-map (org-element-parse-buffer) 'table
                  (lambda (element)     
                    (or (org-element-property :name element)                    
                        (org-element-property :results element)))))

      (src-params (org-element-map (org-element-parse-buffer) 'src-block
                    (lambda (element)     
                      (org-element-property :parameters element))))

      (src-langs (org-element-map (org-element-parse-buffer) 'src-block
                    (lambda (element)     
                      (org-element-property :language element))))

      ;; register the filters
      (org-export-filter-table-functions '(ox-mrkup-filter-table))
      (org-export-filter-src-block-functions '(ox-mrkup-filter-src-block)))

  ;; and export the result
  (browse-url (org-export-to-file 'html "custom-src-table-export-3.html")))
#<process open custom-src-table-export-3.html>

Here is the resulting html file: custom-src-table-export-3.html which shows the new export behavior. It might not be too difficult to make links between the parameters and the tables, but it would require parsing the :parameters string. For now, this makes it easy enough to read in HTML where the data is coming from (assuming fluency in org-mode header arguments!).

Special thanks to Aaron Ecay, and Charles Berry on the org-mode mailing list for pointing me towards a solution.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter
« Previous Page -- Next Page »