Using org-mode outside of Emacs - sort of

| categories: orgmode, emacs | tags:

I recently posted about using Emacs for scripts (http://kitchingroup.cheme.cmu.edu/blog/2014/08/06/Writing-scripts-in-Emacs-lisp/ ). Someone was probably wondering, why would you do that, when you could use shell, python or perl? A good reason is to write scripts that can access data or code inside an org-file! This would allow you to leverage the extensive support for org-mode in Emacs, without a user necessarily even needing to use Emacs. Let us consider some examples.

1 Extracting tables from an org-file

If tables are named in org-mode, it is possible to extract the contents. Here is a table:

x y
1 1
2 4
3 9

Another table might look like

a b
1 1
2 8
3 27

It would be convenient to have a command-line utility that could extract the data from that table with a syntax like:

extract-org-table tblname orgfile --format lisp|csv|tab

Here is one way to do it:

;; org-table tblname orgfile lisp|csv|tab

(let ((tblname (pop command-line-args-left))
      (org-file (pop command-line-args-left))
      (format)
      (table)
      (content))
  (when command-line-args-left
    (setq format (pop command-line-args-left)))
  (find-file org-file)
  (setq table 
	(org-element-map (org-element-parse-buffer) 'table 
	  (lambda (element)
	    (when (string= tblname (org-element-property :name element))
	      element))
	  nil ;info
	  t )) ; first-match

  (unless table
    (error "no table found for %s" tblname))

  (when table
    (goto-char (org-element-property :contents-begin table))
    (let ((contents (org-table-to-lisp)))
      (if (string= format "lisp")
	  (print contents)
	;else      
	(dolist (row contents)
	  (unless (eq row 'hline)
	    (cond
	     ((string= format "csv")
	      (princ (mapconcat 'identity row ",")))
	     ((string= format "tab")
	      (princ (mapconcat 'identity row "\t")))
	     (t
	      (error "unsupported format: %s" format)))
	    (princ "\n")))))))

Let us try it out. org-babel-tangle

./extract-org-table data-2 org-outside-emacs.org lisp
(("a" "b") ("1" "1") ("2" "8") ("3" "27"))
./extract-org-table data-1 org-outside-emacs.org csv
x,y
1,1
2,4
3,9
./extract-org-table data-2 org-outside-emacs.org tab
a       b
1       1
2       8
3       27

That looks pretty reasonable, and you could even pipe the output to another classic unix command like cut to get a single column. Let us get the second column here.

./extract-org-table data-1 org-outside-emacs.org csv | cut -d , -f 2
y
1
4
9

That is starting to look like using data from an org-file, but outside of org. Of course, we are using org-mode, via emacs, but the point is a user might not have to know that, as long as a fairly recent Emacs and org-mode was installed on their system.

2 Running code in an org-file

It may be that there is code in an org-file that you might want to use, but for some reason choose not to cut and paste from the org-file to some script. Here is a simple code block:

import time
with open('results.dat', 'w') as f:
    f.write(time.asctime())

To call this externally we have to find the block and then run it.

;; org-run blockname org-file
;; run a code block in an org file
(let ((blockname (pop command-line-args-left))
      (org-file (pop command-line-args-left))
      (src))
  (find-file org-file)
  (setq src
	(org-element-map (org-element-parse-buffer) 'src-block
	  (lambda (element)
	    (when (string= blockname (org-element-property :name element))
	      element))
	  nil ;info
	  t )) ; first-match
  (when src
     (goto-char (org-element-property :begin src))
     ;; since we start with a fresh emacs, we have to configure some things.
     (org-babel-do-load-languages
      'org-babel-load-languages
      '((python . t)))
     (let ((org-confirm-babel-evaluate nil))
       (org-babel-execute-src-block))))

org-babel-tangle

./org-call.el python-block org-outside-emacs.org
cat results.dat
Mon Aug 11 20:17:01 2014

That demonstrates it is possible to call source blocks, but this is pretty limited in capability. You can only call a block; we did not capture any output from the block, only its side effects, e.g. it changed a file that we can examine. We have limited capability to set data into the block, other than through files. It might be possible to hack up something that runs org-babel-execute-src-block with constructed arguments that enables something like a var to be passed in. That is beyond today's post. When I get around to it, here is a reminder of how it might be possible to feed stdin to an emacs script: http://stackoverflow.com/questions/2879746/idomatic-batch-processing-of-text-in-emacs .

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.6

Discuss on Twitter

Writing scripts in Emacs-lisp

| categories: emacs | tags:

I have written lots of script commands, mostly in Python, occasionally in bash. Today I learned you can also write them in emacs-lisp (http://www.emacswiki.org/emacs/EmacsScripts ). There is an interesting wrinkle on the first line which specifies how to run the command, which is explained in the emacswiki page.

Here is an example script that just prints some information about Emacs and the command line args you pass to it. We use some Local variables at the end to make the script open in emacs-lisp mode for editing. $0 in shell language is the name of the script being run, so the header here simply loads the script into emacs, and then runs the main function.

:;exec emacs -batch -l "$0" -f main "$@"

(defun main ()
  (print (version))
  (print (format "I did it. you passed in %s" command-line-args-left)))

;; Local Variables:
;; mode: emacs-lisp
;; End:

We need to tangle this code block to get the script. org-babel-tangle

Since we do not have a regular shebang, we manually change the mode to make it executable, and then call the script with some arguments.

chmod +x test.el
./test.el arg1 arg2
"GNU Emacs 22.1.1 (mac-apple-darwin)
 of 2014-06-05 on osx105.apple.com"

"I did it. you passed in (arg1 arg2)"

Hahah! I guess the emacs on my path is an old one! Ironically, the Emacs I am writing in is much more modern (but not on the path).

(version)
GNU Emacs 24.3.1 (x86_64-apple-darwin, NS apple-appkit-1038.36)
 of 2013-03-13 on bob.porkrind.org

And it is evidence I wrote this on a Mac. First Mac post ever.

1 Addition based on Trevor's comment

Also according to http://www.emacswiki.org/emacs/EmacsScripts , there is the following option:

#!emacs --script

as the shebang line. That did not work on my mac, but a small variation did with the absolute path to emacs. You still define the function in the script file, but you finally have to call the function.

(defun main ()
  (print (version))
  (print (format "I did it. you passed in %s" command-line-args-left)))

(main)
;; Local Variables:
;; mode: emacs-lisp
;; End:
./test2.el arg1 arg2 arg3
"GNU Emacs 22.1.1 (mac-apple-darwin)
 of 2014-06-05 on osx105.apple.com"

"Called with (/usr/bin/emacs --no-splash -scriptload ./test2.el arg1 arg2 arg3)"

"I did it. you passed in (arg1 arg2 arg3)"

Now, how do you do this python style so one file is a script and library at once? In python that is done with:

def main ():
    ... put some module code here

if __name__ == '__main__':
    main()

We can check the command line-args to see if there is a clue there.

(defun main ()
  (print (version))
  (print (format "Called with %s" command-line-args))
  (print (format "I did it. you passed in %s" command-line-args-left)))

(main)
;; Local Variables:
;; mode: emacs-lisp
;; End:
./test3.el arg1
"GNU Emacs 22.1.1 (mac-apple-darwin)
 of 2014-06-05 on osx105.apple.com"

"Called with (/usr/bin/emacs --no-splash -scriptload ./test3.el arg1)"

"I did it. you passed in (arg1)"

And apparently, this means when called with –script, we see "-scriptload" as a command line arg. Strange, but workable. We just look for that, and if we see it run as a script, and if not do nothing.

(defun main ()
  (print (version))
  (print (format "Called with %s" command-line-args))
  (print (format "I did it. you passed in %s" command-line-args-left)))

(when (member "-scriptload" command-line-args)
  (main))

Here we run as a script.

./test4.el arg1
"GNU Emacs 22.1.1 (mac-apple-darwin)
 of 2014-06-05 on osx105.apple.com"

"Called with (/usr/bin/emacs --no-splash -scriptload ./test4.el arg1)"

"I did it. you passed in (arg1)"

Now, we try loading the file, and calling our function.

(load-file "test4.el")
(main)
I did it. you passed in nil

Sweet. An emacs script and library in one. Now, I just need to get my modern emacs on the path!

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.6

Discuss on Twitter

Another parsing of links for citations with pre and post text.

| categories: org-ref, org-mode, emacs | tags:

Some LaTeX citations look like \cite[pretext][post text]{key}. Here I explore parsing a link like (pre text)(post text)key. Note you cannot use [] inside the link, as it breaks the link syntax. Also, these links must be wrapped in [[]] because of the parentheses and spaces in the parentheses. This is a very different approach than used here which used the description of the link to define the pre and post text. The disadvantage of that approach is that the key is hidden, whereas in this approach it is not; you can see the key and pre/post text.

The basic strategy will be to use a regexp to parse the link path. The regexp below is pretty hairy, but basically it looks for optional text in () and uses numbered groups to store what is found. Then, we use what we found to construct the LaTeX syntax. We redefine the function in org-ref that gets the key for clicking, and we redefine the cite format function. The result is that we retain the click functionality that shows us what the key refers to.

(defun org-ref-parse-key (s)
  "return pretext, posttext and bibtex key from a string like \"(pre text)(post text)bibtexkey\""
  (string-match "\\(?1:(\\(?2:[^)]*\\))\\)?\\(?3:(\\(?4:[^]]*\\))\\)?\\(?5:.*\\)" s)
  ;; return pretext postext key
  (list (match-string 2 s) (match-string 4 s) (match-string 5 s)))

(defun org-ref-get-bibtex-key-and-file (&optional key)
  "returns the bibtex key and file that it is in. If no key is provided, get one under point"
 (interactive)
 (let ((org-ref-bibliography-files (org-ref-find-bibliography))
       (file))
   (unless key
     ;; get the key
     (setq key (nth 2 (org-ref-parse-key (org-ref-get-bibtex-key-under-cursor)))))
   (setq file     (catch 'result
		    (loop for file in org-ref-bibliography-files do
			  (if (org-ref-key-in-file-p key (file-truename file)) 
			      (throw 'result file)))))
   (cons key file)))

(defun org-ref-format-cite (keyword desc format)
   (cond
    ((eq format 'latex)
     (let* ((results (org-ref-parse-key keyword))
	    (pretext (nth 0 results))
	    (posttext (nth 1 results))
	    (key (nth 2 results)))
       (concat "\\cite" 
	       (when pretext (format "[%s]" pretext))
	       (when posttext (format "[%s]" posttext))
	       (format "{%s}" key))))))
org-ref-format-cite
(org-ref-format-cite "(pre text)(post text)key" nil 'latex)
\cite[pre text][post text]{key}
(org-ref-format-cite "(pre text)key" nil 'latex)
\cite[pre text]{key}
(org-ref-format-cite "key" nil 'latex)
\cite{key}

It looks like they all work! Let us test the links: mehta-2014-ident-poten, (pre text)mehta-2014-ident-poten and (pre text)(post text)biskup-2014-insul-ferrom-films. a multiple citation mehta-2014-ident-poten,thompson-2014-co2-react,calle-vallejo-2013-number.

This seems to work from an export point of view. You can not mix multiple citations with this syntax, and I did not define the html export above. Otherwise, it looks like this might be a reasonable addition to org-ref.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.6

Discuss on Twitter

Using org-files like el-files

| categories: org-mode, emacs | tags:

I wrote some emacs-lisp code in org-mode, and load them with org-babel-load-file. I thought it would be nice if there was load path for org-files, similar to the one for lisp files. Here I document what it might look like.

We need a load path to search for the org-file.

(setq org-load-path '("~/Dropbox/kitchingroup/jmax/"))
~/Dropbox/kitchingroup/jmax/

Next, we need the function to do the loading. We need to find the org-file, and then load it.

(defun org-require (orgfile)
  "orgfile is a symbol to be loaded"
  (let ((org-file (concat (symbol-name orgfile) ".org"))
	(path))

  ;; find the org-file
  (catch 'result
    (loop for dir in org-load-path do
	  (when (file-exists-p
		 (setq path
		       (concat
			(directory-file-name dir)
			"/"
			org-file)))
	    (throw 'result path))))
  (org-babel-load-file path)))


(org-require 'org-ref)
Loaded ~/Dropbox/kitchingroup/jmax/org-ref.el

That looks pretty simple. You do need write access to the location where the org-file is though. Let us look at a version that copies the file to a temporary directory. For some reason, I am not able to use org-babel-load-file with this. But, it does look like I can tangle the file, and assuming (big assumption) that the file tangles to a regularly named .el file, this seems to work too.

(defun org-require (orgfile)
  "orgfile is a symbol to be loaded"
  (let ((org-file (concat (symbol-name orgfile) ".org"))
        (el-file (concat (symbol-name orgfile) ".el"))
	(path))

  ;; find the org-file
  (catch 'result
    (loop for dir in org-load-path do
	  (when (file-exists-p
		 (setq path
		       (concat
			(directory-file-name dir)
			"/"
			org-file)))
	    (throw 'result path))))
  (copy-file path temporary-file-directory t)

  (org-babel-tangle-file (concat temporary-file-directory (file-name-nondirectory path)))
  (load-file (concat temporary-file-directory el-file))
))

(org-require 'org-ref)
t

This actually seems pretty reasonable. I have not thought about complications but for simple cases, e.g. single org-file, it looks ok.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.6

Discuss on Twitter

Automatic downloading of a pdf from a journal site

| categories: bibtex, emacs | tags:

Many bibliography software packages can automatically download a pdf for you. In this post, we explore how that can be done from emacs. The principle idea is that the pdf is obtained from a url, and that you can calculate the url by some method. Then you can download the file.

For example, consider this article in Phys. Rev. Lett. http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.99.016105 . There is a link to get the pdf for this article at http://journals.aps.org/prl/pdf/10.1103/PhysRevLett.99.016105 . It is not difficult to construct that url; you just replace /abstract/ with /pdf/.

The trick is how to get the first url. We have previously seen that we can construct a bibtex entry from a doi. In fact, we can use the doi to get the url above. If you visit https://doi.org/10.1103/PhysRevLett.99.016105 , you will be redirected to the url. It so happens that you can use code to get the redirected url. In emacs-lisp it is a little convoluted; you have to use url-retrieve, and provide a callback that sets the redirect. Here is an example. It appears you need to run this block twice to get the right variable setting. That seems like some kind of error in what I have set up, but I cannot figure out why.

(defvar *doi-utils-redirect*)

(defun callback (&optional status)
 (when status ;  is nil if there none
   (setq *doi-utils-redirect* (plist-get status :redirect))))

(url-retrieve
  "https://doi.org/10.1103/PhysRevLett.99.016105"
  'callback)

(print *doi-utils-redirect*)
"http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.99.016105"

From there, creating the pdf url is as simple as

(replace-regexp-in-string "prl/abstract" "prl/pdf" "http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.99.016105")
http://journals.aps.org/prl/pdf/10.1103/PhysRevLett.99.016105

And finally we download the file with

(url-copy-file "http://journals.aps.org/prl/pdf/10.1103/PhysRevLett.99.016105" "PhysRevLett.99.016105.pdf" nil)
t

So that is the gist of automating pdf downloads. You do these steps:

  1. Get the DOI
  2. Get the url that the DOI redirects to
  3. Calculate the link to the pdf
  4. Download the pdf

Each publisher does something a little bit different, so you have to work this out for each one. I have worked alot of them out at https://github.com/jkitchin/jmax/blob/master/user/doi-utils.el . That file is a work in progress, but it is a project I intend to use on a regular basis.

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.6

Discuss on Twitter
« Previous Page -- Next Page »