## Exporting numbered citations in html with unsorted numbered bibliography

| categories: org-mode | tags: | View Comments

In this post we illustrated a simple export of org-ref citations to html. This was a simple export that simply replaced each citation with a hyperlink as defined in the export function for each type of link. Today we look at formatting in text citations with superscripted numbers, and having an unsorted (i.e. in order of citation) numbered bibliography. This will take one pass to get the citations and calculate replacements and the bibliography, and one pass to replace them and insert the bibliography.

This text is just some text with somewhat random citations in it for seeing it work. You might like my two data sharing articles 1,2. We illustrate the use of org-mode in publishing computational work 3,4,5, experimental 6and mixed computational and experimental work 7,8. This example will correctly number multiple references to a citation, e.g. 1and 2.

This post is somewhat long, and the way I worked it out is at the end (Long appendix illustrating how we got the code to work). The short version is that we do some preprocessing to get the citations in the document, calculate replacement values for them and the bibliography, replace them in the org-buffer before the export in the backend (html) format we want, and then conclude with the export. This is proof of concept work.

The main issues you can see are:

1. Our formatting code is very rudimentary, and relies on reftex. It is not as good as bibtex, or presumably some citation processor. Major improvements would require abandoning the reftex approach to use something that builds up the bibliography entry, allows modification of author names, and accomodates missing information gracefully.
2. The bibliography contents reflect the contents of my bibtex file, which is LaTeX compatible. We could clean it up more, by either post-processing to remove some things like escaped &, or by breaking compatibility with LaTeX.
3. The intext citations could use some fine tuning on spaces, e.g. to remove trailing spaces after words, or to move superscripts to the right of punctuation, or to adjust spaces after some citations.
4. Changing the bibliography style for each entry amounts to changing a variable for the bibliography. We have to modify a function to change the intext citation style, e.g. to brackets, or (author year).
5. I stuck with only cite links here, and only articles and books. It would not get a citenum format correct, e.g. it should not be superscripted in this case, or a citeauthor format correct. That would require some code in the replacement section that knows how to replace different types of citations.

The org-ref-unsrt-html-processor function could be broken up more, and could take some parameters to fine-tune some of these things, and generalize some things like getting the citation elements for the buffer. Overall, I think this shows that citations in org-mode with org-ref are actually pretty flexible. It is not as good as bibtex/LaTeX, and won't be for an unforseeably long time unless someone really needs high quality citations in a format other than LaTeX. Note for LaTeX export, we don't have to do any preprocessing at all. If you wanted to try Word export, you might make a pandoc processor that replaces everything in pandoc citation syntax, and then use pandoc for the conversion. If you didn't care to use the bibtex database for anything else, you could just use backend specific markup to make it exactly right for your output. I did this in reference 8where you can see the chemical formulas are properly subscripted.

If you would like to see the bibtex file used for this you can get it here: numbered.bib

# Bibliography

1. Kitchin, Examples of Effective Data Sharing in Scientific Publishing, ACS Catalysis, 5(6), 3894-3899 (2015). link. doi.
2. John Kitchin, Data Sharing in Surface Science, Surface Science , (0), - (2015). link. doi.
3. Zhongnan Xu & John R Kitchin, Tuning Oxide Activity Through Modification of the Crystal and Electronic Structure: From Strain To Potential Polymorphs, Phys. Chem. Chem. Phys., 17(), 28943-28949 (2015). link. doi.
4. Prateek Mehta, Paul Salvador & John Kitchin, Identifying Potential BO2 Oxide Polymorphs for Epitaxial Growth Candidates, ACS Appl. Mater. Interfaces, 6(5), 3630-3639 (2014). link. doi.
5. Curnan & Kitchin, Effects of Concentration, Crystal Structure, Magnetism, and Electronic Structure Method on First-Principles Oxygen Vacancy Formation Energy Trends in Perovskites, The Journal of Physical Chemistry C, 118(49), 28776-28790 (2014). link. doi.
6. Hallenbeck & Kitchin, Effects of O2 and SO2 on the Capture Capacity of a Primary-Amine Based Polymeric CO2 Sorbent, Industrial & Engineering Chemistry Research, 52(31), 10788-10794 (2013). link. doi.
7. Spencer Miller, Vladimir Pushkarev, Andrew, Gellman & John Kitchin, Simulating Temperature Programmed Desorption of Oxygen on Pt(111) Using DFT Derived Coverage Dependent Desorption Barriers, Topics in Catalysis, 57(1-4), 106-117 (2014). link. doi.
8. Jacob Boes, Gamze Gumuslu, James Miller, Andrew, Gellman & John Kitchin, Estimating Bulk-Composition-Dependent H2 Adsorption Energies on CuxPd1-x Alloy (111) Surfaces, ACS Catalysis, 5(), 1020-1026 (2015). link. doi.

## 1 The working code

Here is a function to process the org file prior parsing during the export process. This function goes into org-export-before-parsing-hook, and takes one argument, the backend. We simply replace all the citation links with formatted HTML snippets or blocks. If the snippets get longer than a line, it will break.

We use org-ref-reftex-format-citation to generate the bibliography, which uses reftex to format a string with escape characters in it.

(setq org-ref-bibliography-entry-format
'(("article" . "<li><a name=\"\%k\"></a>%a, %t, <i>%j</i>, <b>%v(%n)</b>, %p (%y). <a href=\"%U\">link</a>. <a href=\"http://dx.doi.org/%D\">doi</a>.</li>")
("book" . "<li><a name=\"\%k\"></a>%a, %t, %u (%y).</li>")))

(defun org-ref-unsrt-latex-processor () nil)
(defun org-ref-unsrt-html-processor ()
"Citation processor function for the unsrt style with html output."
unique-keys numbered-keys
replacements
bibliography)
;; step 1 - get the citation links
if (-contains?
org-ref-cite-types

;; list of unique numbered keys. '((key number))
(setq unique-keys (loop for i from 1
for key in (org-ref-get-bibtex-keys)
collect (list key (number-to-string i))))

;; (start end replacement-text)
(setq replacements
collect
(loop for (key number) in unique-keys
do
(setq
path
(replace-regexp-in-string
key (format "<a href=\"#%s\">%s</a>" key number)
path)))
(format "@@html:<sup>%s</sup>@@" path)))))

;; construct the bibliography string
(setq bibliography
(concat "#+begin_html
<h1>Bibliography</h1><ol>"
(mapconcat
'identity
(loop for (key number) in unique-keys
collect
(let* ((result (org-ref-get-bibtex-key-and-file key))
(bibfile (cdr result))
(entry (save-excursion
(with-temp-buffer
(insert-file-contents bibfile)
(bibtex-set-dialect
(parsebib-find-bibtex-dialect) t)
(bibtex-search-entry key)
(bibtex-parse-entry t)))))
;; remove escaped & in the strings
(replace-regexp-in-string "\\\\&" "&"
(org-ref-reftex-format-citation
entry
(cdr (assoc (cdr (assoc "=type=" entry))
org-ref-bibliography-entry-format))))))
"")
"</ol>
#+end_html"))

;; now, we need to replace each citation. We do that in reverse order so the
;; positions do not change.
(loop for (start end replacement) in (reverse replacements)
do
(setf (buffer-substring start end) replacement))

if (string= "bibliographystyle"
do
""))

;; replace the bibliography link with the bibliography text
if (string= "bibliography"
(error "Only one bibliography link allowed"))

bibliography)))

(defun org-ref-citation-processor (backend)
"Figure out what to call and call it"
(let (bibliographystyle)
(setq
bibliographystyle
(org-element-property
:path (car
(org-element-map
if (string= "bibliographystyle"
(funcall (intern (format "org-ref-%s-%s-processor" bibliographystyle backend)))))

(browse-url (org-html-export-to-html))

#<process open ./blog.html>


## 2 Long appendix illustrating how we got the code to work

The first thing we need is a list of all the citation links, in the order cited. Here they are.

(mapcar
if (-contains? org-ref-cite-types (org-element-property :type link))

 kitchin-2015-examp,kitchin-2015-data-surfac-scien xu-2015-tunin-oxide,mehta-2014-ident-poten,curnan-2014-effec-concen hallenbeck-2013-effec-o2 miller-2014-simul-temper,boes-2015-estim-bulk kitchin-2015-examp kitchin-2015-data-surfac-scien boes-2015-estim-bulk

Now, we need to compute replacements for each citation link, and construct the bibliography. We will make a numbered, unsorted bibliography, and we want to replace each citation with the corresponding numbers, hyperlinked to the entry.

We start with a list of the keys in the order cited, and a number we will use for each one.

(loop for i from 1
for key in (org-ref-get-bibtex-keys)
collect (list key i))

 kitchin-2015-examp 1 kitchin-2015-data-surfac-scien 2 xu-2015-tunin-oxide 3 mehta-2014-ident-poten 4 curnan-2014-effec-concen 5 hallenbeck-2013-effec-o2 6 miller-2014-simul-temper 7 boes-2015-estim-bulk 8

Now, we need to compute replacements for each cite link. This will be replacing each key with the number above. We will return a list of ((start end) . "replacement text") that we can use to replace each link. For fun, we make these superscripted html.

(let ((links (loop for link in (org-element-map (org-element-parse-buffer) 'link 'identity)
if (-contains? org-ref-cite-types (org-element-property :type link))
(replacements (loop for i from 1
for key in (org-ref-get-bibtex-keys)
collect (list key (number-to-string i)))))
collect (let ((path (org-element-property :path link)))
(dolist (repl replacements)
(setq path (replace-regexp-in-string (car repl) (nth 1 repl) path)))
(format "<sup>%s</sup>" path)))))

 950 1004 1,2 1073 1145 3,4,5 1160 1190 6 1236 1286 7,8 1364 1388 1 1392 1427 2 4091 4117 8

We also need to compute the bibliography for each key. We will use org-ref-reftex-format-citation to do this. For that we need the parsed bibtex entries, and a format string. org-ref provides most of this.

(setq org-ref-bibliography-entry-format
'(("article" . "<li>%a, %t, <i>%j</i>, <b>%v(%n)</b>, %p (%y). <a href=\"%U\">link</a>. <a href=\"http://dx.doi.org/%D\">doi</a>.</li>")
("book" . "<li>%a, %t, %u (%y).</li>")))

(concat "<h1>Bibliography</h1><br><ol>"
(mapconcat
'identity
(loop for key in (org-ref-get-bibtex-keys)
collect
(let* ((result (org-ref-get-bibtex-key-and-file key))
(bibfile (cdr result))
(entry (save-excursion
(with-temp-buffer
(insert-file-contents bibfile)
(bibtex-set-dialect (parsebib-find-bibtex-dialect) t)
(bibtex-search-entry key)
(bibtex-parse-entry)))))
(org-ref-reftex-format-citation
entry
(cdr (assoc (cdr (assoc "=type=" entry))
org-ref-bibliography-entry-format)))))
"")
"</ol>")


# Bibliography

1. Kitchin, Examples of Effective Data Sharing in Scientific Publishing, {ACS Catalysis}, 5(6), 3894-3899 (2015). link. doi.
2. "John Kitchin", Data Sharing in Surface Science, "Surface Science ", (0), - (2015). link. doi.
3. Zhongnan Xu \& John R Kitchin, Tuning Oxide Activity Through Modification of the Crystal and Electronic Structure: From Strain To Potential Polymorphs, {Phys. Chem. Chem. Phys.}, 17(), 28943-28949 (2015). link. doi.
4. Prateek Mehta, Paul Salvador \& John Kitchin, Identifying Potential BO2 Oxide Polymorphs for Epitaxial Growth Candidates, {ACS Appl. Mater. Interfaces}, 6(5), 3630-3639 (2014). link. doi.
5. Curnan \& Kitchin, Effects of Concentration, Crystal Structure, Magnetism, and Electronic Structure Method on First-Principles Oxygen Vacancy Formation Energy Trends in Perovskites, {The Journal of Physical Chemistry C}, 118(49), 28776-28790 (2014). link. doi.
6. "Hallenbeck \& Kitchin, Effects of O2 and SO2 on the Capture Capacity of a Primary-Amine Based Polymeric CO2 Sorbent, "Industrial \& Engineering Chemistry Research", 52(31), 10788-10794 (2013). link. doi.
7. Spencer Miller, Vladimir Pushkarev, Andrew, Gellman \& John Kitchin, Simulating Temperature Programmed Desorption of Oxygen on Pt(111) Using DFT Derived Coverage Dependent Desorption Barriers, {Topics in Catalysis}, 57(1-4), 106-117 (2014). link. doi.
8. Jacob Boes, Gamze Gumuslu, James Miller, Andrew, Gellman \& John Kitchin, Estimating Bulk-Composition-Dependent H2 Adsorption Energies on CuxPd1-x Alloy (111) Surfaces, {ACS Catalysis}, 5(), 1020-1026 (2015). link. doi.

org-mode source

Org-mode version = 8.2.10

## Another parsing of links for citations with pre and post text.

| categories: | tags: | View Comments

Some LaTeX citations look like \cite[pretext][post text]{key}. Here I explore parsing a link like (pre text)(post text)key. Note you cannot use [] inside the link, as it breaks the link syntax. Also, these links must be wrapped in [[]] because of the parentheses and spaces in the parentheses. This is a very different approach than used here which used the description of the link to define the pre and post text. The disadvantage of that approach is that the key is hidden, whereas in this approach it is not; you can see the key and pre/post text.

The basic strategy will be to use a regexp to parse the link path. The regexp below is pretty hairy, but basically it looks for optional text in () and uses numbered groups to store what is found. Then, we use what we found to construct the LaTeX syntax. We redefine the function in org-ref that gets the key for clicking, and we redefine the cite format function. The result is that we retain the click functionality that shows us what the key refers to.

(defun org-ref-parse-key (s)
"return pretext, posttext and bibtex key from a string like \"(pre text)(post text)bibtexkey\""
(string-match "\$$?1:(\\(?2:[^)]*\$$)\\)?\$$?3:(\\(?4:[^]]*\$$)\\)?\$$?5:.*\$$" s)
;; return pretext postext key
(list (match-string 2 s) (match-string 4 s) (match-string 5 s)))

(defun org-ref-get-bibtex-key-and-file (&optional key)
"returns the bibtex key and file that it is in. If no key is provided, get one under point"
(interactive)
(let ((org-ref-bibliography-files (org-ref-find-bibliography))
(file))
(unless key
;; get the key
(setq key (nth 2 (org-ref-parse-key (org-ref-get-bibtex-key-under-cursor)))))
(setq file     (catch 'result
(loop for file in org-ref-bibliography-files do
(if (org-ref-key-in-file-p key (file-truename file))
(throw 'result file)))))
(cons key file)))

(defun org-ref-format-cite (keyword desc format)
(cond
((eq format 'latex)
(let* ((results (org-ref-parse-key keyword))
(pretext (nth 0 results))
(posttext (nth 1 results))
(key (nth 2 results)))
(concat "\\cite"
(when pretext (format "[%s]" pretext))
(when posttext (format "[%s]" posttext))
(format "{%s}" key))))))

org-ref-format-cite

(org-ref-format-cite "(pre text)(post text)key" nil 'latex)

\cite[pre text][post text]{key}

(org-ref-format-cite "(pre text)key" nil 'latex)

\cite[pre text]{key}

(org-ref-format-cite "key" nil 'latex)

\cite{key}


It looks like they all work! Let us test the links: mehta-2014-ident-poten, (pre text)mehta-2014-ident-poten and (pre text)(post text)biskup-2014-insul-ferrom-films. a multiple citation mehta-2014-ident-poten,thompson-2014-co2-react,calle-vallejo-2013-number.

This seems to work from an export point of view. You can not mix multiple citations with this syntax, and I did not define the html export above. Otherwise, it looks like this might be a reasonable addition to org-ref.

org-mode source

Org-mode version = 8.2.6

## Using org-files like el-files

| categories: | tags: | View Comments

I wrote some emacs-lisp code in org-mode, and load them with org-babel-load-file. I thought it would be nice if there was load path for org-files, similar to the one for lisp files. Here I document what it might look like.

We need a load path to search for the org-file.

(setq org-load-path '("~/Dropbox/kitchingroup/jmax/"))

 ~/Dropbox/kitchingroup/jmax/

Next, we need the function to do the loading. We need to find the org-file, and then load it.

(defun org-require (orgfile)
"orgfile is a symbol to be loaded"
(let ((org-file (concat (symbol-name orgfile) ".org"))
(path))

;; find the org-file
(catch 'result
(loop for dir in org-load-path do
(when (file-exists-p
(setq path
(concat
(directory-file-name dir)
"/"
org-file)))
(throw 'result path))))

(org-require 'org-ref)

Loaded ~/Dropbox/kitchingroup/jmax/org-ref.el


That looks pretty simple. You do need write access to the location where the org-file is though. Let us look at a version that copies the file to a temporary directory. For some reason, I am not able to use org-babel-load-file with this. But, it does look like I can tangle the file, and assuming (big assumption) that the file tangles to a regularly named .el file, this seems to work too.

(defun org-require (orgfile)
"orgfile is a symbol to be loaded"
(let ((org-file (concat (symbol-name orgfile) ".org"))
(el-file (concat (symbol-name orgfile) ".el"))
(path))

;; find the org-file
(catch 'result
(loop for dir in org-load-path do
(when (file-exists-p
(setq path
(concat
(directory-file-name dir)
"/"
org-file)))
(throw 'result path))))
(copy-file path temporary-file-directory t)

(org-babel-tangle-file (concat temporary-file-directory (file-name-nondirectory path)))
))

(org-require 'org-ref)

t


This actually seems pretty reasonable. I have not thought about complications but for simple cases, e.g. single org-file, it looks ok.

org-mode source

Org-mode version = 8.2.6

## Better integration of org-mode and email

| categories: | tags: | View Comments

I like to email org-mode headings and content to people. It would be nice to have some records of when a heading was sent, and to whom. We store this information in a heading. It is pretty easy to write a simple function that emails a selected region.

(defun email-region (start end)
"Send region as the body of an email."
(interactive "r")
(let ((content (buffer-substring start end)))
(compose-mail)
(message-goto-body)
(insert content)
(message-goto-to)))


that function is not glamorous, and you still have to fill in the email fields, and unless you use gnus and org-contacts, the only record keeping is through the email provider.

What I would like is to send a whole heading in an email. The headline should be the subject, and if there are TO, CC or BCC properties, those should be used. If there is no TO, then I want to grab the TO from the email after you enter it and store it as a property. You should be able to set OTHER-HEADERS as a property (this is just for fun. There is no practical reason for this yet). After you send the email, it should record in the heading when it was sent.

It turned out that is a relatively tall order. While it is easy to setup the email if you have everything in place, it is tricky to get the information on TO and the time sent after the email is sent. Past lispers had a lot of ideas to make this possible, and a day of digging got me to the answer. You can specify some "action" functions that get called at various times, e.g. after sending, and a return action when the compose window is done. Unfortunately, I could not figure out any way to do things except to communicate through some global variables.

So here is the code that lets me send org-headings, with the TO, CC, BCC properties, and that records when I sent the email after it is sent.

(defvar *email-heading-point* nil
"global variable to store point in for returning")

"global variable to store to address in email")

"after returning from compose do this"
(org-set-property "SENT-ON" (current-time-string))
;; reset this incase you added new ones
)

(defun email-send-action ()
"send action for compose-mail"

"Send the current org-mode heading as the body of an email, with headline as the subject.

use these properties
TO

save when it was sent as s SENT property. this is overwritten on
subsequent sends. could save them all in a logbook?
"
(interactive)
; store location.
(org-mark-subtree)
(let ((content (buffer-substring (point) (mark)))
(TO (org-entry-get (point) "TO" t))
(CC (org-entry-get (point) "CC" t))
(BCC (org-entry-get (point) "BCC" t))
(continue nil)
(switch-function nil)
(yank-action nil)
(send-actions '((email-send-action . nil)))

(compose-mail TO SUBJECT OTHER-HEADERS continue switch-function yank-action send-actions return-action)
(message-goto-body)
(insert content)
(when CC
(message-goto-cc)
(insert CC))
(when BCC
(message-goto-bcc)
(insert BCC))
(if TO
(message-goto-body)
(message-goto-to))
))


This works pretty well for me. Since I normally use this to send tasks to people, it keeps the task organized where I want it, and I can embed an org-id in the email so if the person replies to it telling me the task is done, I can easily navigate to the task to mark it off. Pretty handy.

org-mode source

Org-mode version = 8.2.6

## Finding emails from tags from org-contacts database

| categories: org-mode | tags: | View Comments

Org-mode has a contacts manager called org-contacts. If you set it up, you can use it to insert email addresses using a tag in message-mode. Out of the box though, it only works on one tag. You cannot do something like +group-phd to get entries tagged group but not tagged phd. Here we develop a function to do that for us.

We could use the org-files and map the headings to do this, but org-contacts has already done this and has a database we can use instead. We get the database from org-contacts-filter. Here is the first entry.

(car (org-contacts-filter))


(Chris Jones #<marker at 1 in contacts.org> ((FILE . c:/Users/jkitchin/Dropbox/org-mode/contacts.org) (TAGS . :co2:) (ALLTAGS . :co2:) (BLOCKED . ) (COMPANY . Georgia Tech, Chemical Engineering) (EMAIL . Christopher.Jones@chbe.gatech.edu) (CATEGORY . contacts)))

It looks like we have (name marker (cons cells)) for each entry. We can get the tags associated with that entry like this.

We can get the tags for an entry with this code:

(let ((entry (car (org-contacts-filter))))
(cdr (assoc "TAGS" (nth 2 entry))))

:co2:


We will use some code for org tags. Notably, from a tags expression, we can automatically generate code that tells us if we have a match. Here we generate the code to test for a match on "+co2-group".

(let ((todo-only nil))
(cdr (org-make-tags-matcher "+co2-group")))


(and (progn (setq org-cached-props nil) (and (not (member group tags-list)) (member co2 tags-list))) t)

Note we will have to bind tags-list before we eval this.

So to use it, we need to split the tags from an org-contacts entry into a list of strings. It appears each entry just has the tag string, so we split the substring (skipping first and last characters) by ":" to get the list. We do that here, and test if a list of tags containing "co2" is matched by the expression "co2-junior".

(let* ((tags-list (split-string (substring ":co2:" 1 -1) ":"))
(todo-only nil))
(eval (cdr (org-make-tags-matcher "co2-junior"))))

t


It is. So, now we just need to loop through the database, and collect entries that match.

(defun insert-emails-from-tags (tag-expression)
"insert emails from org-contacts that match the tags expression. For example:
group-phd will match entries tagged with group but not with phd."
(interactive "sTags: ")
(insert
(mapconcat 'identity
(loop for contact in (org-contacts-filter)
for contact-name = (car contact)
for email = (org-contacts-strip-link (car (org-contacts-split-property
(or
(cdr (assoc-string org-contacts-email-property
""))))
for tags = (cdr (assoc "TAGS" (nth 2 contact)))
for tags-list = (if tags
(split-string (substring (cdr (assoc "TAGS" (nth 2 contact))) 1 -1) ":")
'())
if (let ((todo-only nil))
(eval (cdr (org-make-tags-matcher tag-expression))))

collect (org-contacts-format-email contact-name email))
",")))


This is not quite completion in message-mode, but it is good enough. You put your cursor in the To field, and run that command, enter the tag expression, and you will get your emails!