Introduction to a citation processor in org-ref

| categories: emacs, citations, orgmode, orgref | tags:

As a potential solution for citations in org-mode for non-LaTeX export, here we introduce csl (citation syntax lisp). The idea is heavily influenced by the xml-based Citation Syntax Language, but uses lisp sexps instead.

Briefly, there is a csl file that contains two variables: citation-style and bibliography-style. The citation-style defines how the in-text citations are represented for different types of citations. The bibliography-style defines how the bibliography is constructed.

What do we gain by this?

  1. No need for external citeproc program, and hackability by org-mode experts.
  2. Punctuation transposition and space chomping, i.e. put superscripts on the right side of punctuation if you want it, and remove whitespace before superscripts if you want it.
  3. Total tunability of the citation format to different backends.
  4. Easy to change bibliography format with the bibliographystyle link.
  5. The use of Bibtex databases. These are plain text, and flexible.

The real code for this is too long to blog about. Instead, you should check it out here: https://github.com/jkitchin/org-ref/tree/master/citeproc

1 Reference types

  • A book.1
  • An article2
  • A miscellaneous bibtex type.3

There is work to do in supporting other types of entry types that are common in bibtex files.

2 Citation types

  • Regular citation:2
  • citenum: See Ref. 2
  • citeauthor: Kitchin
  • citeyear: 2015

There is work to do in supporting other types of citations.

3 Multiple citations and sorting within citation

You can specify that the cites within a citation are consistently sorted in the export.

  • a,b:2,4
  • b,a:2,4

There is work to do for range collapsing, e.g. to turn 1,2,3 into 1-3.

4 Space chomping and punctuation testing

I think citations should always be put in the sentence they logically belong to. LaTeX has a feature through natbib I think where for some styles, e.g. superscripts, the citations are moved to the right side of punctuation, and whitespace is chomped so the superscript is next to words, not separated by spaces. We can do that here too.

  • Citation at end of sentence.2
  • Citation in clause,2,4 with a comma.
  • Citation in middle of2,4 a sentence.

5 Building

At the moment, you have to add a hook function to put the replacements in the document before parsing.

(add-to-list 'load-path ".")
(require 'org-ref-citeproc)

(when (file-exists-p "readme.html") (delete-file "readme.html"))
(let ((org-export-before-parsing-hook '(orcp-citeproc)))
  (browse-url (org-html-export-to-html)))
#<process open ./readme.html>
(add-hook 'org-export-before-parsing-hook 'orcp-citeproc)
orcp-citeproc

6 Summary thoughts

This looks promising. There is probably a lot of work to do to make this as robust as say citeproc-js or the Zotero handler. I am not sure if we could write this in a way to directly use the CSL. My feeling is it would not be as flexible as this, and we would have to add to it anyway.

Here are some remaining things that could be worked on if we continue this direction.

  1. Other bibtex entries need to be tested out.
  2. Remaining bibtex fields need to be defined.
  3. Standardization of styling that can be done. Not all features described in my csl are supported, e.g. et. al. and probably others.
  4. The author-year style needs name disambiguation somehow.
  5. Hyperlinking in html.
  6. Make sure export to other backends works.
  7. Can this work for notes-based styles?

7 Bibliography

You use a bibliographystyle link to specify a csl. These are similar to bibtex styles, and in some cases no change is needed for LaTeX export (although you may have to remove the citeproc hook function).

  1. Kittel, Charles, Introduction to Solid State Physics, (2005).
  2. Kitchin, John R., Examples of Effective Data Sharing in Scientific Publishing, ACS Catalysis, 5(6), pp. 3894-3899 (2015). https://doi.org/10.1021/acscatal.5b00538.
  3. Xu, Zhongnan; Rossmeisl, Jan and Kitchin, John R., Supporting data for: A linear response, {DFT+U} study of trends in the oxygen evolution activity of transition metal rutile dioxides. doi:10.5281/zenodo.12635, https://doi.org/https://zenodo.org/record/12635. https://doi.org/10.5281/zenodo.12635.
  4. Kitchin, John R., Data Sharing in Surface Science, Surface Science , N/A, pp. in press (2015). https://doi.org/10.1016/j.susc.2015.05.007.

Copyright (C) 2015 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter