Sentence casing your bibtex entry journal titles

| categories: bib | tags:

I previously talked about title-casing the titles of journal articles in bibtex entries. Here we describe an alternative transformation: sentence-casing. In sentence case the first word is capitalized, and all others (except proper nouns). We also should capitalize the first word of any subtitles, which we take to be the first word after a :. That is usually correct. We should also ignore any LaTeX commands, or protected words in the title.

(defun jmax-sentence-case-article (&optional key start end)
  "Convert a bibtex entry article title to sentence-case. The
arguments are optional, and are only there so you can use this
function with `bibtex-map-entries' to change all the title
entries in articles."
  (interactive)
  (bibtex-beginning-of-entry)

  (let* ((title (bibtex-autokey-get-field "title"))
         (words (split-string title))
         (start 0))
    (when
        (string= "article" (downcase (cdr (assoc "=type=" (bibtex-parse-entry)))))
      (setq words (mapcar
                   (lambda (word)
                     (if
                         ;; match words containing {} or \ which are probably
                         ;; LaTeX or protected words
                         (string-match "\\$\\|{\\|}\\|\\\\" word)
                         word
                       (s-downcase word)))
                   words))
      
      ;; capitalize first word
      (setf (car words) (s-capitalize (car words)))

      ;; join the words
      (setq title (mapconcat 'identity words " "))

      ;; capitalize a word after a :, eg. a subtitle, and protect it
      (while
          (string-match "[a-z]:\\s-+\\([A-Z]\\)" title start)
        (let ((char (substring title (match-beginning 1) (match-end 1))))
          (setf (substring title (match-beginning 1) (match-end 1))
                (format "%s" (upcase char)))
          (setq start (match-end 1))))
            
      ;; this is defined in doi-utils
      (bibtex-set-field
       "title" title)

      ;; clean and refill entry so it looks nice
      (bibtex-clean-entry)
      (bibtex-fill-entry))))
jmax-sentence-case-article

Now, we can easily convert this entry in title-case:

@article{arroyave-2005-ab-ni,
  author =       {R. Arroyave and D. Shin and Z.-K. Liu},
  title =        {Ab Initio Thermodynamic Properties of Stoichiometric
                  Phases in the {Ni-Al} System},
  journal =      {Acta Materialia },
  volume =       53,
  number =       6,
  pages =        {1809 - 1819},
  year =         2005,
  doi =          {10.1016/j.actamat.2004.12.030},
  url =
                  {http://www.sciencedirect.com/science/article/pii/S1359645404007669},
  issn =         {1359-6454},
  keywords =     {Ab initio},
}

To this in sentence case:

@article{arroyave-2005-ab-ni,
  author =       {R. Arroyave and D. Shin and Z.-K. Liu},
  title =        {Ab initio thermodynamic properties of stoichiometric
                  phases in the {Ni-Al} system},
  journal =      {Acta Materialia },
  volume =       53,
  number =       6,
  pages =        {1809 - 1819},
  year =         2005,
  doi =          {10.1016/j.actamat.2004.12.030},
  url =
                  {http://www.sciencedirect.com/science/article/pii/S1359645404007669},
  issn =         {1359-6454},
  keywords =     {Ab initio},
}

The function is written so you can use it with bibtex-map-entries to change all the titles in one shot like this:

% (bibtex-map-entries 'jmax-sentence-case-article)

The function is not perfect. For example in this next entry, the chemical symbols Mn, Fe, Co, Ni, are incorrectly lower-cased.

@article{arroyo-2010-first-princ,
  author =       {Arroyo y de Dompablo, M. E. and Lee, Yueh-Lin and
                  Morgan, D.},
  title =        {First principles investigation of oxygen vacancies
                  in columbite \ce{MNb_2O_6} ({M = Mn, Fe, Co, Ni,
                  Cu})},
  journal =      {Chemistry of Materials},
  volume =       22,
  number =       3,
  pages =        {906-913},
  year =         2010,
  doi =          {10.1021/cm901723j},
  url =          {http://pubs.acs.org/doi/abs/10.1021/cm901723j},
  eprint =       {http://pubs.acs.org/doi/pdf/10.1021/cm901723j},
}

Here is the result of sentence casing:

@article{arroyo-2010-first-princ,
  author =       {Arroyo y de Dompablo, M. E. and Lee, Yueh-Lin and
                  Morgan, D.},
  title =        {First principles investigation of oxygen vacancies
                  in columbite \ce{MNb_2O_6} ({M = mn, fe, co, ni,
                  Cu})},
  journal =      {Chemistry of Materials},
  volume =       22,
  number =       3,
  pages =        {906-913},
  year =         2010,
  doi =          {10.1021/cm901723j},
  url =          {http://pubs.acs.org/doi/abs/10.1021/cm901723j},
  eprint =       {http://pubs.acs.org/doi/pdf/10.1021/cm901723j},
}

The Cu is not lower-cased because it has a } attached to it after the title is split into words. The original entry is not properly formatted, in my opinion. I was lazy in wrapping the whole string in braces, {M = Mn, Fe, Co, Ni, Cu}, to protect the capitalization of the elements in bibtex. The correct way to do this is the more verbose: {M} = {M}n, {F}e, {C}o, {N}i, {C}u, where each letter is individually protected.

Still, the function can save a lot of keystrokes. You should still inspect the final results, to catch any unusual modifications. You do have your bibtex file under version control right?

This function can also be found at https://github.com/jkitchin/jmax/blob/master/jmax-bibtex.el .

Copyright (C) 2014 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.7c

Discuss on Twitter