MS Word comments from org-mode

| categories: docx, orgmode | tags:

TL;DR:

Today I learned you can make a Word document from org-mode with Word comments in them. This could be useful when working with collaborators maybe. The gist is you use html for the comment, then export to markdown or html, then let pandoc convert those to docx. A comment in HTML looks like this:

<span class="comment-start" author="jkitchin">Comment text</span>The text being commented on <span class="comment-end"></span> 

Let's wrap that in a link for convenience. I use a full display so it is easy to see the comment. I only export the comment for markdown and html export, for everything else we just use the path. We somewhat abuse the link syntax here by using the path for the text to comment on, and the description for the comment.

(org-link-set-parameters
 "comment"
 :export (lambda (path desc backend)
           (if (member backend '(md html))
               (format "<span class=\"comment-start\" author=\"%s\">%s</span>%s<span class=\"comment-end\"></span>"
                       (user-full-name)
                       desc
                       path)
             ;; ignore for other backends and just use path
             path))
 :display 'full
 :face '(:foreground "orange"))                  

Now, we use it like this This is the commentThis is the text commented on.

In org-mode it looks like:

To get the Word doc, we need some code that first exports to Markdown, and then calls pandoc to convert that to docx. Here is my solution to that. Usually you would put this in a subsection tagged with :noexport: but I show it here to see it. Running this block generates the docx file and opens it. Here I also leverage org-ref to get some citations and cross-references.

(require 'org-ref-refproc)
(let* ((org-export-before-parsing-hook '(org-ref-cite-natmove ;; do this first
                                        org-ref-csl-preprocess-buffer
                                        org-ref-refproc))
       (md (org-md-export-to-markdown))
       (docx (concat (file-name-sans-extension md) ".docx")))
  (shell-command (format "pandoc -s %s -o %s" md docx))
  (org-open-file docx '(16)))

The result looks like this in MS Word:

How a comment looks in Word.

That is pretty remarkable. There are some limitations in Markdown, e.g. I find the tables don't look good, not all equations are converted, some cross-references are off. Next we add some more org-features and try the export with HTML.

1. export features for test

Test cross-references, references, equations, etc…

Aliquam erat volutpat (Fig. fig-2). Nunc eleifend leo vitae magna. In id erat non orci commodo lobortis. Proin neque massa, cursus ut, gravida ut, lobortis eget, lacus. Sed diam. Praesent fermentum tempor tellus. Nullam tempus &yang-2022-evaluat-degree. Mauris ac felis vel velit tristique imperdiet. Donec at pede. Etiam vel neque nec dui dignissim bibendum. Vivamus id enim. Phasellus neque orci, porta a, aliquet quis in Table tab-1, semper a, massa. Phasellus purus (eq-1). Pellentesque tristique imperdiet tortor. Nam euismod tellus id erat &kolluru-2022-open-chall.

Table 1: A table.
x y
1 3
3 6

We have equations:

\begin{equation} \label{org9973acf} y = mx + b \end{equation}
  • bullet1
    • nested bullet
  • bullet2

some defintions:

emacs
greatest editor
  1. item 1
  2. item 2

One equation: \(e^{i\pi} - 1 = 0\)

A second equation:

\begin{equation} e^{i\pi} - 1 = 0 \end{equation}

3. Alternate build with HTML.

Here we consider For example, htmlalternate build approaches.

Run this to get the docx file. I find this superior; it has references, cross-references, equations, tables, figures, etc. Even a title.

(let* ((org-export-before-parsing-hook '(org-ref-csl-preprocess-buffer
                                         org-ref-refproc))
       (org-html-with-latex 'dvipng)
       (f (org-html-export-to-html))
       (docx (concat (file-name-sans-extension f) ".docx")))
  (shell-command (format "pandoc -s %s -o %s" f docx))
  (org-open-file docx '(16)))

Copyright (C) 2023 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 9.5.5

Discuss on Twitter

ox-pandoc - org-mode + org-ref to docx with bibliographies

| categories: docx, pandoc, orgmode | tags:

There is a new org-mode exporter: ox-pandoc . It seems like it makes it easy to convert org-mode to other formats, including docx, and including references in a bibliography. Let us try it out.

1 The setup

We have to modify org-ref org-ref modifies helm-bibtex to insert citation links. We have to undo that here to insert LaTeX style citations. We do that here so that the key binding for inserting references from org-ref inserts the LaTeX citations. This is necessary for pandoc to convert the reference citations to the bibliography in the docx format. If you do not use org-ref, this is probably not necessary.

(setq helm-bibtex-format-citation-functions
      '((org-mode . (lambda (x) (insert (concat
                                         "\\cite{"
                                         (mapconcat 'identity x ",")
                                         "}")) ""))))
org-mode lambda (x) (insert (concat \cite{ (mapconcat (quote identity) x ,) }))

We have to add ox-pandoc and require it.

(add-to-list 'load-path (expand-file-name "ox-pandoc" starter-kit-dir))
(require 'ox-pandoc)

2 The document

Now, for some text. Grindy wrote this nice paper on approaching chemical accuracy with density functional calculations \cite{grindy-2013-approac}. Two other interesting papers include these ones \cite{guldner-1961,guerrini-2008-effec-feo}.

An equation: \(e^x = 4\).

And a figure with a caption:

Figure 1: Make sure this is in your org-file.

3 Summary

This is better than what I have seen in the past. ox-pandoc has some options that might tailor the bibliography to specific formats. You lose some functionality of org-ref cite links by using raw LaTeX, but if that is not a deal breaker this might be a good way to go for some purposes.

Here is the word document that results from this file: test-doc.docx

Copyright (C) 2015 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Export org-mode to docx with citations via pandoc

| categories: docx, orgmode | tags:

Pandoc continues to develop, and since the last time I wrote about it there is improved support for citations. We will use that to convert org documents to Word documents that actually have citations and a bibliography in them. This post explores using helm-bibtex to insert pandoc compatible citations, and then using pandoc to convert the org file to a word document (docx). We can define the format of citations that helm-bibtex inserts in a function, and tell helm-bibtex to use it when in org mode.

Here is that code. This is just to give me a convenient tool to insert citations with searching in my bibtex file. I think you could just as easily use reftex for this, or an ido-completing function on bibtex keys. See Pandoc - Pandoc User’s Guide for directions on citation format. The key is to format the cite links to the pandoc format.

(defun helm-bibtex-format-pandoc-citation (keys)
  (concat "[" (mapconcat (lambda (key) (concat "@" key)) keys "; ") "]"))

;; inform helm-bibtex how to format the citation in org-mode
(setf (cdr (assoc 'org-mode helm-bibtex-format-citation-functions))
  'helm-bibtex-format-pandoc-citation)
helm-bibtex-format-pandoc-citation

Now, we can cite the org-mode book [@dominik-2010-org-mode], and some interesting papers on using org-mode [@schulte-2011-activ-docum; @schulte-2012-multi-languag]. You could pretty easily add pre and post text manually to these, after selecting and inserting them.

We need a bibliography file for pandoc to work. I will use a bibtex file, since I already have it and am using helm-bibtex to select keys. I found pandoc could not read my massive bibtex file, perhaps it does not support all the types yet, so I made a special small bibtex file for this. So, now all we need to do is convert this file to a docx. I use a function like this to do that. It uses an org-ref function to get the bibliography defined in this file, derives some file names, and then runs pandoc.

(defun ox-export-to-docx-and-open ()
 "Export the current org file as a docx via markdown."
 (interactive)
 (let* ((bibfile (expand-file-name (car (org-ref-find-bibliography))))
        ;; this is probably a full path
        (current-file (buffer-file-name))
        (basename (file-name-sans-extension current-file))
        (docx-file (concat basename ".docx")))
   (save-buffer)
   (when (file-exists-p docx-file) (delete-file docx-file))
   (shell-command (format
                   "pandoc -s -S --bibliography=%s %s -o %s"
                   bibfile current-file docx-file))
   (org-open-file docx-file '(16))))

And now we run it to get our docx.

(ox-export-to-docx-and-open)

Here is the result: org-to-docx-pandoc.docx

It is not too bad. Not all the equations showed up below, and the figure did not appear for some reason. But, the citations went through fine. A downside of this is the citation links are not clickable (but see Making pandoc links for a way to do this), so they lack all the awesome features that org-ref gives them. Maybe pandoc can convert these to LaTeX links, but we already have such a good framework for that I do not see why you would want to do it. A better option is to figure out how to export the org file to an org file, and transform the org citation links to pandoc citations, then use pandoc on the temporarily transformed buffer. That way, you keep the cite links and their functionality, and ability to export to many formats, and get export to docx via pandoc.

There are other options in pandoc to fine tune the reference format (you need a csl file). That can be included in the org-file via file tags pretty easily. These citations are not links in the word document, and it does not look like they can be converted to footnotes, endnotes or interact with Endnote or Zotero at this time, but it is a step forward in getting a passable word document with references out of org-mode!

Since we are testing, let us try it some other typical features in an org-file.

1 Numbered list

  1. Item 1
  2. Item 2
  3. Item 3

2 Bulleted list

  • item 1
  • item 2
  • item 3
    • subitem

3 definitions

org-mode
tool for awesomeness

4 Math

One equation: \(e^{i\pi} - 1 = 0\)

A second equation:

\begin{equation}
e^{i\pi} - 1 = 0
\end{equation}

5 An image

Figure 1: A little icon.

6 A table

Table 1: A little table.
x y
1 2
3 4

a plain table

x y
1 2
3 4

7 Making pandoc links

Here I show a way to get clickable text on pandoc links. I found a nice library called button-lock that uses a regular expression to attach text properties to matching text.

Below I repeat the citations so it is easy to see the effect after running the code block. Indeed, you get clickable text, even org-ref like capability. I think you could even add the idle-timer messages, and the org-ref menu.

Now, we can cite the org-mode book [@dominik-2010-org-mode], and some interesting papers on using org-mode [@schulte-2011-activ-docum; @schulte-2012-multi-languag]. You could pretty easily add pre and post text manually to these, after selecting and inserting them.

You would need to make this code run in when you open an org-file to get it to work every time.

(require 'button-lock)
(global-button-lock-mode)

(button-lock-set-button
 "@\\([-a-zA-Z0-9_:]*\\)"
 (lambda ()
   (interactive)
   (re-search-backward "@")
   (re-search-forward  "@\\([-a-zA-Z0-9_:]*\\)")
   (let* ((key (match-string-no-properties 1))
          (bibfile (cdr (org-ref-get-bibtex-key-and-file key))))
     (if bibfile
        (save-excursion
          (with-temp-buffer
            (insert-file-contents bibfile)
            (bibtex-search-entry key)
            (message (org-ref-bib-citation))))
       (message "No entry found"))))
 :face (list 'org-link))

8 References

Copyright (C) 2015 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter