A sudo org-link and sh block

| categories: babel, orgmode, emacs | tags:

Shell blocks in org-mode are pretty useful, but they are a little limited in that it is not obvious how to run a sudo command in them.

So for example, this gives me a permission denied error.

ls /var/audit

One way to get around this is to create an org-mode link like this one:

;http://stackoverflow.com/questions/2472273/how-do-i-run-a-sudo-command-in-emacs
(org-add-link-type
 "sudo"
 (lambda (cmd)
   "Run CMD with sudo."
   (shell-command
    (concat "echo " (shell-quote-argument (read-passwd "Password? "))
            " | sudo -S " cmd))))

Now you can create a link like ls /var/audit, and when you click on it you will be prompted for a password, and then you will see a buffer containing the output. To get an actual sudo code block, you need a new org babel library. Here is an example of what it might look like. Tangle this file to generate the library. Note: This is a lightly modified version of ob-emacs-lisp.el, and I have not tested it very thoroughly.

;;; ob-sudo.el --- An org-mode source block to run shell commands as sudo

;;; Commentary:
;; Runs the block of code as a shell command with sudo.

;;; Code:

(defun org-babel-execute:sudo (body params)
  "Run BODY as a shell command using sudo."
  (let* ((passwd (shell-quote-argument (read-passwd "Password? ")))
         (result (shell-command-to-string
                  (concat "echo " passwd
                          " | sudo -S " body))))
    ;; this is verbatim from ob-emacs-lisp
    (org-babel-result-cond (cdr (assoc :result-params params))
      (let ((print-level nil)
            (print-length nil))
        (if (or (member "scalar" (cdr (assoc :result-params params)))
                (member "verbatim" (cdr (assoc :result-params params))))
            (format "%S" result)
          (format "%s" result)))
      (org-babel-reassemble-table
       result
       (org-babel-pick-name (cdr (assoc :colname-names params))
                            (cdr (assoc :colnames params)))
       (org-babel-pick-name (cdr (assoc :rowname-names params))
                            (cdr (assoc :rownames params)))))))

(provide 'ob-sudo)
;;; ob-sudo.el ends here

Let us add the current dir to our path so we can load it. If you use this a lot, you should put the library on your permanent path.

(add-to-list 'load-path (expand-file-name "."))

Now, add the sudo "language" to org-babel-load-languages.

(org-babel-do-load-languages
 'org-babel-load-languages
 '((emacs-lisp . t)
   (python . t)
   (sh . t)
   (matlab . t)
   (sqlite . t)
   (ruby . t)
   (perl . t)
   (org . t)
   (dot . t)
   (plantuml . t)
   (R . t)
   (sudo . t)))

And, here it is in action. Hopefully I am not giving away some important information here!

ls /var/audit
20141106003522.20141110021519
20141110021548.crash_recovery
20141112154126.crash_recovery
20141119201541.20141122145259
20141122145317.20141124214930
20141124215000.crash_recovery
20141126062011.20141202192451
20141202192507.crash_recovery
20141210133306.crash_recovery
20141225181819.20150106015256
20150106015325.20150111010018
20150111010121.crash_recovery
20150115195518.20150115200101
20150115200110.crash_recovery
20150123061227.20150215123411
20150215123454.crash_recovery
20150225004740.20150310201600
20150310201633.20150314214730
20150314214807.crash_recovery
20150323145600.20150329170647
20150329170721.crash_recovery
20150407215846.20150413000423
20150413000438.20150421122044
20150421122104.20150518122545
20150518122616.20150518124432
20150518124432.20150518124513
20150518124513.20150518125437
20150518125437.20150518125935
20150518125935.20150518132111
20150518132111.20150531202621
20150531202719.20150601123612
20150601123612.20150601124932
20150601124932.20150601125151
20150601125151.20150601125555
20150601125555.20150601131947
20150601131947.20150601132421
20150601132421.20150601133735
20150601133735.20150601140740
20150601140740.20150601154012
20150601154012.20150601155125
20150601155125.20150601155215
20150601155215.20150601160937
20150601160937.crash_recovery
20150613061543.20150614054541
20150614054541.20150625165357
20150625165432.20150625200623
20150625200623.20150628042242
20150628042242.20150628103628
20150628103628.20150630052100
20150630052100.20150701232519
20150702005345.20150710203212
20150710203226.not_terminated
current

Summary thoughts: I will reiterate again I have not tested this a lot, I was mostly interested in trying to make a new sh block with sudo support. Let me know if it has issues for you, and make sure you have backups of things it could mess up!

Copyright (C) 2015 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Drag images and files onto org-mode and insert a link to them

| categories: video, emacs | tags:

I want to drag and drop an image onto an org mode file and get a link to that file. This would be used for finding images in Finder, and then dragging them to the Emacs buffer. There is org-download.el which looks like it should do something like this too, but it did not work out of the box for me, and I want to add a few wrinkles to it. For a simple drag-n-drop, I just want the link to appear. With ctrl-drag-n-drop I want to add an attr_org line to set the image size, add a caption line, insert the image at the beginning of the line where the mouse cursor is, put the cursor on the caption line and then refresh the inline images in org-mode so the image is immediately visible.

While we are at let us also make it possible to drag file links onto org-files, instead of having the files open. Again, for a simple drag-n-drop, I want a link inserted. For ctrl-drag-n-drop we open the file, and for Meta (alt) drag-n-drop, we insert an attachfile link. You can also define s-drag-n-drop (Super/command) and C-s and M-s drag-n-drop if you can think of things to do with that.

Here is the code to make those things happen. Or watch the video: https://www.youtube.com/watch?v=ahqKXbBVjpQ

(defun my-dnd-func (event)
  (interactive "e")
  (goto-char (nth 1 (event-start event)))
  (x-focus-frame nil)
  (let* ((payload (car (last event)))
         (type (car payload))
         (fname (cadr payload))
         (img-regexp "\\(png\\|jp[e]?g\\)\\>"))
    (cond
     ;; insert image link
     ((and  (eq 'drag-n-drop (car event))
            (eq 'file type)
            (string-match img-regexp fname))
      (insert (format "[[%s]]" fname))
      (org-display-inline-images t t))
     ;; insert image link with caption
     ((and  (eq 'C-drag-n-drop (car event))
            (eq 'file type)
            (string-match img-regexp fname))
      (insert "#+ATTR_ORG: :width 300\n")
      (insert (concat  "#+CAPTION: " (read-input "Caption: ") "\n"))
      (insert (format "[[%s]]" fname))
      (org-display-inline-images t t))
     ;; C-drag-n-drop to open a file
     ((and  (eq 'C-drag-n-drop (car event))
            (eq 'file type))
      (find-file fname))
     ((and (eq 'M-drag-n-drop (car event))
           (eq 'file type))
      (insert (format "[[attachfile:%s]]" fname)))
     ;; regular drag and drop on file
     ((eq 'file type)
      (insert (format "[[%s]]\n" fname)))
     (t
      (error "I am not equipped for dnd on %s" payload)))))


(define-key org-mode-map (kbd "<drag-n-drop>") 'my-dnd-func)
(define-key org-mode-map (kbd "<C-drag-n-drop>") 'my-dnd-func)
(define-key org-mode-map (kbd "<M-drag-n-drop>") 'my-dnd-func)
my-dnd-func

jkitchin.json

Copyright (C) 2015 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Acronym minor mode for Emacs

| categories: tooltip, video, emacs | tags:

Three letter acronyms (TLA) are pretty common, as are other kinds of acronyms, e.g. ferromagnetic (FM), anti-ferromagnetic (AFM), National Security Agency (NSA), even Escape-Meta-Alt-Control-Shift (EMACS) etc… in technical documents. As you get away from the definition, it can get hard to remember what they are, so here we develop a minor mode that will put a tooltip over acronyms that hopefully shows what they mean.

You can see this in action here: https://www.youtube.com/watch?v=2G2isMO6E2c

When we turn the mode on, it will scan the buffer looking for an acronym pattern, deduce its likely meaning, and put tooltips on every subsequent use of the acronym. The pattern we will look for is a sequence of uppercase letters surrounded by parentheses. We will assume that if we find N uppercase letters, that the previous N words contain the definition of the acronym. This is pretty approximate, but it is not likely to be that wrong. Then, we will use button-lock to put the tooltips on all subsequent instances of acronyms. We don't want flyspell interfering with the tooltips, so we remove the overlays if they are there.

Unlike previous examples where we just use button-lock, here we wrap the feature into a minor mode that you can turn on and off. Note, you cannot add new acronyms and have them have tooltips. You have to refresh the buttons.

Here is the minor mode code. We use the interesting rx package to build the regular expression. It is more verbose, but a little easier to read than a straight regexp like (concat "\\<" (match-string 1) "\\>") in my opinion.

(make-variable-buffer-local
  (defvar *acronym-buttons* '() "list of acronym buttons"))

(require 'rx)

(defun highlight-acronyms ()
  (save-excursion
    (let ((case-fold-search nil))
      (goto-char (point-min))
      (while (re-search-forward "(\\([A-Z]+\\))" nil t)
        (when flyspell-mode
          (flyspell-delete-region-overlays (match-beginning 1)
                                           (match-end 1)))
        (let* ((acronym (match-string 1))
               (p (point))
               (definition (save-excursion
                             (goto-char (match-beginning 1))
                             (backward-word (length acronym))
                             (buffer-substring (point) p))))
          (add-to-list '*acronym-buttons*
                       (button-lock-set-button
                        (rx word-start (eval (match-string 1)) word-end)
                        nil
                        :help-echo definition)))))))


(defun remove-acronym-buttons ()
  (dolist (button *acronym-buttons*)
      (button-lock-unset-button button))
  (setq *acronym-buttons* '()))


(defun refresh-acronyms ()
  "Refresh acronym tooltips in buffer."
  (interactive)
  (remove-acronym-buttons)
  (highlight-acronyms))


;;;###autoload
(define-minor-mode acronym-mode
  "Put definitions on acronyms."
  :lighter " AM"
  (if acronym-mode
      (highlight-acronyms)
    (remove-acronym-buttons)))


(provide 'acronym-mode)
acronym-mode

There it is. Now any time we have an acronym like EMACS we can mouse over it, or type C-h . on the acronym to see how it was previously defined. If you don't like it, you can turn it off!

Copyright (C) 2015 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

Indexing headlines in org files with swish-e with laser-sharp results

| categories: orgmode, swishe, emacs | tags:

So far, it looks like swish-e is able to do some pretty focused searches on specific content types. However, the return results are not actually that sharp; in the way we have been using swish-e, it can only tell us the document path that matches, not where in the document the match is. To fix that, we need a new approach to what a "document" is, and a new approach to indexing. We will finally use the "-s prog" option in swish-e which runs an external program that prints stuff to stdout for swish-e to index. We will treat each headline in an org file as a "document" but rather than have the path to the file, we will put an org-mode link there that will take us right to the point of interest.

You can see this in action here: https://www.youtube.com/watch?v=bTwXtEb5Ng8

Basically, we need a program to output chunks like this for each headline in an org-file:

Path-Name: [[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/ase-db.org") (goto-char 1))]]
Content-Length: 247
Document-Type: XML*

<headline><title>Using the ase database module</title><properties><FILE>/Users/jkitchin/blogofile-jkitchin.github.com/_blog/ase-db.org</FILE><BLOCKED></BLOCKED><categories>python, ase</categories><CATEGORY>ase-db</CATEGORY></properties></headline>

Then we need to tell swish-e to run the program and index its output. Here is the program to do that.

:;exec emacs -batch -l $0 "$@"
(require 'org)
(require 'xml)
(require 'cl)

(add-to-list 'load-path "~/Dropbox/kitchingroup/jmax/elpa/f-20140828.716")
(add-to-list 'load-path "~/Dropbox/kitchingroup/jmax/elpa/s-20140910.334")
(add-to-list 'load-path "~/Dropbox/kitchingroup/jmax/elpa/dash-20141201.2206")
(require 'f)

(defun print-tag (name attrs &optional closingp)
  "Print an xml tag with symbol NAME and ATTRS (a cons list of (attribute . value)).
if CLOSINGP print the closing tag instead."
  (format
   "<%s%s%s>"
   (if closingp "/" "")
   name
   (if (and attrs (not closingp))
       (concat
        " "
        (mapconcat
         (lambda (x)
           (format "%s=\"%s\""
                   (car x)
                   (xml-escape-string (cdr x))))
         attrs
         " "))
     "")))

(defmacro tag (name attributes &rest body)
  "macro to create an xml tag with NAME, ATTRIBUTES. BODY is executed in the tag."
  `(format "%s%s%s"
           (print-tag ,name ,attributes nil)
           (concat
            ,@body)
           (print-tag ,name nil t)))

(defun headline-xml (headline)
  "Return xml representation of an element HEADLINE."
  (let ((title (org-element-property :title headline))
        (properties (save-excursion
                      (goto-char
                       (org-element-property :begin headline))
                      (org-entry-properties))))
    (tag 'headline ()
         (tag 'title () (xml-escape-string (mapconcat 'identity title " ")))
         (when properties
           (tag 'properties ()
                (mapconcat
                 'identity
                 (loop for (p . v) in properties
                       collect (tag p () (xml-escape-string v)))
                 ""))))))

(defun headline-document (headline)
  "Return the headline \"document\" for swish-e to index."
  (let ((xml (replace-regexp-in-string
              "[^[:ascii:]]" ""
              (headline-xml headline))))
    (format "Path-Name: [[elisp:(progn (find-file \"%s\") (goto-char %s) (show-children))][link]]
Content-Length: %s
Document-Type: XML*

%s" (buffer-file-name)
(org-element-property :begin headline)
(length xml)
xml)))

(defun process-file (fname)
  "Print the `headline-document' for each headline in FNAME."
  (with-current-buffer (find-file-noselect fname)
    (mapconcat 'identity
               (org-element-map (org-element-parse-buffer)
                   'headline
                 (lambda (headline)
                   (princ (headline-document headline))))
               "")))

;; Here is the main work in the script.
(loop for dir in '("/Users/jkitchin/blogofile-jkitchin.github.com/_blog")
      do
      (loop for fname in (f-entries
                          dir
                          (lambda (x)
                            (string=  "org"  (file-name-extension x)))
                          t)
            do (ignore-errors
                 (princ (process-file fname)))))

Now we need a configuration file:

# Example configuration file

# where to save the index
IndexFile /Users/jkitchin/blogofile-jkitchin.github.com/_blog/index-org-headlines.swish-e

# index all tags for searching
UndefinedMetaTags auto
UndefinedXMLAttributes auto

And we run the indexer, I did this in an actual shell. For some reason, it was not possible to run here. The output is pretty useful though, as it tells you what MetaNames are searchable.

swish-e -c swish-org-headlines.conf -S prog -i ./swish-org-headlines.el
10:17 $ swish-e -c swish-org-headlines.conf -S prog -i ./swish-org-headlines.el
Indexing Data Source: "External-Program"
Indexing "./swish-org-headlines.el"
External Program found: ./swish-org-headlines.el
**Adding automatic MetaName 'headline' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/writing-exams-in-orgmode.org") (goto-char 18) (show-children))][link]]'
**Adding automatic MetaName 'title' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/writing-exams-in-orgmode.org") (goto-char 18) (show-children))][link]]'
**Adding automatic MetaName 'properties' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/writing-exams-in-orgmode.org") (goto-char 18) (show-children))][link]]'
**Adding automatic MetaName 'file' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/writing-exams-in-orgmode.org") (goto-char 18) (show-children))][link]]'
**Adding automatic MetaName 'blocked' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/writing-exams-in-orgmode.org") (goto-char 18) (show-children))][link]]'
**Adding automatic MetaName 'categories' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/writing-exams-in-orgmode.org") (goto-char 18) (show-children))][link]]'
**Adding automatic MetaName 'date' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/writing-exams-in-orgmode.org") (goto-char 18) (show-children))][link]]'
**Adding automatic MetaName 'updated' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/writing-exams-in-orgmode.org") (goto-char 18) (show-children))][link]]'
**Adding automatic MetaName 'category' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/writing-exams-in-orgmode.org") (goto-char 18) (show-children))][link]]'
**Adding automatic MetaName 'points' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/writing-exams-in-orgmode.org") (goto-char 1391) (show-children))][link]]'
**Adding automatic MetaName 'tags' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/why-org-mode.org") (goto-char 25) (show-children))][link]]'
**Adding automatic MetaName 'alltags' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/why-org-mode.org") (goto-char 25) (show-children))][link]]'
**Adding automatic MetaName 'todo' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/why-org-mode.org") (goto-char 1733) (show-children))][link]]'
**Adding automatic MetaName 'closed' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/why-org-mode.org") (goto-char 1733) (show-children))][link]]'
**Adding automatic MetaName 'timestamp_ia' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/pdfsync.org") (goto-char 28) (show-children))][link]]'
**Adding automatic MetaName 'id' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-to-docx-pandoc.org") (goto-char 5056) (show-children))][link]]'
**Adding automatic MetaName 'custom_id' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-db.org") (goto-char 1311) (show-children))][link]]'
**Adding automatic MetaName 'calculation' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-db.org") (goto-char 1311) (show-children))][link]]'
**Adding automatic MetaName 'volume' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-db.org") (goto-char 1311) (show-children))][link]]'
**Adding automatic MetaName 'total_energy' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-db.org") (goto-char 1311) (show-children))][link]]'
**Adding automatic MetaName 'stress' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-db.org") (goto-char 1311) (show-children))][link]]'
**Adding automatic MetaName 'priority' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog.org") (goto-char 15327) (show-children))][link]]'
**Adding automatic MetaName 'export_title' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog.org") (goto-char 506769) (show-children))][link]]'
**Adding automatic MetaName 'export_author' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog.org") (goto-char 506769) (show-children))][link]]'
**Adding automatic MetaName 'export_file_name' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog.org") (goto-char 506769) (show-children))][link]]'
**Adding automatic MetaName 'export_date' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog.org") (goto-char 506769) (show-children))][link]]'
**Adding automatic MetaName 'scheduled' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog.org") (goto-char 516502) (show-children))][link]]'
**Adding automatic MetaName 'deadline' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog.org") (goto-char 516502) (show-children))][link]]'
**Adding automatic MetaName 'votes' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog.org") (goto-char 532031) (show-children))][link]]'
**Adding automatic MetaName 'timestamp' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog.org") (goto-char 571125) (show-children))][link]]'
**Adding automatic MetaName 'clock' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog-2014.org") (goto-char 21059) (show-children))][link]]'
**Adding automatic MetaName 'level' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog-2014.org") (goto-char 46582) (show-children))][link]]'
**Adding automatic MetaName 'correct' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog-2014.org") (goto-char 46582) (show-children))][link]]'
**Adding automatic MetaName 'permalink' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog-2014.org") (goto-char 61814) (show-children))][link]]'
**Adding automatic MetaName 'hint' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog-2014.org") (goto-char 340534) (show-children))][link]]'
**Adding automatic MetaName 'answer' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog-2014.org") (goto-char 355206) (show-children))][link]]'
**Adding automatic MetaName 'correct-answer' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog-2014.org") (goto-char 377210) (show-children))][link]]'
**Adding automatic MetaName 'post_filename' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog-2014.org") (goto-char 415454) (show-children))][link]]'
**Adding automatic MetaName 'ordered' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/blog-2014.org") (goto-char 423900) (show-children))][link]]'
**Adding automatic MetaName 'grade' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/add-subheadings-to-headings.org") (goto-char 2822) (show-children))][link]]'
**Adding automatic MetaName ':export_file_name:' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/add-properties-to-headings.org") (goto-char 2) (show-children))][link]]'
**Adding automatic MetaName 'firstname' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-contacts/referee-contacts.org") (goto-char 155) (show-children))][link]]'
**Adding automatic MetaName 'lastname' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-contacts/referee-contacts.org") (goto-char 155) (show-children))][link]]'
**Adding automatic MetaName 'email' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-contacts/referee-contacts.org") (goto-char 155) (show-children))][link]]'
**Adding automatic MetaName 'affiliation' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-contacts/referee-contacts.org") (goto-char 155) (show-children))][link]]'
**Adding automatic MetaName 'lettergrade' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/org-report/Slim-Shady-HW1.org") (goto-char 29) (show-children))][link]]'
**Adding automatic MetaName 'difficulty' found in file '[[elisp:(progn (find-file "/Users/jkitchin/blogofile-jkitchin.github.com/_blog/problem-selection/problem-selection.org") (goto-char 1) (show-children))][link]]'
Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 6,044 words alphabetically
Writing header ...
Writing index entries ...
  Writing word text: Complete
  Writing word hash: Complete
  Writing word data: Complete
6,044 unique words indexed.
4 properties sorted.
5,084 files indexed.  1,760,249 total bytes.  368,569 total words.
Elapsed time: 00:00:37 CPU time: 00:00:01
Indexing done!

Ok, now for the proof in the approach!

swish-e -f index-org-headlines.swish-e -w headline=generating

1000 link "separate-bib.org") (goto-char 1) (show-children))][link]]" 393 1000 link "blog-2014.org") (goto-char 158456) (show-children))][link]]" 229 1000 link "blog-2014.org") (goto-char 272383) (show-children))][link]]" 400 1000 link "blog-2014.org") (goto-char 158456) (show-children))][link]]" 229 1000 link "blog.org") (goto-char 448965) (show-children))][link]]" 389 1000 link "org-db.org") (goto-char 575) (show-children))][link]]" 204 1000 link "org-db.org") (goto-char 575) (show-children))][link]]" 204 1000 link "separate-bib.org") (goto-char 1) (show-children))][link]]" 393 1000 link "blog-2014.org") (goto-char 272383) (show-children))][link]]" 400 .

swish-e -f index-org-headlines.swish-e -w todo=TODO

1000 link "blog.org") (goto-char 16933) (show-children))][link]]" 342 1000 link "blog-2014.org") (goto-char 61231) (show-children))][link]]" 207 1000 link "blog-2014.org") (goto-char 60802) (show-children))][link]]" 274 1000 link "blog-2014.org") (goto-char 60289) (show-children))][link]]" 207 1000 link "blog-2014.org") (goto-char 61568) (show-children))][link]]" 246 1000 link "blog-2014.org") (goto-char 61231) (show-children))][link]]" 207 1000 link "blog-2014.org") (goto-char 60802) (show-children))][link]]" 274 1000 link "blog-2014.org") (goto-char 60289) (show-children))][link]]" 207 1000 link "blog.org") (goto-char 632875) (show-children))][link]]" 266 1000 link "blog.org") (goto-char 529123) (show-children))][link]]" 202 1000 link "blog.org") (goto-char 529087) (show-children))][link]]" 206 1000 link "blog.org") (goto-char 518108) (show-children))][link]]" 280 1000 link "blog.org") (goto-char 30559) (show-children))][link]]" 337 1000 link "blog-2014.org") (goto-char 61568) (show-children))][link]]" 246 .

1 Summary thoughts

This could be super useful for a lot of different elements: headlines, src-blocks, links, tables, paragraphs are the main ones that come to mind. You could have pretty focused searches that go straight to the matches!

Copyright (C) 2015 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter

An xml representation of an org document for indexing with swish-e

| categories: search, emacs | tags:

Swish-e can index xml data, and enable searching by tag. Here we push our org-mode indexing idea a little further. Initially we indexed org files as text. Then, we exported it to html, and indexed the html. That enabled some richer searching. Now, we will create an xml representation of the org file for indexing. This will enable us to use a custom tag system and search for specific text in tables, or src-blocks, or in headlines, or for headlines with certain tags, todo state or properties.

Incidentally, this is a general strategy for indexing arbitrary files. You just make an xml representation of the file containing the data to be indexed, and use swish-e to index that xml.

Let us start with code to generate xml. I adapted this from some code in Land Of Lisp . First, a function that simply prints a tag with attributes.

(defun print-tag (name attrs &optional closingp)
  "Print an xml tag with symbol NAME and ATTRS (a cons list of (attribute . value)).
if CLOSINGP print the closing tag instead."
  (format
   "<%s%s%s>"
   (if closingp "/" "")
   name
   (if (and attrs (not closingp))
       (concat
        " "
        (mapconcat
         (lambda (x)
           (format "%s=\"%s\""
                   (car x)
                   (xml-escape-string (cdr x))))
         attrs
         " "))
     "")))

(print-tag 'html '((color . "blue") (label . "test")))
<html color="blue" label="test">

XML tags almost always come in pairs. We define a macro to make this happen here. The macro prints the opening tag, evaluates the body, and prints the closing body. Note that the body may contain other tags, or a string. The string should be escaped to avoid illegal xml characters.

(defmacro tag (name attributes &rest body)
  `(format "%s%s%s"
           (print-tag ,name ,attributes nil)
           (concat
           ,@body)
           (print-tag ,name nil t)))

;; example usage
(tag "xml" '((test . "id"))
     (tag "body" nil
          (tag "p" nil (xml-escape-string "paragraph & < 1"))
          (tag "p" nil "paragraph 2")))
<xml test="id"><body><p>paragraph &amp; &lt; 1</p><p>paragraph 2</p></body></xml>

Now, we can use this to get an xml representation of the source blocks, e.g.

(mapconcat 'identity
           (org-element-map
               (org-element-parse-buffer)
               'src-block
             (lambda (element)
               (tag
                'src-block
                `((language . ,(org-element-property :language element)))
                (tag 'contents ()
                     (xml-escape-string
                      (org-element-property :value element))))))
           "")
<src-block language="emacs-lisp"><contents>(defun print-tag (name attrs &amp;optional closingp)
  &quot;Print an xml tag with symbol NAME and ATTRS (a cons list of (attribute . value)).
if CLOSINGP print the closing tag instead.&quot;
  (format
   &quot;&lt;%s%s%s&gt;&quot;
   (if closingp &quot;/&quot; &quot;&quot;)
   name
   (if (and attrs (not closingp))
       (concat
	&quot; &quot;
	(mapconcat
	 (lambda (x)
	   (format &quot;%s=\&quot;%s\&quot;&quot;
		   (car x)
		   (xml-escape-string (cdr x))))
	 attrs
	 &quot; &quot;))
     &quot;&quot;)))

(print-tag &apos;html &apos;((color . &quot;blue&quot;) (label . &quot;test&quot;)))
</contents></src-block><src-block language="emacs-lisp"><contents>(defmacro tag (name attributes &amp;rest body)
  `(format &quot;%s%s%s&quot;
	   (print-tag ,name ,attributes nil)
           (concat
	   ,@body)
	   (print-tag ,name nil t)))

(tag &quot;xml&quot; &apos;((test . &quot;id&quot;))
     (tag &quot;body&quot; nil
	  (tag &quot;p&quot; nil (xml-escape-string &quot;paragraph &amp; &lt; 1&quot;))
	  (tag &quot;p&quot; nil &quot;paragraph 2&quot;)))
</contents></src-block><src-block language="emacs-lisp"><contents>(mapconcat &apos;identity
	   (org-element-map
	       (org-element-parse-buffer)
	       &apos;src-block
	     (lambda (element)
	       (tag
		&apos;src-block
		`((language . ,(org-element-property :language element)))
		(tag &apos;contents ()
		     (xml-escape-string
		      (org-element-property :value element))))))
	   &quot;&quot;)
</contents></src-block><src-block language="emacs-lisp"><contents>(let ((xml (tag &apos;root `((filename . ,(buffer-file-name))
			(indexed-on . ,(current-time-string)))
		;; map the headlines
		(mapconcat
		 &apos;identity
		 (org-map-entries
		  (lambda ()
		    (let* ((tags (org-get-tags))
			   (heading-components (org-heading-components))
			   (title (nth 4 heading-components))
			   (level (nth 0 heading-components))
			   (properties (org-entry-properties))
			   (elem (org-element-at-point))
			   (bp (org-element-property :contents-begin elem))
			   (ep (org-element-property :contents-end elem))
			   (content (buffer-substring bp ep)))
		      (tag &apos;heading `((level . ,level))
			   (tag &apos;title () (xml-escape-string title))
			   (tag &apos;tags () (mapconcat &apos;identity tags &quot; &quot;))
			   (tag &apos;properties ()
				(mapconcat
				 (lambda (x)
				   (tag &apos;property `((label . (car ,x))) (cdr x)))
				 properties
				 &quot;&quot;))
			   (tag &apos;content ()
				(format &quot;%s&quot; (xml-escape-string content)))))))
		 &quot;&quot;)

		;; map specific element types
		(tag &apos;source-blocks ()
		     (mapconcat
		      &apos;identity
		      (org-element-map
			  (org-element-parse-buffer)
			  &apos;src-block
			(lambda (element)
			  (tag &apos;src-block
			       `((language .
					   ,(org-element-property
					     :language element)))
			       (tag &apos;contents ()
				    (xml-escape-string
				     (org-element-property :value element)))))) &quot;&quot;))

		(tag &apos;tables ()
		     (mapconcat
		      &apos;identity
		      (org-element-map
			  (org-element-parse-buffer)
			  &apos;table
			(lambda (element)
			  (tag &apos;table ()
			       (when (org-element-property :caption element)
				 (tag &apos;caption ()
				(caaar (org-element-property :caption element))))
			       (xml-escape-string
				(buffer-substring
				 (org-element-property :contents-begin element)
				 (org-element-property :contents-end element))))))
		      &quot;&quot;))

		(tag &apos;paragraphs ()
		     (mapconcat
		      &apos;identity
		      (org-element-map
			  (org-element-parse-buffer)
			  &apos;paragraph
			(lambda (element)
			  (tag &apos;paragraph ()
			       (xml-escape-string
				(buffer-substring
				 (org-element-property :contents-begin element)
				 (org-element-property :contents-end element))))))
		      &quot;&quot;
		      ))
		)))
  (with-temp-file &quot;org2xml.xml&quot;
    (insert xml)))
</contents></src-block><src-block language="emacs-lisp"><contents>(xml-parse-file &quot;org2xml.xml&quot;)
</contents></src-block>

So, finally we can map the entries to get some information about them, e.g. the tags, properties, todo state, etc… Then we create xml representing all that information so we can have a more precise search. Instead of looking for a word, we can specify that the word be in a property for example. Then, we make xml representations of the tables, src-blocks and paragraphs.

I am going to follow the example here that we worked out before on html and create a filter function that takes an org-file and spits out xml at the command line.

:;exec emacs -batch -l $0 -f main "$@"
(require 'org)
(require 'xml)

(defun print-tag (name attrs &optional closingp)
  "Print an xml tag with symbol NAME and ATTRS (a cons list of (attribute . value)).
if CLOSINGP print the closing tag instead.
You should use `xml-escape-string' on text going into the attributes to avoid errors."
  (format
   "<%s%s%s>"
   (if closingp "/" "")
   name
   (if (and attrs (not closingp))
       (concat
        " "
        (mapconcat
         (lambda (x)
           (format "%s=\"%s\"" (car x) (cdr x)))
           attrs
           " "))
     "")))

(defmacro tag (name attributes &rest body)
  `(format "%s%s%s"
           (print-tag ,name ,attributes nil)
           (concat
           ,@body)
           (print-tag ,name nil t)))

(defun main ()
  (find-file (car command-line-args-left))
  (princ (tag 'root `((filename . ,(buffer-file-name))
                      (indexed-on . ,(current-time-string)))
              ;; map the headlines
              (mapconcat
               'identity
               (org-map-entries
                (lambda ()
                  (let* ((tags (org-get-tags))
                         (heading-components (org-heading-components))
                         (todo (nth 2 heading-components))
                         (headline (nth 4 heading-components))
                         (thislevel (nth 0 heading-components))
                         (properties (org-entry-properties)))
                    (tag 'heading `((level . ,thislevel))
                         (tag 'headline () (xml-escape-string headline))
                         (tag 'tags () (mapconcat 'identity tags " "))
                         (when todo
                           (tag 'todo () todo))
                         (tag 'properties ()
                              (mapconcat
                               (lambda (x)
                                 (tag 'property `((name . ,(xml-escape-string (car x))))
                                      (xml-escape-string (cdr x))))
                               properties
                               ""))))))
               "")

              ;; get file keywords, TITLE, authors, etc...
              (tag 'file-keywords ()
                   (mapconcat 'identity
                              (org-element-map (org-element-parse-buffer 'element) 'keyword
                                (lambda (keyword)
                                  (tag (xml-escape-string (org-element-property :key keyword)) ()
                                       (xml-escape-string (org-element-property :value keyword)))))
                              ""))

              ;; map specific element types
              (tag 'source-blocks ()
                   (mapconcat
                    'identity
                    (org-element-map
                        (org-element-parse-buffer)
                        'src-block
                      (lambda (element)
                        (tag 'src-block
                             `((language .
                                         ,(org-element-property
                                           :language element)))
                             (tag 'contents ()
                                  (xml-escape-string
                                   (org-element-property :value element)))))) ""))

              (tag 'tables ()
                   (mapconcat
                    'identity
                    (org-element-map
                        (org-element-parse-buffer)
                        'table
                      (lambda (element)
                        (tag 'table ()
                             (when (org-element-property :caption element)
                               (tag 'caption ()
                                    (format
                                     "%s"
                                     (org-element-property
                                      :caption element))))
                             (xml-escape-string
                              (buffer-substring
                               (org-element-property :contents-begin element)
                               (org-element-property :contents-end element))))))
                    ""))

              (tag 'paragraphs ()
                   (mapconcat
                    'identity
                    (org-element-map
                        (org-element-parse-buffer)
                        'paragraph
                      (lambda (element)
                        (tag 'paragraph ()
                             (xml-escape-string
                              (buffer-substring
                               (org-element-property :contents-begin element)
                               (org-element-property :contents-end element))))))
                    ""
                    )))))

We could do more, e.g. links, or images, but this is pretty good for now. Now, let's configure a swish indexer. We instruct swish-e to use some metanames, and attributes so we can search on them later.

# Example configuration file

# Tell Swish-e what to directories to index
IndexDir /Users/jkitchin/blogofile-jkitchin.github.com/_site

# where to save the index
IndexFile /Users/jkitchin/blogofile-jkitchin.github.com/_blog/index-org2xml.swish-e

# What to index
IndexOnly .org

# Tell Swish-e that .txt files are to use the HTML parser.
IndexContents XML* .org

FileFilter .org /Users/jkitchin/blogofile-jkitchin.github.com/_blog/org2xml.el

# index all tags for searching
UndefinedMetaTags auto
UndefinedXMLAttributes auto

And now, run the index command. I did this at the command line. There might be some problems with the script as there were some warnings about non-zero exits, but there was only a few so we ignore them for now.

swish-e -c swish-org2xml.conf

1 Examples of searching for org-files

1.1 Files with words in the filename

Here we look for filenames with the word "Extracting" in them.

swish-e -f index-org2xml.swish-e -w root.filename=Extracting
# SWISH format: 2.4.7
# Search words: root.filename=Extracting
# Removed stopwords:
# Number of hits: 2
# Search time: 0.000 seconds
# Run time: 0.007 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2014/02/19/Extracting-bibtex-file-from-an-org-buffer.org "Extracting-bibtex-file-from-an-org-buffer.org" 6094
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/media/2014-02-19-Extracting-bibtex-file-from-an-org-buffer/notes.org "notes.org" 195515
.

Or, thanks to the date being in the path, we can find by year, How about July of 2012?

swish-e -f index-org2xml.swish-e -w root.filename="(2012/07)"
# SWISH format: 2.4.7
# Search words: root.filename=(2012/07)
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.007 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2012/07/15/Professor-Kitchin-was-awarded-the-Presidential-Early-Career-Award-for-Scientists-and-Engineers-(PECASE).org "Professor-Kitchin-was-awarded-the-Presidential-Early-Career-Award-for-Scientists-and-Engineers-(PECASE).org" 311
.

Interesting we have to use the parentheses here.

1.2 DONE Files with headlines containing a word

Now, lets find documents with "Compiled" in a heading title with level=2

swish-e -f index-org2xml.swish-e -w heading.level=2 title=Compiled -m5
# SWISH format: 2.4.7
# Search words: heading.level=2 title=Compiled
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.007 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/media/2014-07-12-Org-mode-is-awesome/why-org-mode.org "why-org-mode.org" 13522
.

1.3 Headlines marked TODO

We can find documents with headlines marked TODO:

swish-e -f index-org2xml.swish-e  -w "todo=TODO" -m 5
# SWISH format: 2.4.7
# Search words: todo=TODO
# Removed stopwords:
# Number of hits: 12
# Search time: 0.000 seconds
# Run time: 0.008 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/media/2014-01-27-Clocking-your-time-in-org-mode/blog.org "blog.org" 134160
624 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2014/02/16/A-dynamic-snippet-for-a-task-due-7-days-from-now.org "A-dynamic-snippet-for-a-task-due-7-days-from-now.org" 2587
425 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2014/02/16/END.org "END.org" 1531
269 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2015/02/01/Handling-multiple-selections-in-helm.org "Handling-multiple-selections-in-helm.org" 3290
269 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2015/01/30/More-adventures-in-helm---more-than-one-action.org "More-adventures-in-helm---more-than-one-action.org" 3236
.

1.4 For a table

so2-capacity-1

swish-e -f index-org2xml.swish-e -w table="energy"
# SWISH format: 2.4.7
# Search words: table=energy
# Removed stopwords:
# Number of hits: 2
# Search time: 0.000 seconds
# Run time: 0.007 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2014/08/21/Using-org-entries-like-a-database.org "Using-org-entries-like-a-database.org" 53035
633 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/07/04/Estimating-uncertainties-in-equations-of-state.org "Estimating-uncertainties-in-equations-of-state.org" 3117
.

1.5 Tagged headlines

Find entries with a "slide" tag.

swish-e -f index-org2xml.swish-e -w "tags=slide"
# SWISH format: 2.4.7
# Search words: tags=slide
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.009 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/media/2014-07-12-Org-mode-is-awesome/why-org-mode.org "why-org-mode.org" 13522
.

Evidently there is one file where I talk about slides in org-show.

1.6 Headlines with a property

Here I find documents with headlines that have thermodynamics in the property "categories".

swish-e -f index-org2xml.swish-e -w "property.label=categories property=thermodynamics"
# SWISH format: 2.4.7
# Search words: property.label=categories property=thermodynamics
# Removed stopwords:
# Number of hits: 10
# Search time: 0.000 seconds
# Run time: 0.009 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/02/01/Water-gas-shift-equilibria-via-the-NIST-Webbook.org "Water-gas-shift-equilibria-via-the-NIST-Webbook.org" 10789
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/03/01/Gibbs-energy-minimization-and-the-NIST-webbook.org "Gibbs-energy-minimization-and-the-NIST-webbook.org" 5441
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/03/01/Finding-equilibrium-composition-by-direct-minimization-of-Gibbs-free-energy-on-mole-numbers.org "Finding-equilibrium-composition-by-direct-minimization-of-Gibbs-free-energy-on-mole-numbers.org" 6155
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/02/27/Reading-parameter-database-text-files-in-python.org "Reading-parameter-database-text-files-in-python.org" 3947
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/02/18/The-Gibbs-free-energy-of-a-reacting-mixture-and-the-equilibrium-composition.org "The-Gibbs-free-energy-of-a-reacting-mixture-and-the-equilibrium-composition.org" 8230
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/02/18/Calculating-a-bubble-point-pressure-of-a-mixture.org "Calculating-a-bubble-point-pressure-of-a-mixture.org" 3203
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/02/15/The-equal-area-method-for-the-van-der-Waals-equation.org "The-equal-area-method-for-the-van-der-Waals-equation.org" 5737
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/02/12/Using-constrained-optimization-to-find-the-amount-of-each-phase-present.org "Using-constrained-optimization-to-find-the-amount-of-each-phase-present.org" 5210
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/02/05/Constrained-minimization-to-find-equilibrium-compositions.org "Constrained-minimization-to-find-equilibrium-compositions.org" 5666
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2014/09/23/Generating-an-atomic-stoichiometric-matrix.org "Generating-an-atomic-stoichiometric-matrix.org" 3487
.

That seems about right, according to http://kitchingroup.cheme.cmu.edu/categories.html there are 9 documents. I am not sure why they don't totally agree, but I can live with it.

Here are documents containing headlines with the property "TOTAL_ENERGY"

swish-e -f index-org2xml.swish-e -w property.label=TOTAL_ENERGY
# SWISH format: 2.4.7
# Search words: property.label=TOTAL_ENERGY
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.008 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2014/08/21/Using-org-entries-like-a-database.org "Using-org-entries-like-a-database.org" 53035
.

1.7 Documents with a Python source block containing a word

Find org files with diffusion in a python source block.

swish-e -f index-org2xml.swish-e -w src-block.language=python -w src-block=diffusion
# SWISH format: 2.4.7
# Search words: src-block.language=python src-block=diffusion
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.011 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2013/04/02/Transient-diffusion---partial-differential-equations.org "Transient-diffusion---partial-differential-equations.org" 3660
.

1.8 An org-file with a UUID

swish-e -f index-org2xml.swish-e -w  property="(38FCCF3D-7FC5-49BF-BB77-486BBAA17CD9)"
# SWISH format: 2.4.7
# Search words: property=(38FCCF3D-7FC5-49BF-BB77-486BBAA17CD9)
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.007 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/org/2014/11/23/Machine-gradable-quizzes-in-emacs+org-modex.org "Machine-gradable-quizzes-in-emacs+org-modex.org" 5743
.

Interesting, again the parentheses are necessary to find a match. I think because of the dashes. The next example is similar, but finds an entry with that bibtex key in a CUSTOM_ID property.

swish-e -f index-org2xml.swish-e -w  property="(mantina-2008-first-princ)"
# SWISH format: 2.4.7
# Search words: property=(mantina-2008-first-princ)
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.010 seconds
1000 /Users/jkitchin/blogofile-jkitchin.github.com/_site/media/2014-02-19-Extracting-bibtex-file-from-an-org-buffer/notes.org "notes.org" 195515
.

2 Summary

This is pretty cool. There are still some bugs to work out in the indexing filter I think, but this demonstrates you can index org-files, and have pretty refined searches to find your files. There is still some thinking to do on how to schedule an incremental indexing, and whether we need more or better metanames. The indexing is not fast, but that is probably because I am running this through a FileFilter, rather than the -s prog option in swish-e. This is super promising to me though. Imagine building an agenda from files found with TODO headlines in them; a global todo list! Or, grabbing contacts from wherever they are. No more losing files you have not used in a while. Find all documents containing a citation. With some extra work, you could index links, citations, chemical formulas , or other types of identifiable content.

The logical conclusion of this work might be an ox-swish-e-xml export engine to render the org-file into xml, rather than the script I used here. It would be really great to get some refined output, e.g. rather than just get matching documents, get location information so you could open the document to the matching element. That might be out of reach for swish-e, but could be in reach for other programs like Sphinx that are more integrated with a database. There is a very interesting project here: https://github.com/wvxvw/sphinx-mode to integrate org-mode with the Sphinx search (http://sphinxsearch.com ) engine.

Copyright (C) 2015 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter
« Previous Page -- Next Page »