* DONE Another approach to embedding org-source in html CLOSED: [2015-05-09 Sat 19:19] :PROPERTIES: :date: 2015/05/09 19:19:10 :updated: 2015/05/10 09:34:55 :categories: orgmode, data :END: In this [[http://kitchingroup.cheme.cmu.edu/blog/2015/05/09/An-alternative-approach-to-including-org-source-in-blog-posts/][post]] I examined a way to embed the org-source in a comment in the html of the post, and developed a reasonably convenient way to extract the source in emacs. One downside of the approach was the need to escape at least the dashes, and then unescape them on extraction. I came across another idea, which is to put the org-source in base64 encoded form in a [[http://en.wikipedia.org/wiki/Data_URI_scheme][data uri]]. First let us see what the encoding means: #+BEGIN_SRC emacs-lisp (base64-encode-string "") #+END_SRC #+RESULTS: : PCEtLSB0ZXN0LS0+ And decoding: #+BEGIN_SRC emacs-lisp (base64-decode-string "PCEtLSB0ZXN0LS0+") #+END_SRC #+RESULTS: : The encoding looks random, but it is reversible. More importantly, it probably will not have any html like characters in it that need escaped. The idea of a data uri is that the data it serves is embedded in the URL href attribute. This is basically how to make a data uri. We give the url here a class so we can find it later. #+BEGIN_EXAMPLE source #+END_EXAMPLE Here is the actual html for the browser. If you click on it, your browser automatically decodes it for you! #+BEGIN_HTML source #+END_HTML So, during the blog publish step, we just need to add this little step to the html generation, and it will be included as a data uri. Here is the function that generates the data uri for us, and example of using it. The encoded source is not at all attractive to look at it, but you almost never need to look at it, it is invisible in the browser. Interestingly, if you click on the link, you will see the org source right in your browser! #+BEGIN_SRC emacs-lisp :results html (defun source-data-uri (source) "Encode the string in SOURCE to a data uri." (format "source" (base64-encode-string source))) (source-data-uri (buffer-string)) #+END_SRC #+RESULTS: #+BEGIN_HTML source #+END_HTML Now, we integrate it into the blogofile function: #+BEGIN_SRC emacs-lisp (defun bf-get-post-html () "Return a string containing the YAML header, the post html, my copyright line, and a link to the org-source code." (interactive) (let ((org-source (buffer-string)) (url-to-org (bf-get-url-to-org-source)) (yaml (bf-get-YAML-heading)) (body (bf-get-HTML))) (with-temp-buffer (insert yaml) (insert body) (insert (format "
Copyright (C) %s by John Kitchin. See the License for information about copying.
" (format-time-string "%Y"))) (insert (format "
" url-to-org)) (insert (format "
Org-mode version = %s
" (org-version))) ;; this is the only new code we need to add. (insert (source-data-uri org-source)) ;; return value (buffer-string)))) #+END_SRC Now we need a new adaptation of the grab-org-source function. We still need a regexp search to get the source, and we still need to decode it. #+BEGIN_SRC emacs-lisp (defun grab-org-source (url) "Extract org-source from URL to a buffer named *grab-org-source*." (interactive "sURL: ") (switch-to-buffer (get-buffer-create "*grab-org-source*")) (erase-buffer) (org-mode) (insert (with-current-buffer (url-retrieve-synchronously url) (let (start) (re-search-forward "" nil t) (base64-decode-string (match-string 1)))))) #+END_SRC What else could we do with this? One idea would be to generate data uris for each code block that you could open in your browser. For example, here we generate a list of data uris for each code block in the buffer. We don't take care to label them or make it easy to see what they are, but if you click on one, you should see a plain text version of the block. If this is done a lot, it might even make sense to change the mime type to download the code in some native app. #+BEGIN_SRC emacs-lisp :results html (org-element-map (org-element-parse-buffer) 'src-block (lambda (src-block) (source-data-uri (org-element-property :value src-block)))) #+END_SRC #+RESULTS: #+BEGIN_HTML (source source source source source source) #+END_HTML I am not sure if this is better or worse than the other approach. I have not tested it very thoroughly, but it seems like it should work pretty generally. I imagine you could also embed other kinds of files in the html, if for some reason you did not want to put the files on your server. Overall this seems to lack some elegance in searching for data, e.g. like [[http://en.wikipedia.org/wiki/Embedded_RDF][RDF]] or [[http://en.wikipedia.org/wiki/RDFa][RDFa]] is supposed to enable, but it might be a step in that direction, using org-mode and Emacs as the editor.