Converting a DOI to other scientific identifiers in Pubmed
Posted June 09, 2015 at 07:29 AM | categories: orgmode, ref | tags:
Sometimes it is useful to convert a DOI to another type of identifier. For example, in this post we converted a DOI to a Scopus EID, and in this one we got the WOS accession number from a DOI. Today, we consider how to get Pubmed identifiers. Pubmed provides an API for this purpose:
http://www.ncbi.nlm.nih.gov/pmc/tools/id-converter-api/
We will use the DOI tool. According to the documentation, we need to form a URL like this:
We will call our tool "org-ref" and use the value of user-mail-address. The URL above returns XML, so we can parse it, and then extract the identifiers. This is a simple http GET request, which we can construct using url-retrieve-synchronously. Here is what we get.
(let* ((url-request-method "GET") (doi"10.1093/nar/gks1195") (my-tool "org-ref") (url (format "http://www.ncbi.nlm.nih.gov/pmc/utils/idconv/v1.0/?tool=%s&email=%s&ids=%s" my-tool user-mail-address doi)) (xml (with-current-buffer (url-retrieve-synchronously url) (xml-parse-region url-http-end-of-headers (point-max))))) xml)
((pmcids ((status . "ok")) "\n" (request ((idtype . "doi") (dois . "") (versions . "yes") (showaiid . "no")) "\n" (echo nil "tool=org-ref;email=jkitchin%40andrew.cmu.edu;ids=10.1093%2Fnar%2Fgks1195") "\n") "\n" (record ((requested-id . "10.1093/NAR/GKS1195") (pmcid . "PMC3531190") (pmid . "23193287") (doi . "10.1093/nar/gks1195")) (versions nil (version ((pmcid . "PMC3531190.1") (current . "true"))))) "\n"))
The parsed xml is now just an emacs-lisp data structure. We need to get the record, and then get the attributes of it to extract the identifiers. Next, we create a plist of the identifiers. For fun, we add the Scopus EID and WOS accession number from the previous posts too.
(let* ((url-request-method "GET") (doi"10.1093/nar/gks1195") (my-tool "org-ref") (url (format "http://www.ncbi.nlm.nih.gov/pmc/utils/idconv/v1.0/?tool=%s&email=%s&ids=%s" my-tool user-mail-address doi)) (xml (car (with-current-buffer (url-retrieve-synchronously url) (xml-parse-region url-http-end-of-headers (point-max))))) (record (first (xml-get-children xml 'record))) (doi (xml-get-attribute record 'doi)) (pmcid (xml-get-attribute record 'pmcid)) (pmid (xml-get-attribute record 'pmid))) (list :doi doi :pmid pmid :pmcid pmcid :eid (scopus-doi-to-eid doi) :wos (wos-doi-to-accession-number doi)))
(:doi "10.1093/nar/gks1195" :pmid "23193287" :pmcid "PMC3531190" :eid "2-s2.0-80053651587" :wos "000312893300006")
Well, there you have it, four new scientific document ids from one DOI. Of course we have defined org-mode links for each one of these:
I have not tested this on too many DOIs yet. Not all of them are indexed by Pubmed.
Copyright (C) 2015 by John Kitchin. See the License for information about copying.
Org-mode version = 8.2.10