Getting a Scopus EID from a DOI

| categories: orgmode, ref | tags:

Scopus is a scientific literature indexing and search engine service run by Elsevier. I have been integrating Scopus workflows into Emacs and org-ref. Scopus seems to work with their own digital identifiers, known as an EID. I usually have a DOI to work with. Here, we develop a way to get an EID from a DOI using the Scopus API. You need to get your own Scopus API key here: http://dev.elsevier.com/myapikey.html and set scopus-api-key in Emacs to use this code.

Once we have an EID, here are a few interesting things we can do with them. This is an EID: 2-s2.0-84881394200, for this reference:

Hallenbeck, Alexander P. and Kitchin, John R., "Effects of \ce{O_2} and \ce{SO_2} on the capture capacity of a primary-amine based polymeric \ce{CO_2} sorbent", Industrial & Engineering Chemistry Research, 52:10788-10794 (2013)

With the EID, we can construct a URL to the Scopus document page:

(let ((eid "2-s2.0-84881394200"))
  (format "http://www.scopus.com/record/display.url?eid=%s&origin=resultslist" eid))
http://www.scopus.com/record/display.url?eid=2-s2.0-84881394200&origin=resultslist

We can construct a URL to citing documents:

(let ((eid "2-s2.0-84881394200"))
  (format "http://www.scopus.com/results/citedbyresults.url?sort=plf-f&cite=%s&src=s&imp=t&sot=cite&sdt=a&sl=0&origin=recordpage" eid))
http://www.scopus.com/results/citedbyresults.url?sort=plf-f&cite=2-s2.0-84881394200&src=s&imp=t&sot=cite&sdt=a&sl=0&origin=recordpage

And there are three types of related document urls we can create: by author, keyword or references.

By authors:

(let ((eid "2-s2.0-84881394200"))
  (format (concat "http://www.scopus.com/search/submit/mlt.url"
                  "?eid=%s&src=s&all=true&origin=recordpage"
                  "&method=aut&zone=relatedDocuments")
            eid))

http://www.scopus.com/search/submit/mlt.url?eid=2-s2.0-84881394200&src=s&all=true&origin=recordpage&method=aut&zone=relatedDocuments

By keywords:

(let ((eid "2-s2.0-84881394200"))
  (format (concat "http://www.scopus.com/search/submit/mlt.url"
                  "?eid=%s&src=s&all=true&origin=recordpage"
                  "&method=key&zone=relatedDocuments")
          eid))

http://www.scopus.com/search/submit/mlt.url?eid=2-s2.0-84881394200&src=s&all=true&origin=recordpage&method=key&zone=relatedDocuments

And by references:

(let ((eid "2-s2.0-84881394200"))
  (format (concat  "http://www.scopus.com/search/submit/mlt.url?"
                   "eid=%s&src=s&all=true&origin=recordpage"
                   "&method=ref&zone=relatedDocuments")
           eid))

http://www.scopus.com/search/submit/mlt.url?eid=2-s2.0-84881394200&src=s&all=true&origin=recordpage&method=ref&zone=relatedDocuments

We can generate all those on the fly if we have an EID. The problem is that we usually have the DOI, not the EID. So, here we use the Scopus API to retrieve that. Basically, we just do a search on the DOI, assume one and only one is found, and get the EID from the results. The DOI we have for the reference considered here is doi:10.1021/ie400582a.

The gist of what we will do is send an http request to Scopus with our API key, and data specifying what to get. Scopus will return data to us in either json or xml, depending on what we ask for.

I find json easiest to deal with, so we first work it out in json. We use the Scopus search API and query on the doi here. We get back json data which we read as an emacs-lisp plist, and extract the eid from it.

(let* ((doi "10.1021/ie400582a")
       (url-request-method "GET")
       (url-mime-accept-string "application/json")
       (url-request-extra-headers  (list (cons "X-ELS-APIKey" *scopus-api-key*)
                                         '("field" . "eid")))
       (url (format  "http://api.elsevier.com/content/search/scopus?query=doi(%s)" doi))
       (json-object-type 'plist)
       (json-data (with-current-buffer  (url-retrieve-synchronously url)
                    (json-read-from-string
                     (buffer-substring url-http-end-of-headers (point-max))))))
 (plist-get (elt (plist-get (plist-get json-data :search-results) :entry) 0) :eid))
2-s2.0-84881394200

That is the EID we were looking for. Here, we just wrap that code in a function so it is easier to reuse.

(defun scopus-doi-to-eid-json (doi)
  "Return a parsed xml from the Scopus article retrieval api for DOI.
This does not always seem to work for the most recent DOIs."
  (let* ((url-request-method "GET")
         (url-mime-accept-string "application/json")
         (url-request-extra-headers  (list (cons "X-ELS-APIKey" *scopus-api-key*)
                                           '("field" . "eid")))
         (url (format  "http://api.elsevier.com/content/search/scopus?query=doi(%s)" doi))
         (json-object-type 'plist)
         (json-data (with-current-buffer  (url-retrieve-synchronously url)
                      (json-read-from-string
                       (buffer-substring url-http-end-of-headers (point-max))))))
    (plist-get (elt (plist-get (plist-get json-data :search-results) :entry) 0) :eid)))

(scopus-doi-to-eid "10.1021/ie400582a")

XML is the native format in the Scopus API. They say that json works most of the time, but some XML cannot be rendered as json. Here we use the XML returned to get the EID. It is less intuitive to me, but mostly because I have used it less. I don't think you can specify and XPATH like you can in Python.

(let* ((doi "10.1021/ie400582a")
       (url-request-method "GET")
       (url-mime-accept-string "application/xml")
       (url-request-extra-headers  (list (cons "X-ELS-APIKey" *scopus-api-key*)
                                         '("field" . "eid")))
       (url (format  "http://api.elsevier.com/content/search/scopus?query=doi(%s)" doi))
       (xml (with-current-buffer  (url-retrieve-synchronously url)
              (xml-parse-region url-http-end-of-headers (point-max))))
       (results (car xml))
       (entry (car (xml-get-children results 'entry))))
  (car (xml-node-children (car (xml-get-children entry 'eid)))))
2-s2.0-84881394200

Now we wrap this in a function for reusability.

(defun scopus-doi-to-eid (doi)
  "Get a Scopus eid from a DOI."
  (let* ((url-request-method "GET")
         (url-mime-accept-string "application/xml")
         (url-request-extra-headers  (list (cons "X-ELS-APIKey" *scopus-api-key*)
                                           '("field" . "eid")))
         (url (format  "http://api.elsevier.com/content/search/scopus?query=doi(%s)" doi))
         (xml (with-current-buffer  (url-retrieve-synchronously url)
                (xml-parse-region url-http-end-of-headers (point-max))))
         (results (car xml))
         (entry (car (xml-get-children results 'entry))))
    (car (xml-node-children (car (xml-get-children entry 'eid))))))

(scopus-doi-to-eid "10.1021/ie400582a")
2-s2.0-84881394200

This code is wrapped up in org-ref/scopus.el . It provides a new org-mode eid link, e.g. eid:2-s2.0-84881394200 which is functional and provides access to the citing and related article Scopus pages for that eid.

There are also new links and functions for a alloy Au segregation and auth(kitchin) and title(segregation).

Let's not forget the scopusid:7004212771 link to Scopus Author pages.

Now you can use org-mode for reproducible scientific literature searching in Scopus!

Copyright (C) 2015 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter