Generating an alphabetized list of collaborators from the past five years

| categories: python, scopus | tags:

Almost every proposal I write requires some list of my coauthors from the past several years. Some want the list alphabetized, and some want affiliations too. It has always bothered me to make this list, mostly because it is tedious, and it seems like something that should not be hard to generate. It turns out it is not too hard. I have been developing a Python interface ((https://github.com/jkitchin/scopus )) to Scopus that more or less enables me to script this.

Scopus is not free. You need either a license, or institutional access to use it. Here is the strategy to generate my list of coauthors. First, we need to get the articles for the past 5 years that are mine, and for each paper we get the coauthors. I use my Scopus author id in the query, and then sort the names alphabetically into a table. Then, I use that table as input to a second code block that does an author query in Scopus to get the current affiliations. Here is the code.

from scopus.scopus_api import ScopusAbstract
from scopus.scopus_search import ScopusSearch

s = ScopusSearch('AU-ID(7004212771) AND PUBYEAR > 2010')

coauthors = {}
for eid in s.EIDS:
    ab = ScopusAbstract(eid)
    for au in ab.authors:
        if au.auid not in coauthors and au.auid != '7004212771':
            coauthors[au.auid] = au.indexed_name

return sorted([[auid, name] for auid,name in coauthors.items()], key=lambda x:x[1])
52463103500 Akhade S.A.
6506329719 Albenze E.
36472906200 Alesi W.R.
56963752500 Anna S.L.
56522803500 Boes J.R.
26433085700 Calle-Vallejo F.
54973276000 Chao R.
7201800897 Collins T.J.
54883867200 Curnan M.T.
7003584159 Damodaran K.
55328415000 Demeter E.L.
37005464900 Dsilva C.
18037364800 Egbebi A.
35603120700 Eslick J.C.
56673468200 Fan Q.
24404182600 Frenkel A.I.
35514271900 Gellman A.J.
12803603300 Gerdes K.
54585146800 Gumuslu G.
55569145100 Hallenbeck A.P.
24316829300 Hansen H.A.
56009239000 Hilburg S.L.
55676869000 Hopkinson D.
56674328100 Illes S.M.
23479647900 Inoglu N.G.
6603398169 Jaramillo T.F.
8054222900 Joshi Y.V.
47962378000 Keturakis C.
57056061900 Kondratyuk P.
55391991800 Kondratyuk P.
7006205398 Koper M.T.M.
23004637900 Kusuma V.A.
35787409400 Landon J.
55005205100 Lee A.S.
6701399651 Luebke D.R.
35491189200 Man I.C.
27467500000 Mantripragada H.
55373026900 Mao J.X.
55210428500 Marks A.
27667815700 Martinez J.I.
56071079300 Mehta P.
56673592900 Michael J.D.
55772901000 Miller D.C.
7501599910 Miller J.B.
26032231600 Miller S.D.
35576929100 Morreale B.
55308251800 Munprom R.
14036290400 Myers C.R.
7007042214 Norskov J.K.
24081524800 Nulwala H.B.
56347288000 Petrova R.
7006208748 Pushkarev V.V.
56591664500 Raman S.
7004217247 Resnik K.P.
47962694800 Richard Alesi Jr. W.
9742604300 Rossmeisl J.
7201763336 Rubin E.S.
6602471339 Sabolsky E.M.
7004541416 Salvador P.A.
22981503200 Shi W.
55885836600 Siefert N.S.
25224517700 Su H.-Y.
57016792200 Thirumalai H.
8724572500 Thompson R.L.
8238710700 Vasic R.
37081979100 Versteeg P.
7006804734 Wachs I.E.
6701692232 Washburn N.R.
56542538800 Watkins J.D.
55569461200 Xu Z.
56424861600 Yin C.
56969809500 Zhou X.

It is worth inspecting this list for duplicates. I see at least two duplicates. That is a limitation of almost every indexing service I have seen. Names are hard to disambiguate. I will live with it. Now, we will use another query to get affiliations, and the names. Since we use a sorted list from above, these names are in alphabetical order. We exclude co-authors from Carnegie Mellon University since these are often my students, or colleagues, and they are obvious conflicts of interest for proposal reviewing anyway. I split the current affiliation on a comma, since it appears the institution comes first, followed by the department. We only need an institution here.

from scopus.scopus_author import ScopusAuthor

coauthors = [ScopusAuthor(auid) for auid, name in data]

print(', '.join(['{0} ({1})'.format(au.name, au.current_affiliation.split(',')[0])
                 for au in coauthors
                 if au.current_affiliation.split(',')[0] != 'Carnegie Mellon University']))
Sneha A. Akhade (Pennsylvania State University), Erik J. Albenze (National Energy Technology Laboratory), Federico Calle-Vallejo (Leiden Institute of Chemistry), Robin Chao (National Energy Technology Laboratory), Krishnan V. Damodaran (University of Pittsburgh), Carmeline J. Dsilva (Princeton University), Adefemi A. Egbebi (URS), John C. Eslick (National Energy Technology Laboratory), Anatoly I. Frenkel (Yeshiva University), Kirk R. Gerdes (National Energy Technology Laboratory), Heine Anton Hansen (Danmarks Tekniske Universitet), David P. Hopkinson (National Energy Technology Laboratory), Thomas Francisco Jaramillo (Fermi National Accelerator Laboratory), Yogesh V. Joshi (Exxon Mobil Research and Engineering), Christopher J. Keturakis (Lehigh University), Marc T M Koper (Leiden Institute of Chemistry), Victor A. Kusuma (National Energy Technology Laboratory), James Landon (University of Kentucky), David R. Luebke (Liquid Ion Solutions), Isabelacostinela Man (Universitatea din Bucuresti), James X. Mao (University of Pittsburgh), José Ignacio Martínez (CSIC - Instituto de Ciencia de Materiales de Madrid (ICMM)), David C M Miller (National Energy Technology Laboratory), Bryan D. Morreale (National Energy Technology Laboratory), Christina R. Myers (National Energy Technology Laboratory), Jens Kehlet Nørskov (Stanford Linear Accelerator Center), Rumyana V. Petrova (International Iberian Nanotechnology Laboratory), Vladimir V. Pushkarev (Dow Corning Corporation), Sumathy Raman (Exxon Mobil Research and Engineering), Kevin P. Resnik (URS), Walter Richard Alesi (National Energy Technology Laboratory), Jan Rossmeisl (Kobenhavns Universitet), Edward M. Sabolsky (West Virginia University), Wei Shi (University of Pittsburgh), Nicholas S. Siefert (National Energy Technology Laboratory), Haiyan Su (Dalian Institute of Chemical Physics Chinese Academy of Sciences), Robert Lee Thompson (University of Pittsburgh Medical Center), Relja Vasić (SUNY College of Nanoscale Science and Engineering), Israel E. Wachs (Lehigh University), John D. Watkins (National Energy Technology Laboratory), Chunrong Yin (United States Department of Energy), Xu Zhou (Liquid Ion Solutions)

This is pretty sweet. I could pretty easily create a query that had all the PIs on a proposal, and alphabetize everyone's coauthors, or print them to a CSV file for import to Excel, or whatever format is required for conflict of interest reporting. The list is not perfect, but it is easy to manually fix it here.

That little bit of code is wrapped in a command-line utility in the scopus Python package. You use it like this. Just run it every time you need an updated list of coauthors! It isn't super flexible for now, e.g. excluding multiple affiliations, including multiple authors, etc… isn't fully supported.

./scopus_coauthors 7004212771 2010 --exclude-affiliation="Carnegie Mellon University"
Sneha A. Akhade (Pennsylvania State University), Erik J. Albenze (National Energy Technology Laboratory), Federico Calle-Vallejo (Leiden Institute of Chemistry), Robin Chao (National Energy Technology Laboratory), Krishnan V. Damodaran (University of Pittsburgh), Carmeline J. Dsilva (Princeton University), Adefemi A. Egbebi (URS), John C. Eslick (National Energy Technology Laboratory), Anatoly I. Frenkel (Yeshiva University), Kirk R. Gerdes (National Energy Technology Laboratory), Heine Anton Hansen (Danmarks Tekniske Universitet), David P. Hopkinson (National Energy Technology Laboratory), Thomas Francisco Jaramillo (Fermi National Accelerator Laboratory), Yogesh V. Joshi (Exxon Mobil Research and Engineering), Christopher J. Keturakis (Lehigh University), Marc T M Koper (Leiden Institute of Chemistry), Victor A. Kusuma (National Energy Technology Laboratory), James Landon (University of Kentucky), David R. Luebke (Liquid Ion Solutions), Isabelacostinela Man (Universitatea din Bucuresti), James X. Mao (University of Pittsburgh), José Ignacio Martínez (CSIC - Instituto de Ciencia de Materiales de Madrid (ICMM)), David C M Miller (National Energy Technology Laboratory), Bryan D. Morreale (National Energy Technology Laboratory), Christina R. Myers (National Energy Technology Laboratory), Jens Kehlet Nørskov (Stanford Linear Accelerator Center), Rumyana V. Petrova (International Iberian Nanotechnology Laboratory), Vladimir V. Pushkarev (Dow Corning Corporation), Sumathy Raman (Exxon Mobil Research and Engineering), Kevin P. Resnik (URS), Walter Richard Alesi (National Energy Technology Laboratory), Jan Rossmeisl (Kobenhavns Universitet), Edward M. Sabolsky (West Virginia University), Wei Shi (University of Pittsburgh), Nicholas S. Siefert (National Energy Technology Laboratory), Haiyan Su (Dalian Institute of Chemical Physics Chinese Academy of Sciences), Robert Lee Thompson (University of Pittsburgh Medical Center), Relja Vasić (SUNY College of Nanoscale Science and Engineering), Israel E. Wachs (Lehigh University), John D. Watkins (National Energy Technology Laboratory), Chunrong Yin (United States Department of Energy), Xu Zhou (Liquid Ion Solutions)

Copyright (C) 2016 by John Kitchin. See the License for information about copying.

org-mode source

Org-mode version = 8.2.10

Discuss on Twitter