Generating an alphabetized list of collaborators from the past five years
Posted February 20, 2016 at 05:03 PM | categories: scopus, python | tags:
Updated February 20, 2016 at 05:24 PM
Almost every proposal I write requires some list of my coauthors from the past several years. Some want the list alphabetized, and some want affiliations too. It has always bothered me to make this list, mostly because it is tedious, and it seems like something that should not be hard to generate. It turns out it is not too hard. I have been developing a Python interface ((https://github.com/jkitchin/scopus )) to Scopus that more or less enables me to script this.
Scopus is not free. You need either a license, or institutional access to use it. Here is the strategy to generate my list of coauthors. First, we need to get the articles for the past 5 years that are mine, and for each paper we get the coauthors. I use my Scopus author id in the query, and then sort the names alphabetically into a table. Then, I use that table as input to a second code block that does an author query in Scopus to get the current affiliations. Here is the code.
from scopus.scopus_api import ScopusAbstract from scopus.scopus_search import ScopusSearch s = ScopusSearch('AU-ID(7004212771) AND PUBYEAR > 2010') coauthors = {} for eid in s.EIDS: ab = ScopusAbstract(eid) for au in ab.authors: if au.auid not in coauthors and au.auid != '7004212771': coauthors[au.auid] = au.indexed_name return sorted([[auid, name] for auid,name in coauthors.items()], key=lambda x:x[1])
52463103500 | Akhade S.A. |
6506329719 | Albenze E. |
36472906200 | Alesi W.R. |
56963752500 | Anna S.L. |
56522803500 | Boes J.R. |
26433085700 | Calle-Vallejo F. |
54973276000 | Chao R. |
7201800897 | Collins T.J. |
54883867200 | Curnan M.T. |
7003584159 | Damodaran K. |
55328415000 | Demeter E.L. |
37005464900 | Dsilva C. |
18037364800 | Egbebi A. |
35603120700 | Eslick J.C. |
56673468200 | Fan Q. |
24404182600 | Frenkel A.I. |
35514271900 | Gellman A.J. |
12803603300 | Gerdes K. |
54585146800 | Gumuslu G. |
55569145100 | Hallenbeck A.P. |
24316829300 | Hansen H.A. |
56009239000 | Hilburg S.L. |
55676869000 | Hopkinson D. |
56674328100 | Illes S.M. |
23479647900 | Inoglu N.G. |
6603398169 | Jaramillo T.F. |
8054222900 | Joshi Y.V. |
47962378000 | Keturakis C. |
57056061900 | Kondratyuk P. |
55391991800 | Kondratyuk P. |
7006205398 | Koper M.T.M. |
23004637900 | Kusuma V.A. |
35787409400 | Landon J. |
55005205100 | Lee A.S. |
6701399651 | Luebke D.R. |
35491189200 | Man I.C. |
27467500000 | Mantripragada H. |
55373026900 | Mao J.X. |
55210428500 | Marks A. |
27667815700 | Martinez J.I. |
56071079300 | Mehta P. |
56673592900 | Michael J.D. |
55772901000 | Miller D.C. |
7501599910 | Miller J.B. |
26032231600 | Miller S.D. |
35576929100 | Morreale B. |
55308251800 | Munprom R. |
14036290400 | Myers C.R. |
7007042214 | Norskov J.K. |
24081524800 | Nulwala H.B. |
56347288000 | Petrova R. |
7006208748 | Pushkarev V.V. |
56591664500 | Raman S. |
7004217247 | Resnik K.P. |
47962694800 | Richard Alesi Jr. W. |
9742604300 | Rossmeisl J. |
7201763336 | Rubin E.S. |
6602471339 | Sabolsky E.M. |
7004541416 | Salvador P.A. |
22981503200 | Shi W. |
55885836600 | Siefert N.S. |
25224517700 | Su H.-Y. |
57016792200 | Thirumalai H. |
8724572500 | Thompson R.L. |
8238710700 | Vasic R. |
37081979100 | Versteeg P. |
7006804734 | Wachs I.E. |
6701692232 | Washburn N.R. |
56542538800 | Watkins J.D. |
55569461200 | Xu Z. |
56424861600 | Yin C. |
56969809500 | Zhou X. |
It is worth inspecting this list for duplicates. I see at least two duplicates. That is a limitation of almost every indexing service I have seen. Names are hard to disambiguate. I will live with it. Now, we will use another query to get affiliations, and the names. Since we use a sorted list from above, these names are in alphabetical order. We exclude co-authors from Carnegie Mellon University since these are often my students, or colleagues, and they are obvious conflicts of interest for proposal reviewing anyway. I split the current affiliation on a comma, since it appears the institution comes first, followed by the department. We only need an institution here.
from scopus.scopus_author import ScopusAuthor coauthors = [ScopusAuthor(auid) for auid, name in data] print(', '.join(['{0} ({1})'.format(au.name, au.current_affiliation.split(',')[0]) for au in coauthors if au.current_affiliation.split(',')[0] != 'Carnegie Mellon University']))
Sneha A. Akhade (Pennsylvania State University), Erik J. Albenze (National Energy Technology Laboratory), Federico Calle-Vallejo (Leiden Institute of Chemistry), Robin Chao (National Energy Technology Laboratory), Krishnan V. Damodaran (University of Pittsburgh), Carmeline J. Dsilva (Princeton University), Adefemi A. Egbebi (URS), John C. Eslick (National Energy Technology Laboratory), Anatoly I. Frenkel (Yeshiva University), Kirk R. Gerdes (National Energy Technology Laboratory), Heine Anton Hansen (Danmarks Tekniske Universitet), David P. Hopkinson (National Energy Technology Laboratory), Thomas Francisco Jaramillo (Fermi National Accelerator Laboratory), Yogesh V. Joshi (Exxon Mobil Research and Engineering), Christopher J. Keturakis (Lehigh University), Marc T M Koper (Leiden Institute of Chemistry), Victor A. Kusuma (National Energy Technology Laboratory), James Landon (University of Kentucky), David R. Luebke (Liquid Ion Solutions), Isabelacostinela Man (Universitatea din Bucuresti), James X. Mao (University of Pittsburgh), José Ignacio Martínez (CSIC - Instituto de Ciencia de Materiales de Madrid (ICMM)), David C M Miller (National Energy Technology Laboratory), Bryan D. Morreale (National Energy Technology Laboratory), Christina R. Myers (National Energy Technology Laboratory), Jens Kehlet Nørskov (Stanford Linear Accelerator Center), Rumyana V. Petrova (International Iberian Nanotechnology Laboratory), Vladimir V. Pushkarev (Dow Corning Corporation), Sumathy Raman (Exxon Mobil Research and Engineering), Kevin P. Resnik (URS), Walter Richard Alesi (National Energy Technology Laboratory), Jan Rossmeisl (Kobenhavns Universitet), Edward M. Sabolsky (West Virginia University), Wei Shi (University of Pittsburgh), Nicholas S. Siefert (National Energy Technology Laboratory), Haiyan Su (Dalian Institute of Chemical Physics Chinese Academy of Sciences), Robert Lee Thompson (University of Pittsburgh Medical Center), Relja Vasić (SUNY College of Nanoscale Science and Engineering), Israel E. Wachs (Lehigh University), John D. Watkins (National Energy Technology Laboratory), Chunrong Yin (United States Department of Energy), Xu Zhou (Liquid Ion Solutions)
This is pretty sweet. I could pretty easily create a query that had all the PIs on a proposal, and alphabetize everyone's coauthors, or print them to a CSV file for import to Excel, or whatever format is required for conflict of interest reporting. The list is not perfect, but it is easy to manually fix it here.
That little bit of code is wrapped in a command-line utility in the scopus Python package. You use it like this. Just run it every time you need an updated list of coauthors! It isn't super flexible for now, e.g. excluding multiple affiliations, including multiple authors, etc… isn't fully supported.
./scopus_coauthors 7004212771 2010 --exclude-affiliation="Carnegie Mellon University"
Sneha A. Akhade (Pennsylvania State University), Erik J. Albenze (National Energy Technology Laboratory), Federico Calle-Vallejo (Leiden Institute of Chemistry), Robin Chao (National Energy Technology Laboratory), Krishnan V. Damodaran (University of Pittsburgh), Carmeline J. Dsilva (Princeton University), Adefemi A. Egbebi (URS), John C. Eslick (National Energy Technology Laboratory), Anatoly I. Frenkel (Yeshiva University), Kirk R. Gerdes (National Energy Technology Laboratory), Heine Anton Hansen (Danmarks Tekniske Universitet), David P. Hopkinson (National Energy Technology Laboratory), Thomas Francisco Jaramillo (Fermi National Accelerator Laboratory), Yogesh V. Joshi (Exxon Mobil Research and Engineering), Christopher J. Keturakis (Lehigh University), Marc T M Koper (Leiden Institute of Chemistry), Victor A. Kusuma (National Energy Technology Laboratory), James Landon (University of Kentucky), David R. Luebke (Liquid Ion Solutions), Isabelacostinela Man (Universitatea din Bucuresti), James X. Mao (University of Pittsburgh), José Ignacio Martínez (CSIC - Instituto de Ciencia de Materiales de Madrid (ICMM)), David C M Miller (National Energy Technology Laboratory), Bryan D. Morreale (National Energy Technology Laboratory), Christina R. Myers (National Energy Technology Laboratory), Jens Kehlet Nørskov (Stanford Linear Accelerator Center), Rumyana V. Petrova (International Iberian Nanotechnology Laboratory), Vladimir V. Pushkarev (Dow Corning Corporation), Sumathy Raman (Exxon Mobil Research and Engineering), Kevin P. Resnik (URS), Walter Richard Alesi (National Energy Technology Laboratory), Jan Rossmeisl (Kobenhavns Universitet), Edward M. Sabolsky (West Virginia University), Wei Shi (University of Pittsburgh), Nicholas S. Siefert (National Energy Technology Laboratory), Haiyan Su (Dalian Institute of Chemical Physics Chinese Academy of Sciences), Robert Lee Thompson (University of Pittsburgh Medical Center), Relja Vasić (SUNY College of Nanoscale Science and Engineering), Israel E. Wachs (Lehigh University), John D. Watkins (National Energy Technology Laboratory), Chunrong Yin (United States Department of Energy), Xu Zhou (Liquid Ion Solutions)
Copyright (C) 2016 by John Kitchin. See the License for information about copying.
Org-mode version = 8.2.10