What Comes After MAG?

Total Page:16

File Type:pdf, Size:1020Kb

What Comes After MAG? Innovation Information Initiative What comes after MAG? Samuel J. Klein Published on: May 27, 2021 License: Creative Commons Attribution 4.0 International License (CC-BY 4.0) Innovation Information Initiative What comes after MAG? Edit this page Overview of citation graphs and tools This is an interlay for (scholarly) citation graphs: 1. What are these for / what is their scope; 2. What things like this exist now for various contexts; 3. How are these updated, by which curators + what processes; 4. What are the upstream and downstream sources + derivatives; and 5. What do we want the above to become, in the fullness of time? Focus and challenge Compiling a global citation graph (or a subset relevant to your current research context), in a format that’s convenient for [re]calculating metrics and training models. What do we want this to become? Here are 5 things that everyone building an open academic-graph commons can contribute to, and a 0th thing (standards that can help align our efforts :) Elements of the commons we want: 0. simple standards for being part of the commons open, forkable code + data, transparent processes commitment to register IDs, scripts, vocabularies, schemas, processes w/ a shared registry (WD or equivalent) 1. a federated data pipeline —> what can others build to speed this up? a source catalog + associated scripts a script library for processing/cleaning and disambiguation a federated event feed —> what exists, what more is needed? named processes for reproducing dataset outputs from the above 2. a vocabulary of core entities, and a set of PIDs others can build against for each (not an internal PK for each project; most projects don't need to generate a new PID for most entities) 3. a set of datasets released on a time series, w/ explicit + consistent (MAG used to provide one; whatever OR builds will be another; incremental updates are a bonus) 2 Innovation Information Initiative What comes after MAG? 4. a set of services available online, for free / at cost / at burden 5. internal documentation + interlayer description An overview: What is the future of the OAG? extending 'outside' reflections like this, w/ contributions from everyone providing part of the above A maintenance + dependency checklist: what upstreams + downstreams does the OAG depend on? How can someone rebuild it from scratch; or support its maintainers? What exists now Concordance of citation graphs Other lists + aggregators Do other concordances exist? Lists of resources: (github-awesome lists) (wp list of graphs) List of academic databases: includes Internet Archive Scholar, fatcat Citation graphs themselves Microsoft Academic / Open Academic Graph Internal graphs @ metrics-providers Web of Science Lens.org Publish or Perish Depsy (deprecated): (citations for software) Search engines GettheResearch Semantic Scholar Derivatives: citation-intent, paper-ID, author-ID Dimensions Metrics ImpactStory: https://profiles.impactstory.org/ (alt metrics) Clarivate Dimensions 3 Innovation Information Initiative What comes after MAG? How are these updated? Most internal/commercial pipelines are opaque. Dimensions updates some things continuously, other things (GRID) twice a year. Topic maps —> Citation existence —> Dissambiguating article + author ID —> Citation affect Crossref Drafts specs: Event feed —> what is needed? Data pipeline —> what can others build to speed this up? ID set —> (OurR spec) —> coming out soon :) mainly want people to actually be open! Process writeup: What comes after MAG? Data sources: Limiting what else Open requests: How do people currently use the MAG API ? What's missing so far? (conf proceedings, non-DOIs, open list for requesters, ML classification) IDs —> What new ones exist? what’s being maintained? : MAG ID —> Attendees —> : IDs — GRID / ROR / SS / IA [new primary key] OAIR : [SS / Meta / BN ? / Crossref / MAG / Lens] —> clarify degree of open code + data —> publisher agreements API access : read-only GETs (as per MAG?) 4 Innovation Information Initiative What comes after MAG? === Patent feeds as well COAR/BASE compared to UPW What are related up + downstreams? 170 dataset-papers drawing on MAG Reliance on Science Where do we want to be? + related research What comes after MAG? Microsoft Academic Graph changed the landscape of possibility for uses of citation graphs. It was mostly-complete and mostly-free to reuse, at launch 7 years ago. It was updated by a talented team at MS, which did extensive document- processing on a wide range of source formats. It quickly became a staple of any aggregator of such data, and people started to rely on its identifiers, author-identification, and topic-mapping Related research “Zenodo in the Spotlight of Traditional and New Metrics” 5.
Recommended publications
  • “Altmetrics” Using Google Scholar, Twitter, Mendeley, Facebook
    Pre-Print Version Altmetrics of “altmetrics” using Google Scholar, Twitter, Mendeley, Facebook, Google-plus, CiteULike, Blogs and Wiki Saeed-Ul Hassan, Uzair Ahmed Gillani [email protected] Information Technology University, 346-B Ferozepur Road, Lahore (Pakistan) Abstract: We measure the impact of “altmetrics” field by deploying altmetrics indicators using the data from Google Scholar, Twitter, Mendeley, Facebook, Google- plus, CiteULike, Blogs and Wiki during 2010- 2014. To capture the social impact of scientific publications, we propose an index called alt-index, analogues to h-index. Across the deployed indices, our results have shown high correlation among the indicators that capture social impact. While we observe medium Pearson’s correlation (ρ= .247) among the alt-index and h-index, a relatively high correlation is observed between social citations and scholarly citations (ρ= .646). Interestingly, we find high turnover of social citations in the field compared with the traditional scholarly citations, i.e. social citations are 42.2% more than traditional citations. The social mediums such as Twitter and Mendeley appear to be the most effective channels of social impact followed by Facebook and Google-plus. Overall, altmetrics appears to be working well in the field of “altmetrics”. Keywords: Altmetrics, Social Media, Usage Indicators, Alt-index Pre-Print Version Introduction In scholarly world, altmetrics are getting popularity as to support and/or alternative to traditional citation-based evaluation metrics such as impact factor, h-index etc. (Priem et. al., 2010). The concept of altmetrics was initially proposed in 2010 as a generalization of article level metrics and has its roots in the #altmetrics hashtag (McIntyre et al, 2011).
    [Show full text]
  • Publish Or Perish: a Dilemma for Academic Librarians? W
    Georgia Southern University Digital Commons@Georgia Southern Library Faculty Publications Faculty Research and Publications 5-1999 Publish or Perish: A Dilemma for Academic Librarians? W. Bede Mitchell Georgia Southern University, [email protected] Mary Reichel Appalachian State University Follow this and additional works at: https://digitalcommons.georgiasouthern.edu/lib-facpubs Part of the Higher Education Commons, and the Library and Information Science Commons Recommended Citation Mitchell, W. Bede, Mary Reichel. 1999. "Publish or Perish: A Dilemma for Academic Librarians?." College and Research Libraries, 60 (3): 232-243. doi: 10.5860/crl.60.3.232 https://digitalcommons.georgiasouthern.edu/lib-facpubs/7 This article is brought to you for free and open access by the Faculty Research and Publications at Digital Commons@Georgia Southern. It has been accepted for inclusion in Library Faculty Publications by an authorized administrator of Digital Commons@Georgia Southern. For more information, please contact [email protected]. 232 College & Research Libraries May 1999 Publish or Perish: A Dilemma For Academic Librarians? W. Bede Mitchell and Mary Reichel This study examines the influence of scholarly requirements on librar­ ians’ ability to earn tenure or continuous employment. After a literature review, the authors present the results of a survey of research, doctoral, and master’s-level institutions. Of the 690 responding institutions, 54.3 percent employ tenure-track librarians. Of these, more than 60 percent require some scholarship and 34.6 percent encourage it. At these 374 institutions, 92.2 percent of librarians who underwent tenure review dur­ ing a three-year period were approved. The authors summarize survey information on librarians not granted tenure as well as those believed by directors to have resigned to avoid tenure review.
    [Show full text]
  • The Opencitations Data Model
    The OpenCitations Data Model Marilena Daquino1;2[0000−0002−1113−7550], Silvio Peroni1;2[0000−0003−0530−4305], David Shotton2;3[0000−0001−5506−523X], Giovanni Colavizza4[0000−0002−9806−084X], Behnam Ghavimi5[0000−0002−4627−5371], Anne Lauscher6[0000−0001−8590−9827], Philipp Mayr5[0000−0002−6656−1658], Matteo Romanello7[0000−0002−7406−6286], and Philipp Zumstein8[0000−0002−6485−9434]? 1 Digital Humanities Advanced research Centre (/DH.arc), Department of Classical Philology and Italian Studies, University of Bologna fmarilena.daquino2,[email protected] 2 Research Centre for Open Scholarly Metadata, Department of Classical Philology and Italian Studies, University of Bologna 3 Oxford e-Research Centre, University of Oxford [email protected] 4 Institute for Logic, Language and Computation (ILLC), University of Amsterdam [email protected] 5 Department of Knowledge Technologies for the Social Sciences, GESIS - Leibniz-Institute for the Social Sciences [email protected], [email protected] 6 Data and Web Science Group, University of Mannheim [email protected] 7 cole Polytechnique Fdrale de Lausanne [email protected] 8 Mannheim University Library, University of Mannheim [email protected] Abstract. A variety of schemas and ontologies are currently used for the machine-readable description of bibliographic entities and citations. This diversity, and the reuse of the same ontology terms with differ- ent nuances, generates inconsistencies in data. Adoption of a single data model would facilitate data integration tasks regardless of the data sup- plier or context application. In this paper we present the OpenCitations Data Model (OCDM), a generic data model for describing bibliographic entities and citations, developed using Semantic Web technologies.
    [Show full text]
  • Abstract and Index and Web Discovery Services IEEE Partners
    ____________________________________________________________________________________________________________ Abstract and Index and Web Discovery Services IEEE Partners Introduction This document is intended to provide a general overview of the abstract and indexing services and web discovery services that take in IEEE content. While this report is intended to provide readers with information on IEEE content indexed and in what service, there are several things to keep in mind: Services identified in this report do not cover everything in IEEE’s Xplore Digital Library either because IEEE does not provide all content to these services, or because only certain content was selected by a partner. Some services add and delete titles regularly, or include only select articles, in order to maintain a database that is relevant to their audience. While IEEE may provide a data feed for a particular subscription package (noted in the tables below), partners are not required to index all content. Most partners update their products at varying intervals and many Abstract and Indexing Services do not include corrected or updated article information. As a result, and given that these services and our agreements with these partners can and do change, readers of this report are encouraged to contact Krista Thom, Publishing Relations Program Specialist at [email protected] with specific questions. Abstract & Indexing Services Abstract and indexing services maintain databases, often subject-specific, which users can search to find relevant content. The data included in these services may be peer-reviewed journals, books, reports, and other types of content. Unlike web scale discovery services, these services collect metadata (including abstracts) from publishers and other organizations into large repositories or indexes.
    [Show full text]
  • 445 Publishing Trends of Journals with Manuscripts in Pubmed Central
    445 ORIGINAL INVESTIGATION DOI: dx.doi.org/10.5195/jmla.2018.457 Publishing trends of journals with manuscripts in PubMed Central: changes from 2008–2009 to 2015–2016 Lauren Topper, PhD; Diane Boehr, MLS See end of article for authors’ affiliations. Objective: The National Institutes of Health (NIH) public access policy mandates that all articles containing NIH-funded research must be deposited into PubMed Central (PMC). The aim of this study was to assess publishing trends of journals that were not selected for the National Library of Medicine (NLM) collection but contain NIH-funded articles submitted to PMC in compliance with the public access policy. In addition, the authors investigated the degree to which NIH-funded research is published in journals that NLM does not collect due to concerns with the publishers. Methods: We analyzed bibliographic data from the NIH Manuscript Submission system for journals that were not selected for the NLM collection from August 2015 to August 2016. Publications (n=738) were analyzed by language, publishing country, publishing format, and subject, and the results were compared to a similar study of 2008–2009 data. In addition, publications were analyzed by whether their publishers are collected by NLM, as determined by transparency and adherence to publishing best practices. Results: Only a few differences were found between the studies. Most notably, while both studies revealed that most journals were not selected for the NLM collection because they were out of scope (i.e., not biomedical), we noted an increase in 2015–2016 in biomedical journals containing NIH-funded articles that were not added to the collection due to concerns with the publishers.
    [Show full text]
  • Moral Disagreements and Pernicious Pragmatism: Pluralism, Value Argumentation, and the U.S
    University of Windsor Scholarship at UWindsor OSSA Conference Archive OSSA 9 May 18th, 9:00 AM - May 21st, 5:00 PM Moral disagreements and pernicious pragmatism: Pluralism, value argumentation, and the U.S. health care debate John Rief University of Pittsburgh Matthew Paul Brigham James Madison University Bill Balthrop Follow this and additional works at: https://scholar.uwindsor.ca/ossaarchive Part of the Philosophy Commons Rief, John; Brigham, Matthew Paul; and Balthrop, Bill, "Moral disagreements and pernicious pragmatism: Pluralism, value argumentation, and the U.S. health care debate" (2011). OSSA Conference Archive. 53. https://scholar.uwindsor.ca/ossaarchive/OSSA9/papersandcommentaries/53 This Paper is brought to you for free and open access by the Conferences and Conference Proceedings at Scholarship at UWindsor. It has been accepted for inclusion in OSSA Conference Archive by an authorized conference organizer of Scholarship at UWindsor. For more information, please contact [email protected]. Moral disagreements and pernicious pragmatism: Pluralism, value argumentation, and the U.S. health care debate JOHN JOSEPH RIEF Department of Communication, Institute for Clinical Research Education University of Pittsburgh 4200 5th Avenue, Cathedral of Learning 1117, Pittsburgh, PA 15260 United States of America [email protected] MATTHEW PAUL BRIGHAM Department of Communication Studies James Madison University 54 Bluestone Drive, MSC 2106, Harrisonburg, VA 22807-001 United States of America [email protected] ABSTRACT: This paper suggests some important revisions to Perelman’s approach to resolving major value-based disagreements in pluralist political cultures by analyzing a critical exemplar—the national health care debate in the United States—in which political and economic expediency have justified with- holding rights from citizens despite their recognition by government officials.
    [Show full text]
  • Tipping Points: Cancelling Journals When Arxiv Access Is Good Enough
    Tipping points: cancelling journals when arXiv access is good enough Tony Aponte Sciences Collection Coordinator UCLA Library ASEE ELD Lightning Talk June 17, 2019 Preprint explosion! Brian Resnick and Julia Belluz. (2019). The war to free science. Vox https://www.vox.com/the-highlight/2019/6/3/18271538/open- access-elsevier-california-sci-hub-academic-paywalls Preprint explosion! arXiv. (2019). arXiv submission rate statistics https://arxiv.org/help/stats/2018_by_area/index 2018 Case Study: two physics journals and arXiv ● UCLA: heavy users of arXiv. Not so heavy users of version of record ● Decent UC authorship ● No UC editorial board members 2017 Usage Annual cost Cost per use 2017 Impact Factor Journal A 103 $8,315 ~$80 1.291 Journal B 72 $6,344 ~$88 0.769 Just how many of these articles are OA? OAISSN.py - Enter a Journal ISSN and a year and this python program will tell you how many DOIs from that year have an open access version2 Ryan Regier. (2018). OAISSN.py https://github.com/ryregier/OAcounts. Just how many of these articles are OA? Ryan Regier. (2018). OAISSN.py https://github.com/ryregier/OAcounts. Just how many of these articles are OA? % OA articles from 2017 % OA articles from 2018 Journal A 68% 64% Journal B 11% 8% Ryan Regier. (2018). OAISSN.py https://github.com/ryregier/OAcounts. arXiv e-prints becoming closer to publisher versions of record according to UCLA similarity study of arXiv articles vs versions of record Martin Klein, Peter Broadwell, Sharon E. Farb, Todd Grappone. 2018. Comparing Published Scientific Journal Articles to Their Pre-Print Versions -- Extended Version.
    [Show full text]
  • Semantic Scholar Document Analysis at Scale
    Semantic Scholar Document Analysis at Scale Miles Crawford, Director of Engineering Outline ● Introduction to the Allen Institute for Artificial Intelligence and Semantic Scholar ● Research at Semantic Scholar ● Creating www.semanticscholar.org ● Other resources for researchers Introduction to AI2 and S2 “AI for the common good” Allen Institute for Artificial Intelligence Mosaic Aristo Euclid Common Sense Knowledge Machine Reading and Math and Geometry and Reasoning Question Answering Comprehension AllenNLP PRIOR Semantic Scholar Deep Semantic NLP Visual Reasoning AI-Based Academic Platform Knowledge Semantic Scholar: Vision & Strategy Semantic Scholar makes the world's scholarly knowledge easy to survey and consume. Semantic Scholar: Vision & Strategy Differentiation: S2 is dramatically better at surveying, extracting, and helping researchers consume the most relevant information from the world's research Scale: Attract and retain a significant and sustainable share of academic search traffic Impact on research with AI: Work towards a “Wright Brothers” moment for research through research on novel AI techniques that are prototyped with millions of active users semanticscholar.org Research at Semantic Scholar Research at S2: Three Levels of Analysis Paper Relationships Macro Paper: Extract meaningful structures Figures Tables Topics Relations Neural Networks Omniglot Backpropagation Results C-peptide [contraindicated with] Diabetes Mellitus The addition of MbPA reaches a test perplexity of 29.2 which is, to the authors’ knowledge, state-of-the-art at time of writing. Peters et al. ACL 2017 -- Semi-supervised Sequence Tagging with Bidirectional Langua… Ammar et al. SemEval 2017 -- Semi-supervised End-to-end Entity and Relation Extrac… Siegel et al. JCDL 2018 -- Extracting Scientific Figures with Distantly Supervised Neural… Paper: Extract meaningful structures Relationships: Establishing Connections Ontology Matching uses method should cite UMLS Discovered KB Ammar et al.
    [Show full text]
  • Scientometrics As an Important Tool for the Growth of the Field of Learning Analytics
    (2014). Scientometrics as an Important Tool for the Growth of the Field of Learning Analytics. Journal of Learning Analytics, 1(2), 1‐4. Scientometrics as an Important Tool for the Growth of the Field of Learning Analytics Negin Mirriahi, Dragan Gasevic, Phil Long and Shane Dawson Editors, Journal of Learning Analytics ABSTRACT: This article introduces the special issue from SoLAR’s Learning Analytics and Knowledge conference. Learning analytics is an emerging field incorporating theory and practice from numerous disciplines to investigate how learner interactions with digital artefacts can provide actionable data to the learner herself, and educators about the learning process. As the field continues to expand there is a timely opportunity to evaluate its ongoing maturation. This evaluation could be in part informed by regular scientometric analyses from both the Journal and Conference publications. These analyses can collectively provide insight into the development of learning analytics more broadly and assist with the allocation of resources to under‐represented areas for example. KEYWORDS: Special issue, learning analytics, research, practice, Society for Learning Analytics Research, SoLAR, LAK’13 1 EDITORIAL We would like to dedicate this issue to our dear colleague and friend Erik Duval from Katholieke Universiteit Leuven, Belgium. Erik has been key figure in the development of the Society for Learning Analytics Research as both executive member and prominent researcher advancing the field. At LAK’14 Indianapolis, USA, Erik informed us that he was unable to attend the conference as he had been diagnosed with non-Hodgkin lymphoma1. Despite undergoing a difficult medical treatment, Erik has maintained his unfailing level of energy, sense of humour and passion – traits which capture the hearts of those he interacts with2.
    [Show full text]
  • Nonprofit Policy Forum
    NONPROFIT POLICY FORUM EDITOR-IN-CHIEF Dennis R. Young, Atlanta, GA MANAGING EDITOR Linda Serra SENIOR EDITORIAL BOARD Alan J. Abramson, Fairfax, VA Helmut Anheier, Berlin Andrea Bassi, Bologna Elizabeth T. Boris, Washington, D.C. Gemma Donnelly-Cox, Dublin Philippe Eynaud, Paris James Ferris, Los Angeles, CA Robert Fischer, Cleveland, OH Benjamin Gidron, Rishon LeZion Virginia A. Hodgkinson, Washington, D.C. Kevin Kearns, Pittsburgh, PA Michael Layton, Grand Rapids, MI Eliza Lee, Hong Kong Michael Meyer, Vienna Myles McGregor-Lowndes, Brisbane Mariko Nishimura, Tokyo Marta Reuter, Stockholm Mark Rosenman, Cincinnati, OH James Allen Smith, New York, NY Melissa Stone, Minneapolis, MN Simon Teasdale, Glasgow Isabel Vidal, Barcelona Filip Wijkström, Stockholm Naoto Yamauchi, Suita Dennis R. Young, Atlanta, GA ABSTRACTED/INDEXED IN Baidu Scholar · Cabells Journalytics · Chronos Hub · CNKI Scholar (China National Knowledge Infrastructure) · CNPIEC: cnpLINKer · Dimensions · DOAJ (Directory of Open Access Journals) · EBSCO (relevant databases) · EBSCO Discovery Service · EconBiz · ERIH PLUS (European Reference Index for the Humanities and Social Sciences) · Genamics JournalSeek · Google Scholar · IBR (International Bibliography of Reviews of Scholarly Literature in the Humani- ties and Social Sciences) · IBZ (International Bibliography of Periodical Literature in the Humanities and Social Sciences) · J-Gate · JournalTOCs · KESLI-NDSL (Korean National Discovery for Science Leaders) · Microsoft Academic · MyScienceWork · Naver Academic · Naviga (Softweco) · Norwegian Register for Scientific Journals, Series and Publishers · Primo Central (ExLibris) · ProQuest (relevant databases) · Publons · QOAM (Quality Open Access Market) · ReadCube · Research Papers in Economics (RePEc) · SCImago (SJR) · SCOPUS · Semantic Scholar · Sherpa/RoMEO · Summon (ProQuest) · TDNet · Ulrich‘s Periodicals Directory/ulrichsweb · WanFang Data · Web of Science: Emerging Sources Citation Index · WorldCat (OCLC) · Yewno Discover.
    [Show full text]
  • Arxiv:1803.10713V2 [Cs.DL] 6 Apr 2021 the Impact of Those Authors That Cite It
    CERN-TH-2018-066 Biblioranking fundamental physics (updated to 2021/1/1) Alessandro Strumiaa, Riccardo Torreb;c a Dipartimento di Fisica dell’Universit`adi Pisa, Italy b CERN, Theory Division, Geneva, Switzerland c INFN, sezione di Genova, Italy Abstract Counting of number of papers, of citations and the h-index are the simplest bibliometric indices of the impact of research. We discuss some improvements. First, we replace citations with individual ci- tations, fractionally shared among co-authors, to take into account that different papers and different fields have largely different average number of co-authors and of references. Next, we improve on citation counting applying the PageRank algorithm to citations among papers. Being time-ordered, this reduces to a weighted counting of citation de- scendants that we call PaperRank. We compute a related AuthorRank applying the PageRank algorithm to citations among authors. These metrics quantify the impact of an author or paper taking into account arXiv:1803.10713v2 [cs.DL] 6 Apr 2021 the impact of those authors that cite it. Finally, we show how self- and circular- citations can be eliminated by defining a closed market of Citation-coins. We apply these metrics to the InSpire database that covers fundamental physics, presenting results for papers, authors, journals, institutes, towns, countries, and continents, for all-time and in recent time periods. Contents 1 Introduction2 2 Ranking papers7 2.1 PaperRank . .7 2.2 PaperRank of papers: results . .8 2.3 PaperRank as the number of citations-of-citations . 10 2.4 Top-referred (recent) papers . 11 2.5 Paper metrics: correlations .
    [Show full text]
  • Visualizing Citation and Authorship Relations for Finding Reviewers
    1 ReviewerNet: Visualizing Citation and Authorship Relations for Finding Reviewers Mario Salinas, Daniela Giorgi, and Paolo Cignoni Abstract—We propose ReviewerNet, an online, interactive visualization system aimed to improve the reviewer selection process in the academic domain. Given a paper submitted for publication, we assume that good candidate reviewers can be chosen among the authors of a small set of relevant and pertinent papers; ReviewerNet supports the construction of such set of papers, by visualizing and exploring a literature citation network. Then, the system helps to select reviewers that are both well distributed in the scientific community and that do not have any conflict-of-interest, by visualising the careers and co-authorship relations of candidate reviewers. The system is publicly available, and it has been evaluated by a set of experienced researchers in the field of Computer Graphics. Index Terms—Scholarly data visualization, bibliometric networks, expert finding. F 1 INTRODUCTION HE number of digital academic documents, either newly laborations. Every member of the community has its own T published papers or documents resulting from digitiza- approach to reviewer finding, which usually involves bibli- tion efforts, grows at a very fast pace: the Scopus digital ographic research, and frequent visits to public repositories repository counts more than 70 million documents and like DBLP [8] and researchers’ home pages. In any case, one 16 million author profiles [1]; the Web of Science plat- has to confront possibly large collections of data to make form has more than 155 million records from over 34,000 decisions, and a user may easily get lost after following a journals [2]; Microsoft Academic collects about 210 million few links.
    [Show full text]