
Innovation Information Initiative What comes after MAG? Samuel J. Klein Published on: May 27, 2021 License: Creative Commons Attribution 4.0 International License (CC-BY 4.0) Innovation Information Initiative What comes after MAG? Edit this page Overview of citation graphs and tools This is an interlay for (scholarly) citation graphs: 1. What are these for / what is their scope; 2. What things like this exist now for various contexts; 3. How are these updated, by which curators + what processes; 4. What are the upstream and downstream sources + derivatives; and 5. What do we want the above to become, in the fullness of time? Focus and challenge Compiling a global citation graph (or a subset relevant to your current research context), in a format that’s convenient for [re]calculating metrics and training models. What do we want this to become? Here are 5 things that everyone building an open academic-graph commons can contribute to, and a 0th thing (standards that can help align our efforts :) Elements of the commons we want: 0. simple standards for being part of the commons open, forkable code + data, transparent processes commitment to register IDs, scripts, vocabularies, schemas, processes w/ a shared registry (WD or equivalent) 1. a federated data pipeline —> what can others build to speed this up? a source catalog + associated scripts a script library for processing/cleaning and disambiguation a federated event feed —> what exists, what more is needed? named processes for reproducing dataset outputs from the above 2. a vocabulary of core entities, and a set of PIDs others can build against for each (not an internal PK for each project; most projects don't need to generate a new PID for most entities) 3. a set of datasets released on a time series, w/ explicit + consistent (MAG used to provide one; whatever OR builds will be another; incremental updates are a bonus) 2 Innovation Information Initiative What comes after MAG? 4. a set of services available online, for free / at cost / at burden 5. internal documentation + interlayer description An overview: What is the future of the OAG? extending 'outside' reflections like this, w/ contributions from everyone providing part of the above A maintenance + dependency checklist: what upstreams + downstreams does the OAG depend on? How can someone rebuild it from scratch; or support its maintainers? What exists now Concordance of citation graphs Other lists + aggregators Do other concordances exist? Lists of resources: (github-awesome lists) (wp list of graphs) List of academic databases: includes Internet Archive Scholar, fatcat Citation graphs themselves Microsoft Academic / Open Academic Graph Internal graphs @ metrics-providers Web of Science Lens.org Publish or Perish Depsy (deprecated): (citations for software) Search engines GettheResearch Semantic Scholar Derivatives: citation-intent, paper-ID, author-ID Dimensions Metrics ImpactStory: https://profiles.impactstory.org/ (alt metrics) Clarivate Dimensions 3 Innovation Information Initiative What comes after MAG? How are these updated? Most internal/commercial pipelines are opaque. Dimensions updates some things continuously, other things (GRID) twice a year. Topic maps —> Citation existence —> Dissambiguating article + author ID —> Citation affect Crossref Drafts specs: Event feed —> what is needed? Data pipeline —> what can others build to speed this up? ID set —> (OurR spec) —> coming out soon :) mainly want people to actually be open! Process writeup: What comes after MAG? Data sources: Limiting what else Open requests: How do people currently use the MAG API ? What's missing so far? (conf proceedings, non-DOIs, open list for requesters, ML classification) IDs —> What new ones exist? what’s being maintained? : MAG ID —> Attendees —> : IDs — GRID / ROR / SS / IA [new primary key] OAIR : [SS / Meta / BN ? / Crossref / MAG / Lens] —> clarify degree of open code + data —> publisher agreements API access : read-only GETs (as per MAG?) 4 Innovation Information Initiative What comes after MAG? === Patent feeds as well COAR/BASE compared to UPW What are related up + downstreams? 170 dataset-papers drawing on MAG Reliance on Science Where do we want to be? + related research What comes after MAG? Microsoft Academic Graph changed the landscape of possibility for uses of citation graphs. It was mostly-complete and mostly-free to reuse, at launch 7 years ago. It was updated by a talented team at MS, which did extensive document- processing on a wide range of source formats. It quickly became a staple of any aggregator of such data, and people started to rely on its identifiers, author-identification, and topic-mapping Related research “Zenodo in the Spotlight of Traditional and New Metrics” 5.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages5 Page
-
File Size-