Semantic Scholar Document Analysis at Scale

Total Page:16

File Type:pdf, Size:1020Kb

Semantic Scholar Document Analysis at Scale Semantic Scholar Document Analysis at Scale Miles Crawford, Director of Engineering Outline ● Introduction to the Allen Institute for Artificial Intelligence and Semantic Scholar ● Research at Semantic Scholar ● Creating www.semanticscholar.org ● Other resources for researchers Introduction to AI2 and S2 “AI for the common good” Allen Institute for Artificial Intelligence Mosaic Aristo Euclid Common Sense Knowledge Machine Reading and Math and Geometry and Reasoning Question Answering Comprehension AllenNLP PRIOR Semantic Scholar Deep Semantic NLP Visual Reasoning AI-Based Academic Platform Knowledge Semantic Scholar: Vision & Strategy Semantic Scholar makes the world's scholarly knowledge easy to survey and consume. Semantic Scholar: Vision & Strategy Differentiation: S2 is dramatically better at surveying, extracting, and helping researchers consume the most relevant information from the world's research Scale: Attract and retain a significant and sustainable share of academic search traffic Impact on research with AI: Work towards a “Wright Brothers” moment for research through research on novel AI techniques that are prototyped with millions of active users semanticscholar.org Research at Semantic Scholar Research at S2: Three Levels of Analysis Paper Relationships Macro Paper: Extract meaningful structures Figures Tables Topics Relations Neural Networks Omniglot Backpropagation Results C-peptide [contraindicated with] Diabetes Mellitus The addition of MbPA reaches a test perplexity of 29.2 which is, to the authors’ knowledge, state-of-the-art at time of writing. Peters et al. ACL 2017 -- Semi-supervised Sequence Tagging with Bidirectional Langua… Ammar et al. SemEval 2017 -- Semi-supervised End-to-end Entity and Relation Extrac… Siegel et al. JCDL 2018 -- Extracting Scientific Figures with Distantly Supervised Neural… Paper: Extract meaningful structures Relationships: Establishing Connections Ontology Matching uses method should cite UMLS Discovered KB Ammar et al. NAACL 2018 -- Construction of the Literature Graph in Semantic Scholar Wang et al. BioNLP 2018 -- Ontology Alignment in the Biomedical Domain Using Entit… Bhagavatula et al. NAACL 2018 -- Content-Based Citation Recommendation Relationships: Establishing Connections Relationships: Establishing Connections Macro: Trends in research pre-publishing vs. citation rates understanding peer reviews quantify demographic bias in clinical trials Kang et al. NAACL 2018 -- A Dataset of Peer Reviews (PeerRead): Collection, Insights... Feldman et al. arXiv 2018 -- Citation Count Analysis for Papers with Preprints 2018 Publications A Dataset of Peer Reviews (PeerRead): Collection, Insights and Extracting Scientific Figures with Distantly Supervised Neural NLP Applications Networks Kang, Ammar, Dalvi, van Zuylen, Kohlmeier, Hovy, Schwartz Siegel, Lourie, Power, Ammar NAACL 2018 JCDL 2018 Content-Based Citation Recommendation Neural Relation Extraction using a Combination of Full Bhagavatula, Feldman, Power, Ammar. Supervision and Distant Supervision NAACL 2018 Beltagy, Lo, Ammar arXiv preprint 2018 Construction of the Literature Graph in Semantic Scholar Ammar, Groeneveld, Bhagavatula, …, Etzioni Citation Count Analysis for Papers with Preprints NAACL 2018 Feldman, Lo, Ammar arXiv preprint 2018 Ontology Alignment in the Biomedical Domain Using Entity Definitions and Context Wang, Bhagavatula, Neumann, Lo, Wilhelm, Ammar BioNLP 2018 Bringing it All to Production semanticscholar.org: Scale and Reach ● Comprehensive literature graph to explore papers, authors, topics. ● View and search across 40MM+ Computer Science & BioMedical academic papers. ● Expansion to all of science in 2019. ● Topic extraction covering 350k+ topics, 230MM+ mentions, and 6.7MM+ relationships. ● Used by over 2.2MM unique users per month. ● Global adoption. Architecture Overview Data Acquisition Pipeline Machine Learning Web Application Models Figures Node Server Web Crawler Crawler Junk Filter React Entities Application DBLP De-duplicate User’s & Cluster Metadata Browser PubMed React Etc... Metadata API Application Springer Create Citation Metadata The Graph Internet Etc... Etc...Etc... Join Extracted Data Create Search Index Elasticsearch Normalizing Paper Records Data Acquisition Sourced Paper Title: N/A Crawler Authors: N/A PDF: a2ho-4.pdf Sourced Paper etc... Title: Analyzing... PubMed Authors: A, B, C Sourced Paper PDF: N/A Amazon etc... S3 Title: Findings of... Springer Authors: A, B, C PDF: findings.pdf etc... Etc... Etc...Etc... Pipeline Distributed Paper Resolution Junk Filter Title Block: analyzscientificdocument Title: Analyzing Scientific Documents De-duplicate Author: John Doe & Cluster Title: Analyzing Scientific Documents Title: Analyzing scientific documents Author: John Doe Author: J. Doe Create Citation Graph Title: Analyze Scientific Documents Title: Analyze Scientific Documents Author: Jane Smith Author: Jane Smith Join Extracted Data Title Block: deeplearningfornlp Create Search Index Title: Deep Learning for NLP Title: Deep Learning for NLP Author: Jane Smith Author: Jane Smith Pipeline Distributed Citation/Reference Matching Junk Filter Title: Structured Content Extraction Author: John Doe Title Block: analyzscientificdocument Title: Analyzing Scientific Documents De-duplicate & Cluster Author: John Doe References: 1. Analyzing Scientific Title: Analyze Scientific Documents Create Author: Jane Smith Citation Documents, Doe et al., Graph SCIDOCA 2016 2. Deep Learning with NLP, Join Smith, NAACL 2017 Extracted Title Block: deeplearningfornlp Data 3. Structural Heuristics in Academic Documents, Oscar, SCIDOCA 2016 Title: Deep Learning for NLP Create Author: Jane Smith Search Index ? Pipeline Scalable ML Models Notifications for new or changed Junk Filter papers Model Task System De-duplicate & Cluster Create Citation Table of Paper 1: abc Paper 1: Paper 1: abc Paper 1: abc extraction inputs Paper 2: def Paper 2: Paper 2: Paper 2: def Graph Etc... and results per Paper 3: Paper 3: Paper 3: Paper 3: hij model: Paper 4: Paper 4: Paper 4: Paper 4: Join Extracted Data Docker containers on autoscaling host groups: Figure Figure Key Topic Create ExtractionFigure ExtractionFigure ResultsFigure LocationFigure Etc...Figure Search Index Extractionv1.0 Extractionv1.1 Extractionv1.0 Extractionv1.3 Extraction v1.0 v1.0 v1.0 v1.0 v1.0 Other Resources Open Research Corpus ● Full Corpus of Papers: ○ Some data restricted due to some licensing/copyright conditions. ○ Record-per-line JSON archive, 36GB. ○ http://labs.semanticscholar.org/corpus/ ● API: ○ Lightweight single paper or author access. ○ JSON over REST API. ○ Link resolver for easy linking to Semantic Scholar. ○ http://api.semanticscholar.org/ Use of Open Corpus as Training Set An independent researcher created a system for embedding papers as a vector to support similarity measurements and recommendation generation https://github.com/Santosh-Gupta/Research2Vec ArXiv Added a Bibliography via API ArXiv used our API to fetch and re-display citation and reference lists on their pages. Open Source Repositories Cite-o-matic: Find relevant citations for drafts of papers. https://github.com/allenai/citeomatic Science Parse v1: CRF-based extraction of metadata from PDFs. https://github.com/allenai/science-parse Science Parse v2: LSTM-based extraction of metadata from PDFs. https://github.com/allenai/spv2 Deep Figures: Extract figures and tables from papers. https://github.com/allenai/deepfigures-open What would you like to see? Thank you! [email protected] http://allenai.org https://semanticscholar.org Appendix 1st dataset of peer reviews Dongyeop Kang, Waleed Ammar, Bhavana Dalvi Mishra, Madeleine van Zuylen, Sebastian Kohlmeier, Eduard Hovy, Roy Schwartz. NAACL 2018 Prepublishing before paper gets accepted is associated with 65% more citations! Sergey Feldman, Kyle Lo, Waleed Ammar. arXiv 2018 Gender bias in clinical trials Cite-o-matic Chandra Bhagavatula, Sergey Feldman, Russell Power, Waleed Ammar. NAACL 2018 Literature graph Synthesize knowledge Waleed Ammar, Dirk Groeneveld, Chandra Bhagavatula, Iz Beltagy, Miles Crawford, Doug Downey, Jason Dunkelberger, Ahmed Elgohary, Sergey Feldman, Vu Ha, Rodney Kinney, Sebastian Kohlmeier, Kyle Lo, Tyler Murray, Hsu-Han Ooi, Matthew Peters, Joanna Power, Sam Skjonsberg, Lucy Lu Wang, Chris Wilhelm, Zheng Yuan, Madeleine van Zuylen, and Oren Etzioni. NAACL 2018 (industry track) Ontology alignment Lucy Wang, Chandra Bhagavatula, Mark Neumann, Kyle Lo, Chris Wilhelm and Waleed Ammar. BioNLP 2018 Context-sensitive word representations Matthew E. Peters, Waleed Ammar, Chandra Bhagavatula, Russell Power. ACL 2017 Waleed Ammar, Matthew E. Peters, Chandra Bhagavatula, Russell Power. SemEval 2017 Figure extraction Noah Siegel, Nicholas Lourie, Russell Power, Waleed Ammar. JCDL 2018. Next steps 1. Extract Meaningful Structures 2018 2019 Next steps steps 1. Extract Meaningful Structures 2018 2019 Address more paper FAQs (e.g., PICO elements). Next steps steps 1. Extract Meaningful Structures 2018 2019 Model salience of extracted structures at the document level. Next steps steps 1. Extract Meaningful Structures 2018 2019 User’s implicit/explicit feedback. Next steps steps 1. Extract Meaningful Structures 2018 2019 User’s implicit/explicit feedback. Next steps steps 1. Extract Meaningful Structures 2018 2019 User’s implicit/explicit feedback. Next steps steps 1. Extract Meaningful Structures 2018 2019 Adapt existing models to more domains. Ganin and Lempitsky. ICML’15 Next steps steps 2. Establish Connections 2018
Recommended publications
  • “Altmetrics” Using Google Scholar, Twitter, Mendeley, Facebook
    Pre-Print Version Altmetrics of “altmetrics” using Google Scholar, Twitter, Mendeley, Facebook, Google-plus, CiteULike, Blogs and Wiki Saeed-Ul Hassan, Uzair Ahmed Gillani [email protected] Information Technology University, 346-B Ferozepur Road, Lahore (Pakistan) Abstract: We measure the impact of “altmetrics” field by deploying altmetrics indicators using the data from Google Scholar, Twitter, Mendeley, Facebook, Google- plus, CiteULike, Blogs and Wiki during 2010- 2014. To capture the social impact of scientific publications, we propose an index called alt-index, analogues to h-index. Across the deployed indices, our results have shown high correlation among the indicators that capture social impact. While we observe medium Pearson’s correlation (ρ= .247) among the alt-index and h-index, a relatively high correlation is observed between social citations and scholarly citations (ρ= .646). Interestingly, we find high turnover of social citations in the field compared with the traditional scholarly citations, i.e. social citations are 42.2% more than traditional citations. The social mediums such as Twitter and Mendeley appear to be the most effective channels of social impact followed by Facebook and Google-plus. Overall, altmetrics appears to be working well in the field of “altmetrics”. Keywords: Altmetrics, Social Media, Usage Indicators, Alt-index Pre-Print Version Introduction In scholarly world, altmetrics are getting popularity as to support and/or alternative to traditional citation-based evaluation metrics such as impact factor, h-index etc. (Priem et. al., 2010). The concept of altmetrics was initially proposed in 2010 as a generalization of article level metrics and has its roots in the #altmetrics hashtag (McIntyre et al, 2011).
    [Show full text]
  • Publish Or Perish: a Dilemma for Academic Librarians? W
    Georgia Southern University Digital Commons@Georgia Southern Library Faculty Publications Faculty Research and Publications 5-1999 Publish or Perish: A Dilemma for Academic Librarians? W. Bede Mitchell Georgia Southern University, [email protected] Mary Reichel Appalachian State University Follow this and additional works at: https://digitalcommons.georgiasouthern.edu/lib-facpubs Part of the Higher Education Commons, and the Library and Information Science Commons Recommended Citation Mitchell, W. Bede, Mary Reichel. 1999. "Publish or Perish: A Dilemma for Academic Librarians?." College and Research Libraries, 60 (3): 232-243. doi: 10.5860/crl.60.3.232 https://digitalcommons.georgiasouthern.edu/lib-facpubs/7 This article is brought to you for free and open access by the Faculty Research and Publications at Digital Commons@Georgia Southern. It has been accepted for inclusion in Library Faculty Publications by an authorized administrator of Digital Commons@Georgia Southern. For more information, please contact [email protected]. 232 College & Research Libraries May 1999 Publish or Perish: A Dilemma For Academic Librarians? W. Bede Mitchell and Mary Reichel This study examines the influence of scholarly requirements on librar­ ians’ ability to earn tenure or continuous employment. After a literature review, the authors present the results of a survey of research, doctoral, and master’s-level institutions. Of the 690 responding institutions, 54.3 percent employ tenure-track librarians. Of these, more than 60 percent require some scholarship and 34.6 percent encourage it. At these 374 institutions, 92.2 percent of librarians who underwent tenure review dur­ ing a three-year period were approved. The authors summarize survey information on librarians not granted tenure as well as those believed by directors to have resigned to avoid tenure review.
    [Show full text]
  • The Opencitations Data Model
    The OpenCitations Data Model Marilena Daquino1;2[0000−0002−1113−7550], Silvio Peroni1;2[0000−0003−0530−4305], David Shotton2;3[0000−0001−5506−523X], Giovanni Colavizza4[0000−0002−9806−084X], Behnam Ghavimi5[0000−0002−4627−5371], Anne Lauscher6[0000−0001−8590−9827], Philipp Mayr5[0000−0002−6656−1658], Matteo Romanello7[0000−0002−7406−6286], and Philipp Zumstein8[0000−0002−6485−9434]? 1 Digital Humanities Advanced research Centre (/DH.arc), Department of Classical Philology and Italian Studies, University of Bologna fmarilena.daquino2,[email protected] 2 Research Centre for Open Scholarly Metadata, Department of Classical Philology and Italian Studies, University of Bologna 3 Oxford e-Research Centre, University of Oxford [email protected] 4 Institute for Logic, Language and Computation (ILLC), University of Amsterdam [email protected] 5 Department of Knowledge Technologies for the Social Sciences, GESIS - Leibniz-Institute for the Social Sciences [email protected], [email protected] 6 Data and Web Science Group, University of Mannheim [email protected] 7 cole Polytechnique Fdrale de Lausanne [email protected] 8 Mannheim University Library, University of Mannheim [email protected] Abstract. A variety of schemas and ontologies are currently used for the machine-readable description of bibliographic entities and citations. This diversity, and the reuse of the same ontology terms with differ- ent nuances, generates inconsistencies in data. Adoption of a single data model would facilitate data integration tasks regardless of the data sup- plier or context application. In this paper we present the OpenCitations Data Model (OCDM), a generic data model for describing bibliographic entities and citations, developed using Semantic Web technologies.
    [Show full text]
  • Abstract and Index and Web Discovery Services IEEE Partners
    ____________________________________________________________________________________________________________ Abstract and Index and Web Discovery Services IEEE Partners Introduction This document is intended to provide a general overview of the abstract and indexing services and web discovery services that take in IEEE content. While this report is intended to provide readers with information on IEEE content indexed and in what service, there are several things to keep in mind: Services identified in this report do not cover everything in IEEE’s Xplore Digital Library either because IEEE does not provide all content to these services, or because only certain content was selected by a partner. Some services add and delete titles regularly, or include only select articles, in order to maintain a database that is relevant to their audience. While IEEE may provide a data feed for a particular subscription package (noted in the tables below), partners are not required to index all content. Most partners update their products at varying intervals and many Abstract and Indexing Services do not include corrected or updated article information. As a result, and given that these services and our agreements with these partners can and do change, readers of this report are encouraged to contact Krista Thom, Publishing Relations Program Specialist at [email protected] with specific questions. Abstract & Indexing Services Abstract and indexing services maintain databases, often subject-specific, which users can search to find relevant content. The data included in these services may be peer-reviewed journals, books, reports, and other types of content. Unlike web scale discovery services, these services collect metadata (including abstracts) from publishers and other organizations into large repositories or indexes.
    [Show full text]
  • 445 Publishing Trends of Journals with Manuscripts in Pubmed Central
    445 ORIGINAL INVESTIGATION DOI: dx.doi.org/10.5195/jmla.2018.457 Publishing trends of journals with manuscripts in PubMed Central: changes from 2008–2009 to 2015–2016 Lauren Topper, PhD; Diane Boehr, MLS See end of article for authors’ affiliations. Objective: The National Institutes of Health (NIH) public access policy mandates that all articles containing NIH-funded research must be deposited into PubMed Central (PMC). The aim of this study was to assess publishing trends of journals that were not selected for the National Library of Medicine (NLM) collection but contain NIH-funded articles submitted to PMC in compliance with the public access policy. In addition, the authors investigated the degree to which NIH-funded research is published in journals that NLM does not collect due to concerns with the publishers. Methods: We analyzed bibliographic data from the NIH Manuscript Submission system for journals that were not selected for the NLM collection from August 2015 to August 2016. Publications (n=738) were analyzed by language, publishing country, publishing format, and subject, and the results were compared to a similar study of 2008–2009 data. In addition, publications were analyzed by whether their publishers are collected by NLM, as determined by transparency and adherence to publishing best practices. Results: Only a few differences were found between the studies. Most notably, while both studies revealed that most journals were not selected for the NLM collection because they were out of scope (i.e., not biomedical), we noted an increase in 2015–2016 in biomedical journals containing NIH-funded articles that were not added to the collection due to concerns with the publishers.
    [Show full text]
  • Moral Disagreements and Pernicious Pragmatism: Pluralism, Value Argumentation, and the U.S
    University of Windsor Scholarship at UWindsor OSSA Conference Archive OSSA 9 May 18th, 9:00 AM - May 21st, 5:00 PM Moral disagreements and pernicious pragmatism: Pluralism, value argumentation, and the U.S. health care debate John Rief University of Pittsburgh Matthew Paul Brigham James Madison University Bill Balthrop Follow this and additional works at: https://scholar.uwindsor.ca/ossaarchive Part of the Philosophy Commons Rief, John; Brigham, Matthew Paul; and Balthrop, Bill, "Moral disagreements and pernicious pragmatism: Pluralism, value argumentation, and the U.S. health care debate" (2011). OSSA Conference Archive. 53. https://scholar.uwindsor.ca/ossaarchive/OSSA9/papersandcommentaries/53 This Paper is brought to you for free and open access by the Conferences and Conference Proceedings at Scholarship at UWindsor. It has been accepted for inclusion in OSSA Conference Archive by an authorized conference organizer of Scholarship at UWindsor. For more information, please contact [email protected]. Moral disagreements and pernicious pragmatism: Pluralism, value argumentation, and the U.S. health care debate JOHN JOSEPH RIEF Department of Communication, Institute for Clinical Research Education University of Pittsburgh 4200 5th Avenue, Cathedral of Learning 1117, Pittsburgh, PA 15260 United States of America [email protected] MATTHEW PAUL BRIGHAM Department of Communication Studies James Madison University 54 Bluestone Drive, MSC 2106, Harrisonburg, VA 22807-001 United States of America [email protected] ABSTRACT: This paper suggests some important revisions to Perelman’s approach to resolving major value-based disagreements in pluralist political cultures by analyzing a critical exemplar—the national health care debate in the United States—in which political and economic expediency have justified with- holding rights from citizens despite their recognition by government officials.
    [Show full text]
  • Tipping Points: Cancelling Journals When Arxiv Access Is Good Enough
    Tipping points: cancelling journals when arXiv access is good enough Tony Aponte Sciences Collection Coordinator UCLA Library ASEE ELD Lightning Talk June 17, 2019 Preprint explosion! Brian Resnick and Julia Belluz. (2019). The war to free science. Vox https://www.vox.com/the-highlight/2019/6/3/18271538/open- access-elsevier-california-sci-hub-academic-paywalls Preprint explosion! arXiv. (2019). arXiv submission rate statistics https://arxiv.org/help/stats/2018_by_area/index 2018 Case Study: two physics journals and arXiv ● UCLA: heavy users of arXiv. Not so heavy users of version of record ● Decent UC authorship ● No UC editorial board members 2017 Usage Annual cost Cost per use 2017 Impact Factor Journal A 103 $8,315 ~$80 1.291 Journal B 72 $6,344 ~$88 0.769 Just how many of these articles are OA? OAISSN.py - Enter a Journal ISSN and a year and this python program will tell you how many DOIs from that year have an open access version2 Ryan Regier. (2018). OAISSN.py https://github.com/ryregier/OAcounts. Just how many of these articles are OA? Ryan Regier. (2018). OAISSN.py https://github.com/ryregier/OAcounts. Just how many of these articles are OA? % OA articles from 2017 % OA articles from 2018 Journal A 68% 64% Journal B 11% 8% Ryan Regier. (2018). OAISSN.py https://github.com/ryregier/OAcounts. arXiv e-prints becoming closer to publisher versions of record according to UCLA similarity study of arXiv articles vs versions of record Martin Klein, Peter Broadwell, Sharon E. Farb, Todd Grappone. 2018. Comparing Published Scientific Journal Articles to Their Pre-Print Versions -- Extended Version.
    [Show full text]
  • Scientometrics As an Important Tool for the Growth of the Field of Learning Analytics
    (2014). Scientometrics as an Important Tool for the Growth of the Field of Learning Analytics. Journal of Learning Analytics, 1(2), 1‐4. Scientometrics as an Important Tool for the Growth of the Field of Learning Analytics Negin Mirriahi, Dragan Gasevic, Phil Long and Shane Dawson Editors, Journal of Learning Analytics ABSTRACT: This article introduces the special issue from SoLAR’s Learning Analytics and Knowledge conference. Learning analytics is an emerging field incorporating theory and practice from numerous disciplines to investigate how learner interactions with digital artefacts can provide actionable data to the learner herself, and educators about the learning process. As the field continues to expand there is a timely opportunity to evaluate its ongoing maturation. This evaluation could be in part informed by regular scientometric analyses from both the Journal and Conference publications. These analyses can collectively provide insight into the development of learning analytics more broadly and assist with the allocation of resources to under‐represented areas for example. KEYWORDS: Special issue, learning analytics, research, practice, Society for Learning Analytics Research, SoLAR, LAK’13 1 EDITORIAL We would like to dedicate this issue to our dear colleague and friend Erik Duval from Katholieke Universiteit Leuven, Belgium. Erik has been key figure in the development of the Society for Learning Analytics Research as both executive member and prominent researcher advancing the field. At LAK’14 Indianapolis, USA, Erik informed us that he was unable to attend the conference as he had been diagnosed with non-Hodgkin lymphoma1. Despite undergoing a difficult medical treatment, Erik has maintained his unfailing level of energy, sense of humour and passion – traits which capture the hearts of those he interacts with2.
    [Show full text]
  • Nonprofit Policy Forum
    NONPROFIT POLICY FORUM EDITOR-IN-CHIEF Dennis R. Young, Atlanta, GA MANAGING EDITOR Linda Serra SENIOR EDITORIAL BOARD Alan J. Abramson, Fairfax, VA Helmut Anheier, Berlin Andrea Bassi, Bologna Elizabeth T. Boris, Washington, D.C. Gemma Donnelly-Cox, Dublin Philippe Eynaud, Paris James Ferris, Los Angeles, CA Robert Fischer, Cleveland, OH Benjamin Gidron, Rishon LeZion Virginia A. Hodgkinson, Washington, D.C. Kevin Kearns, Pittsburgh, PA Michael Layton, Grand Rapids, MI Eliza Lee, Hong Kong Michael Meyer, Vienna Myles McGregor-Lowndes, Brisbane Mariko Nishimura, Tokyo Marta Reuter, Stockholm Mark Rosenman, Cincinnati, OH James Allen Smith, New York, NY Melissa Stone, Minneapolis, MN Simon Teasdale, Glasgow Isabel Vidal, Barcelona Filip Wijkström, Stockholm Naoto Yamauchi, Suita Dennis R. Young, Atlanta, GA ABSTRACTED/INDEXED IN Baidu Scholar · Cabells Journalytics · Chronos Hub · CNKI Scholar (China National Knowledge Infrastructure) · CNPIEC: cnpLINKer · Dimensions · DOAJ (Directory of Open Access Journals) · EBSCO (relevant databases) · EBSCO Discovery Service · EconBiz · ERIH PLUS (European Reference Index for the Humanities and Social Sciences) · Genamics JournalSeek · Google Scholar · IBR (International Bibliography of Reviews of Scholarly Literature in the Humani- ties and Social Sciences) · IBZ (International Bibliography of Periodical Literature in the Humanities and Social Sciences) · J-Gate · JournalTOCs · KESLI-NDSL (Korean National Discovery for Science Leaders) · Microsoft Academic · MyScienceWork · Naver Academic · Naviga (Softweco) · Norwegian Register for Scientific Journals, Series and Publishers · Primo Central (ExLibris) · ProQuest (relevant databases) · Publons · QOAM (Quality Open Access Market) · ReadCube · Research Papers in Economics (RePEc) · SCImago (SJR) · SCOPUS · Semantic Scholar · Sherpa/RoMEO · Summon (ProQuest) · TDNet · Ulrich‘s Periodicals Directory/ulrichsweb · WanFang Data · Web of Science: Emerging Sources Citation Index · WorldCat (OCLC) · Yewno Discover.
    [Show full text]
  • Arxiv:1803.10713V2 [Cs.DL] 6 Apr 2021 the Impact of Those Authors That Cite It
    CERN-TH-2018-066 Biblioranking fundamental physics (updated to 2021/1/1) Alessandro Strumiaa, Riccardo Torreb;c a Dipartimento di Fisica dell’Universit`adi Pisa, Italy b CERN, Theory Division, Geneva, Switzerland c INFN, sezione di Genova, Italy Abstract Counting of number of papers, of citations and the h-index are the simplest bibliometric indices of the impact of research. We discuss some improvements. First, we replace citations with individual ci- tations, fractionally shared among co-authors, to take into account that different papers and different fields have largely different average number of co-authors and of references. Next, we improve on citation counting applying the PageRank algorithm to citations among papers. Being time-ordered, this reduces to a weighted counting of citation de- scendants that we call PaperRank. We compute a related AuthorRank applying the PageRank algorithm to citations among authors. These metrics quantify the impact of an author or paper taking into account arXiv:1803.10713v2 [cs.DL] 6 Apr 2021 the impact of those authors that cite it. Finally, we show how self- and circular- citations can be eliminated by defining a closed market of Citation-coins. We apply these metrics to the InSpire database that covers fundamental physics, presenting results for papers, authors, journals, institutes, towns, countries, and continents, for all-time and in recent time periods. Contents 1 Introduction2 2 Ranking papers7 2.1 PaperRank . .7 2.2 PaperRank of papers: results . .8 2.3 PaperRank as the number of citations-of-citations . 10 2.4 Top-referred (recent) papers . 11 2.5 Paper metrics: correlations .
    [Show full text]
  • Visualizing Citation and Authorship Relations for Finding Reviewers
    1 ReviewerNet: Visualizing Citation and Authorship Relations for Finding Reviewers Mario Salinas, Daniela Giorgi, and Paolo Cignoni Abstract—We propose ReviewerNet, an online, interactive visualization system aimed to improve the reviewer selection process in the academic domain. Given a paper submitted for publication, we assume that good candidate reviewers can be chosen among the authors of a small set of relevant and pertinent papers; ReviewerNet supports the construction of such set of papers, by visualizing and exploring a literature citation network. Then, the system helps to select reviewers that are both well distributed in the scientific community and that do not have any conflict-of-interest, by visualising the careers and co-authorship relations of candidate reviewers. The system is publicly available, and it has been evaluated by a set of experienced researchers in the field of Computer Graphics. Index Terms—Scholarly data visualization, bibliometric networks, expert finding. F 1 INTRODUCTION HE number of digital academic documents, either newly laborations. Every member of the community has its own T published papers or documents resulting from digitiza- approach to reviewer finding, which usually involves bibli- tion efforts, grows at a very fast pace: the Scopus digital ographic research, and frequent visits to public repositories repository counts more than 70 million documents and like DBLP [8] and researchers’ home pages. In any case, one 16 million author profiles [1]; the Web of Science plat- has to confront possibly large collections of data to make form has more than 155 million records from over 34,000 decisions, and a user may easily get lost after following a journals [2]; Microsoft Academic collects about 210 million few links.
    [Show full text]
  • The Top 21 Free Online Journal and Research Databases
    The Top 21 Free Online Journal and Research Databases 1. CORE https://core.ac.uk/ is a multidisciplinary aggregator of open access research. It allows users to search more than 66 million open access articles. While most of these link to the full-text article on the original publisher's site, five million records are hosted directly on CORE. In addition to a straightforward keyword search, CORE offers advanced search options to filter results by publication type, year, language, journal, repository, and author. 2. ScienceOpen Functioning as a research and publishing network, https://www.scienceopen.com/ offers open access to more than 28 million articles in all areas of science. Although you do need to register to view the full text of the articles, registration is free. The advanced search function is highly detailed, allowing you to find precisely the research you're looking for. 3. Directory of Open Access Journals A multidisciplinary, community-curated directory, https://doaj.org/ (DOAJ) gives researchers access to high-quality, peer- reviewed journals. It has archived more than two million articles from 9,519 journals, allowing you to either browse by subject or search by keyword. The site was launched in 2003 with the aim of increasing the visibility of open access scholarly journals. Content on the site covers subjects from science to law to fine arts and everything in between. 4. Education Resources Information Center The https://eric.ed.gov/ (ERIC), of the Institution of Education Sciences, allows you to search by topic for material related to the field of education. Links lead to other sites, where you may have to purchase the information, but you can search for full-text articles only.
    [Show full text]