Using Test Data to Evaluate Rankings of Entities in Large Scholarly Citation Networks
Total Page:16
File Type:pdf, Size:1020Kb
Using test data to evaluate rankings of entities in large scholarly citation networks by Marcel Dunaiski Dissertation approved for the degree of Doctor of Philosophy in the Faculty of Science at Stellenbosch University Computer Science Division, Department Mathematical Sciences, University of Stellenbosch, Private Bag X1, Matieland 7602, South Africa. Promoters: Prof. W. Visser Prof. J. Geldenhuys April 2019 Stellenbosch University https://scholar.sun.ac.za Declaration By submitting this dissertation electronically, I declare that the entirety of the work con- tained therein is my own, original work, that I am the sole author thereof (save to the extent explicitly otherwise stated), that reproduction and publication thereof by Stellen- bosch University will not infringe any third party rights and that I have not previously in its entirety or in part submitted it for obtaining any qualification. This dissertation includes five original papers published in peer-reviewed journals or books. The development and writing of the papers (published and unpublished) were the principal responsibility of myself and, for each of the cases where this is not the case, a declaration is included in the dissertation indicating the nature and extent of the contributions of co-authors. Date: April 2019 Copyright © 2019 Stellenbosch University All rights reserved. i Stellenbosch University https://scholar.sun.ac.za DECLARATION ii Declaration by the candidate: With regard to the papers (published and unpublished) appended to this dissertation, the nature and scope of my contributions were as follows: Nature of contribution Extent of contribution (%) All aspects of producing the papers 100 The following co-authors have contributed to the papers (published and unpublished) appended to this dissertation: Name e-mail address Nature of contribution Extent of con- tribution (%) Willem Visser [email protected] Supervisory and editorial 50 Jaco Geldenhuys [email protected] Supervisory and editorial 50 Signature of candidate: “Declaration with signature in possession of candidate and super- visors” Date: 2018/12/03 Declaration by the co-authors: The undersigned hereby confirm that 1. the declaration above accurately reflects the nature and extent of the contributions of the candidate and the co-authors to the papers (published and unpublished) appended to this dissertation, 2. no other authors contributed to the papers (published and unpublished) appended to this dissertation, and 3. potential conflicts of interest have been revealed to all interested parties and that the necessary arrangements have been made to use the material in the papers (published and unpublished) appended to this dissertation. Signature Institutional affiliation Date “Declaration with signature in Stellenbosch University 2018/12/03 possession of candidate and supervisors” “Declaration with signature in Stellenbosch University 2018/12/03 possession of candidate and supervisors” Stellenbosch University https://scholar.sun.ac.za Abstract Using test data to evaluate rankings of entities in large scholarly citation networks M.P. Dunaiski Computer Science Division, Department Mathematical Sciences, University of Stellenbosch, Private Bag X1, Matieland 7602, South Africa. Dissertation: PhD (Computer Science) March 2019 A core aspect in the field of bibliometrics is the formulation, refinement, and verification of metrics that rate entities in the science domain based on the information contained within the scientific literature corpus. Since these metrics play an increasingly important role in research evaluation, continued scrutiny of current methods is crucial. For example, metrics that are intended to rate the quality of papers should be assessed by correlating them with peer assessments. I approach the problem of assessing metrics with test data based on other objective ratings provided by domain experts which we use as proxies for peer-based quality assessments. This dissertation is an attempt to fill some of the gaps in the literature concerning the evaluation of metrics through test data. Specifically, I investigate two main research questions: (1) what are the best practices when evaluating rankings of academic entities based on test data, and (2), what can we learn about ranking algorithms and impact metrics when they are evaluated using test data? Besides the use of test data to evaluate metrics, the second continual theme of this dissertation is the application and evaluation of indirect ranking algorithms as an alternative to metrics based on direct citations. Through five published journal articles, I present the results of this investigation. iii Stellenbosch University https://scholar.sun.ac.za Opsomming Die evaluering van rangordes van entiteite in groot wetenskaplike sitasienetwerke deur die gebruik van toetsdata M.P. Dunaiski Rekenaarwetenskap Afdeling, Departement van Wiskundige Wetenskappe, Universiteit van Stellenbosch, Privaatsak X1, Matieland 7602, Suid Afrika. Proefskrif: PhD (Rekenaarwetenskap) Maart 2019 Kern werksaamhede in die veld van bibliometrika is die formulasie, verfyning en verifi- kasie van maatstawwe wat rangordes vir wetenskaplike entiteite bepaal op grond van die inligting bevat in die literatuur korpus van die wetenskap. Aangesien hierdie maatstawwe ’n al belangriker rol speel in die evaluasie van navorsing, is dit krities dat hulle voort- durend en noukeurig ondersoek word. Byvoorbeeld, maatstawwe wat veronderstel is om die gehalte van artikels te beraam, moet gekorreleer word met eweknie-assesserings. Ek takel die evaluasie van maatstawwe met behulp van toetsdata gebaseer op ’n ander tipe objektiewe rangorde (verskaf deur kenners in ’n veld), en gebruik dít om in te staan vir eweknie-assesserings van gehalte. Hierdie proefskrif poog om van die gapings te vul as dit kom by die evaluasie van maatstawwe met behulp van toetsdata. Meer spesifiek ondersoek ek twee vrae: (1) wat is die beste praktyke vir die evaluasie van rangordes vir akademiese entiteite gebaseer op toetsdata, en (2) wat kan ons leer oor die rangorde algoritmes en oor impak-maatstawwe wanneer ons hulle met die toetsdata evalueer? Buiten die gebruik van toetsdata, is daar ’n tweede deurlopende tema in hierdie proefskrif: die toepassing en evaluering van indirekte rangorde algoritmes as ’n alternatief tot maatstawwe wat direkte sitasies gebruik. Die resultate van my ondersoek word beskryf in vyf reeds-gepubliseerde joernaal artikels. iv Stellenbosch University https://scholar.sun.ac.za Preface This is a dissertation by publication. It starts with a brief introduction to the topic of citation impact metrics, followed by five chapters which very briefly summarise five papers, and concludes with a synopsis chapter. The five papers themselves are appended, for the convenience of the reader, to the end of the dissertation. The bibliographies for each paper are self contained, and only the references cited in the introduction and the discussion sections of the dissertation are included in the dissertation’s bibliography. v Stellenbosch University https://scholar.sun.ac.za Contents Declaration i Abstract iii Opsomming iv Contents vi 1 Introduction 1 1.1 Bibliographic databases . 1 1.2 A primer on citation metrics . 3 1.3 Indirect citation metrics . 5 1.4 Metric normalisation . 7 1.5 Test data . 9 1.6 Evaluation measures . 11 2 Evaluating indirect impact metrics using test data 13 3 How to evaluate rankings of academic entities using test data 14 4 Ranking papers 16 5 Ranking authors (part 1) 18 6 Ranking authors (part 2) 19 7 Discussion and conclusion 21 List of references 29 Appendix 36 vi Stellenbosch University https://scholar.sun.ac.za Chapter 1 Introduction Citation metrics constitute an essential tool in scientometrics and play an increasingly important role in research evaluation (Bornmann, 2017). A large number of metrics are proposed every year for rating papers and authors which are often generalised and applied to aggregate levels for the evaluation of departments, institutions, or even countries. The paper by Mingers and Leydesdorff (2015) is an excellent overview of the field of scientometrics, review of citation impact metrics, and discussion about how metrics are used for research evaluation and policy decisions. An important responsibility of the research community is to continuously scrutinise current metrics and validate whether they fulfil their intended purpose. According to Bornmann and Mutz (2015), situations should be created or found in empirical research in which a metric can fail to achieve its purpose. A metric should be regarded as only provisionally valid if these situations could not be found or realised. For example, metrics that are intended to rate the quality of papers should be assessed by correlating them with peer assessments (Bornmann and Marx, 2015). However, collecting direct peer-assessed test data is time-consuming and expensive. We therefore use a proxy for this assessment which comprises test data based on other ground truth provided by domain experts. More specifically, we collected five test data sets that comprise researchers or papers that have won awards or similar accolades of achievement that indicate their impact or influence in their respective academic fields. This forms the central theme of this dissertation, which focuses on paper and author ranking metrics and their evaluation with the use of appropriate test data. In this chapter, and with the future chapters in mind, we give the reader a brief overview of citation metrics and their evaluation. At the same