Over-Optimization of Academic Publishing Metrics Draft Article
Total Page:16
File Type:pdf, Size:1020Kb
Over-Optimization of Academic Publishing Metrics Draft Article Over-Optimization of Academic Publishing Metrics: Observing Goodhart’s Law in Action Michael Fire1,2* and Carlos Guestrin1 Abstract The academic publishing world is changing significantly, with ever-growing numbers of publications each year and shifting publishing patterns. However, the metrics used to measure academic success, such as the number of publications, citation number, and impact factor, have not changed for decades. Moreover, recent studies indicate that these metrics have become targets and follow Goodhart’s Law, according to which “when a measure becomes a target, it ceases to be a good measure.” In this study, we analyzed over 120 million papers to examine how the academic publishing world has evolved over the last century. Our study shows that the validity of citation-based measures is being compromised and their usefulness is lessening. In particular, the number of publications has ceased to be a good metric as a result of longer author lists, shorter papers, and surging publication numbers. Citation-based metrics, such citation number and h-index, are likewise affected by the flood of papers, self-citations, and lengthy reference lists. Measures such as a journal’s impact factor have also ceased to be good metrics due to the soaring numbers of papers that are published in top journals, particularly from the same pool of authors. Moreover, by analyzing properties of over 2600 research fields, we observed that citation-based metrics are not beneficial for comparing researchers in different fields, or even in the same de- partment. Academic publishing has changed considerably; now we need to reconsider how we measure success. Multimedia Links I Interactive Data Visualization I Code Tutorials I Fields-of-Study Features Table Keywords Science of Science; Scientometrics; Goodhart’s Law; Data Science; Big Data; Academic Publishing Metrics 1Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 2The eScience Institute, University of Washington, Seattle, WA *Corresponding author: fi[email protected] Contents 5 Discussion 13 6 Conclusions 17 Introduction1 7 Data and Code Availability 17 2 Related Work4 Acknowledgments 18 2.1 Changes in Publication Trends.............4 References 18 2.2 Success Metrics and Citation Trends.........4 A Supplementary Materials - Additional Results 20 3 Methods and Experiments5 A.1 Additional Results..................... 20 3.1 Datasets.............................5 The Microsoft Academic Graph (MAG) Dataset • The AMiner Dataset • The SCImago Journal Rank Dataset • The Join Dataset Introduction 3.2 Code Framework.......................5 In the last century, the academic publishing world has changed 3.3 Analysis of Publication Trends.............6 drastically in volume and velocity. The volume of papers has Paper Trends • Author Trends • Journal Trends • Field-of-Research increased sharply from about 174,000 papers published in Trends 1950 to over 7 million papers published 2014 (see Figure2). 4 Results7 Furthermore, the speed in which researchers can share and publish their studies has increased significantly. Today’s re- 4.1 Results of Paper Trends..................7 searchers can publish not only in an ever-growing number of 4.2 Results of Author Trends................ 11 traditional venues, such as conferences and journals, but also 4.3 Results of Journal Trends............... 12 in electronic preprint repositories and in mega-journals that 4.4 Results of Fields-of-Research Trends....... 13 provide rapid publication times [1]. Over-Optimization of Academic Publishing Metrics: Observing Goodhart’s Law in Action — 2/47 Over-Optimization of Academic Publishing Metrics: Observing Goodhart’s Law in Action — 3/47 Along with the exponential increase in the quantity of changes in academia at different levels (see Figure 1): published papers, the number of ranked scientific journals has increased to over 20,000 journals (see Figures 15 and S17), • Papers – We observed that on average papers became and the number of published researchers has soared (see Fig- shorter, yet other features, such as titles, abstracts, and ure S14). As part of this escalation, metrics such as the number author lists, became longer (see Figures4,9, S2, S5, S6, of papers, number of citations, impact factor, h-index, and and S7). Furthermore, both the number of references altmetrics are being used to compare the impact of papers, and the number of self-citations considerably increased researchers, journals, and universities [2, 3, 4, 5]. Using quan- (see Figures S9 and8). Moreover, the total number titative metrics to rank researchers contributes to a hypercom- of papers without any citations at all increased sharply petitive research environment, which is changing academic over time (see Figure 10). culture - and not in a positive direction [6]. • Authors – We noticed a sharp increase in the number of Studies suggests that publication patterns have changed as new authors. Moreover, we observed a negative relation a result of Goodhart’s Law, according to which, “When a mea- between researchers’ career ages and the number of sure becomes a target, it ceases to be a good measure” [6, 7]. publications, i.e., early career researchers tended to Goodhart’s Law, and its closely related Campbell’s Law [8], publish more quickly than those later in their careers influence many systems in our everyday life, including eco- (see Figure 12). Additionally, the average number of nomical [9], educational [8], and other decision-making sys- coauthors per author considerably increased over time tems [10]. As an example, Goodhart’s Law can be found in (see Figure 13). the NYPD’s manipulation of crime reports (the “measure”) in order to improve crime statistics (the “target”) [11]. Another • Journals – We observed a drastic increase in the number example is found in the educational system, revealing that of ranked journals, with several hundreds of new ranked when “test scores become the goal of the teaching process, journals published each year (see Figures 15, S17, and S18). they both lose their value as indicators of educational status In addition, we observed that journal ranking changed and distort the educational process in undesirable ways” [8]. significantly as average citations per document and Recent studies indicate that when measures become tar- SCImago Journal Rank (SJR) measures increased, while gets in academic publishing, the effectiveness of the measures the h-index measure decreased over time. Moreover, the can be compromised, and unwelcome and unethical behaviors percentage of papers with returning authors increased may develop, such as salami publications [12], ghost author- sharply in recent years (see Figure 19). For example, in ships [13], p-hacking [14], metrics manipulation [15], and Cell journal about 80% of all papers published in 2016 even faking of peer reviews [16]. included at least one author who had published in the If the influence of Goodhart’s Law on academia is indeed journal before, while in 1980 this rate stood at less than significant, then it should be possible to observe that academic 40%. entities, such as researchers and journals, will over-optimize their own measures to achieve a desired target. Similar to the • Fields of Research – We analyzed the properties of consequences of making test scores a target, chasing after cer- 19 major research domains, such as art, biology, and tain measures in the academic publishing world to desperately computer science, as well as the properties of 2600 sub- win the battle of “impact or perish”[7] can have undesirable domains, such photography, genetic diversity, and com- effects. puter animation. Our analysis revealed that different In this study, our main goal was to utilize new advances in domains had widely ranging properties (see Figures 21 data science tools to perform an in-depth and precise bottom- and S24), even within subdomains. For example, dif- up analysis of academic publishing over the decades. Our ferent subdomains of biology, and even subdomains of comprehensive analysis ranged from micro to macro levels as genetics, had surprisingly different average numbers of we studied individual researchers’ behaviors as well as behav- citations (see Figures S33, and S34). ioral changes within large research domains. Additionally, we These observations support the hypothesis that commonly wanted to uncover how and if Goodhart’s Law has changed used measures, such as the author’s number of papers, h-index, academic publishing. and citation number, have become targets. Researchers can To achieve our research goals, we developed an open- increase their average number of papers by writing shorter source code framework to analyze data from several large- papers with more coauthors. The h-index can be boosted by scale datasets containing over 120 million publications, with increasing the number of papers and self-citations. A higher 528 million references and 35 million authors,1 since the citation number is attained with longer titles and abstracts, as beginning of the 19th century. This provided a precise and full well as more references (see Section 4.1 and Figure S13). picture of how the academic publishing world has evolved. It is time to consider how we judge academic papers. In our analysis, we uncovered a wide variety of underlying Citation-based measures have been the standard for decades, 1The number of authors was estimated according to the unique full names but these measures are far from perfect. In fact, our study in the Microsoft Academic Graph dataset (see Section 4.2) shows that the validity of citation-based measures is being Over-Optimization of Academic Publishing Metrics: Observing Goodhart’s Law in Action — 4/47 compromised and their usefulness is lessening. Goodhart’s Additionally, many studies have focused on how pub- Law is in action in the academic publishing world. lication trends have changed over time, often focusing on The remainder of the paper is organized as follows: In specific geographical areas, various demographic character- Section2, we provide an overview of related studies.