Validity of Altmetrics Data for Measuring Societal Impact
Total Page:16
File Type:pdf, Size:1020Kb
Accepted for publication in the Journal of Informetrics Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime Lutz Bornmann Division for Science and Innovation Studies Administrative Headquarters of the Max Planck Society Hofgartenstr. 8, 80539 Munich, Germany. E-mail: [email protected] Abstract Can altmetric data be validly used for the measurement of societal impact? The current study seeks to answer this question with a comprehensive dataset (about 100,000 records) from very disparate sources (F1000, Altmetric, and an in-house database based on Web of Science). In the F1000 peer review system, experts attach particular tags to scientific papers which indicate whether a paper could be of interest for science or rather for other segments of society. The results show that papers with the tag “good for teaching” do achieve higher altmetric counts than papers without this tag – if the quality of the papers is controlled. At the same time, a higher citation count is shown especially by papers with a tag that is specifically scientifically oriented (“new finding”). The findings indicate that papers tailored for a readership outside the area of research should lead to societal impact. If altmetric data is to be used for the measurement of societal impact, the question arises of its normalization. In bibliometrics, citations are normalized for the papers’ subject area and publication year. This study has taken a second analytic step involving a possible normalization of altmetric data. As the results show there are particular scientific topics which are of especial interest for a wide audience. Since these more or less interesting topics are not completely reflected in Thomson Reuters’ journal sets, a normalization of altmetric data should not be based on the level of subject categories, but on the level of topics. Key words altmetrics; bibliometrics; F1000; Twitter; societal impact 2 1 Introduction In science policy it was assumed into the 1990s that society can benefit most from a science which pursues research at a high level. Correspondingly, indicators were (and are) used in scientometrics, such as citation counts, which measure the impact of research on science itself. Since the 1990s a trend can be observed in science policy no longer to assume that society benefits from a science pursued at a high level (Bornmann, 2012, 2013). It is now expected that the benefit for society be demonstrated. Thus, for example, organizations which support research (such as, for example, the US National Science Foundation) now expect that supported projects lead to an outcome which is of interest not solely to science. For these organizations the consequence for the peer review procedure is that not only the possible scientific yield of the project has to be assessed, but also the returns for other sections of society. These days, scientific work is not assessed solely on the basis of the peer review procedure, but also with indicators. A good example of these quantitative assessments is university ranking (Hazelkorn, 2011). The most important indicators in this connection (not only with university ranking) are bibliometric indicators based on publications and their citations (Vinkler, 2010). The impact of research is generally measured with citations. Since the impact of one publication on another publication is measured here, citations measure the impact of research on research itself. Citations allow a determination as to whether research (for example in institutions or countries) is being pursued at the highest level on average or not. But citations cannot be used to measure the impact of research on other sections of society. This is why scientometrics has taken up the wish in science policy to measure the impact of research beyond the confines of science, and is seeking new possibilities for impact measurement (Bornmann, 2014). With societal impact assessments the (1) social, (2) cultural, (3) environmental and (4) economic returns (impact and effects) from results (research 3 output) or products (research outcome) of publicly funded research are measured (Bornmann, 2013). Currently the most favored procedure for measuring societal impact involves case studies, which, however, are seen as too time-consuming and therefore less practicable. An attractive possibility for measuring societal impact is seen in altmetrics (short for alternative metrics) (Mohammadi & Thelwall, 2014). “Altmetrics refers to data sources, tools, and metrics (other than citations) that provide potentially relevant information on the impact of scientific outputs (e.g., the number of times a publication has been tweeted, shared on Facebook, or read in Mendeley). Altmetrics opens the door to a broader interpretation of the concept of impact and to more diverse forms of impact analysis” (Waltman & Costas, 2014, p. 433). An overview of various altmetrics may be obtained from Priem and Hemminger (2010). Twitter (www.twitter.com), for example, is the best known microblogging application. This application allows the user to post short messages (tweets) of up to 140 characters. “These tweets can be categorized, shared, sent directly to other users and linked to websites or scientific papers … Currently there are more than 200 million active Twitter users who post over 400 million tweets per day” (Darling, Shiffman, Côté, & Drew, 2013). Priem and Costello (2010) define tweets as Twitter citations if they contain a direct or indirect link to a peer-reviewed scholarly article. These Twitter citations can be counted and assessed as an alternative metric for papers. There are already a number of studies concerning altmetrics. An overview of these studies can be found in Bar-Ilan, Shema, and Thelwall (2014), Haustein (2014), and Priem (2014). Many of these studies have measured the correlation between citations and altmetrics. Since the correlations were often at a moderate level, the results are difficult to interpret: Both metrics seem to measure something similar but not identical. The studies published so far cannot yet provide a satisfactory answer to the question whether altmetrics is appropriate for the measurement of societal impact or not. That is the reason for this investigation of the question. 4 In January 2002, a new type of peer-review system has been launched, in which about 5000 Faculty members are asked “to identify, evaluate and comment on the most interesting papers they read for themselves each month – regardless of the journal in which they appear” (Wets, Weedon, & Velterop, 2003, p. 251). What is known as the Faculty of 1000 (F1000) peer review system is accordingly not an ex-ante assessment of manuscripts provided for publication in a journal, but an ex-post assessment of papers which have already been published in journals. The Faculty members also attach tags to the papers indicating their relevance for science (e.g. “new finding”), but which can also serve other purposes. One example of the tags which the members can attach is “good for teaching”. Papers can be marked in this way if they represent a key paper in a field, are well written, provide a good overview of a topic, and/or are well suited as literature for students. Papers marked with this tag can be expected to have an impact beyond science itself (that means societal impact), unlike papers without this tag. If altmetrics indicate a greater impact for papers with this tag than those without, this would suggest that altmetrics measure societal impact. This study is essentially based on a dataset with papers (and their evaluations and tags from Faculty members) extracted from F1000 (see also Mohammadi & Thelwall, 2013). This dataset was extended with further data – bibliometric (e.g. citation counts) and altmetric (e.g. Twitter counts). There follows in the next sections a comparison of altmetric counts with citation counts, to investigate the differences between the two metrics in relation to tags and recommendations. 2 Methods 2.1 Peer ratings provided by F1000 F1000 is a post-publication peer review system of the biomedical literature (papers from medical and biological journals). This service is part of the Science Navigation Group, a group of independent companies that publish and develop information services for the 5 professional biomedical community and the consumer market. F1000 Biology was launched in 2002 and F1000 Medicine in 2006. The two services were merged in 2009 today constitute the F1000 database. Papers for F1000 are selected by a peer-nominated global “Faculty” of leading scientists and clinicians who then rate them and explain their importance (F1000, 2012). This means that only a restricted set of papers from the medical and biological journals covered is reviewed, and most of the papers are actually not (Kreiman & Maunsell, 2011; Wouters & Costas, 2012). The Faculty nowadays numbers more than 5,000 experts worldwide, assisted by 5,000 associates, which are organized into more than 40 subjects (which are further subdivided into over 300 sections). On average, 1,500 new recommendations are contributed by the Faculty each month (F1000, 2012). Faculty members can choose and evaluate any paper that interests them; however, “the great majority pick papers published within the past month, including advance online papers, meaning that users can be made aware of important papers rapidly” (Wets, et al., 2003, p. 254). Although many papers published in popular and high-profile journals (e.g. Nature, New England Journal of