Feburary 2020 Global Research Report The value of bibliometric databases: Data-intensive studies beyond search and discovery

Jonathan Adams, David Pendlebury and Martin Szomszor Author biographies

Jonathan Adams is Chief Scientist at David Pendlebury is Head of Martin Szomszor is Director at the the Institute for Scientific Information Research Analysis at ISI. Since 1983 Institute for Scientific Information (ISI). He is also a Visiting Professor he has used data to and has also held the role of Head at King’s College London, Policy study the structure and dynamics of of Research Analytics at ISI. He was Institute. In 2017 he was awarded research. He worked for many years named a 2015 top-50 UK Information an Honorary D.Sc. by the University with ISI founder . Age data leader for his work in creating of Exeter, for his work in higher With Henry Small, David developed the REF2014 impact case studies education and research policy. ISI’s Essential Science Indicators. database for the Higher Education Funding Council for England (HEFCE).

Foundational past, visionary future

The Institute for Today, as the ‘university’ of the • Carries out research to sustain, Scientific Information Web of Science Group, ISI both: extend and improve the knowledge base and • Maintains the foundational disseminates that knowledge ISI builds on the work of Dr. Eugene knowledge and editorial rigor upon to our colleagues, partners and Garfield – the original founder and which the Web of Science index and all those who deal with research a pioneer of information science. its related products and services in academia, corporations, Named after the company he are built. Our robust evaluation and funders, publishers and founded – the forerunner of the curation have been informed by governments via our reports Web of Science Group – ISI was research use and objective analysis and publications and at re-established in 2018 and serves as a for almost half a century. Selective, events and conferences. home for analytic expertise, guided structured and complete data in by his legacy and adapted to respond the Web of Science provide rich to technological advancements. insights into the contribution and value of the world's most impactful Our global team of industry- scientific and research journals. recognized experts focus on the These expert insights enable development of existing and researchers, publishers, editors, new bibliometric and analytical librarians and funders to explore approaches, whilst fostering the key drivers of a journal's value collaborations with partners and for diverse audiences, making academic colleagues across the better use of the wide body of global research community. data and metrics available.

ISBN: 978-1-9160868-7-6

2 Executive summary

In 2019, around 145,000 researchers The topical structure of such from, on average, 139 countries publications demonstrates the working across a diversity of research critical relevance of Web of Science disciplines interrogated the Web of literature to review health policy Science each day to explore research targets like cancer, women’s health information and discover key and cardiovascular disease, the literature to inform current research. management of medical outcomes, and the development of innovative In 1981 the Web of Science indexed methods and treatments. approximately 500,000 papers (substantive academic articles and Tracking such literature across time reviews) from 6,800 journals; this reveals the emergence of new expanded substantially to 2.5 million fields and helps inform the direction papers sourced from 21,300 journals of funding. Identifying frequent in 2019. This is a deep data resource authors in a select area reveals the for a wide range of analytic uses. distribution of expertise and supports comparative evaluation of activity There are, however, few studies of and outcomes for national policy and how the Web of Science is used as a institutional research management. bibliographic database other than for the purposes of search and discovery. Our analysis shows that Web of Science is the primary source of publication and citation data for the Our analysis majority of systematic research reviews across a broad range of disciplines and shows that Web about twice as many research management and evaluation studies as of Science is the any other source. Web of Science is the primary data source for such work primary source of in the USA, China, and most of western Europe. Countries where the Scopus publication and database was more frequently acknowledged include Iran and Italy. citation data for A key beneficiary of structured use of the majority of Web of Science bibliographic records are the biomedical researchers systematic who have an established and structured approach to accessing raw research reviews material for reviews that inform the development and current state of research topics that are critical to human health and disease control.

3 Introduction

The Web of Science was The numbers of papers in the Web of The focus of research management interrogated every day in 2019 Science, and the spread of research it interest is, however, rarely pitched at by around 145,000 researchers covers, has expanded hugely over the this categorical level. It is much more across an average of 139 countries, last few decades. In 1981 it indexed likely to be ‘topical’ at a finer grain, representing the full spectrum about 500,000 papers (substantive within or between established journal of research disciplines in the academic articles and reviews) every categories. Research is increasingly natural and social sciences and, year from around 6,800 journals. interdisciplinary, as the focus of increasingly, in the humanities. Today, it indexes about 2.5 million research topics often spans traditional papers from about 21,300 journals disciplinary structures and cutting- Researchers use the Web of annually. The index houses around 20 edge solutions to major challenges Science to initiate new research million papers from researchers based draw on multiple fields. This means plans and help frame the questions in the USA or about 27 million papers that the information needed by both they want to answer; and then to sourced from researchers in the researchers and those tasked with search for and discover the key European Union: a deep data resource selecting, funding and evaluating literature that helps to support and for a wide range of analytic uses. the outcomes of research projects inform their current research. is likely to come from targeted searches, analyses and reports, to Perhaps surprisingly, there are few be of wide interest across a field. studies of the ways in which the bibliographic database is used The numbers of The concentrated mass of the by the research community for database has many tales to other purposes than for search and papers in the Web tell. Drawing together related discovery. Pringle (2008) discussed publications in a structured way the rising use of bibliographic of Science, and the provides the raw material for reviews metadata in research evaluation that reveal the development and and Schnell (2018) documented spread of research current state of broader research the historical place of the Web topics. Tracking literature across of Science as the first citation it covers, has time reveals the emergence of index for data analytics and new fields and may help to direct . Most recently, Li, expanded hugely funding where it can be most usefully Rollins and Yan (2018) confirmed spent. Identifying the most frequent that the quantitative impact of over the last few authors in a select area reveals the Web of Science had not been the distribution of expertise and rigorously examined by scientific decades. supports comparative evaluation studies. They investigated the of activity and outcomes. ways in which this source was mentioned in a sample of 19,478 papers published between 1997- 2017 and analysed the distribution Publication records are structured across countries, institutions and in journal-based categories domains. This appears to have (originally designed to make been the first study to empirically searching easier) but now, are more investigate the documentation frequently employed as the basis on the use of the database. for comparative analytics. There are 254 detailed Web of Science journal In this report, we extend the studies categories (as of January 2020) and of Li et al (2018) and Schnell (2018) 21 broader categories in the Essential by focusing on the ways researchers Science Indicators. Citation analysis is are using Web of Science data to often carried out using the framework learn about the findings reported of Web of Science categories, where across the entire scientific and the close disciplinary relationship scholarly communication system between journals assigned to any and how this information can category means that the average support specific studies, especially citation count for all articles in that systematic reviews, and support category in a particular year is a better research management. meaningful global benchmark.

4 When, what and where?

Our first step in this report is to look there has been more usage than the at the numbers of papers (i.e. both ‘normal’ level of academic search articles and reviews) published and discovery that would be required The data shows how each year and indexed in the Web to produce a research paper. of Science that also reference, in rapid the growth their titles, abstracts and key-words, The data shows how rapid the growth the analysis of data from the Web of such studies has been – compared of such studies has of Science or other widely-used, to the overall growth of the underlying global publication databases. database. Whereas acknowledgments been, compared to of database source were rare in Over the period 1999 through to studies using bibliographic data in the the overall growth mid-2019, when we extracted these 1990s, numbers rose to around 1,000 data, we found 51,120 papers that papers per year by 2010 and growth of the underlying explicitly referenced the use of since then has sky-rocketed. There are one or more major bibliographic now over 10,000 papers per year that database. data sources. We assume that such use bibliographic data in some form an acknowledgment implies that for reviews and analytics (Figure 1).

Figure 1. Annual numbers of papers (articles and reviews) indexed every year in the Web of Science since 1981 and the count of papers that report that their content has drawn on bibliographic data sources for analyses. The data for 2019 cover only a part year (to September).

2,500,000 12,500

2,000,000 10,000 Count of papers analysing Web Science data

1,500,000 7,500

1,000,000 5,000

500,000 2,500 Count of papers annually indexed in the Web Science

0 0

1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 2019

Web of Science Acknowledging 10,000 bibliographic data

5 The Web of Science is not the From a different perspective, Among articles, however, about only source for such analyses, and some 20,511 studies used Scopus twice as many report using Web indeed is predated by the Index data of which about 35% also used of Science data compared to the Medicus (1879) and Biosis (1926), Web of Science (Figure 2). number reporting Scopus data. This but the Science (the is likely linked to the nature of the forerunner of the Web of Science This is an interesting outcome. publication content and analyst’s index) is unquestionably the oldest Google Scholar poses problems for objectives, which we discuss citation database in the sciences. bibliographic use, including the need later in this report (Figure 3). Other important sources exist. for analysts and other research users These include the Scopus database, to de-duplicate and disambiguate the created in 2004 and managed by information. There is no information the publishing company Elsevier; on the scope of coverage. The benefit Where do the and the Google Scholar database, of Google Scholar is free access. publications which is not curated or indexed. Web of Science and Scopus are In order to provide comparability commercial and curated sources. come from? across disciplines, this analysis excludes subject-specific databases The long-established research such as PubMed, JSTOR and economies in the USA and Western DBLP and does not consider pre- Which types of Europe make up a substantial print indexes such as arXiv. publications use the part of the activity, joined by the Anglophone diaspora in Australia and Many prior studies report that they Web of Science? Canada, but by far the biggest single have drawn on more than one of these contributor is China. In the great sources. The overlapping circles of The majority are reviews, but it is also majority of these cases the primary the diagram in Figure 2 shows us popular with those researching more source of information is from the Web that of the 29,079 publications that typical academic articles. Data use varies of Science. Google Scholar is cited Web of Science publication between these two types of publications. frequently acknowledged in data, there were 7,212 (25%) that also Among reviews, around 20% more countries such as India and South drew on Scopus data and 1,488 (5%) chose Web of Science over Scopus as Africa. Countries where the of these also used Google Scholar. a source for their analysis and content. Scopus database was more frequently acknowledged include Iran and Italy. (Figure 4).

Figure 2. The numbers of studies published in academic journals indexed in Web of Science (1999-2019) that reported their use of one or more of the three principal bibliographic indexing databases.data sources for analyses. The data for 2019 cover only a part year (to September). 5,724 20,096 11,221

Web of Science 29,079

 Scopus 20,511 1,488 Google Scholar 14,079 2,078 1,771

8,742

6 Figure 3. Types of publications in academic journals indexed in the Web of Science (1999-2019) that reported their use of one or more of the three principal bibliographic indexing databases.

20,000

15,000

10,000

Count of published studies (1999-2018) 5,000

0 Articles Reviews Proceedings

WoS Scopus Scholar

Figure 4. Regional distribution of publications in academic journals indexed in the Web of Science (1999-2019) that reported their use of one or more of the three principal bibliographic indexing databases.

9,000

6,000

3,000 Count of published studies (1999-2018)

0

UK USA Italy Iran Brazil Spain India China France Japan Turkey Canada Australia Belgium Sweden Portugal Germany Denmark Netherlands Switzerland South Korea South Africa

WoS Scopus Scholar

7 What topics are included?

The rise in papers acknowledging Reviews In practice, it is difficult to see Web of Science data and other how an effective review could be bibliographic databases as a There are many approaches to implemented without an appropriately critical source tells us that the reviews of existing literature, comprehensive database. The utility of bibliographic data for typically for a select topic within reviewer has a range of search tools analysis, as well as for search and a discipline. Reviews have long available and should be confident that discovery, has become evident been identified as one of the the publications discovered come and valuable to many researchers. most important parts of the entire from authoritative sources. Those research corpus and some review items come from journals that have to However, we need to recognise that journals are amongst the most pass stringent editorial standards. It is the rise in acknowledgments may frequently cited serials in their therefore no surprise that the Web of also reflect a change in researcher field. Eugene Garfield, the founder Science appears as the most frequent behaviour. People turned to the of ISI, recognized the important source in review publications. Web of Science as the primary and role of reviews and reviewers natural source for any analysis of the in the scientific literature. He literature for many years but, since the noted that statements in reviews appearance of Google Scholar (beta were valuable for indexing and Research evaluation release in 2004) and then Scopus that nearly every statement was (2004), they have recognized a need referenced so the citations were, Garfield noted in Science in 1955 to include a variety of sources - often by extension, also indexing that citation counts might reflect the to assure others of the provenance ‘statements’ (Garfield 1976, 1982). impact a publication had on other and quality of their information. researchers (Garfield, 1955). Since Some reviews comprise of an then, the field of scientometrics has What are the characteristics of annotated list and precis of the expanded significantly and many bibliographic records that deliver recent literature - but others are academic groups in North America, those benefits? Two primary use truly systematic. A good review Europe and – more recently – cases stand out: systematic review summarises the results of studies Australasia and Asia now contribute and research evaluation. There are already refereed, accepted and to its development through studies variant approaches to both. published in the corpus and provides of the data and the development a high level of evidence on the of sophisticated descriptive and outcomes of prior work and the state comparative indicators. These have of existing knowledge. It will include been taken up by research managers judgements about the evidence and and policy analysts in national reports As in any study, inform recommendations for further and funding programme evaluation, work. Such a complex review may for which they are well suited. from any field, the pool data through meta-analyses to arrive at a better overview as As in any study, from any field, the identification and the authors assess the supportive identification and description of the and contradictory evidence. data sources used is paramount. There description of the will consequently be many papers To enable such a significant in disciplinarily specific journals and data sources used contribution, there is no better a spread across journals relevant to starting point than a curated, other disciplines where the costs is paramount. deeply structured, bibliographic and benefits of research assessment database that not only contains is a topic of interest. Because of the much of the material required for a increase in research investment review but also offers indexing and globally, the policy attention paid data enhancement - enabling the to methods and to comparative reviewer to rapidly filter, sift and outcomes means that the field has prioritize the material at hand. become regionally pervasive.

8 Methodology We also avoided seeking for balance across years within each topic. In To determine the topics that are the practice, each topic is likely to be targets for analysis, we examined strongly influenced by the content of the text in the titles, abstracts and the much more abundant data records keywords of 51,120 publications of the last few years (see Figure 1). extracted, from the Web of Science over the period 1999 to mid-2019. A We found that a target set of about 30 standard unigram topic-modelling topics produced a relatively balanced pipeline was used, filtering out terms set of clusters. A topic is an arbitrary that appear in more than 50% of the partition in the data records using corpus and those that appeared in thresholds of similarity. In this case we fewer than three publications. All used the frequency with which terms terms are converted to lowercase, are shared. The topics themselves resulting in a dictionary containing share these terms so we can build 36,095 words. The weighting for up a family tree that relates the each word (i.e. the number of times topics to one another. This is called it appeared in the title, abstract, and a dendrogram and it successively keywords) was used as input to the clusters the topics in pairs, groups Non-negative matrix factorization and families. Topics that cluster (NMF) algorithm to produce topic together are more similar in their use models for a specified number of of terminology than topics that are far topics. The process was executed apart in the dendrogram. in Python using standard packages from Scikit learn, a free software machine learning library for the Python programming language (Pedregosa et al., 2011). We found that We generated a range of models, a target set of each producing a different number of topics (ranging from 10 to 50) about 30 topics and evaluated them qualitatively, checking for coherence of the output produced a (i.e., did the model contain topics with words that are clearly related?) relatively balanced and granularity (i.e., how specific in terminology are they?). Although it set of clusters. is possible to produce topic models with very few, or a very large number of topics, the nature of the corpus typically dictates what are sensible choices for a given analysis. We Figures 5 and 6 display and show the found that a target set of topics that relationship between the topics we produced a useful result, in the sense identified. Each topic has a label that that the clusters were of a relatively is determined from an analysis of the similar size, contained clearly most frequent terms used by the set related terms and were sufficiently of papers in that topic. Labelling is specific to partition the publication indicative and proper interpretation dataset into identifiable clusters. requires informed review of the actual papers that form the topic. Labelling At the development stage we did can never be absolute: expert views not seek to test the relative diversity differ as to the precise nature of the of topic content, nor the degree to material in any group created in this which they drew on a few or many or any way; such views also tend to Web of Science journal categories. evolve as analysis proceeds.

9 Topics of interest The upper part of the tree covers A topic and global medical priorities (diabetes, document map Some topics clearly relate to current obesity, children & pregnancy), policy priorities in health; others are disease control and diagnosis about methodology; and others call out (Topic 19 indicates the rising We can also visualise the topics on a underpinning research. Some are tricky significance of plant-related map of the publication landscape, rather to interpret: Topics 2 and 5 are both pharmacology strategies), and a set than as the family tree of Figure 5. The evidently about genetics, particularly of topics related to the increasingly map includes each publication in our polymorphism and its relationship important areas of mental health dataset as an individual point, groups to patient-specific remedies, but an and management. Linked to this are them by their textual similarity, and expert is needed to understand why three Topics (3, 21 and 22) which pictures the relationship between topics the algorithm recognises two topics are about aspects of healthcare and the clusters, which the dendrogram rather than one. Topic 26 refers to social promotion and management. showed us diagrammatically. It gives media, but it should be interpreted Health research, data and analysis us further insights as to the relationship in the context of media as a source form a separate cluster (Topics 0, between these groups of documents. of information and misinformation, 7, 24 and 11). Standing next to the particularly about health. main cluster of medical and health Once we have located the topics topics is a separate cluster on earlier among the publication points in the The picture is dominated by papers stages of research in genetics map, we can add other information to (which are mainly reviews) related to and molecular biology associated reveal further patterns that help with clinical and health, and to with future health solutions. interpretation. In this instance we have areas of health research including both added a second version of the map healthcare, treatment and innovative The lower part of the diagram shows where the publications are colored research. This is unsurprising. The a quite distinct cluster of topics differently to identify those that are concept of the ‘systematic review’ is related to the education and research informetric (colored red, where titles well established in biomedical research. ecosystem. Topic 26 relates to social or abstracts contain keywords such First, because such research has life- media and scholarly communication as ‘scientometric’, ‘bibliometric’, critical outcomes so best practice is while Topic 18 is about learning and ‘informetric’), or reviews (colored a key requirement in enabling rapid skills in the context of healthcare green, when titles or abstracts and efficient progress and this state of training. Finally, publications that contain phrases such as ‘systematic knowledge is constantly under review. describe the use of Web of Science review’, ‘meta-analysis’, and ‘literature Secondly, because a very large share of data for research and innovation search’). This immediately reveals the public and commercial research funds management (Topic 1) and research differentiation between scientometric are invested in these areas, both the evaluation (Topics 12 and 27) are topics and the rest of the literature. management of current research and of the largest individual categories, A total of 4,852 papers were classified future investment depends on regular each with well over 2,000 associated as informetric, and a total of 36,957 and effective monitoring of outcomes. papers. These three topics also were classified as reviews. differ from the rest of the dataset The Cochrane database of systematic because the papers are mostly It is of interest to note that 435 were reviews (https://www.cochrane.org/ articles, not reviews. Overall, they classified as both informetric and about-us) is particularly important in this account for only a minority of the a review. These point to a growing regard and has been globally influential data use cases. This may be a surprise trend to apply bibliometric techniques in setting standards and demonstrating to researchers who believe that to enhance the systematic review value and utility. The use of multiple they are increasingly accountable methodology, for example, to identify databases to inform such reviews for and monitored in their research emerging research areas within has been studied: Bramer et al (2017) activity. Of course, these papers disciplines, to perform topic analysis of concluded: “Optimal searches in are from academics involved in the literature using the citation graphs systematic reviews should search scientometric research, describing or keyword analysis, and to inspect at least Embase, MEDLINE, Web of case studies that develop or illustrate trends in funding. These enhance Science, and Google Scholar as a analytical methodology, rather than understanding of how the research minimum requirement to guarantee from those carrying out evaluations, works, providing useful quantitative adequate and efficient coverage.” which are less often published. indicators to inform research strategy.

10 Figure 5. Dendrogram showing the similarity between the topics identified among 51,120 papers in the Web of Science that use a bibliographic database (publications from 1999-2019). The numbers of papers in each topic is shown to the right of the topic label. Colors indicate topics that cluster together with relatively high degrees of similarity.

17. Women's health

16: Diabetes

20: Nutrition

25. Pediatrics

13. Dietetics & obesity

15. Disease control strategies

14. Diagnostic medicine

19. Plant pharmacology

23. Dementia

9. Mental health

6. Sports science

10. Pain management

28. Sleep disorders

21. Healthcare practice

3. Mental health & social welfare

22. Health intervention

7. Clinical trials

0. Statistical analysis

24. Cardiovascular diseases

11. Surgical outcome analysis

5. Genetics 2

2. Genetics 1

29. Cytokines & inflammatory disorders

8. Morbidity & mortality

4. Soft tissue cancer

26. Social media

18. Learning, skills & healthcare

27. Scientometrics

12. Bibliometrics

1. Research & innovation policy

11 Figure 6. A topic map of 51,120 papers (1999-2019) that acknowledge the particular use of a bibliographic data source.

5. Genetics 2

2. Genetics 1

18. Learning, skills & healthcare

26. Social media 27. Scientometrics 28. Sleep disorders 21. Healthcare practice

3. Mental health & social welfare 1. Research & 22. Health innovation policy intervention 12. Bibliometrics 23. Dementia

15. Disease 19. Plant control strategies 9. Mental health pharmacology 17. Women's health 25. Pediatrics 29. Cytokines & 13. Dietetics inflammatory disorders & obesity 16: Diabetes

20: Nutrition 6. Sports science 7. Clinical trials 14. Diagnostic medicine 24. Cardiovascular diseases 10. Pain 0. Statistical analysis management 4. Soft tissue cancer

11. Surgical 8. Morbidity & mortality outcome analysis

Commentary: The map reveals the structure and makeup of the various research topics that make use of bibliographic databases. The individual topics are clustered using colors corresponding to those shown in the dendrogram (see Figure 5). The crimson cluster contains topics 18, 26, 1, 27 and 12: those that feature numerical approaches and statistical analysis. The green area (topics 21, 3, and 22) are those that relate to social welfare, often practitioner oriented, and are close to specific disease focused topics that form that larger teal cluster. The split purple cluster (topics 2,5 4 and 8) are spread over difference regions, with those in genetics appearing closer to the crimson cluster due to the analytical methodologies used.

12 Topic growth and category spread

The growth of specific topics subject-matter expert, but also from management and it is no surprise that over time broadly reflects the other areas related to supporting 572 (28%) of the 2,008 papers linked general pattern shown in Figure research and technologies. to this topic can be found in journals 1. It rises markedly in the years in the ‘Cardiac & cardiovascular after 2008 and much more There are 254 Web of Science journal systems’ Web of Science category, steeply in the last few years. categories - but more than half of accounting for close to half of all the the 51,120 papers discussed in this Cardiac journal papers (see Table The only topics that stand out as report were published in just 27 of 1), so the remainder are relevant to being different, (and then only by them (i.e. 11% of journal categories). and spread across other topics. degree rather than fundamental profile) are the scientometric topics Information Science accounts for The other three-quarters of the papers (12 and 27) where researchers began 1,378 (2.7%) of the total papers that linked to Topic 24 were drawn from to more frequently acknowledge acknowledge their use of bibliographic a huge diversity of journals. Though their use of specific databases earlier information. Computer Science most were grouped in expected than other topics. The subsequent (889) and Education Research areas of medical and related growth trajectory was then linear (476) are also among the more research, the spread of other journal rather than tracking the exponential frequently interrogated categories. categories including chemistry profile seen in biomedical topics. The rest are biomedical and health and engineering is an indicator of orientated with Oncology (2,104) the increasingly cross-disciplinary The topics created from the text and Public Health (2,020) standing nature of the cutting-edge of in the dataset of 51,120 papers’ out at the top of the table (Table 1). relevant research captured in an titles, abstracts, and keywords are a effective review. The completeness high-level view of the activity. These There is an expected pattern of and accuracy of the network of papers were published in a wide inclusion between specific Web of citation links in a properly curated diversity of journals and any one topic Science categories and the indicative bibliographic database becomes may be made up of components labelling of topics. For example, an essential part of the art for a not only from the discipline areas Topic 24 was identified as being topical review author. Tacit, expert that would be expected by a primarily about cardiovascular risk knowledge is no longer enough.

Table 1.

Before 2009- 2019 Web of Science journal category 2009 2013 2014 2015 2016 2017 2018 (part) Total

Grand total 1,166 7552 3,860 5,047 6,302 7,943 9,654 9,596 51,120

Oncology 28 330 283 205 263 266 370 359 2,104

Public, environmental & occupational health 40 286 160 193 237 331 390 383 2,020

Information science & library science 93 323 111 139 133 197 240 142 1,378

Dentistry, oral surgery & medicine 41 167 88 116 168 203 285 297 1,365

Pharmacology & pharmacy 45 256 92 110 171 188 246 243 1,351

Surgery 19 162 74 135 186 204 277 281 1,338

Nursing 24 162 80 110 159 197 226 252 1,210

Gastroenterology & hepatology 30 176 115 138 154 167 181 185 1,146

Endocrinology & metabolism 26 162 89 93 118 169 219 254 1,130

Cardiac & cardiovascular systems 27 164 91 112 132 168 209 161 1,064

13 Future use

The clusters of topics in Figure This outcome reflects two things: 5 are, as noted, dominated by first, the relative maturity of literature As evidence medicine/health (13 topics review as a tool underpinning related to disease, three to research progress in biomedicine; grows, so disease management, four second, the relative frequency and to medical data and five to pervasiveness of reviews as a part of researchers can basic biology research). The biomedical research management. remaining five cover education This has been driven by both the scale link back into the and research management. of investment and the policy priority underpinning that investment. citation network The clustered documents in all but the two topics in research In biomedicine, the review not of the Web of management could be largely only considers the literature but classified as reviews, whereas most also considers the detail of the Science to collate of the documents that acknowledge research outcomes. The data source Web of Science and other becomes an essential part of the the most impactful bibliographic databases and that audit trail where demand for best would be classified as articles are practices is also an effort to optimize material. in the research management area. results in regard to money spent. Research evaluation and policy, as Although reviews are also well well as management, is a key part established and of recognised of the support of scientometric significance and value in other subjects research by Web of Science (e.g. Annual Review of Condensed publication records and associated Matter Physics, Annual Review of data. In this area the Web of Science Materials Research) they are milestones data are the primary data source rather than a part of research and used far more often than any management. The raw material other. Eugene Garfield was the key captured in the Web of Science is figure in origin, early evolution and also digested and communicated underpinning through other routes, often in the ideas of scientometrics, and his world of ‘gray literature’ which includes SCI database, the fore-runner government and agency reports; of Web of Science, gave birth to and professional publications. An the field. His legacy continues to innovative model is the web-based inform the field globally today. rolling review, employed by climate change scientists in the ScienceBrief The reviews in the biomedical topics website sciencebrief.org/about where draw on a diversity of the Web of analyses of individual papers are Science categories (Table 1). This is written by scientists as a, “transparent, a reminder of the contribution made continuous, and rapid system for by a multiplicity of disciplines, but reviewing current knowledge”. As what is missing in the topic map is evidence grows, so researchers can any substantive component related link back into the citation network to the physical sciences, technology of the Web of Science to collate and the social sciences (Figure 6). the most impactful material.

14 References and background reading

Bramer, W. M., Rethlefsen, M. L., Kleijnen, J. Gurney, K. A. and Pendlebury, D. (2012). Funding and Franco, O.H. (2017). Optimal database acknowledgements in the journal literature: combinations for literature searches in systematic uses and benefits for funders, recipients, and reviews: a prospective exploratory study. analysts. Philadelphia, PA: . Systematic Reviews, 6, article number 245. Li, K., Rollins, J. and Yan, E. (2018). Web of Science Garfield, E. (1955). Citation indexes for science: use in published research and review papers a new dimension in documentation through 1997–2017: a selective, dynamic, cross-domain, association of ideas. Science, 122, 108-111. content-based analysis. Scientometrics, 115, 1–20.

Garfield, E. (1976). Significant journals Pedregosa, G. Varoquaux, A. Gramfort, V. of science. Nature, 264, 609-615. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, and V. Dubourg. (2011). Garfield, E. (1982). ISI’s “new” Index to Scikit-learn: machine learning in Python. Journal Scientific Reviews (ISR): applying research of Machine Learning Research, 12, 2825–2830. front specialty searching to the retrieval of the review literature. Current Contents, 39, 5-12. Pringle, J. (2008). Trends in the use of ISI citation databases for evaluation. Learned Publishing, 21, 85-9. Garfield, E. (1987a). Reviewing review literature: Part 1 - definitions and uses of Schnell, J. D. (2018). Web of Science: the first citation review. Current Contents, 18, 5-8. index for data analytics and scientometrics. In Research Analytics: Boosting University Productivity Garfield, E. (1987b). Reviewing review literature: and Competitiveness through Scientometrics. Part 2 – the place of reviews in the scientific Edited by Francisco J. Cantú-Ortiz. Boca Raton, FL: literature. Current Contents, 19, 5-10 CRC Press, pp. 15-29. ISBN-13: 978-1498785426

15 About the Global Research Report series from the Institute for Scientific Information (ISI)

Our Global Research Reports draw on our unique industry insights to offer insights, analysis, ideas and commentary to enlighten and stimulate debate. Each one in the series demonstrates the huge potential of research data to inform management issues in research assessment and research policy and to accelerate development of the global research base.

Download here: www.webofsciencegroup.com/isi

About the Web of Science Group

The Web of Science Group, a and analytical content and services Clarivate Analytics company, are built; it disseminates that organizes the world’s research knowledge externally through events, information to enable academia, conferences and publications and corporations, publishers and it carries out research to sustain, governments to accelerate the pace extend and improve the knowledge of research. It is powered by the base. ISI constantly reviews existing Web of Science – the world’s largest standards and works with public and publisher-neutral citation index and private sector partners to explore research intelligence platform. Its publication and citation data to create many well-known brands also include innovative indicators of value to Converis, EndNote, Kopernio, researchers and research managers. Publons, ScholarOne and the Institute for Scientific Information (ISI). For more information, please visit webofsciencegroup.com The ‘university’ of the Web of Science Group, ISI maintains the e: [email protected] knowledge corpus upon which the index and related information @webofscience

webofsciencegroup.com/isi

© 2019 Clarivate Analytics. ISI, Web of Science Group and its logo, as well as all other trademarks used herein are trademarks of their respective owners and used under license.

WS445086582 / 01