<<

Scientometrics as Big Data :

On integration of data sources and the problem of different types of classification systems

Henk F. Moed (, The Netherlands) With: Marc Luwel (Hercules Foundation, Brussels) Cinzia Daraio (Univ La Sapienza, Rome)

OECD Workshop, Paris, 25 March 2014 Short CV Henk F. Moed Years Position 1981- Staff member at Centre for Science and 2009 Studies (CWTS), Leiden Univ. 2009 Full Professor of Assessment at Leiden University 2010 – Elsevier, SciVal Dept. Senior Scientific Sept 2012 Advisor As from Elsevier, AGRM Dept. Head of Informetric Sept 2012 Research Group As from Elsevier (2 days/week) and visiting July 2014 professor at academic institutions Contents

1 Sciento/Informetrics as big data science

2 General trends and important topics

3 Need for concordance of different classification systems and for consistency

4 The need for standards in Contents

1 Sciento/Informetrics as big data science

2 General trends and important topics

3 Need for concordance of different classification systems and for consistency

4 The need for standards in scientometrics Journal articles + Journal full Journal text data usage data

Books Unit of Conference assess- Trade jrnls Procs ment Acad. Library Social media Catalogs

OECD research Newspapers input data Sci Personnel Info Syst Contents

1 Sciento/Informetrics as big data science

2 General trends and important topics

3 Need for concordance of different classification systems and for consistency

4 The need for standards in scientometrics Trends • Biblio/sciento/informetrics as big data science • More (Large) datasets electronically available • Combination of large datasets • More interest in research assessment, metrics • Multi-dimensional approaches , integral views • Linking components within a system • Comparing, benchmarking • Emphasis on accountability, productivity, societal impact Important topics in research assessment

1. Institutional performance and rankings

2. Societal and technological impact

3. National/regional scientific development

4. From data to models

A bibliometric model for capturing the UNESCO state of scientific development

Report INTERNATIONALISATI on HE in ON CONSOLIDATION Research institutions SE Asia AND EXPANSION in the country start functioning as fully BUILDING- UP The country fledged partners, and Collaborations with develops its own increasingly take the developed countries scientific lead in international are established. infrastructure. collaborations. National researchers The amount of funds enter international available for PRE- scientific networks. research increases. DEVELOPMENT Low research activity without clear policy or structural funding of research. Publications & Doctoral Enrollment The number of publications generated within a country increases in almost linear fashion with doctoral enrolment. This suggests that doctoral students play a key role in the production of a country’s # Publications publication # Doctoral output. () enrolments Contents

1 Sciento/Informetrics as big data science

2 General trends and important topics

3 Need for concordance of different classification systems and for consistency (with Marc Luwel) 4 The need for standards in scientometrics Need for concordance tables (Science related)

Type of Concordances needed analysis 1 Output - Journal (WoS, Scopus) – input Input stats (OECD); funding data (NSF) 2 Research – Research – Teaching Subject Teaching concordance tables 3 Science – Journal – classification Technology 4 Science – Journal – industrial Sector Industry OECD data on # FTE Research vs. Scopus # Authors

OECD OECD Ratio Ratio SCOPUS # authors / # authors / Country # FTE Res # FTE Res # Authors # FTE Res # FTE Res All Gov+HE (2007) (2007) (2007) All Gov+HE

DEU 290,800 116,600 150,400 0.52 1.29

UK 254,600 159,100 154,600 0.61 0.97 Inconsistencies in data on # FTE Res?

OECD OECD Ratio Ratio SCOPUS # authors / # authors / Country # FTE Res # FTE Res # Authors # FTE Res # FTE Res All Gov+HE (2007) (2007) (2007) All Gov+HE

DEU 290,800 116,600 150,400 0.52 1.29 Differences in UK 254,600ratios 159,100 Scopus 154,600/ 0.61 0.97 OECD between ITA 93,000 56,200 113,100 {ITA, NLD} and 1.22 2.01 NLD 49,700{DEU, 23,800 UK} 46,300 0.93 1.95 almost a factor 2 Contents

1 Sciento/Informetrics as big data science

2 General trends and important topics

3 Need for concordance of different classification systems and for consistency

4 The need for standards in scientometrics Standards?

1 Mapping techniques: towards a winner like the Mercator projection? (with Felix Moya) 2 Indicators: depend upon object, aspect and objective of assessment 3 Need for user-oriented state-of-the-art reports on practices and methods, their pros and cons 4 Need for concordance tables and consistency of classification systems (C. Daraio project)