Bibliometric Impact Assessment With R and the CITAN Package Marek Gagolewskia,b E-mail:
[email protected] aSystems Research Institute, Polish Academy of Sciences ul. Newelska 6, 01-447 Warsaw, Poland bFaculty of Mathematics and Information Science, Warsaw University of Technology pl. Politechniki 1, 00-661 Warsaw, Poland Abstract In this paper CITAN, the CITation ANalysis package for R statistical com- puting environment, is introduced. The main aim of the software is to sup- port bibliometricians with a tool for preprocessing and cleaning bibliographic data retrieved from SciVerse Scopus and for calculating the most popular in- dices of scientific impact. To show the practical usability of the package, an exemplary assessment of authors publishing in the fields of scientometrics and webometrics is per- formed. Keywords: data analysis software, quality control in science, citation analysis, bibliometrics, Hirsch’s h index, Egghe’s g index, SciVerse Scopus. This is a revised version of the paper: Gagolewski M., Bibliometric impact assessment with R and the CITAN pack- age, Journal of Informetrics 5(4), 2011, pp. 678–692. 1 1. Introduction The introduction of the h-index by J.E. Hirsch (2005) started a very in- tensive research trend in the field of scientometrics. Numerous bibliometric impact indices, like the g-index (Egghe, 2006b), the w-index (Woeginger, 2008b), or the R-index (Jin et al., 2007) are particular instances of a wide class of functions called aggregation operators (cf. Gagolewski and Grze- gorzewski, 2010, 2011a,b). Such operators merge several numerical values into a single, representative one. They may be applied in many areas like engineering, statistics, economy or social sciences.