Popular Scientometric Analysis, Mapping... 10th International CABLIBER 2015

Popular Scientometric Analysis, Mapping and Visualisation Softwares: An Overview Ashok Kumar J Shivarama Puttaraj A Choukimath

Abstract Measurement of scientific productivity has been regarded as main indicator of ascertaining impact of research over scientific community. To showcase the impact of research using science mapping and visualisation, analysis has been developed over the period of time. These methods help researchers to understand the structural, temporal and dynamic development of a discipline. The present paper provides a comprehensive overview of widely used softwares used for scientometric analysis, mapping and visualisation.

Keywords: Scientometrics, Data Analysis, Data Mapping, Visualisation Softwares, Citation Datbases

1. Introduction 2. Scientometric Analysis, Mapping and Visualisation Softwares At present the rate of publishing of scholarly com- munication is like flood of scholarly literature which Scientometric softwares can be defined in the light is being published regularly, hence it is exposing the of definition of “Software” as provided in the Online weaknesses of current scientometric analysis based Dictionary of Library and Information Science methods for evaluating this scholarly literature. A (ODLIS) as follows “A set of computer based pro- novel and promising approach to examine and grams, designed and developed to analyse citation evaluate this big amount of literature by using based bibliographic data as input to perform the scientometric analysis softwares like Bibexcel, Pajek, specific tasks i.e. structural analysis of scholarly com- CiteSpace, SAINT, Publish or Perish, Network Work- munication, mapping of scientific research, creation bench, SITKIS, Vantage Point or Excel etc. This pa- of metrics based social maps, information repre- per develops the most comprehensive list of the sentation and organisation, visualisation of research, softwares available to date, assessing the potential micro level analysis (co-word, co-author, cited ref- value of data analysis by each one. This paper also erences, bibliographic coupling, co-citation) etc. as overview the building and validating metrics drawn output. from the citation data for Social Network Analysis A scientometric analysis, mapping and visualisation (SNA), Mapping and Visualisation. These softwares software facilitates its users to draw the maps for are developed by the experts to help in managing visual representation of scientific research based on highly specialised databases to organize the large citation data, to study the structural, temporal and scale data collected in a way which can be frequently dynamics of a subject discipline. Most of these updated, and to work with network analysis for softwares are based on the modern algorithms, mapping and visualisation. mathematical and statistical methods, graphs theory, 10th International CALIBER-2015 HP University and IIAS, Shimla, Himachal Pradesh, India , etc. March 12-14, 2015 © INFLIBNET Centre, Gandhinagar, Gujarat, India - 157 - 10th International CABLIBER 2015 Popular Scientometric Analysis, Mapping...

5. Performs micro level analysis such as:

i. Co-word analysis – keyword based approach of analysis

ii. Co-author analysis – authorship based ap- proach of analysis

iii. Cited references – document based approach of analysis (bibliographic coupling, co-citation Figure 1: Scientometric Analysis, Mapping and analysis, author bibliographic coupling, author co- Visualisation Softwares citation, Journal co-citation and journal biblio- graphic coupling). 2.1 Common Features of these Softwares are 3. Databases for Scientometric Analysis  Facilitate structural, temporal and dynamic analysis of a subject discipline. There are many bibliographic database used widely for doing Scientometric based analysis. Some of  Facilitates and supports mapping & visualiza- these are: Google Scholar, Web of Science, Scopus, tion of a discipline. Microsoft Academic Research, and PubMed etc.  Able to import input data from the data sources, Web of Science, Scopus, PubMed and Google Scholar editing and cleaning of acquired data. are majour and popular sources of bibliometric data  Helps to execute metrics based evaluation of for doing Scientometric based analysis. Each of these the data, creation of maps and networks for databases has its own advantages and limitations. visualisation. 3.1 Google Scholar (https://scholar.google.co.in): 2.2 Purposes Google Scholar is an unpublished bibliographic da- tabase offered by Google. It allows researchers to Scientometric Analysis, Mapping and Visualisation create their Google Scholar page by using their Gmail Softwares can be used for following purposes: account having an affiliating address such as aca- 1. To study Structural and temporal analysis of in- demic institution, fields of interest and citations. formation and dynamics of scholarly communica- Through its "cited by" feature, it provides access to tion. abstracts of articles that have cited the article being 2. Mapping of scientific research/subject and build- viewed. Previously this feature was only found in Scopus and Web of Knowledge. ing of Metrics based social maps.

3. Facilitates application of modern science analy- Coverage: Google Scholar currently covers metrics sis, mapping and visualisation techniques and meth- of articles published during the year 2009 and 2013 ods. (both inclusive). The metrics based on the citations from all the articles that were indexed in Google 4. Information representation, organisation and Scholar in June 2014. This also includes citations from network visualisation.

- 158 - Popular Scientometric Analysis, Mapping... 10th International CABLIBER 2015 articles that are not themselves covered by Scholar  It keeps up with the recent developments in Metrics, Google Scholar includes following items subject areas. below in order to avoid the misidentification of ar-  It allows an author to check who is citing pub- ticles indexed in it: lications, creation of author’s profile/individual a) Journal articles from websites which follows page. guidelines of Google Scholar.  Google Scholar indexed peer-reviewed online b) Articles published in selected conferences in journals, scholarly books and other non-peer Computer Science and Electrical Engineering. reviewed journals. c) Articles preprint submitted to the digital re- 3.2 Web of Science positories i.e. arXiv, SSRN, NBER and RePEC. (http://portal.isiknowledge.com) Google Scholar doesn’t covers following items: The Development of Web of Science (WoS) as world leading citation database is the result of efforts made a) Court opinions, patents, books, and disserta- by the Eugene Garfield of ISI (father of Citation tions Indexing), when he launched the Science Citation b) Publications having less than 100 articles pub- Index (SCI). Web of Science (WoS also known as lished between 2009 and 2013 and Web of Knowledge) is an online subscription-based c) Publications didn’t receive any citations to the bibliographic citation database maintained by published articles during 2009 and 2013. Thomson Reuters. Web of Science is a Multidisciplinary (Science, Social Sciences, Arts, Features of Google Scholar: Humanities) database, having unmatched coverage  It allows users to search for literature available of research data. in digital or physical format online or in re- Coverage: Data Indexed carefully in WoS has cov- spective libraries. erage from the year 1900 to the present. One of the largest discovery platform with the most complete  It indexed the scholarly literature available in records in every subject selected on the basis of their the form of full-text articles, technical reports, impact. 100+ years of abstracts, Over 90 million preprints, theses, books, and selected Schol- records covering 5,300 social science publications in arly Web pages. 55 disciplines, 800 million+ cited references, 8.2 mil-  It only provides access to the abstract and cita- lion records across 160,000 conference proceedings. tion details of resources required prior sub- Web of Science consists of the coverage of the fol- scription. lowing seven online databases  The most relevant results for the searched key- i) Conference Proceedings Citation Index words will be listed first, in order of the author's ranking, the number of references that are (CPCI): covering more than 160,000 conference linked to it and their relevance to other schol- titles in the field of Sciences from1990 onwards. arly literature, and the ranking of the publica- ii) Science Citation Index Expanded (SCI Ex- tion that the journal appears in. pended): covering more than 8,500 Science jour- nals from 1900 onwards. - 159 - 10th International CABLIBER 2015 Popular Scientometric Analysis, Mapping... iii) Social Sciences Citation Index (SSCI): covering publications and bibliographic data, references, and more than 3,000 Social Science journals from the details of the citations received by the publication). year 1900 onwards. Alerting features of Scopus database allows its reg- istered users to track the changes to a profile and iv) Arts & Humanities Citation Index (AHCI): cov- facilitates calculation of author’s productivity index ering more than 1,700 Arts & Humanities journals (h-index). from 1975. 250 major scientific and social sciences journals are also covered additionally. 3.4 PubMed (www.ncbi.nlm.nih.gov/pubmed): PubMed is popular freely accessible Bibliographic v) Index Chemicus (IC): Covering more than 2.6 database, offered by United States National Library million records from 1993 onwards. of Medicine (NLM). PubMed consists of more than vi) Current Chemical Reactions (CCR): Indexed 24 million citations for the disciplines of life science more than one million records from 1986 onwards. and biomedical literature from MEDLINE database. It also covers INPI archives from 1840 to 1985. It includes science journals in biomedical and life vii) Book Citation Index (BCI): covering more than sciences, and online books. Citations may also in- 60,000 selected books from 2005. clude links to full-text, available from PubMed Cen- tral and publishers. 3.3 SCOPUS (http://www.scopus.com) Scopus is an english language bibliographic citation Content: PubMed cover only selected journals that subscription based database, offered by Elsevier. It comply with PubMed scientific standards. It pro- has showcased data from 1995 to present. SCOPUS vides access to the MEDLINE, selected records from has links to the 55 million records. It is regarded as Index Medicus, selected records of the journals one of the largest abstract and citation based data- (from Science published by American Association base of peer-reviewed literature in the form of jour- for the Advancement of Science, BMJ published by nals, books and conference proceedings. It provides British Medical Journal Group, and Annals of Sur- coverage of the world's research output in the vari- gery, published by Lippincott Williams & Wilkins). ous disciplines such as science, technology, medi- Medical Subject Headings (MeSH) are assigned to cine, social sciences, arts and humanities. Scopus da- the items before adding to the MEDLINE database tabase incorporates various important tools to track, and collection of full-text available books and other analyze and visualize the research. To maintain trans- subsets of the NLM records. Most of the PubMed parency in the selection of journal to be included in records also contain links to the full text articles; the database, SCOPUS has established an indepen- some of them are often accessible freely from dent international Scopus Content Selection and PubMed Central and their local mirrors sites such Advisory Board consisting of subject librarian and the UK PubMed Central. scientists. Scopus covers a wider journal range but Information related to the journal items indexed in it is currently limited to recently published articles the PubMed database can be retrieved from the (articles published after 1995) as compare to the Web NLM Catalog. As on 28 July 2014, PubMed contains of Science. Google Scholar. Scopus also offers au- more than 24 million records. 500,000 new records thors profiles (containing details such as affiliations, - 160 - Popular Scientometric Analysis, Mapping... 10th International CABLIBER 2015 are added by Scopus each year. As on the same date, 4.2 CiteSpaceII (http://cluster.cis.drexel.edu/ 13.1 millions of PubMed records are listed along- ~cchen/citespace) with abstracts and 14.2 million articles are with full- CiteSpace is developed by the Dr. Chaomei Chen, a text links (out of which 3.8 million fulltext articles Professor of Informatics, College of Computing and are available accessible freely). PubMed offers simple Informatics at Drexel University, Philadelphia, USA, and advanced search window. for progressive knowledge domain visualization and 4. Popular Scientometric Analysis, Mapping and analyzing trends and patterns in scientific literature. Visualisation Softwares: A Brief Profile It is free Java based software (runs on Java Runtime) used for structural and temporal analyses of net- 4.1 BibExcel (https://bibliometrie.univie.ac.at/ works derived on publication data such as Collabo- bibexcel/) ration Networks, Authors Co-citation Networks, and BibExcel is free software designed by Olle Perrsson, Document Co-citation networks. Department of Sociology, Umea University, Umea, Data Source: The major data source for CiteSpace Sweden, to assist the user in analysing the biblio- input is ISI WoS database, also supports data from graphic data, or any data which is available in text PubMed, arXiv, ADS (http://adswww.harvard.edu/ format-able form in similar manner. The concept ), and NSF Award Abstracts. behind this is generation of data files that can be imported to Excel, or any program that takes tabbed Features: It performs various functions to facilitates data records, for further processing. This toolbox the understanding and interpretation of visualiza- includes various tools, some are visible in the win- tion, network patterns and historical patterns (i.e. dow and some are hidden in the menues. Many of to identification of the fastest growing topics, to find the tools can be used in combination to achieve the out the hotspots of citations in particular subject greater result. Bibexcel Features: domain, fragmentation of a publication network into clusters, automatic labelling of clusters with terms i. Able to do most types of bibliometric analysis, from citing articles, geospatial patterns of collabo- (i.e. co-citation, bibliographic coupling, mapping and ration based on Google Earth. clustering analysis) 4.3 (http://gephi.github.io) ii. Bibexcel allows easy interaction/interoperability cross walk with other software, e.g. Pajek, Excel, Gephi is developed by the dedicated team of young SPSS, etc. engineers and researchers in computer sciences, web mining, network sciences and information visual- iii. BibExcel’s strength is higher of flexibil- ization. Presently led by Mathieu Bastian, CTO. ity in managing data and analysis. Gephi is an interactive Open Graph visualisation and iv. Possible to use other data sources than Web of exploration platform for analysis of all type of net- Science, in fact can deal with data other than biblio- works, complex systems, dynamic and hierarchical graphic records. graphs developed by the Young Team of engineers v. Able import many different types of data. and researchers in the subject area of computer sci-

- 161 - 10th International CABLIBER 2015 Popular Scientometric Analysis, Mapping... ence, web mining, network sciences/complex net- iii. It measures and studies various aspects of a spe- works and information visualisation. cific scholarly knowledge domain.

Features: Gephi is available under GNU (General iv. It analyse the literature published in particular Public License), runs on Windows, Linux, MacOS knowledge domain. X,. Gephi is used to reveal publication patterns and v. Used to know the time and geographic region of trends. Gephi used 3D render engine to display large publication. graphs in real-time and to speed up the explora- tion. Gephi is built-on combined functionalities and vi. Identify prolific countries/ major contributors to flexible architecture to: explore, analyze, spatialize, particular subject. filter, cluster, manipulate, export, all types of net- vii.Identify the languages used frequently for pub- works. Gephi is able to do exploratory data analysis, lications. , social network analysis, biological net- viii. To know the important journals in a par- work analysis and poster creation, etc. ticular subject. 4.4 HistCite ix. To identify the key authors, institutions and ar- (http://interest.science.thomsonreuters.com/ ticles in particular subject area. forms/HistCite) x. To measure authors impact on each other. HistCite software package was developed by the Eugene Garfield, Father of Citation Index. It is used 4.5 Network Workbench (http://nwb.cns.iu.edu/) for the purpose of bibliometric analysis and infor- Network Workbench (NWB) developed at Indiana mation visualization. HistCite operates on Windows University, USA, is another free Java based Network computers with Explorer and trial version Analysis, Modelling and Visualization Toolkit. It is is available free of cost. HistCite is designed to iden- designed for handling Large-Scale data in the sub- tify the significant (most cited) papers retrieved in jects of Biomedical, Social Science and Physics Re- topical searches of the Web of Science. HistCite op- search. It includes specific features for the execution erates on windows with internet explorer. of bibliometric studies. Features: Being bibliometric analysis and informa- Features tion visualization software, it performs following core functions: i. NWB supports design, evaluation, and operates in a unique distributed, shared resources envi- i. Performs specific application: Converting of bib- ronment for managing large-scale data for net- liographies into diagrams (historiographs). work analysis, modelling, and visualization. ii. It uses the bibliographic information (titles, au- ii. NWB supports network science research. thors, dates, author addresses, references, etc.) that describe published items for performing iii. NWB users can access or upload major network bibliometric analysis. datasets.

- 162 - Popular Scientometric Analysis, Mapping... 10th International CABLIBER 2015 iv. Effective algorithms make NWB to be able to v. In setting the colour, shape, size, label, and opac- perform large scale network analysis. ity of individual vertices network appearance can be easily adjusted by filling in worksheet cells. v. NWB makes its users to generate, run, and vali- date network models to study the structural and vi. Using a set of sliders Dynamic filtering instantly dynamics of a network efficiently. hides vertices and edges.To automatically group powerful graph NodeXL analyse their vi. NWB provides advanced features for visualiza- common attributes. tion and analysis for doing specific networks analysis vii.NodeXL has advanced graph metrics as a result it easily calculates degree, betweenness central- vii.Researchers will have access to the validated al- ity, closeness , eigenvector centrality, gorithms, developed in the past in personal time- PageRank, , graph density consuming processes. etc. 4.6. NodeXL (http://nodexl.codeplex.com/) viii. NodeXL is able to perform set of repeated tasks Created by Marc Smith and his team at Social Me- with a single click. dia Research Foundation. NodeXL is free open- 4.7 Pajek (http://vlado.fmf.uni-lj.si/pub/networks/ source software designed for 2007 pajek/ and http://pajek.imfm.si/doku.php) and 2010 which makes it to explore network graphs easily. NodeXL graph gallery gives access to the com- Pajek is developed by Vladimir Batagelj and Andrej munity generated network graphs. Different fonts Mrvar (some procedures were contributed by for edge, vertex and group labels can be set in Matjaz Zaversni). Pajek is non-commercial freely NodeXL. NodeXL has in built auto update feature. available software programme runs on windows. Pajek is able to execute varoius kinds of network Features analyses and visualizations activities. Bibexcel soft- i. Flexible Import and Export of graphs in ware may be used to format input data for Pajek. GraphML format, supports softwares like Pajek, 4.8 Publish or Perish (http://www.harzing.com/ UCINet, and matrix formats. pop.htm) ii. NodeXL is directly connected to personal social Publish or Perish (PoP) is a free software program networks i.e. , YouTube, Flickr and email, that retrieves and analyzes academic citations Google NodeXL uses several plug-ins to get informa- Scholar to measure the research impact. PoP able to tion of personal networks from Facebook, Ex- calculate the various citation based metrics and in- change, Wikis and WWW hyperlinks, etc. dexes. It retrieves citations of publications via Google iii. To reduce clutter in the graph scale zoom option Scholar and Microsoft Academic Search for metrics available into areas of interest. based analysis of citations. PoP was developed by iv. Flexible Layout is possible by using force-di- Anne-Wil Harzing, Research Professor and Research rected algorithms used to lay down the graphs, Development Advisor of the ESCP Europe Busi- or dragging of vertices using mouse. - 163 - 10th International CABLIBER 2015 Popular Scientometric Analysis, Mapping... ness School. It is designed to evaluate the impact of Features research work not covered in the ISI. i. The strongest ability of R is to produce the qual- Feature ity well-designed publication-plots, mathemati- cal symbols and formulae as and when needed. i. Publish or Perish is able to perform following metrics based analysis: ii. R is available as open source software and sup- ports a wide variety of UNIX based platforms ii. Total number of papers and total number of and similar systems (including FreeBSD and citations in a particular subject. Linux), Windows and MacOS. iii. Average citations received by per paper, citations iii. The R is an integrated software facilities data received by per author, papers received by per manipulation, calculation and graphical repre- author, and citations received per year. sentation: iv. Calculation of Hirsch's h-index and related pa- Data handling and storage facility, rameters. Calculations on arrays for particular matrices R v. Calculation of Egghe's g-index. have a set of operators. vi. Calculation of the contemporary h-index. For data analysis, graphical display whether on- vii.Calculation of three variations of individual h- screen or in hard R provides a large, coherent, indices. integrated collection of intermediate tools.

viii. Calculation of the average annual increase R is developed on a programming language which in the individual h-index. incorporates conditionals, loops, user-defined ix. Calculation of the age-weighted citation rate. repeatable functions and input and output fa- cilities. x. Calculation and analysis of the number of au- thors per paper. 4.10 Science Assessment Integrated Network Toolkit (SAINT) (http://www.rathenau.nl/en/ 4.9 R-Project (http://www.r-project.org/) themes/theme/project/bibliometric-software- 'R' was initially written by Robert Gentleman and tools/saint.html) Ross Ihaka, department of Statistics, University of SAINT is a fully integrated software developed by Auckland. Project R is a language based environ- the Rathenau Institute, The Hague, Netherlands, for ment for statistical computing and graphics forma- handling of large scale data for bibliometric and tions. R is a GNU project capable of executing a patentometric research and one of the few packages wide variety of statistical computing (linear and non- can also be used for conversion of ISI data into rela- linear modelling of data, execution of classical sta- tional database (dbm or accdb or sql files). SAINT is tistical tests, time-series analysis, classification, clus- available for downloaded from the official. Source tering of data, etc.) and graphical formation tech- code is also available under open source licence to niques are highly extensible. test and improve SAINT. This toolkit is easy to use

- 164 - Popular Scientometric Analysis, Mapping... 10th International CABLIBER 2015 for making research easier and more efficient. The of important terms extracted from a body of scien- issues related to SAINT can be discussed with com- tific literature. VOSviewer uses its inbuilt text min- munity members via discussion forum. ing function.

The 1.0 ready to use version has: 4.13 CitNetExplorer (http://www.citnetexplorer.nl/ Home) i. To download Web of Knowledge bibliographic data a parser program. In addition like VOSviewer CitNetExplorer is also a free Java based software tool developed by Nees ii. A program for splitting the sentences into a da- Jan van Eck and Ludo Waltman at Centre for Sci- tabase of separate words (i.e. useful in title or ence and Technology Studies (CWTS), Leiden Uni- abstract level analysis). versity. It is used for analyzing and visualizing cita- iii. Tool for transformation of database table or tion networks of scientific publications for improved query into matrix format that can be executed understanding to study the structure and dynamics by i.e. Pajek for visualization. of science communication. It supports direct data 4.11 Sitkis (https://sites.google.com/site/ import from Web of Science for creation of citation sitkisbibliometricanalysis/) networks. Citation networks can be explored inter- actively. Sitkis is a free Java and MS Access based bibliometric software tool, helping researchers facilitating them Features of CitNetExplorer include: during computation process of research analysis and i. Analyse development of a research field over evaluation of scientific information. Sitkis was de- time. veloped exclusively for bibliometric analysis. It pro- ii. Identifying the literature on a research topic. vides tools for extremely streamlined analysis of bibliometric networks. Utilizing Sitkis, one can cal- iii. Exploring the publication oeuvre of a researcher. culate large amount of data in few minutes data that iv. Supporting literature reviewing. can take days to analyse. Data Management 4.12 VOSviewer (http://www.vosviewer.com/ Home) (a)Web of Science data import- data can be im- ported directly from the Web of Science data- VOSviewer is a free Java based program, primarily base for network analysis. developed by Nees Jan van Eck and Ludo Waltman at Centre for Science and Technology Studies (b)Pajek export- networks can be exported in the (CWTS), Leiden University used for data analyses popular Pajek file format. and constructing bibliometric networks (c) Large networks. It supports very large networks, visualisation. VOSviewer creates network maps of (including millions of publications and tens of individual publications, co-authorship, co-citation millions of citation relations, are supported. network, keywords co-occurrence networks, etc. For constructing and visualizing co-occurrence networks

- 165 - 10th International CABLIBER 2015 Popular Scientometric Analysis, Mapping... 4.14 Loet Leydesdorff (c) results and maps generation module for (http://www.leydesdorff.net/) visualisation.

Loet Leydesdorff provides set of free DOS based Features of SciMAT are software to analyse and evaluate bibliometrics data i. Loaders for handling ISI Web of Knowledge for- obtained from the data sources such as Scopus, ISI, mat and RIS format. and Google Scholar for metrics based analyses such as co-authorship (international and institutional ii. Bibliometric networks: Bibliometric networks are collaboration, collaboration networks), co-words based on co-word, co-citation, and bibliographic analysis, Co-citation and bibliographic analysis and coupling. much more. These softwares are also helpful for iii. Pre-processing: It includes de-duplicating, time- preparing data for the creation of relational data- slicing, data reduction and network reduction. bases and information visualization by other iv. Normalization: Normalization association visualisation tools such as Pajek. strength, equivalence index, inclusion index, 4.15 SciMAT jaccard’s index and salton’s cosine are used for (http://sci2s.ugr.es/scimat/description.html) normalisation.

SciMAT stands for Science Mapping Analysis soft- v. Mapping: For mapping of data simple Centers ware Tool, is a Java based open source science map- Algorithm, Single-linkage, Complete-linkage, Av- ping software tool, developed at the research group erage-linkage and Sum-linkage clustering algo- Sci2s at University of Granada, Spain. SciMAT has rithms are used for mapping. been supported by the Project of Spanish Ministry vi. Analysis: network analysis, performance and of Education and Science. SciMAT was developed quality analysis, and temporal analysis can be by M.J. Cobo, A.G. López-Herrera, E. Herrera- executed using SciMAT. Viedma, and F. Herrera which incorporates meth- ods, algorithms, and measures for the science map- vii.Visualization: for result visualisation strategic dia- ping. It is free to perform a science mapping and gram, cluster network, overlapping map, evolu- analysis. SciMAT handles bibliographic data in Sqlite tion map can be created. 3 format (amendable at any time), to carry out viii. Report: Outcomes generated in HTML and bibliometric analysis based studies. SciMAT uses LaTeX format. several algorithms to edit data. Strategic diagrams, 4.16 VantagePoint (https:// Cluster networks, and Evolution areas (are three www.thevantagepoint.com/) techniques jointly used by SciMAT for visualisation). SciMAT has following modules: Vantage Point is Commercial powerful textmining tool for discovering knowledge in search results from (a) dedicated module for management of knowl- patent and literature databases. It has visualization edge base, capabilities. VantagePoint is a 32-bit program pow- (b) science mapping and analysis modules, and erful commercial text-mining tool compatible with

- 166 - Popular Scientometric Analysis, Mapping... 10th International CABLIBER 2015 Windows Vista, Windows 7, and Windows 8 plat- analysis by its self cleaning system. While doing forms. VantagePoint is used for discovering knowl- analysis CoPalRed make preliminary filtering of the edge in search results from patent and literature data recived and excutes three types of fully auto- databases. VantagePoint helps to understand and mated analysis: works with search results retrieved from text data- i. Structural analysis. It shows the structural net- bases. work formation of particular knowledge domain. Features of VantagePoint can be classified into fol- ii. Strategic analysis. Defining the relative position lowing five broad categories of actor within the network, by defining its in- i. Importing – importing the data into tensity, external relations (centrality) and inter- VantagePoint and mining the raw data for get- nal cohesion (density). ting more data from it. iii. Dynamic analysis. CoPalRed, analyzes the trans- ii. Cleaning – data cleaning means transforming formations actors over time in a network. It iden- the data into a consistent set, combining the tifies approaches, forks, appearances and disap- things that needed to analyze as a group, and pearances of actors in a network. merging and normalizing data received from di- CoPalRed main modules are: verse sources. VantagePoint uses fuzzy match- ing techniques data cleaning.  Information capture module

iii. Analyzing – analysing gathered data in a variety  Module debugging information of ways.  Generation module knowledgebase

iv. Reporting – reporting prepares to display end  Knowledge Management Module. results. 4.18 IN-SPIRE™ (http://in-spire.pnnl.gov/) v. Automating – automating encodes entire pro- IN-SPIRE™ Visual Document Analysis was devel- cess to make consistently repeatable. oped by Pacific Northwest National Laboratory, US. 4.17 CoPalRed (http://ec3.ugr.es/copalred/) It is licensed software and latest released is 5.9 ver- CoPalRed is a spceialised computer program devel- sion of IN-SPIRE. It has various tools for exploring oped by Rafael Bailón-Moreno, Department of textual data, tools for handling boolean and topical Chemical Engineering, University of Granada, Spain. queries, and tools for time and trend analysis. The CoPalRed depended on the database for gethering suite of tools allows the user to rapidly discover hid- information (i.e. Web of Science, Scopus, Medline, den information relationships by reading only per- etc). This citation data is used for analysis and tinent documents. IN-SPIRE has been used to ex- CoPalRed transforms this into new knowledge ex- plore technical and patent literature, marketing and plicitly not avaiable and generated before. The com- business documents, web data, accident and safety mon errors ocurred in a bibliographic databses are reports, newswire feeds and message traffic, and corrected or cleaned by the CoPalRed prior to any more. It has applications in many areas, including information analysis, strategic planning, and medi- - 167 - 10th International CABLIBER 2015 Popular Scientometric Analysis, Mapping... cal research. The core features of IN-SPIRE are (a) environment (JRE). Most of these softwares are Quickly Creation of meaningful visualizations and available either free of cost or under open source (b) Able to explore and understand large textual licensing. Some software are able to do scientometric database without reading every record. IN-SPIRE™ analysis, some are able to create maps and networks, supports multiple types of text file formats (such as while some are specialised in information IN-SPIRE: supports encodings for ASCII, UTF-8, visualisation. It is very hard to choose any software UTF-16 and also compatible with PDF, MS-Word, as best among all. As many of these softwares are MS-Excel, and RTF files, email, XML, HTML, RSS designed and developed in a way to capture the in- and spreadsheets. put data from the various popular data sources such as: Scopus, Web of Science, Google Scholar, PubMed 4.19 UCINET (https://sites.google.com/site/ and many others, now we can analyse the large scale ucinetsoftware/home) data in lesser time. Many of the softwares have data UCINET 6 for Windows is a software package for convergence and compatibility feature which sup- social network analysis, developed by Lin Freeman, ports interoperability and crosswalk of input and Martin Everett and Steve Borgatti, and published by output data. These scientometric analysis, mapping Analytic Technologies, USA (http:// and visualisation softwares are now widely used by www.analytictech.com/). It incorporates NetDraw the researchers across the globe. network visualization tool. UCINET 6 is available References for download free of cost 90 day uses only or can be purchased. It allows students to purchase as dis- 1. JASON Priem and BRADLEY M. Hemminger. counted price. In addition a number of free tools, (2010). Scientometrics 2.0: Toward new metrics including Anthropac, NetDraw and KeyPlayer are of scholarly impact on the social. First Monday, also available. UCINET 6 is compatible with Win- Vol. 15, (7). Available at http://firstmonday.org/ dows (i.e. NT, 98, XP, etc). It can be ojs/index.php/fm/article/viewArticle/2874/2570 run via BootCamp, VMFusion Ware, Parallels or (Accessed on 31/12/2014). Wine to run it on Mac or Linux. It supports analysis 2. ONLINE DICTIONARY FOR LIBRARY AND of large datasets. UCINET 6 is able to analyse maxi- INFORMATION SCIENCE. Software. Available mum network size about 2 million nodes but practi- at http://www.abc-clio.com/ODLIS/odlis_s.aspx cally most UCINET procedures are very slow to run (Accessed on 21/01/2015). networks size larger than about 5000 nodes. How- ever, the situation varies network to network. 3. SANTHANAKARTHIKEYAN1, S. [et. al.]. Scientometrics Study on Web: Tools and Tech- 5. Conclusion niques, International Journal of Education Re- The scientometric researchers need to know about search and technology. Vol. 4 (1), pp.40-45. Avail- the various popular data analysis, mapping and able at http://soeagra.com/ijert/ijertmarch2013/ visualisation softwares. Each of such softwares has 7.pdf (Accessed on 19/12/2014). specific characteristics which are different from each other. Many of these softwares run in java run time

- 168 - Popular Scientometric Analysis, Mapping... 10th International CABLIBER 2015 4. . Google Scholar. Available at http:/ digitalAssets/65/65934_ollepersson60.pdf. (Ac- /en.wikipedia.org/wiki/Google_Scholar. (Ac- cessed on 29/12/2014). cessed on 05/01/2015). 14.CHEN, Chaomei. CiteSpace: visualizing patterns 5. GOOGLE SCHOLAR. Google Scholar. Available and trends in scientific literature. Available at at https://scholar.google.co.in/intl/en/scholar/ http://cluster.cis.drexel.edu/~cchen/citespace/. metrics.html#coverage. (Accessed on 05/01/ (Accessed on 12/29/2014). 2015). 15.GEPHI. Gephi. Available at http:// 6. WIKIPEDIA. Web of science. Available at http:/ gephi.github.io/about/. (Accessed on 12/29/ /en.wikipedia.org/wiki/Web_of_Science. (Ac- 2014). cessed on 16/01/2015). 16.GARFIELD, Eugene. Historiograph Compilation 7. WEB OF SCIENCE. Web of Science. Available at HistCite Guide. Available at http:// http://wokinfo.com/citationconnection. (Ac- garfield.library.upenn.edu/histcomp/guide.html. cessed on 19/01/2015). (Accessed on 16/01/2015).

8. WIKIPEDIA. Scopus. Available at http:// 17.WIKIPEDIA. Histcite. Available at http:// en.wikipedia.org/wiki/Scopus. (Accessed on 21/ en.wikipedia.org/wiki/Histcite. (Accessed on 21/ 01/2015). 01/2015).

9. ELSVIER. Scopus. Available at http:// 18.NWB Team. (2006). Network Workbench Tool. www.elsevier.com/online-tools/scopus. (Ac- Indiana University, , and cessed on 16/01/2015). University of Michigan. Available at http:// nwb.slis.indiana.edu. (Accessed on 19/12/2014). 10.NCBI. Pubmed. http://www.ncbi.nlm.nih.gov/ pubmed/. (Accessed on 21/01/2015). 19.SMITH, M. [et.al.]. (2010). NodeXL: a free and open network overview, discovery and explora- 11.WIKIPEDIA. Pubmed. Available at http:// tion add-in for Excel. Available at http:// en.wikipedia.org/wiki/PubMed. (Accessed on 16/ nodexl.codeplex.com. (Accessed on 15/01/2015). 01/2015). 20.VLADIMIR, Batagelj [et.al.]. Pajek: Program for 12.UNIVERSITY OF VIENNA. Bibexcel. Available Large Network Analysis. Available at http:// at http://homepage.univie.ac.at/juan.gorraiz/ pajek.imfm.si/doku.php?id=pajek. (Accessed on bibexcel/index.html. (Accessed on 29/12/2014). 12/29/2014). 13.ASTROM, Fredrik. (2009). Celebrating Scholarly 21.HARZING, A.W. (2007). Publish or Perish. Avail- Communication Studies Communication Stud- able at http://www.harzing.com/pop.htm. (Ac- ies: Festschrift for Olle Persson at his 60th Birth- cessed on 14/01/2015). day. E-Newsletter of the International Society for Scientometrics and Informetrics. Vol. 5, pp.1- 22.R-PROJECT. The R Project for Statistical Com- 94. Available at http://www.soc.umu.se/ puting. Available at http://www.r-project.org. (Accessed on 14/12/2014).

- 169 - 10th International CABLIBER 2015 Popular Scientometric Analysis, Mapping... 23.Rathenau Institute. The infrastructure of knowl- 32.PNNL. IN-SPIRE™ Visual Document Analysis. edge: Bibliometric software tools Available at Available at http://in-spire.pnnl.gov. (Accessed http://www.rathenau.nl/en/themes/theme/ on 12/01/2015). project/bibliometric-software-tools/saint.html. 33.BORGATTI, S.P., [et.al.]. (2002). Ucinet for Win- (Accessed on 17/12/2014). dows: Software for Social Network Analysis. 24.SURULINATHI, M. Software. Available at http:/ Harvard, MA: Analytic Technologies. Available /smartlis.webnode.com/software. (Accessed on at https://sites.google.com/site/ucinetsoftware/ 14/01/2015). home. (Accessed on 15/01/2015).

25.VOSVIEWER. Vosviewer: visualising scientific About Authors landscape. Available at http:// www.vosviewer.com/Home. (Accessed on 11/01/ Mr. Ashok Kumar, Library and Information Officer, 2015). School of Planning and Architecture, New Delhi. 26.JAMALI, Hamid R. Scientometric portal. Avail- Email: [email protected] able at https://sites.google.com/site/hjamali/ Dr. J Shivarama, Assistant Professor, Centre for Li- scientometric-portal. (Accessed on 22/12/2014). brary and Information Management Studies at Tata 27.CITNETEXPLORER. Analysing citation pattern Institute of Social Sciences, Mumbai. in scientific literature. Available at http:// Email:[email protected] www.citnetexplorer.nl/Home. (Accessed on 08/ Mr. Puttaraj A Choukimath, Asst. Librarian (SS), 01/2015). with Sir Dorabji Tata Memorial Library of ‘Tata In- 28.COBO, M.J. [et.al.] (2012). SciMAT: a new Sci- stitute of Social Sciences’, Mumbai. ence Mapping Analysis Software Tool. Journal Email: [email protected] of the American Society for Information Science and Technology, Vol. 63 (8), pp. 1609-1630.

29.VANTAGEPOINT. VantagePoint: Serious Soft- ware for Serious Professionals. Available at https:/ /www.thevantagepoint.com. (Accessed on 14/01/ 2015).

30.EC3. CopalRed: Spanish version 1.0. Available at http://ec3.ugr.es/copalred. (Accessed on 16/12/ 2015).

31.WIKIPEDIA. Copalred. Available at http:// es.wikipedia.org/wiki/Copalred. (Accessed on 15/ 01/2015).

- 170 -