Pre-print submitted to Scientometrics *Corresponding author’s email:
[email protected] A Decade of In-text Citation Analysis based on Natural Language Processing and Machine Learning Techniques: An overview of empirical studies Sehrish Iqbal,a Saeed-Ul Hassan,a* Naif Radi Aljohani,b Salem Alelyani,c, d Raheel Nawaz,e Lutz Bornmannf a Department of Computer Science, Information Technology University, 346-B, Ferozepur Road, Lahore, Pakistan;
[email protected],
[email protected] b Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia;
[email protected] c Center for Artificial Intelligence (CAI), King Khalid University, PO Box 9004, Abha 61413, Saudi Arabia;
[email protected] d College of Computer Science, King Khalid University, PO Box 9004, Abha 61413, Saudi Arabia;
[email protected] e Department of Operations, Technology, Events and Hospitality Management, Manchester Metropolitan University, United Kingdom;
[email protected] f Administrative Headquarters of the Max Planck Society, Division for Science and Innovation Studies, Hofgartenstraße, 8, 80539 Munich, Germany;
[email protected] *Corresponding author’s email:
[email protected] Abstract Citation analysis is one of the most frequently used methods in research evaluation. We are seeing significant growth in citation analysis through bibliometric metadata, primarily due to the availability of citation databases such as the Web of Science, Scopus, Google Scholar, Microsoft Academic, and Dimensions. Due to better access to full-text publication corpora in recent years, information scientists have gone far beyond traditional bibliometrics by tapping into advancements in full-text data processing techniques to measure the impact of scientific publications in contextual terms.