Finding Rising Stars in Co-Author Networks Via Weighted Mutual Influence
Total Page:16
File Type:pdf, Size:1020Kb
Finding Rising Stars in Co-Author Networks via Weighted Mutual Influence Ali Daud a,b Naif Radi Aljohani a Rabeeh Ayaz Abbasi a,c a Faculty of Computing and a Faculty of Computing and c Department of Computer Science, Information Technology, King Information Technology, King Quaid-i-Azam University, Islamabad, Abdulaziz University, Jeddah, Saudi Abdulaziz University, Jeddah, Saudi Pakistan Arabia Arabia [email protected] [email protected] [email protected] b b d Zahid Rafique Tehmina Amjad Hussain Dawood b Department of Computer Science b Department of Computer Science d Faculty of Computing and and Software Engineering, and Software Engineering, Information Technology, University International Islamic University, International Islamic University, of Jeddah, Jeddah, Saudi Arabia Islamabad, Pakistan Islamabad, Pakistan [email protected] [email protected] [email protected] a Khaled H. Alyoubi a Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia [email protected] ABSTRACT KEYWORDS Finding rising stars is a challenging and interesting task which is Bibliographic databases; Citations; Author Order; Rising Stars; Mutual being investigated recently in co-author networks. Rising stars are Influence; Co-Author Networks authors who have a low research profile in the start of their career but may become experts in the future. This paper introduces a new method Weighted Mutual Influence Rank (WMIRank) for finding 1. INTRODUCTION rising stars. WMIRank exploits influence of co-authors’ citations, The rapid evolution of scientific research has created large order of appearance and publication venues. Comprehensive volumes of publications every year, and is expected to continue in experiments are performed to analyze the performance of near future [21]. Online databases such as (DBLP or AMiner) WMIRank in comparison to baseline methods, which have provide large number of publications containing information 1 ignored weighted mutual influence. AMiner data for years 1995- about publication’s title, abstract, author, published year and 2000 is used for experiments. List of top 30 authors as per venues [17]. proposed and baseline methods are compared for their average number of papers, average number of citations and achievements. Informally, co-author relationships and citation links can be called Experimental results provide convincing evidence of the Academic Social Networks (ASNs). Formally, an academic social effectiveness of the investigated weighted mutual influence. network consists of people in research community where they share ideas, research publications, comments and ask questions. CCS CONCEPTS Several areas are well investigated for web databases, such as, expert finding / author ranking [2,5,6,19], author interest finding • Information systems~Link and co-citation analysis [4], research collaboration [9], citation content analysis [23], • Information systems~Retrieval models and ranking author name disambiguation [10,15,16,20], identifying influential • Information systems~Social networks entities [1,3,11,21], etc. However, finding rising-stars problem [7,8, 12,18] still needs to be explored. The rising stars are the scholars with expertise and capabilities to 1 https://aminer.org/ achieve high reputation in their respective fields in near future. Rising star finding in a specific domain is an emerging research © 2017 International World Wide Web Conference Committee (IW3C2), direction that may enable research communities to highlight published under Creative Commons CC BY 4.0 License. WWW 2017 Companion, April 3-7, 2017, Perth, Australia. potential researchers before time. ACM 978-1-4503-4914-717/04. Rising star finding emerges as a new research area and there is DOI: http://dx.doi.org/10.1145/3041021.3054137 little work done in this regard. Initially, this idea was first formulated by PubRank [12], PubRank only incorporates authors and papers mutual influence and static ranking of publication 33 venues. Dynamics of authors’ research profiles are modeled in 3. METHODS [18]. In this method, researchers are classified into four groups using unsupervised learning techniques. Later, PubRank is Before describing WMIRank, a brief introduction of existing enhanced by StarRank [7] which incorporates two new features: methods related to rising stars finding is presented. These methods author’s contribution based mutual influence and dynamic ranking are PubRank [12] and StarRank [7]. These methods are derived lists of publications. Afterwards, a prediction method is proposed from the famous Page Rank algorithm. The details of these [8], that applies machine learning techniques for finding rising methods are presented in Section 3.1 and 3.2. stars using publication, co-author and venue based features combination. However, author’s contribution based mutual 3.1 PubRank influence is not deeply explored in terms of co-author’s citations, PubRank is the first method derived for finding rising stars in order of appearance and publication venue’s citations. bibliographic networks [12]. It is based on two features, the mutual influence among researchers and track record of author’s In this paper, we propose WMIRank for finding outstanding publications in form of publishing in different levels of venues. researchers. It incorporates weighted mutual influence of co- Consider an example of two authors 퐴 and 퐴 with 4 and 3 author’s citations, order of appearance and publication venue’s 푘 푙 publications respectively. Both authors are co-authors of two citations. This weighted mutual influence enables us to effectively papers. The mutual influence weights between authors are find the outstanding researchers. calculated as follows: The main contribution of our work is summarized as follows. 1. It is the first attempt to consider co-author’s citations, co- W (Al , Ak) = (Al , Ak) ⁄PAk = 2⁄ 4 = 0.5 (2) author’s order of appearance and citations of co-author’s W (Ak , Al) = (Ak , Al) ⁄PAl = 2⁄ 3 = 0.66 publication venue. 2. Mathematical formulations for the computation of weighted Where, weight W (Al, Ak) describes influence of author Al on mutual influence. author Ak and PAl and PAk are the total number of publications by 3. Performance evaluation of proposed and baseline methods in authors 퐴푙 and 퐴푘. The weight W (Al, Ak) value is smaller than the terms of average number of papers and average number of weight W (Ak, Al) value, because author 퐴푘 has more publications citations. than author A . So, author 퐴 has more influence on author A . 4. Qualitative analysis of top ranked 30 authors in terms of their l 푘 l achievements. Then worth of scholar’s publications is computed on the basis of 5. The effect of parameters for the computation of ranks of reputation/prestige of publications’ venues. For the computation authors is examined and comparison of methods for ranking of publication quality score, venue of publication is considered i.e. position relocation is provided. publishing in high-level venue in the start of career shows scholar has bright chances to become future rising star. Long et al. This paper is organized as follows. The problem definition is suggests static ranking listings with following rank information. presented in section 2 followed by the brief review of two existing i.e. Rank 1 (premium), Rank 2 (leading), Rank 3 (reputable), methods and detailed description of proposed method in section 3. Rank 4 (unranked) [13]. Finally, the publication quality score for In section 4, dataset, performance evaluation, experiments to an author with a set P of publications is computed as follows. examine the accuracy and efficiency of the proposed method and results comparison with two baseline methods is provided. 1 |P| 1 λ (Ai) = ∑ (3) Finally, section 5 concludes this work. |P| i = 1 α (r(푝푢푏푖)-1) th Where, pubi is the i publication, r(pubi) is the rank of paper and 2. PROBLEM DEFINITION value of α is (0 < α < 1). The value of α is low for a low rank Formal problem definition of calculating rising star scores is paper. The larger λ (Ai) is, the higher the average quality of papers provided here. The ranking problem structure is quite similar to published by the researcher. Finally, PubRank is formulated as the famous page rank problem. follows. Given a set {퐴 ,…, 퐴 } of n authors, we have to calculate the 1 – d 1 푛 PubRank (A ) = + d . X (4) ranking score (rising star score) for each author by calculating i n |v| W (A , A ) . λ (A ) . PubRank (A ) different mutual influence values. Here, W = {푊1,…, 푊푛} be an n i j i j 푋 = ∑ |v| × n matrix, where Wij represents the mutual influence of author 퐴푖 j = 1 ∑k = 1 W (Ak , Aj) . λ (Ak) on author 퐴푗. Let Y be a vector representing ranking scores of n authors. Then ranking function is defined as follows. Where, n is the total number of authors, W (Aj, Ai) and λ (Ai) are influential weight and publication quality score respectively. 1- d |v| (Ai, A푗) y(Ai)= + d . ∑ |v| . y(Aj) (1) n j=1 ∑ (A , A ) k=1 k 푗 3.2 StarRank In StarRank, the order in which authors appear in papers is also Where, d is damping factor, v is the set of co-authors for author Ai considered with first author as maximum contributor [7]. Author and W(Ai, Aj) is the influential weight. The y(Ai) score is calculated for each author to rank the authors. with less contribution gets low score and with more contribution gets higher score. Based on this intuition, author contribution We propose three features i.e. co-author’s citations based mutual based mutual influence is calculated. It also recommends a influence, co-author order based mutual influence and co-author dynamic way of computing the publication venue score instead of venues’ citations based mutual influence for the computation of static ranking. Consider two authors Ak and Al with 4 and 3 rising star score of an author Ai.