An H-Index Weighted by Citation Impact

Available online at www.sciencedirect.com Information Processing and Management 44 (2008) 770–780 www.elsevier.com/locate/infoproman An h-index weighted by citation impact Leo Egghe a,b, Ronald Rousseau a,b,c,* a Hasselt University, Agoralaan, B-3590 Diepenbeek, Belgium b Antwerp University, IBW, Universiteitsplein 1, B-2610 Wilrijk, Belgium c KHBO (Association K.U.Leuven), Industrial Sciences and Technology, Zeedijk 101, B-8400 Oostende, Belgium Received 27 March 2007; received in revised form 10 May 2007; accepted 11 May 2007 Available online 26 June 2007 Abstract An h-type index is proposed which depends on the obtained citations of articles belonging to the h-core. This weighted h-index, denoted as hw, is presented in a continuous setting and in a discrete one. It is shown that in a continuous setting the new index enjoys many good properties. In the discrete setting some small deviations from the ideal may occur. Ó 2007 Elsevier Ltd. All rights reserved. Keywords: h-Index; h-Type indices; Weighted h-index; Discrete and continuous approach; Power law model 1. Introduction The h-index, also known as the Hirsch index, was introduced by Hirsch (2005) as an indicator for lifetime achievement. Considering a scientist’s list of publications, ranked according to the number of citations received, the h-index is defined as the highest rank such that the first h publications received each at least h citations. Although this idea can be applied to many source–item relations we will mainly use the terminology of publications and citations as in Hirsch’ original article. For advantages and disadvantages of the h-index we refer to Hirsch (2005), Gla¨nzel (2006) and Jin et al. (2007). In order to overcome some of these disadvantages scientists have proposed several ‘Hirsch-type’ indices with the intention of either replacing or complementing the original h-index. Among these we mention Egghe’s g-index (2006a, 2006b), Kosmulski’s H(2)-index Jin’s A and AR-indices (Jin, 2006, 2007) and the R-index (Jin, Liang, Rousseau, & Egghe, 2007). We recall the definitions of the most interesting among these proposals. For the g-index as well as for the H(2)-index one draws the same list as for the h-index. The g-index, on the one hand, is defined as the highest rank such that the cumulative sum of the number of citations received is larger than or equal to the square of this rank. Clearly h 6 g. The H(2)-index, on the other hand, is k if k is the highest rank such that the first k * Corresponding author. Address: KHBO (Association K.U.Leuven), Industrial Sciences and Technology, Zeedijk 101, B-8400 Oostende, Belgium. E-mail addresses: [email protected] (L. Egghe), [email protected] (R. Rousseau). 0306-4573/$ - see front matter Ó 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.ipm.2007.05.003 L. Egghe, R. Rousseau / Information Processing and Management 44 (2008) 770–780 771 publications received each at least k2 citations. The H(2)-index will not be discussed further, as we do not think this is an interesting proposal. Jin’s A-index achieves the same goal as the g-index, namely correcting for the fact that the original h-index does not take the exact number of citations of articles included in the h-core into account. This index is simply defined as the average number of citations received by the publications included in the Hirsch core. Mathematically, this is: 1 Xh A ¼ y ð1Þ h j j¼1 In formula (1) the numbers of citations (yj) are ranked in decreasing order. Clearly h 6 A. The R-index, a correction on the A-index (Jin et al., 2007) is defined as: vffiffiffiffiffiffiffiffiffiffiffiffi u uXh t R ¼ yj ð2Þ j¼1 pffiffiffiffiffiffiffiffiffi The h-, A- and R-indices are related through the relation R ¼ A Á h. The AR-index, a refinement of the R- index which takes the age of the publications (denoted as a ) into account, has been proposed by Jin (2007): vffiffiffiffiffiffiffiffiffiffiffiffiffi j u uXh t yj AR ¼ ð3Þ a j¼1 j The formulae shown in Eqs. (1)–(3) are those as defined for the discrete, practical case. They can also be defined in a general, continuous model. In this approach c(r) denotes the continuous rank-frequency function: c : ½0; T !½1; þ1 : r ! cðrÞð4Þ The function c(r) is by definition a decreasing, but not necessarilyR strictly decreasing, positive function, with T c(T) = 1 and T > 1. Assuming, as we do from now on, that 0 cðsÞds < þ1 the four previously mentioned Hirsch-type indices are defined in the continuous case as follows: h is the unique solution of r ¼ cðrÞð5Þ Z r g is the unique solution of r2 ¼ cðsÞds ð6Þ 0 R T 6 2 (assuming that 0 cðsÞds T Þ; the g-index can also be characterized as the largest rank r such that Z r r2 6 cðsÞds 0Z 1 h A ¼ cðrÞdr ð7Þ h 0 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Z h and R ¼ cðrÞdr ð8Þ 0 In this article we focus on the fact that the h-index lacks sensitivity to performance changes and propose a citation-weighted h-index. As the construction of this new index is more elegant in the continuous case than in the discrete one, we first introduce it for the continuous model. 2. Construction of a citation-weighted h-index: continuous case R t cðrÞ dr 0 Theorem 1. Let h be the h-index of c(r) then the equation h ¼ cðtÞ has always a unique solution. This unique solution, denoted as r0, is called the w-rank of the given rank-frequency distribution. Proof. Since c(r) is a rank-frequency function it is strictly positive. Define now, 772 L. Egghe, R. Rousseau / Information Processing and Management 44 (2008) 770–780 R t cðrÞdr kðtÞ¼cðtÞ 0 h c t The function kðtÞ is continuous andR strictly decreasing on [0,T]. Indeed: k0ðtÞ¼c0ðtÞ ð Þ < 0. We further see T h cðrÞ dr 0 6 T that kð0Þ > 0 and kðT Þ¼cðT Þ h 1 À h < 0, as h < T . By the intermediate value theorem the function k takes all valuesR on the interval ½kðT Þ; kð0Þ. Hence there exists at least one value r0 2 [0,T] such that k(r0)=0. ro cðrÞ dr 0 Consequently,R ¼ cðr0Þ. Uniqueness follows from the facts that c(t) is non-increasing and that the func- t h tion cðrÞdr is strictlyR increasing. 0 t cðrÞ dr 6 0 Clearly, r0 h,as h is increasing in t and takes a value which is at least equal to h in h, while c(t)is non-increasing and c(h)=h by definition. h Definition (the continuous citation-weighted h-index). Let h be the h-index of c(r) and let r0 be the unique solution of the equation R t cðrÞdr 0 ¼ cðtÞ ð9Þ h Then the weighted h-index, denoted as hw is defined as: sZffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r0 hw ¼ cðrÞdr ð10Þ 0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi or; equivalently: hw ¼ h Á cðr0Þ ð11Þ 3. Properties of the weighted h-index hw Theorem 2. If the h-index of the continuous rank-frequency function c(r) is h and if the restriction of c(r) to ½0; h is a constant function then h = h and c(r)j =h. w [0,h] R ro cðrÞ dr 0 Proof. WeR know already that r0 6 h. Let now c(r)=C on ½0; h. Thenqffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the defining equation ¼ cðr0Þ r0 R ffiffiffiffiffiffiffiffiffiffi h C dr h p becomes: 0 C, with r = h as its unique solution. Then h C dr C h. By definition we know h ¼ 0 pffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi w ¼ 0 ¼ Á that h ¼ cðhÞ¼C, hence hw ¼ C Á h ¼ h Á h ¼ h. When the number of citations is the same for each r, articles at each (continuous) rank are weighted equally, hence it is natural that hw = h. This simple relation is not true if the definition of the hw-index does not contain a square root. Hence, this theorem explains why we introduced the extra mathematical operation of taking a square root. h Theorem 3. If the restriction of the continuous rank-frequency function c(r) to ½0; h is not the constant function c(r) = h, then the w-rank is strictly smaller than the h-index: r < h and h 6 h . R R 0 w r0 h cðrÞ dr cðrÞ dr 0 0 Proof. Assume that r0 =Rh. Then ¼ > cðhÞ¼cðr0Þ,asc is not the constant function c(r)=h on r h h 0 cðrÞ dr 0 ½0; h. The inequality h > cðr0Þ contradicts Eq. (9), hence r0 < h. From Eq. (9) we see that 2 2 6 h hw ¼ h Á cðr0Þ P h Á cðhÞ¼h . Hence: h hw. Corollary 1. If c(r) is strictly decreasing on ½0; h then h < hw. Corollary 2. The w-rank of a continuous rank-frequency function c(r) with h-index equal to h is equal to h if and only if the restriction of c(r) to ½0; h is the constant function h. L. Egghe, R. Rousseau / Information Processing and Management 44 (2008) 770–780 773 We already know that if the restrictionR ofRc(r)to½0; h is the constant function h then r0 = h and h = hw.If r0 h 2 now r0 = h then, by definition, 0 cðrÞdr ¼ 0 cðrÞdr ¼ h Á cðhÞ¼h .

An H-Index Weighted by Citation Impact

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support