arXiv:1310.6528v3 [math.PR] 30 Jun 2014 Keywords a enfudt nunesvrlpoete fantok I network. a of properties several influence an to found theory been information has biology, molecular and neuroscience, [9] intr as in was networks networks complex directed directed of end for analysis the definition for at adopted similar degrees A the of network. coefficient New disassorta correlation by be networks Pearson’s to undirected to said for is given network first the was h assortativity degree degree prefe large low a with with nodes have nodes When degree for degree. high large similar with of nodes nodes when feature assortative, assortativi a degree called networks called complex also of dependency, topology degree-degree the of analysis the In Introduction 1 correlations. rank laws, power ∗ † nvriyo wne [email protected] Twente, of University [email protected] Twente, of University ereasraiiyi ewrshsbe nlzdi vari a in analyzed been has networks in assortativity Degree eredge eednisi ietdntok with networks directed in dependencies Degree-degree o esrn ereasraiiyi ietdgah ihheavy- with graphs f measures directed graphs correlation in rank assortativity Wikipedia degree these the measuring why on for show calculations we and S languages, on examples different based Using networks alternat tau. directed propose We Kendall’s in limit. dependencies size degree-degree sequence con network for infinite out-degree coefficients the and correlation in in- Pearson’s number negative parameters, the de realistic where heavy-tailed with graphs with law for networks directed that th prove of investigate We setting We the network. in a of coefficients assortativity degree the measure nntokter,Pasnscreaincecet r otco most are coefficients correlation Pearson’s theory, network In ereasraiiy eredge orltos scale correlations, degree-degree assortativity, degree i a e Hoorn der van Pim ev-alddegrees heavy-tailed aur 1 2018 31, January Abstract 1 el Litvak Nelly , ∗ yo h ewr.Antokis network A network. the of ty oilntoksine and sciences network social d a 1] hc corresponds which [16], man 1]ad[2 degree-degree [12] and [10] n v oncinpreference connection a ave ie esr o degree for measure A tive. farno dei the in edge random a of s [19]. hti fe tde sthe is studied often is that † ec ob once to connected be to rence t fsinicfilssuch fields scientific of ety dcdi 1]adlater and [17] in oduced era’ h and rho pearman’s eairo these of behavior e redrce networks, directed free alddegrees. tailed aif power a satisfy s mnyue to used mmonly r oesuited more are eg oanon- a to verge resequences. gree v measures ive rnine or correlations are used to investigate the structure of collaboration networks of a social news sharing website and Wikipedia discussion pages, respectively. Neural networks with high assortativity seem to behave more efficiently under the influence of noise [8] and information content has been shown to depend on the absolute value of the degree assortativity [20]. The effects of degree-degree dependencies on epidemic spreading have been studied in percolation theory [2, 26] and it has been shown, for instance, that the epidemic threshold depends on these correlations. Degree assortativity is used in the analysis of networks under attack, e.g. P2P networks [23, 24]. Networks with high degree assortativity seem to be less stable under attack, [5]. In the case of directed networks, recent research [15] has shown that degree-degree dependencies can influence the rate of consensus in directed social networks like Twitter. Recently it has been shown [13, 14] that for undirected networks of which the degree sequence satisfies a power law distribution with exponent γ (1, 3), Pearson’s correlation ∈ coefficient scales with the network size, converging to a non-negative number in the infinite network size limit. Because most real world networks have been reported to be scale free with exponent in (1, 3), c.f. [1, 18, Table II], this could then explain why large networks are rarely classified as disassortative. In [13, 14] a new measure, corresponding to Spearman’s rho [22], has been proposed as an alternative. In this paper we will extend the analysis in [13] to the setting of directed networks. Here we have to consider four types of degree-degree dependencies, depending on the choice for in- or out-degree on either side of an edge. Our message is, similar to that of [13], that Pearson’s correlation coefficients are size biased and produce undesirable results, hence we should look for other means to measure degree-degree dependencies. We consider networks where the in- and out-degree sequences have a power law distri- bution. We will give conditions on the exponents of the in- and out-degree sequences for which the assortativity measures defined in [9] and [19] converge to a non-negative num- ber in the infinite network size limit. This result is a strong argument against the use of Pearson’s correlation coefficients for measuring degree-degree dependencies in such directed networks. To strengthen this argument we also give examples which clearly show that the values given by Pearson’s correlation coefficients do not represent the true dependency between the degrees, which it is supposed to measure. As an alternative we propose correlation measures based on Spearman’s rho [22] and Kendall’s tau [11]. These measures are based on the ranking of the degrees rather than their value and hence do not exhibit the size bias observed in Pearson’s correlation coefficients. We will give several examples where the difference between these three measures is shown. We also include an example for which one of the four Pearson’s correlation coefficients converges to a random variable in the infinite network size limit and therefore will obviously pro- duce uninformative results. Finally we calculate all four degree-degree correlations on the Wikipedia network for nine different languages using all the assortativity measures proposed in this paper. This paper is structured as follows. In Section 2 we introduce notations. Pearson’s correlation coefficients are introduced in Section 3 and a convergence theorem is given for these measures. We introduce the rank correlations Spearman’s rho and Kendall’s
2 tau for degree-degree dependencies in Section 4. Example graphs that illustrate the difference between the three measures are presented in Section 5 and the degree-degree correlations for the Wikipedia graphs are presented in Section 6. Finally, in Section 7 we briefly discuss the results and there interpretation.
2 Definitions and notations
We start with the formal definition of the problem and introduce the notations that will be used throughout the paper.
2.1 Graphs, vertices and degrees We will denote by G = (V, E) a directed graph with vertex set V and edge set E V V . ⊆ × For an edge e E, we denote its source by e and its target by e∗. With each directed ∈ ∗ graph we associate two functions D+, D− : V N where D+(v) := e E e = v → |{ ∈ | ∗ }| is the out-degree of the vertex v and D−(v) := e E e∗ = v the in-degree. When |{ ∈ | }| considering sequences of graphs, we denote by Gn = (Vn, En) an element of the sequence (Gn)n∈N. We will further use subscripts to distinguish between the different graphs in + − the sequence. For instance, Dn and Dn will denote the out- and in-degree functions of the graph Gn, respectively.
2.2 Four types of degree-degree dependencies In this paper we are interested in measuring dependencies between the degrees at both sides of an edge. That is, we measure the relation between two vectors X and Y as a function of the edges e E corresponding to the degrees of e and e∗, respectively. ∈ ∗ In the undirected case this is called the degree assortativity. In the directed setting however, we can consider any combination of the two degree types resulting in four types of degree-degree dependencies, illustrated in Figure 1. From Figure 1 one can already observe some interesting features of these depen- dencies. For instance, in the Out/In case the edge that we consider contributes to the degrees on both sides. We will later see that for this reason the Out/In dependency in fact generalizes the undirected case. More precisely, our result for the Out/In depen- dencies generalizes the result from [14] when we transform from the undirected to the directed case by making every edge bi-directional. For the other three dependency types we observe that there is always at least one side where the considered edge does not contribute towards the degree on that side. We will later see that for these dependency types the dependency of the in- and out-degree of a vertex will play a role.
3 Out/In In/Out
Out/Out In/In
Figure 1: Four degree-degree dependency types
3 Pearson’s correlation coefficient
Among degree-degree dependency measures, the measure proposed by Newman [16, 17] has been widely used. This measure is the statistical estimator for the Pearson correlation coefficient of the degrees on both sides of a random edge. However, for undirected networks with heavy tailed degrees with exponent γ (1, 3) it was proved [14] ∈ that this measure converges, in the infinite size network limit, to a non-negative number. Therefore, in these cases, Pearson’s correlation coefficient is not able to correctly measure negative degree-degree dependencies. In this section we will extend this result to directed networks proving that also here Pearson’s correlation coefficients are not the right tool to measure degree-degree dependencies. Let us consider Pearson’s correlation coefficients as in [16, 17], adjusted to the setting of directed graphs as in [9, 19]. This will constitute four formulas which we combine into one. Take α, β +, , that is, we let α and β index the type of degree (out- ∈ { −} or in-degree). Then we get the following expression for the four Pearson’s correlation coefficients:
β 1 1 α β ∗ 1 α β ∗ rα(G)= β D (e∗)D (e ) 2 D (e∗) D (e ) , (1) σα(G)σ (G) E − E ! | | eX∈E | | eX∈E eX∈E
4 where
2 1 1 σ (G)= Dα(e )2 Dα(e ) and (2) α v ∗ 2 ∗ u E − E ! u| | eX∈E | | eX∈E t 2 1 1 σβ(G)= Dβ(e∗)2 Dβ(e∗) . (3) v 2 u E − E ! u| | eX∈E | | eX∈E t Here we utilize the notations for the source and target of an edge by letting the super- script index denote the specific degree type of the target e∗ and the subscript index the − degree type of the source e∗. For instance r+ denotes the Pearson correlation coefficient for the Out/In relation. It is convenient to rewrite the summations over edges to summations over vertices by observing that α k + α k D (e∗) = D D (v) eX∈E vX∈V and similarly Dα(e∗)k = D−Dα(v)k eX∈E vX∈V for all k> 0. Plugging this into (1)-(3) we arrive at the following definition. Definition 3.1. Let G = (V, E) be a directed graph and let α, β +, . Then the ∈ { −} Pearson’s α -β correlation coefficient is defined by
β 1 1 α β ∗ β rα(G)= β D (e∗)D (e ) rˆα(G), (4) σα(G)σ (G) E − | | eX∈E where
β 1 1 + α − β rˆα(G)= β 2 D (v)D (v) D (v)D (v), (5) σα(G)σ (G) E | | vX∈V vX∈V 2 1 1 σ (G)= D+(v)Dα(v)2 D+(v)Dα(v) , (6) α v 2 u E − E ! u| | vX∈V | | vX∈V t 2 1 1 σβ(G)= D−(v)Dβ(v)2 D−(v)Dβ(v) . (7) v 2 u E − E ! u| | vX∈V | | vX∈V t Just as in the undirected case, c.f. [13, 14], the wiring of the network only contributes to the positive part of (4). All other terms are completely determined by the in- and out- β degree sequences. This fact enables us to analyze the behavior of rα(G), see Section 3.1. Observe also that in contrast to undirected graphs, in the directed case the correlation between the in- and out-degrees of a vertex can play a role, take for instance α = and − β = +.
5 β β Note that in general rα(G) might not be well defined, for either σα(G) or σ (G) might be zero, for example, when G is a directed cyclic graph of arbitrary size. From β equations (2) and (3) it follows that σα(G) and σ (G) are the variances of X and Y , where X = Dα(e ) and Y = Dβ(e∗), e E, with probability 1/ E . Thus, σ (G) =0 is ∗ ∈ | | α 6 only possible if Dα(v) = Dα(w) for some v, w V . Moreover, v and w must have non- 6 ∈ zero out-degree for at least one such pair v, w, so that Dα(v) and Dα(w) are counted when we traverse over edges. This argument is formalized in the next lemma, which provides necessary and sufficient conditions so that σ (G), σβ(G) = 0. α 6 Lemma 3.2. Let G = (V, E) be a graph and take α, β +, . Then the following ∈ { −} holds: 2 1 Dα(v)Dβ(v) Dα(v)Dβ(v)2 (8) E ! ≤ | | vX∈V vX∈V and strict inequality holds if and only if there exits distinct v, w V such that Dα(v), ∈ Dα(w) > 0 and Dβ(v) = Dβ(w). 6 Proof. Recall that E = Dα(v) for any α +, . Then we have: | | v∈V ∈ { −} P 2 E Dα(v)Dβ(v)2 Dα(v)Dβ(v) | | − ! vX∈V vX∈V = Dα(w)Dα(v)Dβ(v)2 Dα(w)Dβ(w)Dα(v)Dβ(v) − wX∈V v∈XV \w 1 = Dα(w)Dα(v) Dβ(w)2 2Dβ(w)Dβ(v)+ Dβ(v)2 2 − wX∈V v∈XV \w 1 2 = Dα(w)Dα(v) Dβ(w) Dβ(v) 0, 2 − ≥ wX∈V v∈XV \w which proves (8). From the last line one easily sees that strict inequality holds if and only if there exits distinct v, w V such that Dα(v), Dα(w) > 0 and Dβ(v) = Dβ(w). ∈ 6 3.1 Convergence of Pearson’s correlation coefficients In this section we will prove that Pearson’s correlation coefficients (4), calculated on sequences of growing graphs satisfying rather general conditions, converge to a non- negative value. We start by recalling the definition of big theta.
Definition 3.3. Let f, g : N R be positive functions. Then f = Θ(g) if there exist → >0 k , k R and an N N such that for all n N 1 2 ∈ >0 ∈ ≥ k g(n) f(n) k g(n). 1 ≤ ≤ 2
When we have two sequences (an)n∈N and (bn)n∈N we write an = Θ(bn) for (an)n∈N = Θ((bn)n∈N).
6 Next, we will provide the conditions that our sequence of graphs needs to satisfy and prove the result. These conditions are based on properties of i.i.d. sequences of regularly varying random variables, which are often used to model scale-free distributions. We will provide a more thorough motivation of the chosen conditions in Section 3.2. From here on we denote by x y and x y the maximum and minimum of x and y, respectively. ∨ ∧ Definition 3.4. For γ , γ R we denote by G the space of all sequences of − + ∈ >0 γ−γ+ graphs (Gn)n∈N with the following properties: G1 V = n. | n| G2 There exists a N N such that for all n N there exist v, w V with Dα(v), ∈ ≥ ∈ n n Dα(w) > 0 and Dα(v) = Dα(w), for all α +, . n n 6 n ∈ { −} G3 For all p,q R , ∈ >0 + p − q p/γ+∨q/γ−∨1 Dn (v) Dn (v) = Θ(n ). vX∈Vn G4 For all p,q R , if p < γ and q < γ then ∈ >0 + − 1 lim D+(v)pD−(v)q := d(p,q) (0, ). n→∞ n n n ∈ ∞ vX∈Vn Where the limits are such that for all a, b N, k,m > 1 with 1/k + 1/m = 1, ∈ a + p < γ+ and b + q < γ− we have,
1 1 a p b q d(a, b) m d(p,q) k > d( + , + ). m k m k Now we are ready to give the convergence theorem for Pearson’s correlation coeffi- cients, Definition 3.1. β 2 Theorem 3.5. Let α, β +, . Then there exists an area Aα R such that for β ∈ { −} ⊆ (γ , γ ) Aα and (G ) N G , + − ∈ n n∈ ∈ γ−γ+ β lim rˆ (Gn) = 0 n→∞ α β and hence any limit point of rα(Gn) is non-negative. β Proof. Let (Gn)n∈N be an arbitrary sequence of graphs. It is clear that ifr ˆα(Gn) 0 β → then any limit point of rα(Gn) is non-negative. Therefore we only need to prove the first statement. To this end we define the following sequences, 2 2 1 + α 1 − β an = Dn (v)Dn (v) , bn = Dn (v)Dn(v) , En ! En ! | | vX∈Vn | | vX∈Vn + α 2 − β 2 cn = Dn (v)Dn (v) , dn = Dn (v)Dn(v) , vX∈Vn vX∈Vn
7 β 2 and observe thatr ˆα(G ) = a b /(c a )(d b ). Now if (G ) N G then n n n n − n n − n n n∈ ∈ γ−γ+ because of G2 and Lemma 3.2 there exists an N N such that for all n N we have β ∈ ≥ cn > an and dn > bn, sor ˆα(Gn) is well-defined for all n N. Next, using G3, we get a b c d≥ that an = Θ(n ), bn = Θ(n ), cn = Θ(n ) and dn = Θ(n ) for certain constants a, b, c and d, which depend on γ−, γ+ and the degree-degree correlation type chosen. Because β β 2 rˆα(G ) 0 if and only ifr ˆα(G ) 0, we need to find sufficient conditions for which n → n → a b /(c a )(d b ) 0. It is clear that either a < c and b /(d b ) is bounded n n n − n n − n → n n − n or b < d and a /(c a ) is bounded are sufficient. It turns out that this is exactly the n n − n case when either a < c and b d or a c and b < d. We will do the analysis for the ≤ ≤ In/Out degree-degree correlation. The analysis for the other three correlation types is β similar. Figure 2 shows all four areas Aα. When α = and β = + we get the following constants − 1 1 a, b = 2 1 1 γ ∨ γ ∨ − + − 1 2 c = 1 γ ∨ γ ∨ + − 2 1 d = 1 γ ∨ γ ∨ + − β It is clear that when 1 < γ , γ < 2 then a < c and b < d and hencer ˆα 0. − + → Now if 1 < γ < 2 and γ 2 then a = b = d = 1 < c. Using G4 we get that − + ≥ limn→∞ dn/n = d(2, 1) and
2 b D−(v)D+(v) n lim n = lim v∈Vn n n n→∞ n n→∞ n2 E P | n| D−(v)D+(v) 2 D−(v) −1 = lim v∈Vn n n v∈Vn n n→∞ n n P P d(1, 1)2 d = < d(2, 1) = lim n , d(0, 1) n→∞ n where, for the last part, we again used G4. From this it follows that bn/(dn bn) is β − bounded and sor ˆα 0. A similar argument applies to the case γ 2 and 1 < γ < 2, → − ≥ + where the only difference is that a = b = c = 1 < d, hence
A+ = (x,y) R 1
A− = (x,y) R2 1
8 γ− A γ− A+ +− −
3
2
1 1
γ+ γ+ 1 3 1 2 γ− A+ γ− A + − −
3
1 1
γ+ γ+ 1 3 1
β β Figure 2: Four areas Aα, where rα converges to a non-negative number.
β Let us now provide an intuitive explanation for the areas Aα, as depicted in Figure 2. The key observation is that due to G3 the terms with the highest power of either + − β Dn or Dn will dominate inr ˆα(Gn). Therefore, if these moments do not exist, then the β denominator will grow at a larger rate then the numerator, hencer ˆα 0. → Taking α =+= β, we see that D− only has terms of order one while D+ has terms + R − up to order three. This explains why A+ = (x,y) 1 < x 3,y > 1 . Area A− is { ∈ | − ≤ } + then easily explained by observing that the expression for r−(G) is obtained from r+(G) by interchanging D+ and D−. For the Out/In correlation, i.e. α = + and β = , we see from equations (5)-(7) − − thatr ˆ+(G) splits into a product of two terms, each completely determined by either in- or out-degrees, 1 α 2 |E| v∈V D (v) , 1 α 3 1 α 2 2 D (vP) 2 D (v) |E| v∈V − |E| v∈V with α +, . Theseq termsP are of the exact sameP form as the expression in [13] for ∈ { −}
9 the undirected degree-degree correlation. Because both D+ and D− have terms of order three, one sees that A− = (x,y) R2 1
3.2 Motivation for Gγ−γ+ In this section we will motivate Definition 3.4. G1 is easily motivated, for we want to consider infinite network size limits. G2 combined with Lemma 3.2 ensures that from a β certain grah size N, rα(Gn) is always well-defined. Conditions G3 and G4 are related to heavy-tailed degree sequences that are modeled using regularly varying random variables. A random variable X is called regularly varying with exponent γ if for all t > 0, −γ P(X>t)= L(t)t for some slowly varying function L, that is limt→∞ L(tx)/L(t) = 1 for all x > 0. We write for the class of all such distribution functions and write R−γ X to denote a regularly varying random variable with exponent γ. For such a ∈ R−γ random variable X we have that E [Xp] < for all 0 < p < γ. ∞ Through experiments it has been shown that many real world networks, both directed and undirected, have degree sequences whose distribution closely resembles a power law distribution, c.f. Table II of [1] and [18]. Suppose we take two random variables + − ± , and consider, for each n, the degree sequences (D (v)) n as D ∈ Rγ+ D ∈ Rγ− n v∈V i.i.d. copies of these random variables. Then for all 0 < p < γ+ and 0 < q < γ− 1 lim D+(v)pD−(v)q = E ( +)p( −)q . n→∞ n n n D D v∈Vn X Moreover, since ± is non-degenerate, we have E ( ±)k > E [ ±]k, and thus by tak- D D D ing d(p,q) = E [( +)p( −)q], we get G4 where theh secondi part follows from H¨older’s D D inequality. Although i.i.d. sequences generated by sampling from in- and out-degree dis- tributions do not in general constitute a graphical sequence, it is often the case that one can modify this sequence into a graphical sequence preserving i.i.d. properties asymp- totically. Consider for example [6], where a directed version of the configuration model is introduced and it is proven (Theorem 2.4) that the degree sequences are asymptotically independent.
10 + p − q The property G3 is associated with the scaling of the sums v∈Vn Dn (v) Dn (v) and is related to the central limit theorem for regularly varying random variables. When P we model the degrees as i.i.d. copies of independent regularly varying random variables + , − and take p γ or q γ then D+(v)pD−(v)q is in D ∈ R−γ+ D ∈ R−γ− ≥ + ≥ − v∈Vn n n the domain of attraction of a γ-stable random variable S(γ), where γ = (γ /p γ /q), P + ∧ − c.f. [7]. This means that
1 + p − q d Dn (v) Dn (v) S(γ+/p γ−/q), as n (9) an → ∧ →∞ vX∈Vn
q/γ−∨p/γ+ d for some sequence an = Θ(n ), where denotes convergence in distribu- + →p − q q/γ−∨p/γ+ tion. Informally, one could say that v∈Vn Dn (v) Dn (v) scales as n when either the p or q moment does not exist and as n when both moments exist, hence, + p − q q/γ−∨Pp/γ+∨1 v∈Vn Dn (v) Dn (v) scales as n , which is what G3 states. For complete- ness we include the next lemma, which shows that (9) implies that G3 holds with high P probability. We remark that although the motivation for G3 is based on results where the regu- larly varying random variables are assumed to be independent the dependent case can be included. For this, one needs to adjust the scaling parameters in G3 for the specified dependence. In our numerical experiments the in- and out- degrees in Wikipedia graphs show strong independence, hence G3 holds for networks such as Wikipedia.
Lemma 3.6. Let (Xn)n∈N be a sequence of positive random variables such that X n d X, as n , an → →∞ for some sequence (an)n∈N and positive random variable X. Then for each 0 <ε< 1, there exists an N N and κ ℓ > 0 such that for all n N ε ∈ ε ≥ ε ≥ ε P(ℓ a X κ a ) 1 ε. ε n ≤ n ≤ ε n ≥ − Proof. Let 0 <ε< 1 and take δ > 0, 0 <ℓ κ such that ≤ P(ℓ X κ) 1 ε + δ. ≤ ≤ ≥ − d Then, because X /a X as n , there exists an N N such that for all n N, n n → →∞ ∈ ≥ P(ℓ X κ) P(ℓa X κa ) < δ. | ≤ ≤ − n ≤ n ≤ n | Now we get for all n N, ≥ 1 ε + δ P(ℓa X κa ) P(ℓ X κ) P(ℓa X κa ) δ, − − n ≤ n ≤ n ≤ ≤ ≤ − n ≤ n ≤ n ≤ hence P(ℓa X κa ) 1 ε. n ≤ n ≤ n ≥ −
11 4 Rank correlations
In this section we consider two other measures for degree-degree dependencies, Spear- man’s rho [22] and Kendall’s tau [11], which are based on the rankings of the degrees rather than their actual value. We will define these dependency measures and argue that they do not have unwanted behavior as we observed for Pearson’s correlation coefficients. We will later use examples to enforce this argument and show that Spearman’s rho and Kendall’s tau are better candidates for measuring degree-degree dependencies.
4.1 Spearman’s rho Spearman’s rho [22] is defined as the Pearson correlation coefficient of the vector of ranks. Let G = (V, E) be a directed graph and α, β +, . In order to adjust the ∈ { −} definition of Spearman’s rho to the setting of directed graphs we need to rank the vectors α β ∗ (D (e∗))e∈E and (D (e ))e∈E. These will, however, in general have many tied values. α For instance, suppose that D (v) = m for some v V , then edges e E with e∗ = v α α ∈ α ∈ satisfy D (e∗)= D (v). Therefore, we will encounter the value D (v) at least m times α in the vector (D (e∗))e∈E. We will consider two strategies for resolving ties: uniformly at random (Section 4.1.1), and using an average ranking scheme (Section 4.1.2).
4.1.1 Resolving ties uniformly at random Given a sequence x of distinct elements in R we denote by R(x ) the rank of x , { i}1≤i≤n j j i.e. R(x )= i x x , 1 j n. The definition of Spearman’s rho in the setting of j |{ | i ≥ j}| ≤ ≤ directed graphs is then as follows.
Definition 4.1. Let G = (V, E) be a directed graph, α, β +, and let (U ) , ∈ { −} e e∈E (We)e∈E be i.i.d. copies of independent uniform random variables U and W on (0, 1), respectively. Then we define the α-β Spearman’s rho of the graph G as
α β ∗ 2 12 R (e∗)R (e ) 3 E ( E + 1) ρβ (G)= e∈E − | | | | , (10) α E 3 E P | | − | | α α β ∗ β ∗ where R (e∗)= R(D (e∗)+ Ue) and R (e )= R(D (e )+ We).
β From (10) we see that the negative part of ρα(G) depends only on the number of edges 3( E + 1)2 6 E + 4 | | =3+ | | , ( E 2 1) E 2 1 | | − | | − β while for rα(G) it depended on the values of the degrees, see Definition 3.1. When (G ) N G , with γ , γ > 1 then it follows that E = θ(n) hence 3 + (6 E + n n∈ ∈ γ+,γ− + − | n| | | 4)/( E 2 1) 3, as n . Therefore we see that the negative contribution will always | | − → β →∞ be at least 3 and so ρα(Gn) does not in general converge to a non-negative number while β rα(Gn) does.
12 β When calculating ρα(G) on a graph G one has to be careful, for each instance will give different ranks of the tied values. This could potentially give rise to very different results among several instances, see Section 5.1.2 for an example. Therefore, in experiments, β we will take an average of ρα(G) over several instances of the uniform ranking.
4.1.2 Resolving ties with average ranking A different approach for resolving ties is to assign the same average rank to all tied values. Consider, for example, the sequence (1, 2, 1, 3, 3). Here the two values of 3 have ranks 1 and 2, but instead we assign the rank 3/2 to both of them. With this scheme the sequence of ranks becomes (9/2, 3, 9/2, 3/2, 3/2). This procedure can be formalized as follows.
Definition 4.2. Let (xi)1≤i≤n be a sequence in R then we define the average rank of an element xi as j x = x + 1 R(x )= j x >x + |{ | j i}| . i |{ | j i}| 2 n Observe that in the above definition the total average rank is preserved: i=1 R(xi)= n(n + 1)/2. The difference with resolving ties uniformly at random is that we in general n 2 P do not know i=1 R(xi) , for this depends on how many ties we have for each value. We now define the corresponding version of Spearman’s rho of graphs as follows. P Definition 4.3. let G = (V, E) be a directed graph, α, β +, and denote by α β ∗ α α ∈ { −} β ∗ R (e∗) and R (e ) the average ranks of D (e∗) among (D (e∗))e∈E and D (e ) among β ∗ (D (e ))e∈E, respectively. Then we define the α-β Spearman’s rho with average resolu- tion of ties by α β ∗ 2 β 4 e∈E R (e∗)R (e ) E ( E + 1) ρα(G)= β − | | | | , (11) P σα(G)σ (G) where
α σ (G)= 4 R (e )2 E ( E + 1)2 α ∗ − | | | | s eX∈E and
β σβ(G)= 4 R (e∗)2 E ( E + 1)2. − | | | | s eX∈E β Note that ρα(G) does not suffer from any randomness in the ranking of the degrees. Hence, in contrast to (10), here we do not need to take an average over multiple instances. The next lemma shows that taking the expectation over the uniform ranking is actually equal to applying the average ranking scheme.
Lemma 4.4. Let G = (V, E) be a graph, e E and α, β +, . Then ∈ ∈ { −}
13 α α β ∗ β ∗ i) E [R (e∗)] = R (e∗), E R (e ) = R (e ), and
α β ∗ α β ∗ ii) E R (e∗)R (e ) = R (e∗)R (e ) Proof. i) We will only prove the first statement. The proof for the second one is similar. α α Since R (e∗)= R(D (e∗)) + Ue and (Ue)e∈E are i.i.d. copies of an uniform random variable U on (0, 1) we have that
I Dα(f )= Dα(e ) E [I U U ] { ∗ ∗ } { f ≥ e} fX∈E 1 = I Dα(f )= Dα(e ) I f = e + I f = e { ∗ ∗ } { } 2 { 6 } fX∈E 1 1 = I Dα(f )= Dα(e ) + . 2 { ∗ ∗ } 2 fX∈E It follows that
E [Rα(e )] = E I Dα(f )+ U Dα(e )+ U ∗ { ∗ f ≥ ∗ e} fX∈E = I Dα(f ) > Dα(e ) + I Dα(f)= Dα(e ) E [I U U ] { ∗ ∗ } { ∗ ∗ } { f ≥ e} fX∈E fX∈E 1 1 = I Dα(f ) > Dα(e ) + I Dα(f )= Dα(e ) + { ∗ ∗ } 2 { ∗ ∗ } 2 f∈E f∈E Xα X = R (e∗).
ii) By definition we have that
α β ∗ α α β ∗ β ∗ R (e∗)R (e )= I D (f∗) > D (e∗) I D (g ) > D (e ) f,gX∈E n o n o + I Dα(f ) > Dα(e ) I Dβ(g∗)= Dβ(e∗) I W W ∗ ∗ g ≥ e f,gX∈E n o n o n o + I Dα(f )= Dα(e ) I U U I Dβ(g∗) > Dβ(e∗) ∗ ∗ f ≥ e f,gX∈E n o n o n o + I Dα(f )= Dα(e ) I Dβ(g∗)= Dβ(e∗) I U U I W W . ∗ ∗ { f ≥ e} g ≥ e f,gX∈E n o n o n o
Therefore, since (Uf )f∈E and (Wg)g∈E are i.i.d. copies of independent uniform random variables U and W on (0, 1), respectively, the result follows by applying i).
14 β From Lemma 4.4 we conclude that instead of calculating ρα several times and then taking the average we can immediately apply the average ranking which limits the total calculations to just one. Moreover, we have that
3σ σβ E ρβ (G) = α ρβ (G), (12) α E 3 E α h i | | − | | which emphasizes that the difference between the uniform at random and average ranking scheme is determined by the number of ties in the degrees.
4.2 Kendall’s Tau Another common rank correlation is Kendall’s tau [11], which measures the weighted difference between the number of concordant and discordant pairs of the joint obser- vations (xi,yi)1≤i≤n. More precisely, a pair (xi,yi) and (xj,yj) of joint observations is concordant if xi