Spectral Graph Analysis of Modularity and Assortativity

PHYSICAL REVIEW E 82, 056113 ͑2010͒

Spectral graph analysis of modularity and assortativity

P. Van Mieghem,1,* X. Ge,2 P. Schumm,3 S. Trajanovski,1 and H. Wang1 1Faculty of Electrical Engineering, Mathematics, and Computer Science, Delft University of Technology, P.O. Box 5031, 2600 GA Delft, The Netherlands 2College of Information Science and Engineering, Northeastern University, 110004 Shenyang, China 3Department of Electrical and Computer Engineering, Kansas State University, Manhattan, Kansas 66506, USA ͑Received 31 August 2010; revised manuscript received 15 October 2010; published 16 November 2010͒ Expressions and bounds for Newman’s modularity are presented. These results reveal conditions for or properties of the maximum modularity of a network. The inﬂuence of the spectrum of the modularity matrix on the maximum modularity is discussed. The second part of the paper investigates how the maximum modularity, the number of clusters, and the hop count of the shortest paths vary when the assortativity of the graph is changed via degree-preserving rewiring. Via simulations, we show that the maximum modularity increases, the number of clusters decreases, and the average hop count and the effective graph resistance increase with increasing assortativity.

DOI: 10.1103/PhysRevE.82.056113 PACS number͑s͒: 89.75.Fb, 02.10.Ϫv

I. INTRODUCTION the adjacency and modularity matrices, the number of clusters, the hop count or characteristic path length, and the ef- Graph communities reveal important structural features of fective graph resistance. Section II summarizes mathematical the topology. The communities can be observed as sections formulations of the modularity and it provides bounds. Sec- of the graph topology that exhibit relatively higher levels of tion III attempts to understand the effect of increasing and connections within the regions and lower connectivity be- decreasing the assortativity of a graph via degree-preserving tween the regions. Such a structure plays a significant role in rewiring. We show that, in most complex networks, the both the sorting of nodes and the evolution of processes on maximum modularity ͑and the largest eigenvalue of the graphs, including slowing spreading epidemics and contain- modularity matrix͒ increases with assortativity. ing cascading failures ͓1–3͔. Although there is no definition of community that is accepted for all graph-based systems, a metric known as modularity has led to a surge of research in II. MODULARITY community structure discovery and analysis ͓4͔. Fortunato The modularity m, proposed by Newman 4 , is a measure ͓5͔ provides a sound summary of this collection of works. A ͓ ͔ of the quality of a particular division of the network, which significant fraction of this research focuses on ways to assign 1 nodes to communities in efforts to maximize the modularity is defined in ͓21͔ as metric 6 . Other pursuits have begun characterizing the N N ͓ ͔ 1 d d modularity metric, investigating the counterintuitive non- i j m = ͚ ͚ ͩaij − ͪ1͕i and j belong to the same community͖, trivial expected values for random graph models and lattices 2L i=1 j=1 2L ͓7͔ and the upper bound and partitioning resolution ͓8–13͔. ͑1͒ Characterizing modularity is also done by examining how it relates to other significant graph metrics. Introduced only where aij is the matrix element of the adjacency matrix A of shortly before modularity, assortativity is a correlation of the the graph with N nodes and L links. The modularity is pro- similarities of nodes sharing a link ͓14,15͔. Newman sug- portional to the number of links falling within clusters or gested that one driving factor in the formation of communi- groups minus the expected number in an equivalent network ties was the preference of nodes to connect to other nodes with links placed at random. Thus, if the number of links that possessed similar characteristics to their own. This has within a group is not better than random, the modularity is been observed in some social networks where the similarities zero. A modularity approaching 1 reflects networks with are race related ͓16͔. Within topological analysis, assortativ- strong community structure: a dense intragroup and a sparse ity is most commonly used with the node degrees or node intergroup connection pattern. If links are placed at random, strengths. Previous work has shown relations between the then the expected number of links between node i and node assortativity of a graph and the characteristic path length, the j equals didj /2L, where dj is the degree of node j. fraction of nodes in the giant component, the clustering coefficient, the robustness, the spectra of the adjacency matrix, A. Expressions and bounds for the modularity and the modularity ͓17–20͔. This paper presents an analysis of the relationships among The general definition ͑1͒ is first rewritten as follows. We the modularity, the assortativity, the largest eigenvalues of transform the nodal representation to a counting over links l=iϳ j such that

*[email protected] 1We follow the notation of our book ͓22͔.

N N c ␰=−4L and D = 2L for all 1Յ jՅc. Hence, ͚c D2 c Cj c k=1 Ck 2 ͚ ͚ aij1͕i and j belong to the same cluster͖ =2͚ Lk, Յ͑2L͒ /c. The modularity in Eq. ͑2͒ is minimized, for c i=1 j=1 k=1 Ͼ1, if L =0 for 1ՅkՅc, and ͚c D2 is maximized such k k=1 Ck where L is the number of links of cluster C , and the factor 1 k k that mՆ− c . In conclusion, the modularity of any graph is 2 arises from the fact that all links are counted twice due the 1 never smaller than − 2 , and this minimum is obtained for the symmetry A=AT of the adjacency matrix. If we denote by complete bipartite graph. This result was earlier stated in Linter the number of intercommunity links, i.e., the number of ͓23͔, but with a different proof. links that are cut by partitioning the network into c commu- Invoking the Cauchy identity see 22 and c D ͑ ͓ ͔͒ ͚k=1 Ck nities or clusters, then =2L, c L = L + L . c 2 c j−1 ͚ k inter 2 ͑2L͒ 1 2 k=1 D = + ͑D − D ͒ , ͚ Ck ͚ ͚ Cj Ck k=1 c c j=2 k=1 Similarly, N N results in yet another expression for the modularity:

didj1͕i and j belong to the same cluster͖ ͚ ͚ c j−1 2 i=1 j=1 L 1 1 DC − DC m =1− inter − − j k . 3 c c ͚ ͚ ͩ ͪ ͑ ͒ L c c j=2 k=1 2L = d d = D2 , ͚ ͚ i ͚ j ͚ Ck k=1 ͩi C ͪͩj C ͪ k=1 ෈ k ෈ k Since the double sum is always positive, Eq. ͑3͒ provides us where with an upper bound for the modularity, D = d Ck ͚ i 1 Linter i෈C m Յ 1− − . ͑4͒ k c L is the sum of the degrees of all nodes that belong to cluster C . Clearly, D Ն2L , because some nodes in cluster C k Ck k k The upper bound ͑4͒ is only attained if the degree sum of all may possess links connected to nodes in other clusters. The clusters is the same. In passing, we mention that Eq. ͑3͒ basic law of the degree then shows that c D =2L. Sub- rigorously proves the upper bound derived by Fortunato and ͚k=1 Ck stituting these expressions in deﬁnition ͑1͒ leads to an alter- Barthélemy ͓10͔ based on a cyclic chain of same subgraphs, 2 for which, indeed, D =D for each pair j,k . In addition, native expression for the modularity: Cj Ck ͑ ͒ the upper bound ͑4͒ shows that mՅ1 and that a modularity c D 2 Lk Ck of 1 is only reached asymptotically, when the number of m = − . ͑2͒ ͚ ͫ L ͩ 2L ͪ ͬ clusters c ϱ and L =o͑L͒, implying that the fraction of k=1 inter intercommunity→ links over the total number of links L is The last faction can also be written as vanishingly small for large graphs ͑N ϱ and L ϱ͒. → → D Let D⌬C =max͕C ,C ͖͉DC −DC ͉, then a lower bound of the Ck nk E͓Dk͔ j k j k = , modularity, deduced from Eq. ͑3͒, is 2L N E͓D͔ 2 where n and D are the number of nodes and the degree of Linter 1 ͑c −1͒ D⌬C k k m Ն 1− − − . ͑5͒ a random node in cluster Ck, respectively. Thus, the modu- L c 2 ͩ 2L ͪ larity is the sum over all clusters of the fraction of links per cluster minus the square of the fraction of nodes per cluster Only if D⌬C =0 the lower bound ͑5͒ equals the upper bound multiplied by the ratio of the average degree in a cluster over ͑4͒ and the equality sign can occur. Excluding the case that the average degree in the network. D⌬C =0, then not all DC ’s are equal, and we may assume an Subject to the basic law of the degree, c D =2L, the j ͚k=1 Ck ordering D ՆD Ն ՆD , with at least one strict in- C1 C2 ¯ Cc sum ͚c D2 is maximized when D = 2L for all 1ՅkՅc. k=1 Ck Ck c equality. We demonstrate that, for cϾ2, not all differences Indeed, the corresponding Lagrangian, D −D =D Ͼ0 for any pair j,k . For, assume the con- Cj Ck ⌬C ͑ ͒ c c trary, so that D −D =D −D =D −D =D Ͼ0, then C1 C2 C2 C3 C1 C3 ⌬C 2 L = D + ␰ D −2L , D⌬C =DC −DC =͑DC −DC ͒+͑DC −DC ͒=2D⌬C, which ͚ Ck ͚ Ck 1 3 1 2 2 3 k=1 ͩk=1 ͪ cannot hold for D⌬C Ͼ0. Hence, if D⌬C Ͼ0, the inequality in where ␰ is a Lagrange multiplier, supplies the set of Eq. ͑5͒ is strict; alternatively, the lower bound ͑5͒ is not attainable in that case. D =2D +␰=0ץ/Lץ ,equations for the optimal solution Cj Cj In order for a network to have a modular structure, the L =͚c D −2L=0, which is satisﬁed forץ for 1Յ jՅc and ␰ k=1 Ck modularity must be positive. The requirement that the lowerץ bound ͑5͒ is non-negative supplies us with an upper bound 2 Newman ͓33͔ presented still another expression for the for the maximum difference D⌬C in the nodal degree sum modularity. between two clusters in a “modular” graph,

056113-2 SPECTRAL GRAPH ANALYSIS OF MODULARITY AND… PHYSICAL REVIEW E 82, 056113 ͑2010͒

2 L 1 c inter 1 2 2 D⌬C Յ 2L 1− − . ͑6͒ D = Var͓DC͔ + ͑E͓DC͔͒ , ͱ ͩ ͪ ͚ Ck c −1 L c c k=1

For cϾ1, Eq. ͑6͒ demonstrates that D⌬C Ͻ2L. Ignoring the expression ͑2͒ for the modularity becomes integer nature of c, the lower bound ͑5͒ is maximized with respect to the number of clusters c when Linter 1 c m =1− − − Var͓DC͔, ͑10͒ L c ͑2L͒2 2ͱ2L Ͼ ͱ2, ͑7͒ which, again, leads to the upper bound ͑4͒ when the variance = ءc D⌬C is zero, i.e., when all clusters have an equal degree sum. resulting in Incidentally, comparing Eqs. ͑3͒ and ͑10͒, we find that c j−1 2 2 DC − DC Linter D⌬C 1 D⌬C j k m Ն 1− − ͱ2 + . Var͓DC͔ = ͚ ͚ ͩ ͪ , L ͩ 2L ͪ 2ͩ 2L ͪ j=2 k=1 c The right-hand side in this lower bound is positive provided and this is a general result that holds for any random variable in a specific graph ͓22͔. that 1ϾD⌬C /2LϾͱ2͑1−ͱLinter/L͒. When this lower bound for D⌬C /2L is satisfied, the modularity m is certainly positive, implying that the graph exhibits a modular structure. B. Spectral form for the modularity Another presentation for the modularity applies the identity The Nϫc community matrix S, defined as

n 2 n n n n j−1 1 if node i belongs to community k 2 Sik = ͚ xj = ͚ ͚ xjxk = ͚ xj +2͚ ͚ xjxk ͑8͒ ͭ0 otherwise, ͮ ͩ j=1 ͪ j=1 k=1 j=1 j=2 k=1 can be used to rephrase condition in ͑1͒ as to x =D , j Cj c c c j−1 1͕i and j belong to the same community͖ = ͚ SikSjk, D2 = ͑2L͒2 −2 D D , k=1 ͚ Ck ͚ ͚ Cj Ck k=1 j=2 k=1 leading to the matrix representation of the modularity: such that Eq. 2 is rewritten as c N N ͑ ͒ 1 tr͑STMS͒ m = S m S = , 11 c j−1 ͚ ͚ ͚ ik ij jk ͑ ͒ 2L k=1 i=1 j=1 2L 2 Linter m = D D − . ͑9͒ 2 ͚ ͚ Cj Ck ͑2L͒ j=2 k=1 L where

c Using the basic law of the degree, ͚ D =2L, the first term 1 T k=1 Ck M = A − d · d ͑12͒ in Eq. ͑9͒ is maximized, as follows from a similar Lagrang- 2L ian argument as before, when the degree is distributed uni- is the modularity matrix and d is the degree vector. We define formly across the communities as D =2L c, resulting in Ck / the community vector sk, which equals the kth column of the c 2 community matrix S and which specifies the kth cluster: all 1 2L Linter m Յ j −1 − components of sk, corresponding to nodes belonging to clus- 2 ͚ ͑ ͒ͩ ͪ 2L j=2 c L ter Ck, are equal to 1; otherwise, they are zero. 2 2 Using the eigenvalue decomposition of the symmetric 1 2L c − c Linter 1 Linter T Յ − Յ 1− − , modularity matrix M =W diag͓␭j͑M͔͒W , where W is the or- 2L2 ͩ c ͪ ͩ 2 ͪ L c L thogonal NϫN matrix with the jth eigenvector wj belonging to ␭ ͑M͒ in column j, the general spectral expression for the which is again the upper bound 4 and, hence, agrees with j ͑ ͒ modularity m follows from Eq. ͑11͒ as the degree balancing of Eq. ͑3͒. Equations ͑3͒ and ͑9͒ present the maximization of the modularity from dual perspectives, tr͕͑WTS͒Tdiag͓␭ ͑M͔͒WTS͖ m = j yet both point to a common solution of degree balancing and 2L minimizing Linter. N c Finally, we present a probabilistic setting for the modular- 1 = ͑wTs ͒2 ␭ ͑M͒, ͑13͒ ity m by defining the random variable DC as the sum of the ͚ ͚ j k j 2L j=1 ͩk=1 ͪ degrees in an arbitrary cluster. The average is E͓DC͔ in Eq. because WTS = N W S =wTs . In particular, the scalar ءc D = 2L and, comparing with the estimate c 1 = c ͚k=1 Ck c ͑ ͒jk ͚l=1 lj lk j k 7 , it always holds that c=2L E D . However, estimate 7 product wTs = w is the sum of those eigenvector ͑ ͒ / ͓ G͔ ͑ ͒ j k ͚q෈Ck͑ j͒q suggests that the extreme difference D⌬C is not that far way components of wj that belong to cluster Ck. If we write the from the mean, roughly by a factor of ͱ2. Further, with community vector as a linear combination of the eigenvec-

056113-3 VAN MIEGHEM et al. PHYSICAL REVIEW E 82, 056113 ͑2010͒

N N c c tors of M, sk =͚j=1␤kjwj, then the orthogonality of eigenvec- T T 2 T 2 ␭ tors indicates that the coefficients equal ␤kj=wj sk. Moreover, ͚ ͚ ͑wj sk͒ ␭j͑M͒ ͚ ͑wj sk͒ j͑M͒ j=1;j q ͩk=1 ͪ ͩk=1 ͪ " the vectors s1 ,s2 ,...,sc are orthogonal vectors; and, by defi- c N c Յ max c nition, ͚k=1sk =u. Since u is an eigenvector of M belonging 1ՅjՅN T 2 T 2 to the zero eigenvalue as follows from the definition of the ͚ ͚ ͑wj sk͒ ͚ ͑wj sk͒ modularity matrix ͑12͒ because j=1;j"q k=1 k=1

= ␭1͑M͒, 1 T M · u = Au − d · d u = d − d =0, 2L 2L from which we find, with E͓D͔= N , a spectral upper bound for the modularity: and Au=d and dTu=2L, we observe that c ␭ ͑M͒ 1 c m Յ 1 1− n2 . T 2 ͚ k E͓D͔ ͩ N k=1 ͪ ͚ wj sk =0, k=1 This bound can also be written as provided the eigenvector wj "u. Using the Cauchy identity ␭1͑M͒ 1 c c 2 c m Յ 1− − Var n , c c ͩ 2 ͓ C͔ͪ T 2 T 1 T 2 E͓D͔ c N c͚ ͑wj sk͒ − ͚ wj sk = ͚ ͚ ͓wj ͑sm − sk͔͒ k=1 ͩk=1 ͪ 2 m=1 k=1 where nC is the number of nodes in an arbitrary cluster, be- 1 c N c m−1 cause E͓nC͔= c ͚k=1nk = c . Since Var͓nC͔Ն0, we arrive at the T 2 upper bound = ͚ ͚ ͓wj ͑sm − sk͔͒ , m=2 k=1 ␭1͑M͒ 1 m Յ 1− . ͑18͒ we find that E͓D͔ ͩ c ͪ N c m−1 1 We observe that Eq. ͑18͒ may lead to a sharper upper bound m = wT s − s 2 ␭ M . 14 ͚ ͚ ͚ ͓ j ͑ m k͔͒ j͑ ͒ ͑ ͒ than Eq. ͑4͒ if ␭1͑M͒ϽE͓D͔͑see, e.g., Fig. 3 below͒. 2Lc j=1 ͩm=2 k=1 ͪ We have shown in ͓22͔ that the eigenvalues of the modularity matrix M =A− 1 d·dT are interlaced with those of A, For c=2 and y=s1 −s2, which is a vector with component 2L yj =1 if node j belongs to cluster C1 and yj =−1 if node j ␭1͑A͒ Ͼ␭1͑M͒ Ն␭2͑A͒ Ն␭2͑M͒ Ն ¯ Ն␭N͑A͒ Ն␭N͑M͒. belongs to cluster C2, the general relation ͑14͒ reduces to Hence, increasing ␭ A implies increasing ␭ M . For regu- N 2͑ ͒ 1͑ ͒ 1 lar graphs, all eigenvalues of the modularity matrix M are the m = ␤2␭ M , 15 2 ͚ j j͑ ͒ ͑ ͒ same as those of the adjacency matrix A, except that ␭ ͑A͒ is 4L j=1 1 replaced with a zero eigenvalue. N T where y=͚j=1␤jwj, with ␤j =y wj. Expression ͑15͒ was Newman’s starting point in ͓21͔ for his iterated bisection C. Maximizing the modularity method. T T T T T Maximizing the modularity for cϾ2 is an non- Since WW =I, we have that tr͓͑W S͒ W S͔=tr͑S S͒=N polynomial ͑NP͒-complete problem ͓21,23͔. However, Stoer ͑see ͓22͔͒, such that and Wagner ͓24͔ presented a highly efficient algorithm to N c compute the minimum-cut problem when c=2, such that the T 2 modularity clustering for c=2 is not NP hard. In this section, ͚ ͚ ͑wj sk͒ = N. ͑16͒ j=1 k=1 we consider the spectral form ͑13͒ to deduce further insight along the lines of Newman in 25 . We define the non- T ͓ ͔ In the bicluster case where c=2, we see that y y=N such that negative weights N 2 u ͚ ␤ =N. Let wq = denote the eigenvector of M belong- j=1 j ͱN c ing to the eigenvalue ␭q͑M͒=0, then T 2 vj = ͚ ͑wj sk͒ , c c c 1 1 k=1 wTs 2 = uTs 2 = n2, ͚ ͑ q k͒ ͚ ͑ k͒ ͚ k and the modularity in Eq. ͑13͒ becomes k=1 N k=1 N k=1 N 1 where nk is the number of nodes in cluster Ck. By applying m = ͚ vj␭j͑M͒. ͑19͒ the inequality 2L j=1

c T 2 ak a1 + a2 + ¯ + an ak First, assume that v1 =͚k=1͑w1sk͒ =N, then Eq. ͑16͒ im- min Յ Յ max , ͑17͒ plies that wTs =0 for all 1ՅkՅc and all jϾ1. This means 1ՅkՅn qk q1 + q2 + ¯ + qn 1ՅkՅn qk j k that the vectors s1 ,s2 ,...,sc ͑or any linear combination c where q1 ,q2 ,...,qn are positive real numbers and of them, apart from ͚k=1sk =u͒ are orthogonal to the eigen- a1 ,a2 ,...,an are real numbers, vectors w2 ,w3 ,...,wN. Since eigenvectors span the

056113-4 SPECTRAL GRAPH ANALYSIS OF MODULARITY AND… PHYSICAL REVIEW E 82, 056113 ͑2010͒

N-dimensional space, it means that all sk’s must be parallel or proportional to w1. However, the vectors s1 ,s2 ,...,sc are orthogonal; hence, this is not possible. If there are c clusters, it seems that we must require in Eq. ͑16͒ that r ͚ vj = N, j=1 for rՆc−1, and that necessarily at least c−1 eigenvalues in T Eq. ͑13͒ play a role, because then wj sk =0 for all 1ՅkՅc and all jՆc. This means that the vectors s1 ,s2 ,...,sc ͑or any c linear combination of them, apart from ͚k=1sk =u that is pro- u portional to wq = ͱN ͒ are orthogonal to the eigenvectors wc ,wc+1,...,wN or, equivalently, that each community vector sk ͑with 0 or 1 components͒ is a linear combination of the c−1 first c−1 eigenvectors of M, sk =͚j=1␤kjwj, where ␤kj =wTs = w . Together with the equation c s =u, j k ͚q෈Ck͑ j͒q ͚k=1 k this set of equations is sufficient to determine all c community vectors. How to choose these coefficients ␤kj to optimize Eq. ͑13͒ remains a difficult problem. For example, we may adopt the strategy to choose the weights vj, so that vj Նvj+1 for each 1Յ jϽc. However, by incorporating an additional eigenvector wc, it can be possible to increase the weight v1 corresponding to ␭1͑M͒ more ͓despite a lower c ␭c͑M͒ is included in sum ͑13͒ and an extra vc in ͚j=1vj =N, such that the average weight E͓v͔= N decreases͔. Numerical FIG. 1. ͑Color online͒ The 20 largest ͑in absolute value͒ eigen- c values of the adjacency matrix A and of the modularity matrix M as computations ͑see Fig. 4 below͒ show that all vector compo- function of the percentage degree-preserving rewired links in an nents in Eq. ͑13͒ seem to play a role, but that the first eigen- instance of the Barabási-Albert scale-free graph with N=500 and vector w1 is by far the most important. L=1960 links. The relation between ␳D and the percentage of re- For a regular graph with degree r, it is known ͓22͔ that wired links as well as three hop count distributions is also plotted. ␭1͑M͒=␭2͑A͒Ͻ␭1͑A͒=r=E͓D͔ such that bound ͑18͒ equals

␭2͑A͒ 1 1 ␭2͑A͒. However, by degree-preserving rewiring the matrix m Յ 1− Ͻ 1− . 1 regular graph ͩ ͪ d·dT is not changed, only A is. This implies that increasing ␭1͑A͒ c c 2L ␳D via degree-preserving rewiring does not change the sum Since in general ␭1͑A͒Ͼ␭1͑M͒Ն␭2͑A͒, we observe that the of the eigenvalues of M, but it may increase the upper bound lowest upper bound in Eq. ͑18͒ is reached for regular graphs. of ␭1͑M͒ and, hence, via Eq. ͑18͒ also the modularity. In any Another consequence of the interlacing is that in graphs with case, it will not decrease the upper bound for ␭1͑M͒, as fol- large spectral gap ␭1͑A͒−␭2͑A͒, the largest eigenvalue ␭1͑M͒ lows from the interlacing property above. can be much smaller than ␭1͑A͒. For example, the complete graph that possesses the largest possible spectral gap equal to A. Assortativity, maximum modularity, and the spectrum N has ␭1͑M͒=0, the lowest possible largest eigenvalue of any modularity matrix M. Intuitively, graphs with large spec- of A and M tral gap are difficult to tear apart, which means that they form Figure 1 shows the 20 largest ͑in absolute value͒ eigen- already a quite tight community or cluster and that further values of both the adjacency matrix A and the modularity dividing such a graph is hardly possible, resulting in a low matrix M of a realization of the Barabási-Albert scale-free modularity m. graph with N=500 nodes, L=1960 links, E͓D͔Ϸ7.85, and ␳D Ӎ−0.05. Each elementary degree-preserving rewiring step,3 specified by the lemma in ͓26͔ that changes the assor- III. ASSORTATIVITY tativity, results in a connected different graph ͑with the same Networks where high-degree nodes preferentially connect degree vector͒. From one rewiring step to another, the large to other high-degree nodes are called assortative, whereas majority of eigenvalues of A ͑and, similarly, of M͒ are inter- networks where high-degree nodes connect to low-degree nodes are called disassortative. Assortativity is measured by 3A degree-preserved rewired Barabási-Albert scale-free graph the linear degree correlation coefficient ␳D, but we use here ͑and similarly an Erdős-Rényi random graph͒, where ␳D is signifi- assortativity and ␳D interchangeably. cantly changed, is not a Barabási-Albert scale-free graph anymore, In ͓26͔, it has been shown that increasing the assortativity which is characterized by ␳ 0 asymptotically ͑and ␳ =0 for D → D also increases the lower bound for ␭1͑A͒, but not necessarily Erdős-Rényi random graphs as shown in ͓26͔͒.

056113-5 VAN MIEGHEM et al. PHYSICAL REVIEW E 82, 056113 ͑2010͒

FIG. 4. ͑Color online͒ The product v ␭ ͑M͒ for each component FIG. 2. ͑Color online͒ The largest and second largest eigenval- j j j for three instances: the original BA graph ͑red͒, the maximum ues of the adjacency matrix A and the largest eigenvalues of the assortativity rewired version ͑black͒, and the maximum disassorta- modularity matrix M versus the linear degree correlation coefﬁcient tive rewired graph ͑green͒. The sum over all j equals 2Lm accord- ␳ for the Barabási-Albert graph with N=500 nodes and L=1960 D ing to Eq. ͑13͒. The inset shows the weights v ͑at the left on links. The right-hand side axis shows the corresponding maximum j logarithmic scale͒ and the eigenvalues ␭ ͑M͒͑at the right in bold modularity. j dashed lines͒.

laced as shown in ͓26͔, while always, the eigenvalues of M

are interlaced by those of A. The white band of eigenvalues correlation coefficient ␳D. We observe for the two realiza- around zero in Fig. 1 thus contains 480 smaller eigenvalues tions of different classes of graphs that, in the disassortativity ͑which are not shown because the picture would color com- region ͑␳D Ͻ0͒, ␭1͑M͒ follows ␭2͑A͒ reasonably well, while ␳ pletely͒. Also the relationship between assortativity ͑via D͒ in the assortativity region ␭1͑M͒ starts increasing toward and the percentage of degree-preserving rewired links is ␭1͑A͒. However, ␭1͑M͒ never reaches ␭1͑A͒, because ␭1͑M͒ shown, together with the hop count distribution of the origi- is always strictly smaller than ␭1͑A͒, as proved in ͓22͔. As a nal graph ͑no rewiring͒, and that at ␳Dmax and ␳Dmin. consequence and assuming that there is room to increase or Figures 2 and 3 show, for a realization of the rewired decrease ␳D, the larger the spectral gap, ␭1͑A͒−␭2͑A͒, the Barabási-Albert scale-free graph and of the rewired Erdős- larger the potential increase in modularity that can be Rényi random graph with equal N and almost equal number achieved via degree-preserving rewiring. of links L, respectively, how the first few eigenvalues of the Figures 2 and 3 also illustrate that the maximum modu- adjacency and modularity matrix vary with the linear degree larity is roughly proportional to ␭1͑M͒ as long as ␭1͑M͒ is close to ␭2͑A͒. The maximum modularity has been computed by the approximate algorithm described in ͓27͔. For increasing assortativity, the maximum modularity seems to increase faster than ␭1͑M͒. Apart from the extent in assortativity range, the rewired Barabási-Albert scale-free graph ͑Fig. 2͒ and the rewired Erdős-Rényi random graph ͑Fig. 3͒, both with same number of nodes and almost same number of links, behave surprisingly similarly, in spite of their different degree vectors. For three instances of the rewired Barabási-Albert scale- free graph, Fig. 4 draws each term vj␭j͑M͒ in the spectral form ͑13͒ of the modularity, as well as each weight vj and eigenvalue ␭j͑M͒ for 1Յ jՅ500. The inset in Fig. 4 shows that the weights vj vary irregularly, as a noisy signal around the mean 1, and that a very high peak ͑on a logarithmic scale͒ is observed corresponding to ␭q͑M͒=0, which has in- spired us to the general bound ͑18͒. Apart from that peak FIG. 3. ͑Color online͒ The largest and second largest eigenval- u ues of the adjacency matrix A and the largest eigenvalues of the corresponding to the eigenvector wq = ͱN , the weights roughly modularity matrix M versus the linear degree correlation coefficient decrease with the component or eigenvector j. Apart from a ␳D for the Erdős-Rényi random graph Gp͑N͒ with N=500 nodes few values, the resulting product vj␭j͑M͒ is decreasing in j. N L=1955 links and ␳D Ӎ−0.01. Thus, the link density p=L/͑ 2 ͒ Although only shown for the Barabási-Albert scale-free log N graph, these observations are generally observed: the first equals pӍ1.25pc, where pc ϳ N is the critical disconnectivity threshold. The right-hand side axis shows the corresponding maxi- term v1␭1͑M͒ contributes dominantly to the modularity ͑19͒ mum modularity. and illustrates that bound ͑18͒ can be sharp. The other terms

056113-6 SPECTRAL GRAPH ANALYSIS OF MODULARITY AND… PHYSICAL REVIEW E 82, 056113 ͑2010͒ jϾ1 are initially positive, but then negative ͑because the eigenvalues become negative͒, and the whole sum is needed to compute the modularity. Remarkably, a huge cancellation in the sum occurs because we found that sum ͑13͒ is close to its ﬁrst term.

“Shifting-the-weights” principle Since the eigenvalues of M are ordered as usual, ␭1͑M͒Ն␭2͑M͒Ն ¯ Ն␭N͑M͒, the maximum modularity is achieved by shifting in Eq. ͑13͒ as much weight as possible to the larger eigenvalues, which we call the shifting-the- weights principle. Figure 4 supports this principle: fewer eigenvalues in Eq. ͑13͒ imply that the individual weights vj are higher on average due to condition ͑16͒. Furthermore, Fig. 4 illustrates that, especially in the high assortativity regime, the ﬁrst c eigenvalues are clearly dominant, as argued in Sec. II C. FIG. 5. ͑Color online͒ The number of clusters for both the When the largest eigenvalues are close to each other, in- Barabási-Albert scale-free graph ͑corresponding to Fig. 2͒ and the corporating additional eigenvalues may increase the weights Erdős-Rényi random graph ͑corresponding to Fig. 3͒ as a function on the largest eigenvalues, which leads to a larger modular- of the linear degree correlation coefﬁcient ␳D. For both graphs, also ity. As the assortativity increases, the largest eigenvalues of the estimate in Eq. ͑7͒ is drawn in dotted line. matrix M seem to be dispelled from each other ͑see Fig. 1͒. In other words, spacing ␭1͑M͒−␭2͑M͒,␭2͑M͒−␭3͑M͒,... Figures 6 and 7 show details per cluster of the rewired between the largest eigenvalues of the matrix M seems to Barabási-Albert scale-free graph and Erdős-Rényi random grow as the assortativity increases. For large ␳D, the maximal graph for the extreme assortativity and disassortativity and 4 modularity includes a minimum amount of eigenvalues in two intermediate values of ␳D. The cluster sizes are ranked Eq. ͑13͒. The chance to increase the modularity m by incor- from largest to smallest. Apart from the extreme assortative porating more eigenvalues is small because ͑a͒ the average graphs, the cluster sizes are about the size order of magni- weight is reduced due to condition ͑16͒ and ͑b͒ the extra tude, as well as the average degree of nodes in each cluster, eigenvalues included are far smaller when the spacings of the which is drawn in the inset. The larger the assortativity, the leading eigenvalues are large. As a result, the modularity fewer the number of clusters ͑Fig. 5͒ and the larger the dif- increases with increasing assortativity ␳D faster than ␭1͑M͒ ferences in cluster size and average degree per cluster. as shown in Figs. 2 and 3, because v1 also increases with ␳D. We mention that community sizes in real-world networks Since the smallest number of eigenvalues that play a role in have been reported to follow a power law ͓5͔. Our graph the maximum modularity is c−1, the “shifting-the-weights” instances are far too small ͑N=500͒ to see a power-law be- principle also implies that the number of clusters c decreases havior. However, only in highly assortative graphs we expect with increasing assortativity, which is also observed in a power law assuming that our results can be scaled to larger Fig. 5. sizes of N. In disassortative graphs, the sizes of the clusters

B. Number and size of clusters when maximizing the modularity Figure 5 illustrates that the number of clusters as a function of ␳D in the rewired Erdős-Rényi random graph is roughly a stretch-out version of that in the rewired Barabási- Albert scale-free graph. We also observe that the number of clusters “inversely” correlates with the maximum modularity in Figs. 2 and 3: a high maximum modularity seems to correspond to a low number of clusters ͑and conversely͒. Fi- ء nally, Fig. 5 also shows the cluster estimate c =2ͱ2L/D⌬C in Eq. ͑7͒.

4This observation is consistent with what we can deduce from the modularity bound ͑A1͒ expressed in terms of the eigenvalue spacings: when the spacing is generally large, including an extra eigenvalue ͓i.e., c c+1 in Eq. ͑A1͒ in Appendix A͔, will cause a de- FIG. 6. ͑Color online͒ The cluster size of the jth largest cluster → crease in the last term N␭c+1͑M͒ of Eq. ͑A1͒, which is hardly for four values of ␳D of the Barabási-Albert scale-free graph ͑cor- exceeded by the increase in the ﬁrst sum, due to the large weight N responding to Fig. 2͒. The inset shows the average degree for each in that last term and condition ͑16͒. cluster j.

056113-7 VAN MIEGHEM et al. PHYSICAL REVIEW E 82, 056113 ͑2010͒

linear degree correlation coefficient ␳D. In both realizations, we observe from Fig. 8 that the standard deviation ␴H is smaller than the average, except for the extreme assortativity in the Barabási-Albert scale-free graph that has a broad bell- shaped hop count distribution ͑see Fig. 1͒. In addition, we observe that E͓H͔+␴D correlates well with the maximum modularity. An increase in average hop count E͓H͔ with increasing assortativity agrees with the increase in the largest eigenvalue ␭1͑A͒ of the adjacency matrix as function of ␳D. For example, virus or information spread ͑see, e.g., ͓28͔͒ is mainly possible when the effective spreading strength ex- ceeds a threshold ␶c =1/␭1͑A͒. Thus, when ␭1͑A͒ is large ͑corresponding to a large ␳D͒, virus spread is easy, although the hop count is large, while the opposite holds for the disassortative region. The increase in the largest eigenvalue ␭1͑A͒ suggests that virus or information spread over a part of FIG. 7. ͑Color online͒ The cluster size of the jth largest cluster the network becomes easier. Actually, it is easy to spread for four values of ␳D of the Erdős-Rényi random graph ͑correspond- over the high-degree nodes which are well interconnected in ing to Fig. 3͒. The inset shows the average degree for each cluster j. an assortative network. However, the spread over the entire network, especially over the low-degree nodes can be diffi- of the maximized modularity are roughly the same, as well cult, as implied by the increased average hop count. as the number of links ͑or average degree͒. For these graphs For sparse large graphs, the average hop count is approxi- where Var͓nC͔Ӎ0, the spectral upper bound ͑18͒ seems mately ͑see ͓29͔, pp. 343–345͒ promising. log N E͓H͔Ӎ , log ␮ C. Hop count and the assortativity where ␮ is the average degree minus 1. Assuming that the The hop count H is the number of links in an arbitrary number of nodes of each cluster is more or less the same ͑see shortest path in the graph. The average hop count E͓H͔ is a Figs. 6 and 7͒, the average hop count in a modular or hier- measure for the distance between nodes in the graph and archical graph approximately consists of the average number reflects the efficiency of transport in the graph: a short hop of intercommunity hops multiplied by the average number of count means that transport only requires a few links and that intracommunity hops, the end-to-end delay is likely small provided it is a sum over ͑ log c log N/c the number of hops in a path͒. E͓H͔Ӎ , Figure 8 shows how the average hop count E͓H͔ plus or log ␮inter log ␮intra minus one standard deviation ␴H =ͱVar͓H͔ changes with the from which we deduce that the minimum average hop count occurs, ignoring the integer nature of c, when c=ͱN. This asymptotic and approximate estimate agrees reasonably well with Figs. 5 and 8, where ͱNϷ22. This square-root law c =O͑ͱN͒ is remarkable, because earlier Fortunato ͓10͔ has constructed an example that maximizes the modularity from which he found that c=ͱL. We can indeed show, more generally, that a maximization of the modularity with respect to the number of clusters c results in c=ͱL+r, where the smaller terms are r=O͑1͒. Hence, when the modularity is maximized for sufficiently large graphs, it seems likely that the size of the clusters scales as O͑ͱN͒, whereas the number of links in a cluster scales as O͑ͱL͒, and that this holds for sparse graphs where L=O͑N͒, as follows from c=O͑ͱN͒ =ͱL. For both graphs, E͓H͔ increases as the assortative modules are forming in the graphs. Within the modules, the hop counts are lowered, but between the modules, the hop counts FIG. 8. ͑Color online͒ The average hop count E͓H͔ plus or are increased, resulting in the observed increase in variance. minus one standard deviation ␴H =ͱVar͓H͔ for both the Barabási- For the rewired Barabási-Albert scale-free graph, the group- Albert scale-free graph ͑corresponding to Fig. 2͒ and the Erdős- ing of the highest-degree nodes causes several hop counts to Rényi random graph ͑corresponding to Fig. 3͒ as a function of the increase as the efficient transport ͑originally, E͓H͔Ϸ3͒ that linear degree correlation coefficient ␳D. is associated with scale-free graphs is lost. Thus, although

056113-8 SPECTRAL GRAPH ANALYSIS OF MODULARITY AND… PHYSICAL REVIEW E 82, 056113 ͑2010͒

FIG. 9. ͑Color online͒ The normalized effective graph resistance FIG. 10. ͑Color online͒ The largest and second largest eigenval- N ues of the adjacency matrix A and the largest eigenvalues of the RG /͑ 2 ͒ for both the Barabási-Albert scale-free graph ͑correspond- ing to Fig. 2͒ and the Erdős-Rényi random graph ͑corresponding to modularity matrix M versus the linear degree correlation coefficient ␳D for the air transportation network with N=2179 nodes, L Fig. 3͒ versus the linear degree correlation coefficient ␳D. =31 326 links, and E͓D͔Ϸ28.8. The right-hand side axis shows the corresponding maximum modularity. the grouping of nodes with similar degrees does produce several short paths within the groups, the loss of the hubs as IV. REAL-WORLD GRAPHS transport facilitators reduces the number of short paths. In spite of the large difference in the degree distributions ͑albeit Figures 10–13 illustrate that similar phenomena, deduced the average degree E D = 2L is the same , Fig. 8 shows that ͓ ͔ N ͒ above from instances of the Erdős-Rényi random graph and the average hop count E H plus or minus one standard de- ͓ ͔ the Barabási-Albert scale-free graph, also occur in real-world viation ␴ =ͱVar H is surprisingly similar in these both H ͓ ͔ networks. The details of these networks are found for the air graphs. transportation network of the USA in ͓26͔, for the peer-to- peer network Gnutella in ͓30͔, for the coappearance network D. Effective graph resistance of characters in the novel Les Miserables of Victor Hugo in ͓31͔, and for the protein residue network of the immunoglo- The effective graph resistance RG is defined ͑see, e.g., ͓22͔͒ by bulin 1A4J in ͓32͔. We observe the general trend that ͑a͒ the largest eigen- N−1 value ␭ M of the modularity matrix is close to ␭ A , ex- 1 1͑ ͒ 2͑ ͒ cept for extreme values of ␳ , and ͑b͒ the maximum modu- RG = N͚ , D k=1 ␮k where ␮k is the kth largest eigenvalue of the Laplacian of the graph. The effective graph resistance RG measures the ease of communication in a graph. It can be shown ͓22͔ that

RG E͓H͔ Ն . N ͩ2 ͪ

Figure 9 shows the normalized effective graph resistance N RG /͑ 2 ͒ as a function of the assortativity. While the average hop count of the rewired Barabási-Albert scale-free graph can be smaller than that of the rewired Erdős-Rényi random graph ͑Fig. 8͒, its corresponding effective graph resistance is always larger. Just as the average hop count E͓H͔, the normalized effec- FIG. 11. Color online The largest and second largest eigenval- N ͑ ͒ tive graph resistance RG /͑ 2 ͒ of a graph seems to correlate ues of the adjacency matrix A and the largest eigenvalues of the highly with the modularity as a function of ␳D. In addition, modularity matrix M versus the linear degree correlation coefﬁcient the sharp increase toward the maximum value of ␳D indicates ␳D for the Gnutella network with N=77 nodes, L=254 links, and that disconnectivity ͑characterized by R ϱ͒ is likely to E͓D͔Ӎ6.5. The right-hand side axis shows the corresponding maxi- G → occur when ␳D would be further increased. mum modularity.

056113-9 VAN MIEGHEM et al. PHYSICAL REVIEW E 82, 056113 ͑2010͒

reveal conditions for or properties of the maximum modularity of a network and provide proofs of earlier claims. The influence of the spectrum of the modularity matrix on the maximum modularity is discussed from which we deduce a spectral bound ͑18͒. The second part of the paper investigates how the maximum modularity, the number of clusters, and the hop count of the shortest paths vary when the assortativity of the graph is changed via degree-preserving rewiring. Simulations suggests that, apart from the heavy assortative regime, the maximum modularity behaves as ␭1͑M͒, which—in turn—closely follows the second largest eigenvalue ␭2͑A͒ of the adjacency matrix A. Extensive simulations on several real-world complex networks show that the maximum modularity increases, the number of clusters decreases, and the average hop count FIG. 12. ͑Color online͒ The largest and second largest eigenval- and the effective graph resistance increase with increasing ues of the adjacency matrix A and the largest eigenvalues of the assortativity. The major driver to study the influence of modularity matrix M versus the linear degree correlation coefficient degree-preserved rewiring on modularity is to shed light on ␳D for the network Les Miserables with N=737 nodes, L=803 evolutionary processes in nature. If degree-preserved rewir- links, and E͓D͔Ϸ2.2. The right-hand side axis shows the corre- ing can be regarded as a simplified model for an evolutionary sponding maximum modularity. process, then we show that networks, close to their assortative bounds, generally possess a clear modular structure. larity increases with sufficiently large ␳D and faster than ␭1͑M͒. That the trend is not universal is demonstrated in Fig. ACKNOWLEDGMENT 13, where part ͑b͒ of the trend is not followed. This observed trend restricts the hypothesis in ͓7͔ that modularity seems to The work of P.S. was supported by the National Agricul- increase in evolutionary processes; it mainly does if these tural Biosecurity Center ͑NABC͒ at Kansas State University. processes are assortative in nature. As explained in ͓26͔, degree-preserving rewiring may be regarded as a ͑simplified͒ APPENDIX A: ALTERNATIVE FORM evolutionary process and most complex networks are not as- OF THE MODULARITY IN TERMS sortative ͑which is in agreement with the data in ͓15͔͒, be- OF THE EIGENVALUE SPACING cause highly assortative networks are likely disconnected and lack synergy or diversity in their connection pattern. The partial ͑or Abel͒ summation,

V. CONCLUSION n n−1 k n ͚ akbk = ͚ ͚ al ͑bk − bk+1͒ + bn ͚ al , Several expressions ͑3͒, ͑9͒, and ͑10͒ and bounds ͑4͒, ͑5͒, k=1 k=1 ͩ l=1 ͪ ͩ l=1 ͪ and ͑18͒ for Newman’s modularity are derived. These results N ء of m =͑2L͒m=͚j=1vj␭j͑M͒ equals N−1 j N ء m = ͚ ͚ vk ͓␭j͑M͒ − ␭j+1͑M͔͒ + ␭N͑M͚͒ vk j=1 ͩk=1 ͪ k=1 N−1 j = ͚ ͚ vk ͓␭j͑M͒ − ␭j+1͑M͔͒ + N␭N͑M͒, j=1 ͩk=1 ͪ and only the last term is negative, while all others are posi- j tive. Further, using condition ͑16͒ such that ͚k=1vk N +͚k=j+1vk =N, we can write c j N−1 N ء m = ͚ ͚ vk ͓␭j͑M͒ − ␭j+1͑M͔͒ + ͚ N − ͚ vk j=1 ͩk=1 ͪ j=c+1 ͩ k=j+1 ͪ

ϫ͓␭j͑M͒ − ␭j+1͑M͔͒ + N␭N͑M͒ c j FIG. 13. ͑Color online͒ The largest and second largest eigenvalues of the adjacency matrix A and the largest eigenvalues of the = ͚ ͚ vk ͓␭j͑M͒ − ␭j+1͑M͔͒ + N␭c+1͑M͒ j=1 ͩk=1 ͪ modularity matrix M versus the linear degree correlation coefﬁcient N−1 N ␳D for the protein network of immunoglobulin 1A4J with N=95 nodes, L=213 links, and E D 4.5. The right-hand side axis ͓ ͔Ϸ − ͚ ͚ vk ͓␭j͑M͒ − ␭j+1͑M͔͒. shows the corresponding maximum modularity. j=c+1 ͩk=j+1 ͪ

056113-10 SPECTRAL GRAPH ANALYSIS OF MODULARITY AND… PHYSICAL REVIEW E 82, 056113 ͑2010͒

Again, by shifting all the weight to the ﬁrst c clusters such 1 T T 2 Linter that ␭ ͑M͒Ն0 and v =0 for kՆc+1, an upper bound of m = 2 u ͓DCDC −diag͑DC ͔͒u − , ͑B1͒ c+1 k ͑2L͒ i L the modularity can be written in terms of the spacings ␭j͑M͒−␭j+1͑M͒ as where DC is the cϫ1 vector of community degree sums, c j which equals ء m Յ ͚ ͚ vk ͓␭j͑M͒ − ␭j+1͑M͔͒ + N␭c+1͑M͒. j=1 ͩk=1 ͪ T DC = S d. ͑A1͒ This expression shows that the larger indices of j are more Using that relation gives heavily weighted. Recall that the spacing ⌬␭j =␭j͑M͒ −␭ M Ն0 is not necessarily decreasing with j. j+1͑ ͒ 1 T T 1 T Linter The number of positive eigenvalues of M is, due to inter- m = 2 u DCDCu − 2 DCDC − ͑2L͒ ͑2L͒ L lacing, less ͑by 1 of 0͒ than the number of positive eigenvalues of A. The latter is larger than the independence number 1 1 L = uTSTddTSu − dTSSTd − inter . ͑see ͓22͔͒, which is equal to the largest coclique. Thus, when ͑2L͒2 ͑2L͒2 L maximizing the modularity ͓such that ␭c+1͑M͒Ն0͔, the number of clusters c should be smaller than the independence With Su =u and uTd=2L, we arrive at number of the graph. cϫ1 Nϫ1 APPENDIX B: ALTERNATIVE MATRIX FORM L 1 m =1− inter − dTSSTd, FOR THE MODULARITY L ͑2L͒2 Identities ͑8͒ and ͑9͒ also offer a matrix representation of the modularity as which equals Eq. ͑2͒.

͓1͔ X. Chu, J. Guan, Z. Zhang, and S. Zhou, J. Stat. Mech.: Theory ͓18͔ M. Brede and S. Sinha, e-print arXiv:cond-mat/0507709. Exp. ͑2009͒, P07043. ͓19͔ P. Holme and J. Zhao, Phys. Rev. E 75, 046111 ͑2007͒. ͓2͔ J. P. Gleeson, Phys. Rev. E 77, 046117 ͑2008͒. ͓20͔ Z. Jing, T. Lin, Y. Hong, L. Jian-Hua, C. Zhi-Wei, and L. ͓3͔ J. Omic, J. Martin Hernandez, and P. Van Mieghem, 22nd In- Yi-Xue, Chin. Phys. 16, 3571 ͑2007͒. ternational Teletrafﬁc Congress ITC 22 , Amsterdam, 2010 . ͑ ͒ ͑ ͒ ͓21͔ M. E. J. Newman, Proc. Natl. Acad. Sci. U.S.A. 103, 8577 ͓4͔ M. E. J. Newman and M. Girvan, Phys. Rev. E 69, 026113 ͑2006͒. ͑2004͒. ͓22͔ P. Van Mieghem, Graph Spectra for Complex Networks ͑Cam- ͓5͔ S. Fortunato, Phys. Rep. 486, 75 ͑2010͒. bridge University Press, Cambridge, UK 2010 . ͓6͔ I. A. Kovács, R. Palotai, M. S. Szalay, and P. Csermely, PLoS ͒͑ ͒ One 5, e12528 ͑2010͒. ͓23͔ U. Brandes, D. Delling, M. Gaertler, R. Görke, M. Hoefer, Z. ͓7͔ R. Guimerà, M. Sales-Pardo, and L. A. Nunes Amaral, Phys. Nikoloski, and D. Wagner, IEEE Trans. Knowl. Data Eng. 20, Rev. E 70, 025101͑R͒͑2004͒. 172 ͑2008͒. ͓8͔ L. Danon, A. Díaz-Guilera, J. Duch, and A. Arenas, J. Stat. ͓24͔ M. Stoer and F. Wagner, J. ACM 44, 585 ͑1997͒. Mech.: Theory Exp. ͑2005͒, P09008. ͓25͔ M. E. J. Newman, Phys. Rev. E 74, 036104 ͑2006͒. ͓9͔ J. Reichardt and S. Bornholdt, Phys. Rev. E 74, 016110 ͓26͔ P. Van Mieghem, H. Wang, X. Ge, S. Tang, and F. A. Kuipers, ͑2006͒. Eur. Phys. J. B 76, 643 ͑2010͒. ͓10͔ S. Fortunato and M. Barthélemy, Proc. Natl. Acad. Sci. U.S.A. ͓27͔ A. Clauset, M. E. J. Newman, and C. Moore, Phys. Rev. E 70, 104, 36 ͑2007͒. 066111 ͑2004͒. ͓11͔ J. Kumpula, J. Saramäki, K. Kaski, and J. Kertész, Eur. Phys. ͓28͔ P. Van Mieghem, J. Omic, and R. E. Kooij, IEEE/ACM Trans. J. B 56, 41 ͑2007͒. Netw. 17,1͑2009͒. ͓12͔ A. Arenas, A. Fernández, and S. Gómez, New J. Phys. 10, ͓29͔ P. Van Mieghem, Performance Analysis of Communications 053039 ͑2008͒. Systems and Networks ͑Cambridge University Press, Cam- ͓13͔ B. H. Good, Y. de Montjoye, and A. Clauset, Phys. Rev. E 81, bridge, UK, 2006͒. 046106 ͑2010͒. ͓30͔ Gnutella: Internet Peer-to-Peer Networks, http:// ͓14͔ M. E. J. Newman, Phys. Rev. Lett. 89, 208701 ͑2002͒. snap.stanford.edu/data/ ͓15͔ M. E. J. Newman, Phys. Rev. E 67, 026126 ͑2003͒. ͓31͔ D. E. Knuth, The Stanford GraphBase: A Platform for Combi- ͓16͔ M. E. J. Newman and M. Girvan, Mixing Patterns and Com- natorial Computing ͑Addison-Wesley, Reading, MA, 1993͒. munity Structure in Networks, Statistical Mechanics of Com- ͓32͔ Protein Data Bank ͑PDB͒ Codes, http://www.pdb.org/pdb/ plex Networks ͑Springer, Berlin, 2003͒, pp. 66–87. explore/explore.do?structureIdϭ1A4J ͓17͔ R. Xulvi-Brunet and I. Sokolov, Phys. Rev. E 70, 066102 ͓33͔ M. E. J. Newman, Networks: An Introduction ͑Oxford Univer- ͑2004͒. sity Press, New York, 2010͒.

056113-11