<<

and Assortativity and Mixing General mixing between node categories Mixing Assortativity and Mixing Definition Definition I Assume types of nodes are countable, and are Complex Networks General mixing General mixing Assortativity by assigned numbers 1, 2, 3, . . . . Assortativity by CSYS/MATH 303, Spring, 2011 degree I Consider networks with directed edges. Contagion Contagion   References an edge connects a node of type µ References e = Pr Prof. Peter Dodds µν to a node of type ν

Department of Mathematics & Statistics Center for Complex Systems aµ = Pr(an edge comes from a node of type µ) Vermont Advanced Computing Center University of Vermont bν = Pr(an edge leads to a node of type ν)

~ I Write E = [eµν], ~a = [aµ], and b = [bν]. I Requirements: X X X eµν = 1, eµν = aµ, and eµν = bν. µ ν ν µ

Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. 1 of 26 4 of 26

Assortativity and Assortativity and Outline Mixing Notes: Mixing Definition Definition

General mixing General mixing

Assortativity by I Varying eµν allows us to move between the following: Assortativity by degree degree Definition Contagion 1. Perfectly assortative networks where nodes only Contagion References connect to like nodes, and the network breaks into References subnetworks. General mixing P Requires eµν = 0 if µ 6= ν and µ eµµ = 1. 2. Uncorrelated networks (as we have studied so far) Assortativity by degree For these we must have independence: eµν = aµbν . 3. Disassortative networks where nodes connect to nodes distinct from themselves.

Contagion I Disassortative networks can be hard to build and may require constraints on the eµν. References I Basic story: level of assortativity reflects the degree to which nodes are connected to nodes within their group.

2 of 26 5 of 26

Assortativity and Assortativity and Basic idea: Mixing Correlation coefficient: Mixing Definition Definition Random networks with arbitrary degree distributions I General mixing General mixing cover much territory but do not represent all Assortativity by I Quantify the level of assortativity with the following Assortativity by networks. degree [5] degree Contagion assortativity coefficient : Contagion I Moving away from pure random networks was a key References P P References first step. eµµ − aµbµ Tr E − ||E2|| = µ µ = 1 r P 2 I We can extend in many other directions and a 1 − µ aµbµ 1 − ||E ||1 natural one is to introduce correlations between different kinds of nodes. where || · ||1 is the 1-norm = sum of a matrix’s entries. I Node attributes may be anything, e.g.: I Tr E is the fraction of edges that are within groups. 1. degree 2 I ||E ||1 is the fraction of edges that would be within 2. demographics (age, gender, etc.) groups if connections were random. 3. group affiliation 2 I 1 − ||E ||1 is a normalization factor so rmax = 1. I We speak of mixing patterns, correlations, biases... I When Tr eµµ = 1, we have r = 1. I Networks are still random at base but now have more X global structure. I When eµµ = aµbµ, we have r = 0. X [4, 5] I Build on work by Newman , and Boguñá and [1] Serano. . 3 of 26 6 of 26 Assortativity and Assortativity and Correlation coefficient: Mixing Degree-degree correlations Mixing Definition Definition

General mixing General mixing

Assortativity by Assortativity by Notes: degree degree I Notation reconciliation for undirected networks: Contagion Contagion I r = −1 is inaccessible if three or more types are P References j k(ejk − Rj Rk ) References present. r = j k σ2 I Disassortative networks simply have nodes R connected to unlike nodes—no measure of how where, as before, Rk is the probability that a unlike nodes are. randomly chosen edge leads to a node of degree I Minimum value of r occurs when all links between k + 1, and non-like nodes: Tr eµµ = 0.  2 I 2 2 X 2 X −||E ||1 σ = j Rj −  jRj  . r = R min 2 j j 1 − ||E ||1

where −1 ≤ rmin < 0.

7 of 26 10 of 26

Assortativity and Assortativity and Scalar quantities Mixing Degree-degree correlations Mixing Definition Definition Now consider nodes defined by a scalar integer I General mixing General mixing

quantity. Assortativity by Assortativity by degree degree I Examples: age in years, height in inches, number of friends, ... Contagion Error estimate for r: Contagion References References I ejk = Pr (a randomly chosen edge connects a node I Remove edge i and recompute r to obtain ri . with value j to a node with value k). I Repeat for all edges and compute using the jackknife [2] I aj and bk are defined as before. method () Can now measure correlations between nodes I 2 X 2 based on this scalar quantity using standard Pearson σr = (ri − r) . correlation coefficient ( ): i  MIXING PATTERNS IN NETWORKS PHYSICAL REVIEW E 67, 026126 ͑2003͒ P I Mildly sneaky as variables need to be independent j k j k(ejk − aj bk ) hjki − hjiahkib TABLE II. Size n, degree assortativity coefficient r, and expected error ␴r on the assortativity, for a r = = q q numberfor of social, us technological, to be truly and biological happy networks, and both edges directed are and undirected. correlated... Social networks: σa σb 2 2 2 2 coauthorship networks of ͑a͒ physicists and biologists ͓46͔ and ͑b͒ mathematicians ͓47͔, in which authors are hj ia − hjia hk ib − hkib connected if they have coauthored one or more articles in learned journals; ͑c͒ collaborations of film actors in which actors are connected if they have appeared together in one or more movies ͓5,7͔; ͑d͒ directors of fortune 1000 companies for 1999, in which two directors are connected if they sit on the board of directors of the same company ͓48͔; ͑e͒ romantic ͑not necessarily sexual͒ relationships between students at a U.S. high I This is the observed normalized deviation from school ͓49͔; ͑f͒ network of email address books of computer users on a large computer system, in which an edge from user A to user B indicates that B appears in A’s address book ͓50͔. Technological networks: ͑g͒ randomness in the product jk. network of high voltage transmission lines in the Western States Power Grid of the United States ͓5͔; ͑h͒ 8 of 26 network of direct peering relationships between autonomous systems on the Internet, April 2001 ͓51͔; ͑i͒ 11 of 26 network of hyperlinks between pages in the World Wide Web domain nd.edu, circa 1999 ͓52͔; ͑j͒ network of dependencies between software packages in the GNU/Linux operating system, in which an edge from pack- age A to package B indicates that A relies on components of B for its operation. Biological networks: ͑k͒ Assortativity and protein-protein interaction network in the yeast S. Cerevisiae ͓53͔; ͑l͒ metabolic network of the bacterium E. Assortativity and Degree-degree correlations Mixing MeasurementsColi ͓54͔; ͑m͒ neural network of the of nematode degree-degree worm C. Elegans ͓5,55͔; tropic interactions correlations between species Mixing in the food webs of ͑n͒ Ythan Estuary, Scotland ͓56͔ and ͑o͒ Little Rock Lake, Wisconsin ͓57͔. Definition Definition

General mixing Group Network Type Size n Assortativity r Error ␴ General mixing I Natural correlation is between the degrees of r Assortativity by a Physics coauthorship undirected 52 909 0.363 0.002 Assortativity by connected nodes. degree a Biology coauthorship undirected 1 520 251 0.127 0.0004 degree b Mathematics coauthorship undirected 253 339 0.120 0.002 Contagion Contagion I Now define ejk with a slight twist: Social c Film actor collaborations undirected 449 913 0.208 0.0002 References d Company directors undirected 7 673 0.276 0.004 References   e Student relationships undirected 573 Ϫ0.029 0.037 an edge connects a degree j + 1 node f Email address books directed 16 881 0.092 0.004 ejk = Pr to a degree k + 1 node g Power grid undirected 4 941 Ϫ0.003 0.013 Technological h Internet undirected 10 697 Ϫ0.189 0.002   i World Wide Web directed 269 504 Ϫ0.067 0.0002 an edge runs between a node of in-degree j j Software dependencies directed 3 162 Ϫ0.016 0.020 = Pr and a node of out-degree k k Protein interactions undirected 2 115 Ϫ0.156 0.010 Metabolic network undirected 765 Ϫ0.240 0.007 Biological m Neural network directed 307 Ϫ0.226 0.016 n Marine food web directed 134 Ϫ0.263 0.037 I Useful for calculations (as per Rk ) o Freshwater food web directed 92 Ϫ0.326 0.031

I Important: Must separately define P0 as the {ejk } Ϫm) almost never is. In this paper, therefore, we take an ͓58,59͔. The algorithm is as follows. contain no information about isolated nodes. alternative approach,I Social making use networks of computer simulation. tend to be assortative͑1͒ Given the desired () edge distribution e jk , we first cal- We would like to generate on a computer a random net- culate the corresponding distribution of excess degrees qk I Directed networks still fine but we will assume from work having, forI instance,Technological a particular value and of the biological matrix fromnetworks Eq. ͑23͒, and then tend invert to Eq. be͑22͒ to find the degree e jk . ͓This also fixes the , via Eq. ͑23͒.͔ In distribution: here on that ejk = ekj . Ref. ͓22͔ we discusseddisassortative one possible way of doing this using an algorithm similar to that of Sec. II C. One would draw qkϪ1 /k pkϭ . ͑27͒ 9 of 26 edges from the desired distribution e jk and then join the de- 12 of 26 q jϪ1 / j gree k ends randomly in groups of k to create the network. ͚j ͑This algorithm has also been discussed recently by Dorogovtsev, Mendes, and Samukhin ͓42͔.͒ As we pointed out, however, this algorithm is flawed because in order to Note that this equation cannot tell us how many vertices create a network without any dangling edges the number of there are of degree zero in the network. This information is degree k ends must be a multiple of k for all k. It is very not contained in the edge distribution e jk since no edges unlikely that these constraints will be satisfied by chance, connect to degree-zero vertices, and so must be specified and there does not appear to be any simple way of arranging separately. On the other hand, most of the properties of net- for them to be satisfied without introducing bias into the works with which we will be concerned here do not depend ensemble of graphs. Instead, therefore, we use a Monte Carlo on the number of degree-zero vertices, so we can safely set sampling scheme which is essentially equivalent to the p0ϭ0 for the purposes of this paper. Metropolis–Hastings method widely used in the mathemati- ͑2͒ We draw a degree sequence, a specific set ki of de- cal and social sciences for generating model networks grees of the vertices iϭ1, . . . ,N, from the distribution pk ,

026126-7 Assortativity and Assortativity and Spreading on degree-correlated networks Mixing Spreading on degree-correlated networks Mixing Definition Definition

General mixing General mixing Assortativity by ~ Assortativity by degree I Differentiate Fj (x; B1), set x = 1, and rearrange. degree Contagion ~ Contagion Next: Generalize our work for random networks to I We use Fk (1; B1) = 1 which is true when no giant I References References degree-correlated networks. exists. We find:

I As before, by allowing that a node of degree k is ∞ ∞ activated by one neighbor with probability B , we 0 ~ X X 0 ~ k1 Rj Fj (1; B1) = ejk Bk+1,1 + kejk Bk+1,1Fk (1; B1). can handle various problems: k=0 k=0 1. find the giant component size. 2. find the probability and extent of spread for simple disease models. I Rearranging and introducing a sneaky δjk : 3. find the probability of spreading for simple threshold ∞ ∞ models. X  0 ~ X δjk Rk − kBk+1,1ejk Fk (1; B1) = ejk Bk+1,1. k=0 k=0

13 of 26 16 of 26

Assortativity and Assortativity and Spreading on degree-correlated networks Mixing Spreading on degree-correlated networks Mixing Definition Definition

General mixing General mixing

Assortativity by Assortativity by degree degree I In matrix form, we have Contagion Contagion I Goal: Find fn,j = Pr an edge emanating from a References ~ 0 ~ ~ References A ~ F (1; B1) = EB1 degree j + 1 node leads to a finite active E,B1 subcomponent of size n. where I Repeat: a node of degree k is in the game with h i probability Bk1. A ~ = δjk Rk − kBk+1,1ejk , E,B1 j+1,k+1 Define B~ = [B ]. I 1 k1 h~ 0 ~ i 0 ~ F (1; B1) = Fk (1; B1), I Plan: Find the generating function k+1 ~ P∞ n h i Fj (x; B1) = = fn,j x . ~ n 0 [E]j+1,k+1 = ejk , and B1 = Bk+1,1. k+1

14 of 26 17 of 26

Assortativity and Assortativity and Spreading on degree-correlated networks Mixing Spreading on degree-correlated networks Mixing Definition Definition

General mixing I So, in principle at least: General mixing I Recursive relationship: Assortativity by Assortativity by degree ~ 0 ~ −1 ~ degree ∞ F (1; B1) = A EB1. e Contagion ~ Contagion ~ 0 X jk E,B1 Fj (x; B1) = x (1 − Bk+1,1) Rj References References k=0 Now: as F~ 0(1; B~ ), the average size of an active ∞ I 1 X ejk h ~ ik component reached along an edge, increases, we + x Bk+1,1 Fk (x; B1) . Rj move towards a transition to a giant component. k=0 I Right at the transition, the average component size I First term = Pr that the first node we reach is not in explodes. the game. I Exploding inverses of matrices occur when their I Second term involves Pr we hit an active node which determinants are 0. has k outgoing edges. I The condition is therefore: I Next: find average size of active components reached by following a link from a degree j + 1 node det AE,B~ = 0 0 ~ 1 = Fj (1; B1). . 15 of 26 18 of 26 Assortativity and Assortativity and Spreading on degree-correlated networks Mixing Spreading on degree-correlated networks Mixing Definition Definition

I General condition details: General mixing I Truly final piece: Find final size using approach of General mixing Gleeson [3], a generalization of that used for   Assortativity by Assortativity by det A ~ = det δjk Rk−1 − (k − 1)Bk,1ej−1,k−1 = 0. degree degree E,B1 uncorrelated random networks. Contagion Contagion I Need to compute θj,t , the probability that an edge The above collapses to our standard contagion References References I leading to a degree j node is infected at time t. condition when ejk = Rj Rk . ~ ~ I Evolution of edge activity probability: I When B1 = B1, we have the condition for a simple disease model’s successful spread ~ θj,t+1 = Gj (θt ) = φ0 + (1 − φ0)×   det δjk Rk−1 − B(k − 1)ej−1,k−1 = 0. ∞ k−1   X ej−1,k−1 X k − 1 i k−1−i θk,t (1 − θk,t ) Bki . ~ ~ Rj−1 i I When B1 = 1, we have the condition for the k=1 i=0 existence of a giant component: I Overall active fraction’s evolution: det δ R − (k − 1)e  = 0. ∞ k jk k−1 j−1,k−1 X X k φ = φ +(1−φ ) P θ i (1−θ )k−i B . t+1 0 0 k i k,t k,t ki I Bonusville: We’ll find a much better version of this k=0 i=0

set of conditions later... 19 of 26 22 of 26

Assortativity and Assortativity and Spreading on degree-correlated networks Mixing Spreading on degree-correlated networks Mixing Definition Definition I As before, these equations give the actual evolution General mixing General mixing of φt for synchronous updates. We’ll next find two more pieces: Assortativity by Assortativity by VOLUME 89, NUMBER 20 PHYSICAL REVIEW LETTERS 11NOVEMBER 2002 ~ ~ ~ degree I Contagion condition follows from θt+1 = G(θt ). degree 1. Ptrig, the probability of starting a cascade Contagion ~ ~ ~ Contagion Equation (7) diverges at the point at which the deter- the parameterI Expandp controllingG around the assortativeθ0 mixing.= 0. From 2. S, the expected extent of activationminant given of A is a zero. small This point marksReferences the phase transition Eq. (3), the value of r is References at which a giant component forms in our graph. By ∞ ~ ∞ 2 ~ 8pq 1X ∂Gj (0) 1 X ∂ Gj (0) 2 seed. considering the behavior of Eq. (7) close to the transition, θ r = G (~0)+ÿ ; θ(10)+ θ +... j,t+1 1= j 2 k,t 2 k,t where s must be large and positive in the absence of a ˆ 2e 1 2 p q ∂θk,t 2! ∂θ giant component,h i we deduce that a giant component ex- ÿ ‡ kÿ=1† k=1 k,t Triggering probability: ists in the network when detA > 0. This is the appropriate which can take both positive and negative values, passing through zero when p p 1 1 p2 0:1464 . . . . generalization for a network with of ~ 0 2 4 In Fig.I If 1 weG showj (0) theˆ6= size0ˆ for ofÿ the at giant leastˆ component one j for, always have some Generating function: the criterion of Molloy and Reed [16] for the existence of I a giant component. graphs ofinfection. this type as a function of the degree scale parameter , from both our numerical simulations and   To calculate the size S of the giant component, we ∂G (~0) ∞ define u to be the probability that an edge connected to the exact solution~ above. As the figure shows, the two are j h k ik I If Gj (0) = 0 ∀ j, want largest eigenvalue ∂θ > 1. ~ X a ~ of remaining degree k leads to another vertex that in good agreement. The three curves in the figure are for k,t H(x; B1) = x Pk Fk−1(x; B1) . p 0:05, where the graph is disassortative, p p , does not belong to the giant component. Then ˆ ˆ 0 k=0 whereI it isCondition neutral (neither for assortative spreading nor disassortative), is therefore dependent on k 1 e u and p 0:5, where it is assortative. S 1 p p uk ;u k jk k : (8) eigenvalues of this matrix: 0 k k 1 j P As ˆbecomes large we see the expected phase tran- ˆ ÿ ÿ kX1 ÿ ˆ k ejk I Generating function for vulnerable componentˆ size is sition at which a giant component forms. There are two P ∂ (~) more complicated. To test these results and to help form a more complete important points to notice aboutGj the0 figure.e First,j−1,k the−1 picture of the properties of assortatively mixed networks, position of the phase transition moves lower= as the graph (k − 1)Bk1 we have also performed computer simulations, generating becomes more assortative. That∂θ is,k, thet graph percolatesRj−1 networks with given values of ejk and measuring their more easily, creating a giant component, if the high- properties directly. Generating such networks20 isof 26not en- degree verticesInsert preferentiallyquestion associatefrom withassignment other high- 9 () 23 of 26 tirely trivial. One cannot simply draw a set of degree pairs degree ones. Second, notice that, by contrast, the size of ji;ki for edges i from the distribution ejk, since such a the giant component for large  is smaller in the assorta- set would† almost certainly fail to satisfy the basic topo- tively mixed network. logical requirement that the numberAssortativity of edges ending and at These findings are intuitively reasonable. If the net- Assortativity and Spreading on degree-correlatedvertices networks of degree k must be a multipleMixing of k. Instead, workHow mixes the assortatively, giant then component the high-degree vertices changes with Mixing therefore we propose the followingDefinition Monte Carlo algo- will tend to stick together in a subnetwork or core group Definition rithm for generating graphs. ofassortativity: higher mean degree than the network as a whole. It is First, we generate a random graphGeneral with mixing the desired reasonable to suppose that percolation would occur earlier General mixing Want probability of not reaching adegree finite distribution component. according to theAssortativity prescription by given within such a subnetwork. Conversely, since percolation Assortativity by I in Ref. [16]. Then we apply a Metropolisdegree dynamics to will be restricted to this subnetwork, it is not surprising degree the graph in which on each step we choose at random two Contagion Contagion P = S =1 − H(1; B~ ) edges, denoted by the vertex pairs, v1;w1 and v2;w2 , 1.0 trig trig 1 that they connect. We measure the References remaining† degrees† References ∞ j1;k1 and j2;k2 for these vertex pairs, and then replace I More assortative h † ik† 0.8 X the edges with two new ones v1;v2 and w1;w2 with

~ S =1 − Pk Fkprobability−1(1; Bmin1) 1; e. e = e e † . This dynamics† networks ‰ j1j2 k1k2 † j1k1 j2k2 †Š k=0 conserves the degree sequence, is ergodic on the set of 0.6 percolate for lower graphs having that degree sequence, and, with the choice of acceptance probability above, satisfies detailed balance 0.4 assortative average degrees for state probabilities~ e , and hence has the required neutral I Last piece: we have to compute Fk−1(1; B1). i jiki edge distribution e asQ its fixed point. giant component disassortative I But disassortative jk 0.2 I Nastier (nonlinear)—we have to solveAs an the example, recursive consider the symmetric binomial form networks end up expression we started with when x = 1: j k = j k j k j k k j 0.0 ejk eÿ ‡ † ‡ p q ‡ p q ; (9) with higher ~ P∞ ejk ˆ N  j  ‡  k   1 10 100 Fj (1; B1) = k=0 R (1 − Bk+1,1)+ extents of j where p q 1,  > 0, and 1 1 e 1= is a exponential parameter κ k 2 ÿ e normalizing‡h constant.ˆ (Thei binomialN ˆ probabilities ÿ †p and spreading. P∞ jk ~ FIG. 1. Size of the giant component as a fraction of graph Bq kshould+1,1 notF bek ( confused1; B1) with. the quantities p and q from Newman, 2002 [4] k=0 Rj k k size for graphs with the edge distribution given in Eq. (9). The introduced earlier.) This distribution is chosen for ana- points are simulation results for graphs of N 100 000 verti- I Iterative methods should work here.lytic tractability, although its behavior is also quite natu- ces, while the solid lines are the numerical solutionˆ of Eq. (8). ral: the distribution of the sum j k of the degrees at the ‡ Each point is an average over ten graphs; the resulting statis- ends of an edge falls off as a simple exponential, while tical errors are smaller than the symbols. The values of p are that sum is distributed between the two ends binomially, 0.5 (circles), p0 0:146 . . . (squares), and 0:05 (triangles). 21 of 26 ˆ 24 of 26 208701-3 208701-3 Assortativity and References I Mixing Definition

General mixing

[1] M. Boguñá and M. Ángeles Serrano. Assortativity by Generalized percolation in random directed networks. degree Phys. Rev. E, 72:016106, 2005. pdf ( ) Contagion  References [2] B. Efron and C. Stein. The jackknife estimate of variance. The Annals of Statistics, 9:586–596, 1981. pdf () [3] J. P. Gleeson. Cascades on correlated and modular random networks. Phys. Rev. E, 77:046117, 2008. pdf () [4] M. Newman. Assortative mixing in networks. Phys. Rev. Lett., 89:208701, 2002. pdf ()

25 of 26

Assortativity and References II Mixing Definition

General mixing

Assortativity by degree

Contagion

References

[5] M. E. J. Newman. Mixing patterns in networks. Phys. Rev. E, 67:026126, 2003. pdf ()

26 of 26