Assortativity and Assortativity and Mixing Outline Mixing
Assortativity and Mixing Definition Definition Complex Networks, Course 303A, Spring, 2009 General mixing General mixing Assortativity by Definition Assortativity by degree degree
Contagion Contagion Prof. Peter Dodds References General mixing References
Department of Mathematics & Statistics University of Vermont Assortativity by degree
Contagion
References
Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. Frame 1/25 Frame 2/25
Assortativity and Assortativity and Basic idea: Mixing General mixing between node categories Mixing
I Random networks with arbitrary degree distributions I Assume types of nodes are countable, and are Definition Definition cover much territory but do not represent all assigned numbers 1, 2, 3, . . . . General mixing General mixing
networks. Assortativity by I Consider networks with directed edges. Assortativity by degree degree I Moving away from pure random networks was a key Contagion an edge connects a node of type µ Contagion first step. eµν = Pr References to a node of type ν References I We can extend in many other directions and a natural one is to introduce correlations between aµ = Pr(an edge comes from a node of type ν) different kinds of nodes. Node attributes may be anything, e.g.: I b = Pr(an edge leads to a node of type ν) 1. degree µ 2. demographics (age, gender, etc.) 3. group affiliation ~ I Write E = [eµν], ~a = [aµ], and b = [bµ]. We speak of mixing patterns, correlations, biases... I I Requirements: I Networks are still random at base but now have more X X X global structure. eµν = 1, eµν = aµ, and eµν = bν. [3, 4] Frame 3/25 µ ν ν µ Frame 4/25 I Build on work by Newman . Assortativity and Assortativity and Notes: Mixing Correlation coefficient: Mixing
Definition Definition I Varying eµν allows us to move between the following: General mixing I Quantify the level of assortativity with the following General mixing [4] Assortativity by assortativity coefficient : Assortativity by 1. Perfectly assortative networks where nodes only degree degree connect to like nodes, and the network breaks into Contagion P P 2 Contagion µ eµµ − µ aµbµ Tr E − ||E ||1 References = = References subnetworks. r P 2 P 1 − µ aµbµ 1 − ||E ||1 Requires eµν = 0 if µ 6= ν and µ eµµ = 1. 2. Uncorrelated networks (as we have studied so far) where || · || is the 1-norm = sum of a matrix’s entries. For these we must have independence: eµν = aµbν . 1 3. Disassortative networks where nodes connect to I Tr E is the fraction of edges that are within groups. nodes distinct from themselves. 2 I ||E ||1 is the fraction of edges that would be within I Disassortative networks can be hard to build and groups if connections were random. may require constraints on the eµν. 2 I 1 − ||E ||1 is a normalization factor so rmax = 1. I Basic story: level of assortativity reflects the degree I When Tr eµµ = 1, we have r = 1. to which nodes are connected to nodes within their X group. I When eµµ = aµbµ, we have r = 0. X
Frame 5/25 Frame 6/25
Assortativity and Assortativity and Correlation coefficient: Mixing Scalar quantities Mixing
I Now consider nodes defined by a scalar integer Definition quantity. Definition Notes: General mixing General mixing Assortativity by I Examples: age in years, height in inches, number of Assortativity by I r = −1 is inaccessible if three or more types are degree friends, ... degree Contagion Contagion presents. e = Pr a randomly chosen edge connects a node References I jk References I Disassortative networks simply have nodes with value j to a node with value k. connected to unlike nodes—no measure of how I aj and bk are defined as before. unlike nodes are. I Can now measure correlations between nodes I Minimum value of r occurs when all links between based on this scalar quantity using standard non-like nodes: Tr e = 1. µµ Pearson correlation coefficient (): I P −||E2|| j k(ejk − aj bk ) hjki − hji hki = 1 r = j k = a b rmin 2 q q 1 − ||E ||1 σa σb 2 2 2 2 hj ia − hjia hk ib − hkib where −1 ≤ rmin < 0.
Frame 7/25 I This is the observed normalized deviation from Frame 8/25 randomness in the product jk. Assortativity and Assortativity and Degree-degree correlations Mixing Degree-degree correlations Mixing
I Natural correlation is between the degrees of Definition Definition connected nodes. General mixing General mixing Notation reconciliation for undirected networks: Assortativity by I Assortativity by Now define e with a slight twist: degree degree I jk P j k(ejk − Rj Rk ) Contagion = j k Contagion r 2 an edge connects a degree j + 1 node References σ References e = Pr R jk to a degree k + 1 node where, as before, Rk is the probability that a an edge runs between a node of in-degree j randomly chosen edge leads to a node of degree = Pr and a node of out-degree k k + 1, and 2 I Useful for calculations (as per Rk ) 2 X 2 X MIXING PATTERNS IN NETWORKSσR = j Rj − jRj PHYSICAL. REVIEW E 67, 026126 ͑2003͒ I Important: Must separately define P0 as the {ejk } j j TABLE II. Size n, degree assortativity coefficient r, and expected error r on the assortativity, for a contain no information about isolated nodes. number of social, technological, and biological networks, both directed and undirected. Social networks: coauthorship networks of ͑a͒ physicists and biologists ͓46͔ and ͑b͒ mathematicians ͓47͔, in which authors are I Directed networks still fine but we will assume from connected if they have coauthored one or more articles in learned journals; ͑c͒ collaborations of film actors in which actors are connected if they have appeared together in one or more movies ͓5,7͔; ͑d͒ directors of here on that ejk = ekj . Frame 9/25 fortune 1000 companies for 1999, in which two directors are connected if they sit on the board of directors Frame 10/25 of the same company ͓48͔; ͑e͒ romantic ͑not necessarily sexual͒ relationships between students at a U.S. high school ͓49͔; ͑f͒ network of email address books of computer users on a large computer system, in which an edge from user A to user B indicates that B appears in A’s address book ͓50͔. Technological networks: ͑g͒ network of high voltage transmission lines in the Western States Power Grid of the United States ͓5͔; ͑h͒ network of direct peering relationships between autonomous systems on the Internet, April 2001 ͓51͔; ͑i͒ network of hyperlinks between pages in the World Wide Web domain nd.edu, circa 1999 ͓52͔; ͑j͒ network of dependencies between software packages in the GNU/Linux operating system, in which an edge from pack- age A to package B indicates that A relies on components of B for its operation. Biological networks: ͑k͒ Assortativity and protein-protein interaction network in the yeast S. Cerevisiae ͓53͔; ͑l͒ metabolic network of the bacterium E. Assortativity and Degree-degree correlations Mixing MeasurementsColi ͓54͔; ͑m͒ neural network of the of nematode degree-degree worm C. Elegans ͓5,55͔; tropic interactions correlations between species Mixing in the food webs of ͑n͒ Ythan Estuary, Scotland ͓56͔ and ͑o͒ Little Rock Lake, Wisconsin ͓57͔.
Group Network Type Size n Assortativity r Error Definition r Definition
General mixing a Physics coauthorship undirected 52 909 0.363 0.002 General mixing a Biology coauthorship undirected 1 520 251 0.127 0.0004 Assortativity by b Mathematics coauthorship undirected 253 339 0.120 0.002 Assortativity by Error estimate for r: degree Social c Film actor collaborations undirected 449 913 0.208 0.0002 degree Contagion d Company directors undirected 7 673 0.276 0.004 Contagion I Remove edge i and recompute r to obtain ri . e Student relationships undirected 573 Ϫ0.029 0.037 References f Email address books directed 16 881 0.092 0.004 References I Repeat for all edges and compute using the [1] g Power grid undirected 4 941 Ϫ0.003 0.013 jackknife method () Technological h Internet undirected 10 697 Ϫ0.189 0.002 i World Wide Web directed 269 504 Ϫ0.067 0.0002 j Software dependencies directed 3 162 Ϫ0.016 0.020 2 X 2 σr = (ri − r) . k Protein interactions undirected 2 115 Ϫ0.156 0.010 i l Metabolic network undirected 765 Ϫ0.240 0.007 Biological m Neural network directed 307 Ϫ0.226 0.016 n Marine food web directed 134 Ϫ0.263 0.037 I Mildly sneaky as variables need to be independent o Freshwater food web directed 92 Ϫ0.326 0.031 for us to be truly happy and edges are correlated... Ϫm) almost never is. In this paper, therefore, we take an ͓58,59͔. The algorithm is as follows. alternative approach,I Social making use networks of computer simulation. tend to be assortative͑1͒ Given the desired (homophily) edge distribution e jk , we first cal- We would like to generate on a computer a random net- culate the corresponding distribution of excess degrees qk work having, forI instance,Technological a particular value and of the biological matrix fromnetworks Eq. ͑23͒, and then tend invert to Eq. be͑22͒ to find the degree e jk . ͓This also fixes the degree distribution, via Eq. ͑23͒.͔ In distribution: Frame 11/25 Ref. ͓22͔ we discusseddisassortative one possible way of doing this using Frame 12/25 an algorithm similar to that of Sec. II C. One would draw qkϪ1 /k pkϭ . ͑27͒ edges from the desired distribution e jk and then join the de- q jϪ1 / j gree k ends randomly in groups of k to create the network. ͚j ͑This algorithm has also been discussed recently by Dorogovtsev, Mendes, and Samukhin ͓42͔.͒ As we pointed out, however, this algorithm is flawed because in order to Note that this equation cannot tell us how many vertices create a network without any dangling edges the number of there are of degree zero in the network. This information is degree k ends must be a multiple of k for all k. It is very not contained in the edge distribution e jk since no edges unlikely that these constraints will be satisfied by chance, connect to degree-zero vertices, and so must be specified and there does not appear to be any simple way of arranging separately. On the other hand, most of the properties of net- for them to be satisfied without introducing bias into the works with which we will be concerned here do not depend ensemble of graphs. Instead, therefore, we use a Monte Carlo on the number of degree-zero vertices, so we can safely set sampling scheme which is essentially equivalent to the p0ϭ0 for the purposes of this paper. Metropolis–Hastings method widely used in the mathemati- ͑2͒ We draw a degree sequence, a specific set ki of de- cal and social sciences for generating model networks grees of the vertices iϭ1, . . . ,N, from the distribution pk ,
026126-7 Assortativity and Assortativity and Spreading on degree-correlated networks Mixing Spreading on degree-correlated networks Mixing
Definition Definition
General mixing General mixing
Assortativity by Assortativity by Next: Generalize our work for random networks to degree degree I I Goal: Find fn,j = Pr an edge emanating from a degree-correlated networks. Contagion Contagion References degree j + 1 node leads to a finite active References I As before, by allowing that a node of degree k is subcomponent of size n. activated by one neighbor with probability β , we k1 Repeat: a node of degree k is in the game with can handle various problems: I probability β . 1. find the giant component size. k1 ~ 2. find the probability and extent of spread for simple I Define β1 = [βk1]. disease models. I Plan: Find the generating function 3. find the probability of spreading for simple threshold ~ P∞ n Fj (x; β1) = fn,j x . models. n=0
Frame 13/25 Frame 14/25
Assortativity and Assortativity and Spreading on degree-correlated networks Mixing Spreading on degree-correlated networks Mixing
Definition Definition I Recursive relationship: General mixing ~ General mixing I Differentiate Fj (x; β1), set x = 1, and rearrange. ∞ e Assortativity by Assortativity by ~ 0 X jk degree We use F (1; β~ ) = 1 which is true when no giant degree Fj (x; β1) = x (1 − βk+1,1) I k 1 Rj Contagion Contagion k=0 component exists. We find: ∞ References References k X ejk h ~ i ∞ ∞ + x βk+1,1 Fk (x; β1) . 0 ~ X X 0 ~ Rj Rj Fj (1; β1) = ejk βk+1,1 + kejk βk+1,1Fk (1; β1). k=0 k=0 k=0
I First term = Pr that the first node we reach is not in the game. I Rearranging and introducing a sneaky δjk : Second term involves Pr we hit an active node which I ∞ ∞ has k outgoing edges. X 0 ~ X δjk Rk − kβk+1,1ejk Fk (1; β1) = ejk βk+1,1. I Next: find average size of active components k=0 k=0 reached by following a link from a degree j node = 0 ~ Fj (1; β1). Frame 15/25 Frame 16/25 Assortativity and Assortativity and Spreading on degree-correlated networks Mixing Spreading on degree-correlated networks Mixing
Definition I So, in principle at least: Definition General mixing General mixing ~ 0 ~ −1 ~ I In matrix form, we have Assortativity by F (1; β1) = A ~ Eβ1. Assortativity by degree E,β1 degree ~ 0 ~ ~ Contagion Contagion A ~ F (1; β1) = Eβ1 E,β1 ~ 0 ~ References I Now: as F (1; β1), the average size of an active References where component reached along an edge, increases, we move towards a transition to a giant component. h i A ~ = δjk Rk − kβk+1,1ejk , I Right at the transition, the average component size E,β1 j+1,k+1 explodes. h~ 0 ~ i 0 ~ F (1; β1) = Fk (1; β1), k+1 I Exploding inverses of matrices occur when their h~ i determinants are 0. [E]j+1,k+1 = ejk , and β1 = βk+1,1. k+1 I The condition is therefore:
det A ~ = 0 E,β1 Frame 17/25 . Frame 18/25
Assortativity and Assortativity and Spreading on degree-correlated networks Mixing Spreading on degree-correlated networks Mixing
I General condition details: Definition We’ll next find two more pieces: Definition General mixing General mixing det A ~ = det δjk Rk−1 − (k − 1)βk,1ej−1,k−1 = 0. E,β1 Assortativity by Assortativity by degree 1. Ptrig, the probability of starting a cascade degree I The above collapses to our standard contagion Contagion 2. S, the expected extent of activation given a small Contagion condition when ejk = Rj Rk . References seed. References ~ ~ I When β1 = β1, we have the condition for a simple disease model’s successful spread Triggering probability: