Assortativity and Mixing
Total Page:16
File Type:pdf, Size:1020Kb
Assortativity and Assortativity and Mixing General mixing between node categories Mixing Assortativity and Mixing Definition Definition I Assume types of nodes are countable, and are Complex Networks General mixing General mixing Assortativity by assigned numbers 1, 2, 3, . Assortativity by CSYS/MATH 303, Spring, 2011 degree degree I Consider networks with directed edges. Contagion Contagion References an edge connects a node of type µ References e = Pr Prof. Peter Dodds µν to a node of type ν Department of Mathematics & Statistics Center for Complex Systems aµ = Pr(an edge comes from a node of type µ) Vermont Advanced Computing Center University of Vermont bν = Pr(an edge leads to a node of type ν) ~ I Write E = [eµν], ~a = [aµ], and b = [bν]. I Requirements: X X X eµν = 1, eµν = aµ, and eµν = bν. µ ν ν µ Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. 1 of 26 4 of 26 Assortativity and Assortativity and Outline Mixing Notes: Mixing Definition Definition General mixing General mixing Assortativity by I Varying eµν allows us to move between the following: Assortativity by degree degree Definition Contagion 1. Perfectly assortative networks where nodes only Contagion References connect to like nodes, and the network breaks into References subnetworks. General mixing P Requires eµν = 0 if µ 6= ν and µ eµµ = 1. 2. Uncorrelated networks (as we have studied so far) Assortativity by degree For these we must have independence: eµν = aµbν . 3. Disassortative networks where nodes connect to nodes distinct from themselves. Contagion I Disassortative networks can be hard to build and may require constraints on the eµν. References I Basic story: level of assortativity reflects the degree to which nodes are connected to nodes within their group. 2 of 26 5 of 26 Assortativity and Assortativity and Basic idea: Mixing Correlation coefficient: Mixing Definition Definition Random networks with arbitrary degree distributions I General mixing General mixing cover much territory but do not represent all Assortativity by I Quantify the level of assortativity with the following Assortativity by networks. degree [5] degree Contagion assortativity coefficient : Contagion I Moving away from pure random networks was a key References P P References first step. eµµ − aµbµ Tr E − ||E2|| = µ µ = 1 r P 2 I We can extend in many other directions and a 1 − µ aµbµ 1 − ||E ||1 natural one is to introduce correlations between different kinds of nodes. where || · ||1 is the 1-norm = sum of a matrix’s entries. I Node attributes may be anything, e.g.: I Tr E is the fraction of edges that are within groups. 1. degree 2 I ||E ||1 is the fraction of edges that would be within 2. demographics (age, gender, etc.) groups if connections were random. 3. group affiliation 2 I 1 − ||E ||1 is a normalization factor so rmax = 1. I We speak of mixing patterns, correlations, biases... I When Tr eµµ = 1, we have r = 1. I Networks are still random at base but now have more X global structure. I When eµµ = aµbµ, we have r = 0. X [4, 5] I Build on work by Newman , and Boguñá and [1] Serano. 3 of 26 6 of 26 Assortativity and Assortativity and Correlation coefficient: Mixing Degree-degree correlations Mixing Definition Definition General mixing General mixing Assortativity by Assortativity by Notes: degree degree I Notation reconciliation for undirected networks: Contagion Contagion I r = −1 is inaccessible if three or more types are P References j k(ejk − Rj Rk ) References present. r = j k σ2 I Disassortative networks simply have nodes R connected to unlike nodes—no measure of how where, as before, Rk is the probability that a unlike nodes are. randomly chosen edge leads to a node of degree I Minimum value of r occurs when all links between k + 1, and non-like nodes: Tr eµµ = 0. 2 I 2 2 X 2 X −||E ||1 σ = j Rj − jRj . r = R min 2 j j 1 − ||E ||1 where −1 ≤ rmin < 0. 7 of 26 10 of 26 Assortativity and Assortativity and Scalar quantities Mixing Degree-degree correlations Mixing Definition Definition Now consider nodes defined by a scalar integer I General mixing General mixing quantity. Assortativity by Assortativity by degree degree I Examples: age in years, height in inches, number of friends, ... Contagion Error estimate for r: Contagion References References I ejk = Pr (a randomly chosen edge connects a node I Remove edge i and recompute r to obtain ri . with value j to a node with value k). I Repeat for all edges and compute using the jackknife [2] I aj and bk are defined as before. method () Can now measure correlations between nodes I 2 X 2 based on this scalar quantity using standard Pearson σr = (ri − r) . correlation coefficient ( ): i MIXING PATTERNS IN NETWORKS PHYSICAL REVIEW E 67, 026126 ͑2003͒ P I Mildly sneaky as variables need to be independent j k j k(ejk − aj bk ) hjki − hjiahkib TABLE II. Size n, degree assortativity coefficient r, and expected error r on the assortativity, for a r = = q q numberfor of social, us technological, to be truly and biological happy networks, and both edges directed are and undirected. correlated... Social networks: σa σb 2 2 2 2 coauthorship networks of ͑a͒ physicists and biologists ͓46͔ and ͑b͒ mathematicians ͓47͔, in which authors are hj ia − hjia hk ib − hkib connected if they have coauthored one or more articles in learned journals; ͑c͒ collaborations of film actors in which actors are connected if they have appeared together in one or more movies ͓5,7͔; ͑d͒ directors of fortune 1000 companies for 1999, in which two directors are connected if they sit on the board of directors of the same company ͓48͔; ͑e͒ romantic ͑not necessarily sexual͒ relationships between students at a U.S. high I This is the observed normalized deviation from school ͓49͔; ͑f͒ network of email address books of computer users on a large computer system, in which an edge from user A to user B indicates that B appears in A’s address book ͓50͔. Technological networks: ͑g͒ randomness in the product jk. network of high voltage transmission lines in the Western States Power Grid of the United States ͓5͔; ͑h͒ 8 of 26 network of direct peering relationships between autonomous systems on the Internet, April 2001 ͓51͔; ͑i͒ 11 of 26 network of hyperlinks between pages in the World Wide Web domain nd.edu, circa 1999 ͓52͔; ͑j͒ network of dependencies between software packages in the GNU/Linux operating system, in which an edge from pack- age A to package B indicates that A relies on components of B for its operation. Biological networks: ͑k͒ Assortativity and protein-protein interaction network in the yeast S. Cerevisiae ͓53͔; ͑l͒ metabolic network of the bacterium E. Assortativity and Degree-degree correlations Mixing MeasurementsColi ͓54͔; ͑m͒ neural network of the of nematode degree-degree worm C. Elegans ͓5,55͔; tropic interactions correlations between species Mixing in the food webs of ͑n͒ Ythan Estuary, Scotland ͓56͔ and ͑o͒ Little Rock Lake, Wisconsin ͓57͔. Definition Definition General mixing Group Network Type Size n Assortativity r Error General mixing I Natural correlation is between the degrees of r Assortativity by a Physics coauthorship undirected 52 909 0.363 0.002 Assortativity by connected nodes. degree a Biology coauthorship undirected 1 520 251 0.127 0.0004 degree b Mathematics coauthorship undirected 253 339 0.120 0.002 Contagion Contagion I Now define ejk with a slight twist: Social c Film actor collaborations undirected 449 913 0.208 0.0002 References d Company directors undirected 7 673 0.276 0.004 References e Student relationships undirected 573 Ϫ0.029 0.037 an edge connects a degree j + 1 node f Email address books directed 16 881 0.092 0.004 ejk = Pr to a degree k + 1 node g Power grid undirected 4 941 Ϫ0.003 0.013 Technological h Internet undirected 10 697 Ϫ0.189 0.002 i World Wide Web directed 269 504 Ϫ0.067 0.0002 an edge runs between a node of in-degree j j Software dependencies directed 3 162 Ϫ0.016 0.020 = Pr and a node of out-degree k k Protein interactions undirected 2 115 Ϫ0.156 0.010 l Metabolic network undirected 765 Ϫ0.240 0.007 Biological m Neural network directed 307 Ϫ0.226 0.016 n Marine food web directed 134 Ϫ0.263 0.037 I Useful for calculations (as per Rk ) o Freshwater food web directed 92 Ϫ0.326 0.031 I Important: Must separately define P0 as the {ejk } Ϫm) almost never is. In this paper, therefore, we take an ͓58,59͔. The algorithm is as follows. contain no information about isolated nodes. alternative approach,I Social making use networks of computer simulation. tend to be assortative͑1͒ Given the desired (homophily) edge distribution e jk , we first cal- We would like to generate on a computer a random net- culate the corresponding distribution of excess degrees qk I Directed networks still fine but we will assume from work having, forI instance,Technological a particular value and of the biological matrix fromnetworks Eq. ͑23͒, and then tend invert to Eq. be͑22͒ to find the degree e jk . ͓This also fixes the degree distribution, via Eq. ͑23͒.͔ In distribution: here on that ejk = ekj . Ref. ͓22͔ we discusseddisassortative one possible way of doing this using an algorithm similar to that of Sec. II C. One would draw qkϪ1 /k pkϭ . ͑27͒ 9 of 26 edges from the desired distribution e jk and then join the de- 12 of 26 q jϪ1 / j gree k ends randomly in groups of k to create the network.