Overview of Network Theory

Overview of Network Theory

Overview of Network Theory MAE 298, Spring 2009, Lecture 1 Prof. Raissa D’Souza University of California, Davis Example social networks (Immunology; viral marketing; aliances/policy) M. E. J. Newman The Internet (Robustness to failure; optimizing future growth; testing protocols on sample topologies) H. Burch and B. Cheswick A typical web domain (Web search/organization and growth centralized vs. decentralized protocols) M. E. J. Newman The airline network (Optimization; dynamic external demands) Continental Airlines The power grid (Mitigating failure; Distributed sources) M. E. J. Newman Biology: Networks at many levels Control mechanisms / drug design/ gene therapy / biomarkers of disease Cellular networks: GENOME • Genome, Proteome: protein-gene Dandekar Lab interactions PROTEOME • Metabolome: protein-protein Fiehn Lab interactions METABOLISM • Data intergration Bio-chemical BIOshare reactions Lin, Genome Center Citrate Cycle • Network structure / search for biomarkers: D’Souza Software systems (Highly evolveable, modular, robust to mutation, exhibit punctuated eqm) Open-source software as a “systems” paradigm. Networks: • Function calls • Email communication • Socio-Technical congruence Bird, Devanbu, D’Souza, Filkov, Saul, Wen Networks: Physical, Biological, Social • Geometric versus virtual (Internet versus WWW). • Natural /spontaneously arising versus engineered /built. • Each network optimizes something unique. • Identifying similarities and fundamental differences can guide future design/understanding. • Interplay of topology and function ? • Unifying features: – Broad heterogeneity in node degree. – Small Worlds (Diameter ∼ log(N)). Explosion of work and tools • R, Graphviz, Pajek, igraph, Network Workbench, NetworkX, Netdraw, UCInet, Bioconductor, Ubigraph.... Natn Acam Sciences/Natn Research Council Study (2005) “all our modern critical infrastructure relies on networks... too much emphasis on specific applications/jargon/disciplinary stovepipes... need a cross-cutting science of networks... Research for the 21st century” How do we represent a network as a mathematical object? NETWORK TOPOLOGY Connectivity matrix, M: ( 1 if edge exists between i and j Mij = 0 otherwise: 0 1 1 1 1 1 0 B 1 1 0 1 0 C B C B C B 1 0 1 0 0 C = M B C @ 1 1 0 1 1 A 0 0 0 1 1 Node degree is number of links. Typical measures of network topology • Degree distribution (fraction of nodes with degree k, for all k) • Clustering coefficient (fraction of triangles in the graph/transitivity: Are my friends friends with each other?) Also a local measure, for each node ci is number of connections existing between neighbors/total number of possible connections. Typical measures of network topology, cont • Diameter (Greatest distance between any two connected nodes) “Small world” if d ∼ log N and strong clustering. (Watts Stogatz, Nature 393, 1998.) • Betweenness centrality (Fraction of shortest paths passing through a node, i.e., is a node a bottleneck for flow?) Typical measures of network topology, cont • Assortative/dissortative mixing (Are nodes with similar attributes more or less likely to link to each other? Mixing by node degree common. Also, in social networks mixing by gender and race.) (Example of assortative mixing by race. Friendship network of HS students: White, African American and Other.) Network Activity: FLOWS on NETWORKS (Spread of disease, routing data, materials transport/flow) Random walk on the network has state transition matrix, P : 0 1 1=4 1=3 1=2 1=4 0 B 1=4 1=3 0 1=4 0 C B C B C B 1=4 0 1=2 0 0 C = P B C @ 1=4 1=3 0 1=4 1=2 A 0 0 0 1=4 1=2 The eigenvalues and eigenvectors convey much information. Markov Chains, Spectral Gap. Random walk on the WWW is the “Page Rank” Page Rank of a node is the steady-state random walk occupancy probabilty. Example Eigen-technique: Community structure (Political Books 2004) M. Girvan and M. E. J. Newman The “classic” random graph, G(N; p) (The Null Model) • P. Erdos¨ and A. Renyi,´ “On random graphs”, Publ. Math. Debrecen. 1959. • P. Erdos¨ and A. Renyi,´ “On the evolution of random graphs”, Publ. Math. Inst. Hungar. Acad. Sci. 1960. • E. N. Gilbert, “Random graphs”, Annals of Mathematical Statistics, 1959. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● • N ● ● ● ● ● Start with isolated vertices. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● • Add random edges one-at-a-time. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● N(N − 1)=2 total edges possible. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● • After E edges, probability p of any ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● edge is p = 2E=N(N − 1) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● What does the resulting graph look like? (Typical member of the ensemble) N=300 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● p = 1=400 = 0:0025 p = 1=200 = 0:005 Emergence of a “giant component” • pc = 1=N. • p < pc, Cmax ∼ log(N) • p > pc, Cmax ∼ A · N (Ave node degree t = pN so tc = 1.) Branching process (Galton-Watson); “tree”-like at tc = 1. Is connectivity a good thing? • Communication, transportation networks • Spreading of a virus (human or computer) Random graphs as real-world networks? • What about degree distribution, clustering, assortativity....? – Shown later, Erdos-Renyi yields a Poisson degree distribution, but “configuration” models work around this. – Still need null models to match other properties. • “Network Analysis in the Social Sciences”, S. P. Borgatti, A. Mehra, D. J. Brass, G. Labianca, Science 323, 892-895, 2009. – Why would a real network look like a random one? – Local properties of nodes and edges, not statistics of the network. • Developing the correct null models? Degree distribution of “real-world” networks Extremely broad range of node degree observed: from biological, to technological, to social. Typical distribution in node degree The “Internet” “Who-is-Who” network Faloutsos3, SIGCOMM 1999 Szendroi¨ and Csanyi´ p(k) ∼ k−2:16 p(k) = ck−γe−αk 10000 10000 "971108.out" "980410.out" exp(7.68585) * x ** ( -2.15632 ) exp(7.89793) * x ** ( -2.16356 ) 1000 1000 100 100 10 10 1 1 1 10 100 1 10 100 • Small data sets, power laws vs other similar distributions? 10000 • What is the “Internet”/ what10000 level? (e.g., router vs AS) "981205.out" "routes.out" exp(8.11393) * x ** ( -2.20288 ) exp(8.52124) * x ** ( -2.48626 ) 1000 1000 100 100 10 10 1 1 1 10 100 1 10 100 Power law with exponential tail Ubiquitous empirical measurements: System with: p(x) ∼ x−B exp(−x=C) B C Full protein-interaction map of Drosophila 1:20 0:038 High-confidence protein-interaction map of Drosophila 1:26 0:27 Gene-flow/hydridization network of plants as function of spatial distance 0.75 105 m Earthquake magnitude 1.35 - 1.7 ∼ 1021 Nm Avalanche size of ferromagnetic materials 1.2 - 1.4 L1:4 ArXiv co-author network 1:3 53 MEDLINE co-author network 2:1 ∼ 5800 PNAS paper citation network 0.49 4.21 What is a power law? (Also called a “Pareto Distribution” in statistics). −γ pk ∼ k ln pk ∼ −γ ln k 1e−01 1e−04 p(k) 1e−07 1e−10 1 100 10000 k Power Laws versus Bell Curves: “Heavy tails” −γ • Power law distribution: pk ∼ k . 2 2 • Gaussian distribution: pk ∼ exp(−k =2σ ). 1.0 1e−08 0.8 0.6 1e−20 p(k) p(k) 0.4 1e−32 0.2 1e−44 0.0 1e−56 0 100 200 300 400 500 1 2 5 10 20 50 100 500 k k If 1 < γ < 2, mean and variance ! 1. If 2 < γ < 3 mean is finite, but variance ! 1. Degree distribution and Network Growth Models • Heterogeneity in real networks. • Concentrated, Poisson Distribution in Erdos-R¨ enyi´ : – Probability to connect to k nodes is pk. – Probability to be disconnected from remaining (n − k) is (1 − p)(n−k). – Probability for a vertex to have degree k follows a binomial distribution: n k n−k pk = k p (1 − p) : • Seek alternate mechanisms... Known Mechanisms for Power Laws • Phase transitions (singularities) • Random multiplicative processes (fragmentation) • Combination of exponentials (e.g. word frequencies) • Preferential attachment / Proportional attachment (Polya 1923, Yule 1925, Zipf 1949, Simon 1955, Price 1976, Barabasi´ and Albert 1999) Attractiveness is proportional to size: ds dt / s • Add in saturation [Amaral 2000, Borner¨

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    38 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us