Random Walks and Networks

Workshop on Complex Systems, USP São Paulo, December 5th-9th 2011 Tim Evans Random Walks and Networks Page 1 © Imperial College London Outline 1. Notation 2. Random Walks for Model Building 3. Random Walks and Community Detection • Modularity and Vertex Communities • Overlapping Communities 4. Random Walks for Everything 1. Biased Random Walks 2. Random Walks and the Map Equation 5. From Random Walks to Entropy Page 2 © Imperial College London x a z a NOTATION Page 3 © Imperial College London Notation I will focus on Simple Graphs (no values or directions on edges, no multiple edges, no values for vertices) • N = number of vertices in graph • i,j,.. = indices of vertices • E = number of edges in graph • a, b, … = for edge indices N=6 E=8 Page 4 © Imperial College London Notation – degree of a vertex Number of edges connected to a vertex is called the degree of a vertex • k = degree of a vertex • <k> = average degree = (2E / N) Degree Distribution Degree k=2 • n(k) = number of vertices with degree k • p(k) = n(k)/N = probability a random vertex has degree k Page 5 © Imperial College London Notation - Adjacency Matrix The Adjacency Matrix Aij is • 1 if vertices i and j are attached • 0 if vertices i and j are not attached V6 vertices V1 V2 V3 V4 V5 V6 V4 V1 0 1 1 0 0 0 V5 V2 1 0 1 1 0 0 V2 V3 1 1 0 1 1 0 V4 0 1 1 0 1 1 V3 V5 0 0 1 1 0 0 V1 V6 0 0 0 1 0 0 Page 6 © Imperial College London RANDOM WALKS FOR MODEL BUILDING Page 7 © Imperial College London Generalised Random Graphs – The Molloy-Reed Construction [1995,1998] i. Fix N vertices ii. Attach k stubs to each vertex, where k is drawn from given distribution p(k) iii. Connect pairs of stubs chosen at random Page 8 © Imperial College London No Vertex-Vertex Correlations Generalised Random Graphs have given p(k) but otherwise completely random in particular - Properties of all vertices are the same For any given source vertex, the properties of neighbouring vertices independent of properties of the source vertex Page 9 © Imperial College London Random Walks on Random Graphs The degree distribution of a neighbour is not simply p(k) You are more likely to arrive at a high degree vertex than a low degree one k 1 2 p(k | k ) n p(k ) 3 n i k n kn Degree of neighbour kn independent of degree of starting point ki ki Page 10 © Imperial College London Summary of Generalised Random Graphs • These can be reasonable approximations for many theoretical models • Probably not for real world so then use these as a null model. • Calculations with random graphs work because – lack of correlations between vertices – few loops for large sparse graphs, graphs are basically trees • Accessible analytically so can suggest typical behaviour even if very weak e.g. diameter vs N Page 11 © Imperial College London Random Walks for Natural Scale Free Networks Page 12 © Imperial College London Imperial Medical School (URL, 4 weeks, < 3rd Jan 2005) Long PeriodTails 2 (excluding Haldane), degree distribution in Real Data N=14805 <k>=22.95 slope=-2.4 0 0 0.5 1 1.5 2 2.5 3 1.E+00 -1 1 10 100 1000 10000 100000 1.E-01 -2 1.E-02 Imperial Medical -3 log(p(k)) -4 1.E-03 School Web Site -5 1.E-04 Imperial Library Loans (Hook, TSE) p(k) Poission -6 1.E-05 (Hook, Sooman, Warren, TSE) -7 Distribution p(k) 1.E-06 log(k) 1.E-07 Degree distribution, eBay Crawl (max 1000) 1.E-08 Equiv. 1.E-09 Binomial 0 Logged Degree k 0 1 2 3 4 user -1 in -2 degree -3 -4 log p(k) -5 gallery·future-i·com -6 Text -7 [Yule 1925,1944; frequency -8 (Sooman, Warren, TSE) Simon 1955; in TSE -9 Price 1965,1976; Contemporary log k Barabasi, Albert 1999] Physics review Page 13 © Imperial College London All log(k) vs. log(p(k)) except text log(rank) vs. log(freq.) P(k) Growth with Preferential Attachment [Yule 1925, 1944; Simon 1955; Price 1965,1976; Barabasi,Albert 1999 ] 2/(2E) 1. Add new vertex attached to one end of m=½<k> new 5/(2E) edges 2. Attach other ends to existing 4/(2E) vertices chosen with by picking random end of an existing edge chosen 2/(2E) randomly, so probability is Result: P(k) = k / (2E) Scale-Free Preferential Attachment -g “Rich get Richer” n(k) ~ k Page 14 © Imperial College London g=3 P(k) Growth with Preferential Attachment [Yule 1925, 1944; Simon 1955; Price 1965,1976; Barabasi,Albert 1999 ] 2/(2E) P(k) = k / (2E) Preferential Attachment 5/(2E) “Rich get Richer” 4/(2E) Result: Scale-Free Network n(k) ~ k-g g=3 2/(2E) Page 15 © Imperial College London N=200, <k>~4.0, vertex size k Classical Random Scale-Free = Power-Law p(k)~ 1/k3 Diffuse, small degree Tight core of large hubs vertices k =O(ln(N)) 1/2 max kmax=O(N ) Page 16 © Imperial College London Scale-Free in the Real World Attachment probability used was k 1 P(k) pp pr , 2E N a BUT if limk→∞ P(k) k for any a 1 then a power law degree distribution is not produced! Page 17 © Imperial College London Preferential Attachment for Real Networks [Saramäki, Kaski 2004; TSE, Saramäki 2004] 1. Add a new vertex with ½<k> new edges 2. Attach to existing vertices, found by executing a random walk on the network of L steps Start Walk Probability of arriving at a vertex Here number of ways of arriving at vertex = k, the degree Preferential Attachment =3 Page 18 © Imperial College London g (Can also mix in random attachment with probability pr) Preferential Attachment for Real Networks Probability of arriving at a vertex number of ways of arriving at vertex = k, the degree Preferential Attachment g=3 Start Walk Can also mix in random attachment Here with probability pr Page 19 © Imperial College London Naturalness of the Random Walk algorithm • Gives preferential attachment from any network and hence a scale-free network • Uses only LOCAL information at each vertex – Simon/Barabasi-Albert models use global information in their normalisation • Uses structure of Network to produce the networks – a self-organising mechanism e.g. informal requests for work on the film actor’s social network e.g. finding links to other web pages when writing a new one Page 20 © Imperial College London Is the Walk Algorithm Robust? I varied: <d>~5 •Length of walks L=0 •<k> Pure Random Attachment L=1 •Starting point exponential of walks graph L=7 •Length distribution of walks • ….. YES - Good Power Laws but NOT Universal values - 10% or 20% variation Page 21 © Imperial College London RANDOM WALKS AND COMMUNITY DETECTION Vertex Communities Page 22 © Imperial College London Modularity Q [Girvan & Newman 2002] 1. Assign each vertex i to community ci and for each community:- 2. Add the fraction of edges IN community ci 3. Subtract the number of edges in null model • generalised random graph with same degree distribution A k k Qc ij i j k ci ,c j i, j W W W 1 2 3 where W 2E A Page 23 © Imperial College London ij i, j Random Walk Transition Matrix The transition matrix for a simple unbiased random walk on a network is T where the probability of moving from vertex j i j to vertex i is Aij T j ij k j 0.25 i with strength k j Aij Probability of following an edge i from j to any vertex i is 0.25 Page 24 © Imperial College London Random walk as linear algebra Let wi(t) be the number of random walkers at vertex i at time t (or the probability of finding one walker at i ) then the evolution is simply w(t 1) T.w(t) wi (t 1) Tij wj (t) j Page 25 © Imperial College London Equilibrium Equilibrium reached is eigenvector P1 with largest eigenvalue as 1 n 1 1 n w(t ) P1 For simple networks only we have trivial solution k w(t ) P i i i W Page 26 © Imperial College London Modularity as Random Walk [Delvenne et al, 2008,2010] 1. Assign each vertex i to community ci and for each community:- 2. Add fraction of equilibrium walkers remaining in community after one step 3. Subtract fraction of equilibrium walkers in community after infinite number of steps A k k k Qc ij j i j k ci ,c j i, j k j W W W 1 2 3 Page 27 © Imperial College London Modularity for any network 1. Assign each vertex i to community ci and for each community:- 2. Add fraction of equilibrium walkers remaining in community after one step 3. Subtract fraction of equilibrium walkers in community after infinite number of steps A Qc ij P P P k out j i j ci ,c j i, j k j 3 1 2 Page 28 © Imperial College London RANDOM WALKS AND COMMUNITY DETECTION Overlapping Communities Page 29 © Imperial College London Vertex Centric Viewpoint A network is 1. a set of vertices AND 2. a set of edges We tend to have a very VERTEX centred viewpoint Page 30 © Imperial College London Word Frequencies in Network Review Contemporary Physics 2004] [TSE, Word Rank Count Word Rank Count network 1 254 distribut 21 34 vertic 2 107 scale 21 34 edg 3 86 problem 24 33 random 3 86 simpl 24 33 graph 5 81 idea 26 30 degre 6 78 physic 26 30 power 7 68 size 26 30 lattic 8 67 find 29 29 law 9 65 real 29 29 vertex 10 61 type 31 27 number 11 58 case 32 26 distanc 12 48 hub 33 25 model 13 47 show 33 25 connect 14 46 area 35 24 data 15 40 neighbour 35 24 link 16 38 studi 35 24 world 16 38 point 38 23 larg 18 37 term 38 23 small 19 36 figur 40 22 averag 20 35 form 40 22 comput 21 34 site 40 22 Stop words removed, stemmed, from T.S.Evans “Complex Networks” Contemporary Physics, 2004, 45, 455-474, cond-mat/0405123 Vertex Centric Communities – Vertex Partitions Community = partition of detection via VERTEX set modularity Vertex partition of Karate club graph with optimal modularity [Agarwal & Kempe 2007] Page 32 © Imperial College London Random walk on edges Consider how random walkers pass through edges 1 1 1 ki 2 k j i a j b k Using a, b, … for edge indices Page 33 © Imperial College London Random walk on edges Edge to Edge transition matrix Tab is just aj j b 11 1 2 1 Tab k 2 k j j a j b Page 34 © Imperial College London Random walk on edges Edge-Edge transition matrix Tab defines an adjacency matrix of a Weighted Line Graph WL(G) 1 1 Aab jI (a ,b ) 2 k j a b Page 35 © Imperial College London Vertices of a Line Graph 1.

Random Walks and Networks

Superprocesses and Mckean-Vlasov Equations with Creation of Mass

Lecture 19: Polymer Models: Persistence and Self-Avoidance

8 Convergence of Loop-Erased Random Walk to SLE2 8.1 Comments on the Paper

Random Walk in Random Scenery (RWRS)

Particle Filtering Within Adaptive Metropolis Hastings Sampling

A Course in Interacting Particle Systems

ONE-DIMENSIONAL RANDOM WALKS Definition 1. a Random Walk on the Integers with Step Distribution F and Initial State X ∈ Is A

Stochastic Processes: Theory and Methods. Handbook of Statistics 19 (Ed

Random Walk Simulation of Pn Junctions

Simple Random Walks: Improbability of Profitable Stopping

CONDITIONAL EXPECTATION and MARTINGALES 1.1. Definition of A

Random Walk Methods for Modeling Hydrodynamic Transport in Porous and Fractured Media from Pore to Reservoir Scale