Kinetic Theory of Random Graphs: from Paths to Cycles

Kinetic Theory of Random Graphs: from Paths to Cycles 1, 2, E. Ben-Naim ¤ and P. L. Krapivsky y 1Theoretical Division and Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545 2Center for Polymer Studies and Department of Physics, Boston University, Boston, Massachusetts 02215 Structural properties of evolving random graphs are investigated. Treating linking as a dynamic aggregation process, rate equations for the distribution of node to node distances (paths) and of cycles are formulated and solved analytically. At the gelation point, the typical length of paths and cycles, l, scales with the component size k as l » k1=2. Dynamic and ¯nite-size scaling laws for the behavior at and near the gelation point are obtained. Finite-size scaling laws are veri¯ed using numerical simulations. PACS numbers: 05.20.Dd, 02.10.Ox, 64.60.-i, 89.75.Hc I. INTRODUCTION cycle length distribution. More subtle statistical properties of cycles in random graphs can be calculated as well. A random graph is a set of nodes that are randomly In particular, the probability that the system contains no joined by links. When there are su±ciently many links, cycles and the size distribution of the ¯rst, second, etc. a connected component containing a ¯nite fraction of all cycles are obtained analytically. nodes, the so-called giant component, emerges. Random We focus on the behavior near and at the phase tran- graphs, with varying ﬂavors, arise naturally in statisti- sition point, namely, when the gel forms. We show that cal physics, chemical physics, combinatorics, probability the path and the cycle length distribution approach self- theory, and computer science [1{5]. similar distributions near the gelation transition. At the Several physical processes and algorithmic problems gelation point, these distributions develop algebraic tails. are essentially equivalent to random graphs. In gela- The exact results obtained for an in¯nite system allow tion, monomers form polymers via chemical bonds un- us to deduce scaling laws for ¯nite systems. Using heuris- til a giant polymer network, a \gel", emerges. Identify- tic and extreme statistics arguments, the size of the giant ing monomers with nodes and chemical bonds with links component at the gelation point is obtained. This size shows that gelation is equivalent to the emergence a gi- scale characterizes the size distribution of components ant component [6{8]. A random graph is also the most and it leads to a number of scaling laws for the typical natural mean-¯eld model of percolation [9, 10]. In com- path size and cycle size. Extensive numerical simulations puter science, satis¯ability, in its simplest form, maps validate these scaling laws for ¯nite systems. onto a random graph [11]. Additionally, random graphs The rest of the paper is organized as follows. First, the are used to model social networks [12, 13]. evolving random graph process is introduced (Sec. II), Random graphs have been analyzed largely using com- and then the size distribution of all components is ana- binatorial and probabilistic methods [3{5]. An alterna- lyzed in Sec. III. Statistical properties of paths are de- tive statistical physics methodology is kinetic theory, or rived in Sec. IV and then used to obtain statistical prop- equivalently, the rate equation approach. The formation erties of all cycles (Sec. V) and of the ¯rst cycle (Sec. VI). of connected components from disconnected nodes can be We conclude in Sec. VII. Finally, in an appendix, some treated as a dynamic aggregation process [14{17]. This details of contour integration used in the body of the kinetic approach was used to derive primarily the size paper are presented. distribution of components [18{20]. Recently, we have shown that structural characteristics II. EVOLVING RANDOM GRAPHS of random graphs can be analyzed using the rate equation approach [21]. In this study, we present a comprehen- sive treatment of paths and cycles in evolving random A graph is a collection of nodes joined by links. In graphs. The rate equation approach is formulated by a random graph, links are placed randomly. Random treating linking as a dynamic aggregation process. This graphs may be realized in a number of ways. The links approach allows an analytic calculation of the path length may be generated instantaneously (static graph) or se- distribution. Since a cycle is formed when two connected quentially (evolving graph); additionally a given pair of nodes are linked, the path length distribution yields the nodes may be connected by at most a single link (simple graph) or by multiple links (multi-graph). We consider the following version of the random graph model. Initially, there are N disconnected nodes. Then, ¤Electronic address: [email protected] a pair of nodes is selected at random and a link is placed yElectronic address: [email protected] between them (Fig. 1). This linking process continues 2 ¯nite random graphs, both Nk(t) and ck(t) are random variables, but in the N limit the density ck(t) be- comes a deterministic quan! 1tity. It evolves according to the nonlinear rate equation (the explicit time dependence is dropped for simplicity) dc 1 k = (ic )(jc ) k c : (2) dt 2 i j ¡ k i+Xj=k FIG. 1: An evolving random graph. Links are indicated by The initial condition is ck(0) = ±k;1. The gain term ac- solid lines and the newly added link by a dashed line. counts for components generated by joining two smaller components whose sizes sum up to k. The second term on the right-hand side of Eq. (2) represents loss due to ad in¯nitum and it creates an evolving random graph. linking of components of size k to other components. The The process is realized dynamically. Links are generated corresponding gain and loss rates follow from the aggre- with a constant rate in time, set equal to (2N) 1 without ¡ gation rule (1). loss of generality. There are no restrictions associated The rate equations can be solved using a number of with the identity of the two nodes. A pair of nodes may techniques. Throughout this investigation, we use a con- be selected multiple times, i.e., a multi-graph is created. venient method in which the time dependence is elimi- Additionally, the two nodes need not be di®erent, so self- nated ¯rst. Solving the rate equations recursively yields connections are allowed. t 1 2t 1 2 3t c1 = e¡ , c2 = 2 te¡ , c3 = 2 t e¡ , etc. These explicit At time t, the total number of links is on average Nt=2, k 1 kt the average number of links per node (the degree) is t, results suggest that ck(t) = Ck t ¡ e¡ . Substituting and the average number of self-connections per node is this form into (2), we ¯nd that the coe±cients Ck satisfy 1 the recursion relation N ¡ t=2. Therefore, whether or not self-connections are allowed is a secondary issue. Since the linking process is 1 (k 1) C = (iC ) (jC ) (3) completely random, the degree distribution is Poissonian ¡ k 2 i j with a mean equal to t. i+Xj=k subject to C1 = 1. This recursion is solved using the generating function approach. The form of the right- III. COMPONENTS hand side of Eq. (3) suggests to utilize the generating function of the sequence kCk rather than Ck, i.e., The evolving random graph model has several virtues kz kz G(z) = k k Ck e . Multiplying Eq. (3) by k e and that simplify the analysis. First, the linking process summing over all k, we ¯nd that the generating function is completely random as there is no memory of previ- satis¯esPthe nonlinear ordinary di®erential equation ous links. Second, having at hand a continuous variable dG (time) allows us to use continuum methods, particularly (1 G) = G: (4) the rate equation approach. This is best demonstrated ¡ dz by determination of the size distribution of connected Integrating this equation, z = ln G G + A and using components. the asymptotics G ez as z ¡¯xes the constant As linking proceeds, connected components form. A = 0. Thus, we arriv! e at an!implicit¡1 solution for the When a link is placed between two distinct components, generating function the two components join. For example, the latest link G z in Fig. 1 joins two components of size i = 2 and j = 4 G e¡ = e : (5) into a component of size k = i + j = 6. Generally, there are i j ways to join disconnected components. Hence, The coe±cients Ck can be extracted from (5) via the comp£onents undergo the following aggregation process Lagrange inversion formula, or using contour integration as detailed in Appendix A. Substituting r = 1 in Eq. (A1) ij=2N kk¡2 (i; j) i + j: (1) yields Ck = k! reproducing the well-known result for ¡! the size distribution [18, 19] Two components aggregate with a rate proportional to k 2 the product of their sizes. k ¡ k 1 kt c (t) = t ¡ e¡ : (6) k k! In the following, we shall often use the generating func- A. In¯nite Random Graph kz tion for the size distribution c(z; t) = k k ck(t)e . This generating function is readily expressed via the auxiliary Let ck(t) be the density of components containing k Pkz generating function G(z) = k k Ck e : nodes at time t. In terms of Nk(t), the total number of 1 components with k nodes, then c (t) = N (t)=N. For c(z; t) = t¡ GP(z + ln t t): (7) k k ¡ 3 Let us consider the fraction of nodes in ¯nite compo- This is conveniently seen via the cumulative size distri- nents, M1 = k k ck(t).

Load more