The Giant Component: the Golden Anniversary Joel Spencer

The Giant Component: The Golden Anniversary Joel Spencer

Paul Erd˝os and Alfréd Rényi probability distribution over all such graphs. Erd˝os Oftentimes the beginnings of a mathematical area and Rényi studied the typical behavior of G[n, e] are obscure or disputed. The subject of random n as e “evolved” from 0 to 2. When e approaches graphs had, however, a clear beginning, and it n and passes 2 the random graph undergoes a phase occurred fifty years ago. Alfréd Rényi (1921–1970) transition. In a typical computer run on a million was head of the Hungarian Mathematical Institute vertices, the size of the largest component is only which today bears his name. Rényi had a great love 168 at e = 400000 edges. By e = 600000 it has of literature and philosophy, a true Renaissance exploded to size 309433. man. His life, in the words of Paul Turán, was one of intense and creative involvement in the In modern language we let G(n, p) be the exchange of ideas and in public affairs. His mathe- probability space over graphs on n vertices where matics centered on probability theory. Paul Erd˝os each pair is adjacent with independent probability (1913–1996) was a central figure in twentieth- p. Such graphs have very close to a proportion p century mathematics. This author recalls well the of the edges. The behaviors of G[n, e] and G(n, p) = n memorial conference organized by his long-time where e p2 are asymptotically the same for all friend and collaborator Vera Sós in 1999, at which the topics we discuss here, and we shall use the fifteen plenary lecturers discussed his contribu- modern language. tions. What was so surprising to us all was the = c We shall parameterize p n . The graph with sheer breadth of his work, which spanned so many n edges then corresponds to c = 1. In this range vital areas of mathematics. To this very prejudiced 2 the graph will split into components. We let C1, C2 author, it is his work in discrete mathematics that denote the largest and second largest components is having the greatest lasting impact. In this area, in the graph, with |C | denoting their number of Erd˝os tended toward asymptotic questions, a style i vertices. We define the complexity of a component very much relevant to today’s world of o, O, , and with V vertices and E edges as E − V + 1. Trees . For both Erd˝os and Rényi, mathematicsΘ was a and unicyclic graphs have complexity 0 and 1, Ωcollaborative enterprise. They both had numerous coauthors, and they wrote thirty-two joint papers. respectively, and are called simple. In 1960 they produced their masterwork, On Theorem 1 (Erd˝os-Rényi). The behavior of G(n, p) the Evolution of Random Graphs [7]. Begin with n with p = c can be broken into three parts. vertices and no edges and add edges randomly n Subcritical c < 1: All components are simple (that is, uniformly from among the potential edges) | | = oneby one.Let G[n, e] be the state when there are e and very small, C1 O(ln n). 2/3 edges. Of course, G[n, e] could be any graph with n Critical c = 1: |C1| = (n ) A delicate situa- vertices and e edges and technically is the uniform tion! Θ Supercritical c > 1: |C1| ∼ yn where y = y(c) Joel Spencer is professor of computer science and mathe- is the positive solution to the equation matics at the Courant Institute, New York University. His email address is [email protected]. (1) e−cy = 1 − y

720 Notices of the AMS Volume 57, Number 6 C1 has high complexity. All other components are Erd˝os Meets Galton simple and very small, |C2| = O(ln n). = c Fix a vertex v in G(n, p) with p n and perform Remark. The full statements here and below are a breadth first search (BFS) to find its compo- rather unwieldy. Thus |C1| ∼ yn really means that nents. It has binomial distribution B[n − 1, p] for all c > 1 and all ǫ > 0 the limit as n → ∞ of the neighbors, asymptotically Poisson with mean c. probability that (y − ǫ)n < |C1| < (y + ǫ)n is 1. We Its neighbors then have Poisson c new neighbors, allow ourselves the more informal description. and so on. The component of v is approximated by the Galton-Watson process. For c < 1 this Remark. The average degree of a vertex is p(n − 1) ∼ c. The critical behavior takes place approximation works well. But for c > 1 the when the average degree reaches 1. Galton-Watson process may go on forever while the component of v can have at most n vertices. An When c > 1, Erd˝os and Rényi called C1 the ecological limitation causes the processes to con- giant component. There are two salient features verge. BFS requires new vertices. After δn vertices of the giant component: its existence and its have been found, the new distribution is binomial uniqueness. Their system does not have a Jupiter B[n(1−δ), p], which is Poissonwith mean c(1−δ). and an also huge Saturn; it is more like Jupiter The success of BFS causes δ to rise, which makes and Ceres. Several books ([3], [4], [13]) give very it harder to find new vertices, leading the process full discussions. to eventually die. The effect of the ecological limitation is only felt after a positive proportion δn Francis Galton and Henry Watson of vertices have been found. Consider BFS from Let’s switch gears. A Galton-Watson process begins each vertex v. With probability 1 − y the process with a single node “Eve”.1 Eve has Z children, where will die early, giving a small component. But for Z has a Poisson distribution with mean c. (That is, ∼ yn the process will not die early. All of these Eve has k children with probability e−c ck/k!. The vertices have their components merge into the full study of Galton-Watson processes considers giant component. other distributions as well.) These children then have children independently with the same dis- Jupiter Without Saturn tribution, and the process continues through the Why can we not have Jupiter and Saturn, two generations. Let T = Tc be the total population components both of size bigger than δn? This generated. There is a precise formula would be highly unstable. Each additional edge −ck k−1 e (ck) would have probability at least (δn)2/ n ∼ 2δ2 of (2) Pr[Tc = k] = . 2 k! merging them. High instability and nonexistence However, it is also possible the T is infinite. are not the same. Indeed, while there are many Theorem 2. The Galton-Watson process has three proofs of the uniqueness of the giant component, regions. we do not know one that is both simple and Subcritical c < 1: T is finite with probability 1, rigorous. and E[T ] = ∞ ck = 1 . Pk=0 1−c Critical c = 1: T is finite with probability 1 but The Critical Window has infinite expectation.2 Erd˝os and Rényi normally repressed their enthu- Supercritical c > 1. T is infinite with probability siasm in their formal writings. But not now! y = y(c) given by (1). This double “jump” in the size = − of the largest component when With c > 1 let z 1 y be the probability Eve e 1 generates a finite tree. If Eve has k children the full n passes the value 2 is one of tree will be finite if and only if all of the children the most striking facts concerning generate finite trees, which has probability zk. random graphs. [7] Thus In the 1980s, spearheaded by the work of Béla ∞ = ck Bollobás and Tomasz Łuczak, the value c 1 was (3) z = e−c zk, stretched out, and a critical window was found. X k! k=0 The stretching was done by adding a second-order and some manipulation gives that y = 1 − z term. The correct parameterization is satisfies (1). One needs further argument to show 1 λ = = (4) p = + . that z 1, y 0 is a spurious solution. n n4/3 1In the original work the nodes were all male! Now there are three regions: ∼ → −∞ 2This author recalls in undergraduate days first seeing Barely Subcritical pn 1 and λ : All a finite random variable with infinite expectation and components are simple. |C1|∼|C2|, their sizes thinking it was a very funny and totally anomalous cre- increasing with λ. ation. Wrong! Such variables occur frequently at critical The Critical Window pn ∼ 1 and λ a constant: points in percolation processes. Here (and only here!) we have chaotic behavior,

June/July 2010 Notices of the AMS 721 distributions instead of almost sure behavior. Pa- track of component sizes. We parameterize the 2/3 n − − rameterizing |Ci | = Xi n , the Xi (i = 1, 2 and = 1 + 4/3 number of edges as e 2(n λn ) and take beyond) have a nontrivial distribution and a non- “snapshots” at λ = −4, −3,..., +4. The ten largest trivial joint distribution. The complexity Yi of Ci component sizes (listed 0,..., 9 here, and divided also has a nontrivial distribution. by n2/3) are given for each λ. At λ = +2 there is a Barely Supercritical pn ∼ 1 and λ → +∞: 1.16 Jupiter and 0.86 Saturn. The next digit, under 2/3 |C1| ≫ n ≫ |C2|. C1 is the dominant compo- N, gives the new ranking (− if not in the top ten) nent, much bigger than C2 but still small. C1 has for that component for the next λ. Components high complexity, but all other components are 0, 1, 2, 3, 4 have N = 0, meaning they have all simple. merged by λ = +3. At λ = 3 Jupiter has blown Stirling’s formula applied to (2) with c = 1 up to 4.21. (Smaller components have also joined −1/2 −3/2 gives Pr[T1 = k] ∼ (2π) k and Pr[T1 ≥ k] ∼ Jupiter, explaining the discrepancy in the sum.) −1/2 −1/2 2(2π) k . Now consider G(n, p) with pn = 1 The size of the second largest component has and let C(v) be the component containing v. Call decreased (it is the component formerly ranked 5) 2/3 C(v) large if its size is at least Kn and let Z to a 0.22 Ceres. be the number of v with C(v) large. Estimating |C(v)| by T1 would give that C(v) is large with Inside the Critical Window probability 2(2π)−1/2K−1/2n−1/3, and so Z would At λ = −4 there is a “jostling for position” among have expectation 2(2π)−1/2K−1/2n2/3. The actual the top components, while by λ = +4 a dominant value of E[Z] is somewhat smaller due to the component has emerged. The last time the largest ecological limitation, but let us assume it as a component loses that distinction occurs during the heuristic. If any large component exists, every critical window [8]. At λ = −4 all components are vertex of it would be in a large component, so simple, while by λ = +4 the dominant component that Z would be at least Kn2/3. When K is large, has high complexity. Complexity at least 4 is E[Z] is much lower than Kn2/3, so that with high necessary for nonplanarity. Planarity is lost in the probability there would be no large component. critical window [16]. In a masterful work [12], the Conversely, when ǫ is a small positive constant, development of complex components is studied. we expect many components of size bigger than One exceptionally striking result: the probability ǫn2/3. that the evolution ever simultaneously has two complex components is, asymptotically, 5 π. A Strange Physics 18 2/3 Let ci n be the size of the ith largest component Classical Bond Percolation = −1 + −4/3 at p n λn . Let λ be an “infinitesimal” d 2/3 2/3 Mathematical physicists examine Z as a lattice; and increase λ by λ. There∆ are (ci n )(cj n ) the pairs v,~ w~ that are one apart are called bonds. potential edges that∆ would merge Ci , Cj , and each is added with probability ( λ)n−4/3. The n factors They imagine that each bond is occupied with independent probability p. (For graph theorists, cancel: Ci , Cj merge to form∆ a component of size the occupied bonds form a random subgraph of (c + c )n2/3 with probability c c ( λ). This grav- i j i j Zd.) The occupied components then form clusters, itational attraction merges the large∆ components or components. There is a critical probability, and forms the dominant component. We can in- denoted p (dependent on d), so that: clude the complexity in this model. When C , C c i j Subcritical p < p : All components are finite. with complexities r ,r merge, the new compo- c i j Critical p = p : A delicate situation! nent has complexity r + r . Further, each C has c i j i Supercritical p > p : There is precisely one ∼ 1 2 4/3 c 2 ci n potential internal edges. In the infini- 2 infinite component. tesimal time λ with probability ci ( λ)/2, such There are natural analogies between this infinite an edge is added,∆ and the complexity∆ of Ci is model and the asymptotic Erd˝os-Rényi model. incremented by 1. Over time, the complexities get Infinite size corresponds to (n), while finite larger and larger. The limiting process, called the size corresponds to O(ln n). ThereΩ is particular multiplicative coalescent process, has interesting interest in p being very close to pc . Let f (p) connections to Brownian motion [2]. be the probability that 0~ (or, by symmetry, any particular v~) lies in an infinite component. The A Computer Exercise 4 critical exponent β is that real number such that β+o(1) + Computer experimentationvividly shows the rapid f (pc +x) ∼ x as x → 0 . The probability pc +x 3 development in the critical window. In the run on corresponds to pn = 1 + x and f to the probability the following page, we begin with n = 105 vertices that a given vertex v lies in the giant component or, and no edges. At each step a random edge is equivalently, the proportion y(1 + x) of vertices in added, and a Union-Find algorithm is used to keep 4β might not exist, but all mathematical physicists as- 3Thanks to Juliana Freire. sume it does.

722 Notices of the AMS Volume 57, Number 6 · -4 N -3 N -2 N -1 N 0 N +1 N +2 N +3 N +4 0 0.14 1 0.18 0 0.24 1 0.28 0 0.37 0 0.82 0 1.16 0 4.21 0 5.88 1 0.10 2 0.16 1 0.19 2 0.26 1 0.36 1 0.39 1 0.86 0 0.22 0 0.24 2 0.10 3 0.13 3 0.13 4 0.19 3 0.26 0 0.28 3 0.49 0 0.13 0 0.12 3 0.09 4 0.12 4 0.13 0 0.16 2 0.21 2 0.23 4 0.46 0 0.12 0 0.10 4 0.07 0 0.09 5 0.13 3 0.14 5 0.16 0 0.20 2 0.32 0 0.11 2 0.10 5 0.07 0 0.08 7 0.09 5 0.12 4 0.15 4 0.14 5 0.16 1 0.10 1 0.10 6 0.07 5 0.06 6 0.09 7 0.10 8 0.12 5 0.12 2 0.12 3 0.09 1 0.10 7 0.06 8 0.06 2 0.08 6 0.09 - 0.12 3 0.10 1 0.11 4 0.09 4 0.10 8 0.05 - 0.06 8 0.06 9 0.09 - 0.10 3 0.10 7 0.09 2 0.09 0 0.08 9 0.05 9 0.05 - 0.06 0 0.07 - 0.10 9 0.10 0 0.08 5 0.08 6 0.07 the giant component. As y(1+x) ∼ 2x = x1+o(1) the A First-Order Phase Transition β value for the Erd˝os-Rényi model is considered 1. Modify the Erd˝os-Rényi evolution as follows. In d This is known to be the β value in Z for all each round an edge is added to G, initially empty. sufficiently high dimensions d. Grimmett[10] gives Two random pairs {u,v}, {w, x} are given. Add many other critical exponents, and in all cases that pair for which the product of the component the analogous value for the Erd˝os-Rényi model sizes of the two vertices is smaller. This provides a matches the known value in high-dimensional powerful antigravity that deters large components space. Mathematical physicists loosely use the = n from joining. Parameterizing e t 2 edges chosen term mean field behavior to describe percolation (so that t = 1 is the critical value in the Erd˝os- phenomena in high dimensions, and the Erd˝os- Rényi evolution) the giant component occurs at Rényi model has this mean field behavior. t ∼ 1.77. More interesting, extensive computer simulation (but no mathematical proof!) indicates Recent Results strongly that when the critical value is reached, Today it is recognized that percolation and the there is a first-order phase transition. That is, let critical window appear in many guises. Here is a t = tc be the critical probability and let f (t) be the highly subjective description of recent work. We proportion of vertices in the largest component generally give simplified versions. at “time” t. Then the limit of f (t) as t approaches tc from above appears not to be zero but rather Random 2-SAT something like 0.6 [1]. We generate m random clauses C ,...,C on 1 m General Critical Points Boolean variables x1,...,xn. That is, each clause ∗ = C = y ∨ z with y,z drawn randomly from Let G be a graph on n vertices. Set d 2 ∗ ( v dv )( v dv ), noting dv denotes the average {x1, x1,...,xn, xn}. We ask if all Ci can be simulta- P P neously satisfied. The answer changes from yes degree of a vertex if you first select an edge uniformly and then one of its vertices uniformly. Let to no in the critical window m = n + λn2/3 [5]. Gp denote the random subgraph of G, accepting each edge with independent probability p. Then, d-regular Graph = 1 under certain mild conditions on G, p d∗ is Let Gn be a sequence of transitive d-regular graphs. 1 the critical point in the evolution of Gp. When 1−ǫ Under reasonable conditions pc = − acts as the d 1 p = ∗ , Gp contains no giant component, while critical probability for a random subgraph of G . d n when p = 1+ǫ , G does contain a unique giant For p < p the components are small, while for d∗ p c component [6]. p > pc there is a giant component. More delicately, at p = p the largest component has size (n2/3). c Degree Sequence The scaling pd = 1 + λn−1/3 acts as theΘ critical window [18]. For given d1,...,dn we consider the graph on n vertices chosen uniformly among all with that An Improving Walk degree sequence, that is, v having precisely dv neighbors. Suppose for each n we have a degree Consider an infinite walk starting at W0 = 1 with sequence, λi(n) ∼ λi n vertices having degree i. Wt = Wt−1 + Xt − 1 where Xt is Poisson with mean ∗ ∗ 2 t Then (with d from above), d ∼ [ i λi]/[ iλi]. . When Wt = 0 (crashes)it is resetto Wt = 1. When P P n Set Q := i(i − 2)λi so that Q > 0 if and only − i = 1 ǫ ∗ P t n the walk has negative drift and crashes if d > 2. In analyzing BFS one thinks of an edge = 1+ǫ repeatedly. When t n the walk has positive going to a vertex with the above distribution, drift and goes to infinity. The walk will crash for which then is on an expected number d∗ − 1 of the last time in the critical window t = n+λn2/3 [9]. new edges. With d∗ < 2 the process will die, while

June/July 2010 Notices of the AMS 723 for d∗ > 2 it might continue. When Q < 0 the these critical moments. The Erd˝os-Rényi process random graph with this degree sequence has no provides a bedrock to which all other processes giant component, while when Q > 0 it does [17]. may be compared.

Set Qn := i i(i − 2)λi (n) and assume Qn → P −1/3 0. Under moderate assumptions, Qn = λn References provides a criticalwindow. For λ → −∞ the random [1] D. Achlioptas, R. M. D’Souza and J. Spencer, Ex- graph is subcritical, and all components have plosive percolation in random networks, Science size o(n2/3). When λ → +∞ the random graph 323 (5920), 1453–1455, 2009. is supercritical, there is a dominant component [2] David Aldous, Brownian excursions, critical ran- of size ≫ n2/3 and all other components have dom graphs, and the multiplicative coalescent, size o(n2/3). In the power law random graphs, Annals of Probability 25 (1997), 812–854. [3] N. Alon and J. Spencer, The Probabilistic Method, thought by many to model the Web graph and 3rd ed., John Wiley, 2009. ∼ −γ other phenomena, it is assumed that λi i for a [4] Béla Bollobás, Random Graphs, 2nd ed., Cam- constant γ. For certain γ the above critical window bridge University Press. 2001. does not work, and work in progress indicates that [5] Béla Bollobás, Christian Borgs, Jennifer there is a critical window whose exponent depends Chayes, Jeong Han Kim, and David B. Wil- on γ [14] [11]. son, The scaling window of the 2-SAT transition, Random Structures and Algorithms 21 (2002), A Potts Model 182–195. [6] Fan Chung, Paul Horn, and Lincoln Lu, The gi- In the Potts model, the distribution of graphs is ant component in a random subgraph of a given biased toward having more components. There graph, Proceedings of the 6th Workshop on Algo- are three parameters, p ∈ [0, 1], q ≥ 1, and the rithms and Models for the Web-Graph (WAW2009), number of vertices n. A graph G with e edges, Lecture Notes in Computer Science 5427, 38–49. = n − [7] Paul Erd˝os and Alfréd Rényi, On the evolution of s : 2 e nonedges, and c components has probability pe(1 − p)s qc /Z, where Z is a normal- random graphs, Magyar Tud. Akad. Mat. Kutató Int. Közl. 5 (1960), 17–61. izing constant chosen so that the sum of the [8] Paul Erd˝os and Tomasz Łuczak, Changes of probabilities is 1. For q = 2 this is called the leadership in a random graph process, Random Ising model, for q ≥ 3 and integral, this is the Structures and Algorithms 5 (1994), 243–252. Potts model. For 2 < q < 3 the critical value is [9] Juliana Freire, Analysis techniques for nudged = = q−1 − = + random graphs, Ph.D. Thesis, Courant Institute, pn cq : 2 q−2 ln(q 1). At pn cq ǫ there is a 2009. giant component, while at pn = cq − ǫ the largest component has logarithmic size. The critical win- [10] G. Grimmett, Percolation, 2nd ed., Springer-Verlag, 1999. dow has parameterization pn = c + λ . There the q n [11] S. Janson and M. Luczak, A new approach to the graph has two different personalities. Either it has giant component problem, Random Structures and a giant component or the largest component has Algorithms 34 (2008), 197–216. logarithmic size. Both occur with positive limiting [12] Svante Janson, Tomasz Łuczak, Donald Knuth, probability, and these limiting probabilities sum and Boris Pittel, The birth of the giant component, to 1. At no p is there a middle ground with the Random Structures and Algorithms 3 (1993), 233– size of the largest component being bigger than 358. logarithmic but sublinear [15]. [13] Svante Janson, Tomasz Łuczak, and Andrzej Rucinski, Random Graphs, John Wiley, 2000. [14] Milyun Kang and T. Seierstad, The critical phase In Conclusion for random graphs with a given degree sequence, The mathematical landscape at the time of Paul Combin. Probab. Comput. 17 (2008), 67–86. Erd˝os’s birth, nearly one hundred years ago, was [15] Malwina Luczak and Tomasz Łuczak, The phase very different from what it is today. Discrete transition in the cluster-scaled model of a ran- mathematics was disparaged as “the slums of dom graph, Random Structures and Algorithms 28 topology”. Probability was useful for gambling, (2006), 215–246. but not proper work for a serious mathematician. [16] Tomasz Łuczak and John Wierman, The chro- Today both areas are thriving. It is the fecund matic number of random graphs at the double jump threshold, Combinatorica 9 (1989), 39–49. intersection of discrete mathematics and proba- [17] M. Molloy and B. Reed, The size of the largest bility that has seen the most spectacular growth. component of a random graph on a fixed de- A wide variety of random processes on large dis- gree sequence, Combin. Probab. Comput. 7 (1998), crete structures are studied. These processes, to 295–306. use Erd˝os and Rényi’s well-chosen word, undergo [18] Asaf Nachmias, Mean-field conditions for percola- an evolution. At a critical moment they undergo a tion on finite graphs, GAFA (to appear). phase transition, from water to ice, from satisfiable to not satisfiable, from freeflow to gridlock, from small components to a giant component. To understand a process we need to understand

724 Notices of the AMS Volume 57, Number 6