Erd˝Os-Rényi Random Graphs the Probability of an Edge Being in the Edge Set E Is Uniformly Distributed

D.P. van Velzen Erd˝os-RényiRandom Graphs Bachelor thesis Thesis advisors: prof.dr. W.Th.F den Hollander, dr. O. Jovanovski July 4, 2017 Mathematisch Instituut, Universiteit Leiden Contents 1 Introduction 2 2 Phase change behaviour 3 2.1 Comparison to branching processes................ 4 2.2 Poisson branching processes..................... 6 2.3 The subcritical regime........................ 7 2.4 The supercritical regime....................... 10 3 Marginal laws for event counts 11 3.1 Relative entropy............................ 12 3.2 Conditioning on edge count..................... 12 3.3 Conditioning on degree sequence................. 15 4 Conclusion 20 References 21 1 Introduction Networks such as social media, world wide web and systems of transporta- tion are ubiquitous in everyday life. The mathematical properties of such real world networks are extensively studied, and progress is rapid due to the ease of data analysis of networks. These networks often share interesting properties such as power law degree distributions and small world connectedness. From this perspective it is appropriate to look at real-world networks as if they were graphs. Definition 1 (Graph). A simple undirected graph is an ordered pair G = (V, E) of a set V of vertices and a set E ⊆ V2 of edges, where an element e 2 E denotes a connection between two vertices. The elements of E are non-ordered pairs of vertices. « Vertices may denote profiles in a social media context or servers in a www. context. Graphs that model such networks have comparable properties to theoretical graphs. These networks were not engineered to be similar, but have properties that naturally emerge. In many applications it is useful to look at the vertices and edges of graphs as if their location and connection were ruled by chance. One specific model of a random graph is the Erd˝os- Rényi model. 0 Definition 2 (Erd˝os-Rényi). Let Kn = (V, E ) be the complete graph on n vertices. We construct a random set E ⊂ E0 in the following manner. Let p 2 [0, 1], and for each element e 2 E0, take e 2 E with probability p and e 2/ E with probability 1 − p. The graph ERn(p) = (V, E) obtained in this way is called an Erd˝os-Rényirandom graph with vertices V = [n] = f1, 2, . , ng and edges E. For most of section 2 we will look at graphs where p = l/n. The + law for these graphs will be denoted by Pl, with l 2 R .[6]« 2 3 3 2 2 4 4 1 1 5 5 7 7 6 6 1 Figure 1: Erd˝os-Rényigraph for p = 2 , n = 7. Left: Kn. Right: ERn(p) This specific model arises as a very simple example of a random graph. It does not have all properties that real-world networks usually have, but it is interesting for other reasons. The model has phase transitions in the limit n ! ¥ depending on the value of l, which will be discussed in the next section. 2 Phase change behaviour In this section we will give an outline of two proofs concerning connectedness for graphs that are subcritical (l < 1), given in section 2.3, and supercritical (l > 1), given in section 2.4. We will first give definitions that will be used in section 2.1 to compare the Erd˝os-Rényirandom graph to branching processes, which will be defined in section 2.1. In section 2.2 we cite results needed for the proofs in sections 2.3 and 2.4. This section is based on chapter 4 from van der Hofstad’s monograph [6] with verbatim definitions. Notation: call a, b 2 V connected when ab 2 E. Write a $ b when there is a path of connected vertices from a to b. This means that there are vertices m1,..., mk such that am1, m1m2,..., mkb 2 E. By convention a $ a. Definition 3 (Cluster). Let v be in the vertex set V of ERn(p). Define the cluster C(v) of v by C(v) = fx 2 V : v $ xg. « Let jC(v)j denote the number of vertices in C(v). The largest cluster is any cluster for which jC(v)j is maximal, so that jCmaxj = max jC(v)j.(1) v2V We will now define a procedure called an exploration process, that is used to find the cluster of a certain vertex v in any given graph G = (V, E). During the course of the exploration, each vertex in the graph has one of three statuses: active, neutral or inactive. At t = 0 only vertex v is active, all others are 3 neutral. At time t ≥ 1 we take the vertex w with smallest assigned number and run over the edges ww0, where w0 runs over all neutral vertices. If ww0 is in the edge set E, then we let w0 become active. After all neutral vertices are searched, we set w to be inactive. When the process terminates at time t, the number of inactive vertices is the cluster size jC(v)j = t. Exploration of Fig. 1 t=0 1 active none inactive 2 3 4 5 6 7 neutral t=1 2 5 7 active 1 inactive 3 4 6 neutral t=2 3 5 6 7 active 1 2 inactive 4 neutral t=3 4 5 6 7 active 1 2 3 inactive none neutral t=4 5 6 7 active 1 2 3 4 inactive none neutral t=5 6 7 active 1 2 3 4 5 inactive none neutral t=6 7 active 1 2 3 4 5 6 inactive none neutral t=7 none active 1 2 3 4 5 6 7 inactive none neutral We can relate the exploration process to a so called branching process as will be discussed in section 2.1 by setting St to be the number of active vertices at time t. This means that S0 = 1. Let Xt denote the number of vertices that become active at time t. Then S0 = 1, St = St−1 + Xt − 1, (2) i.e., the number of active vertices equals the number of active vertices at time t − 1 plus the number of vertices that becomes active minus the one that becomes inactive. This is true for any graph we explore, however, the probability distribution of Xt is specific to the Erd˝os-Rényirandom graph. We see that Xt ∼ Bin(n − (t − 1) − St−1, p) is dependent on t and St−1. This means that the Xi for i 2 Z>0 are not i.i.d.. If the Xi were i.i.d., then the process described by eq. (2) would be called a branching process, as will be explained in definition 4. The properties of branching processes will prove useful. Let T denote the first t for which St = 0, T = minft : St = 0g. 2.1 Comparison to branching processes We now take another look at eq. (2) and compare the exploration of a random graph to the exploration of a stochastic process called branching process where Xi ∼ Bin(n, p) i.i.d. for all i 2 Z>0. Definition 4 (branching process). A branching process is a stochastic process that gives rise to a layered graph where each layer represents a generation of individuals. Each individual in a generation has a stochastic number of offspring, giving rise to the next generation. In mathematical terms: Let Zn be the number of individuals in the n-th generation, and (zn,i)n≥0,i>0 a two- dimensional array of i.i.d. stochastic variables representing the offspring of 4 the nodes in each generation. The recursive relation Zn−1 Z0 = 1 Zn = ∑ zn,i i=1 defines a branching process. « We visualize a branching process as a tree (graph) where each Zn denotes the number of vertices that are at distance n from the root. Exploring such a graph with the exploration process and starting in the root, we get S0 = 1, St = St−1 + Xt − 1, but for the exploration of a branching process the Xt would be i.i.d. t = 0 1 Z0 = 1 t = 1 2 5 7 Z1 = 3 t = 2 3 6 Z2 = 2 t = 4 4 Z3 = 1 Figure 2: This tree has the exact same exploration as the graph in Fig.1. We will establish stochastic bounds on the exploration of an Erd˝os-Rényiran- dom graph in terms of branching processes. These bounds will allow us to use powerful results for branching processes. Theorem 5 (Stochastic domination of the cluster size). For every k ≥ 1 and v 2 [n] ≥ ≥ Pn,p(jC(v)j ≥ k) ≤ Pn,p(T ≥ k), i.e., jC(v)j T , where T≥ is the total progeny of a binomial branching process as described in section 2.1 with parameters n and p. « Proof. Let ni = n − i − Si be the number of neutral vertices at t = i in the exploration. Then the exploration as described in eq. (2) has Xi ∼ Bin(ni−1, p). Let Yi ∼ Bin(n − ni−1, p) and define ≥ Xi := Xi + Yi. ≥ By Vandermonde’s identity we see that Xi ∼ Bin(n, p) is independent of ni−1 ≥ for all i. Also Xi ≤ Xi because Yi ≥ 0. Now let ≥ ≥ ≥ Si := X1 + ... + Xi − (i − 1). Then ≥ Pn,p(jC(v)j ≥ k) = P(8t ≤ k − 1 : St > 0) ≤ P(8t ≤ k − 1 : St > 0) = P(T≥ ≥ k), ≥ ≥ where T := minft : St = 0g is the total progeny for a binomial branching process with parameters n, p. 5 Theorem 6 (Lower bound on cluster tail).

Load more