The Giant in Random Graphs Is Almost Local

Total Page:16

File Type:pdf, Size:1020Kb

The Giant in Random Graphs Is Almost Local The giant in random graphs is almost local Remco van der Hofstad Department of Mathematics and Computer Science, Eindhoven University of Technology, e-mail: [email protected] Abstract: Local convergence techniques have become a key methodology to study random graphs in sparse settings where the average degree remains bounded. However, many ran- dom graph properties do not directly converge when the random graph converges locally. A notable, and important, random graph property that does not follow from local convergence is the size and uniqueness of the giant component. We provide a simple criterion that guar- antees that local convergence of a random graph implies the convergence of the proportion of vertices in the maximal connected component. We further show that, when this condition holds, the local properties of the giant are also described by the local limit. We apply this novel method to the configuration model as a proof of concept, reproving a result that is well-established. As a side result this proof, we show that the proof also implies the small-world nature of the configuration model. MSC2020 subject classifications: 60K37, 05C81, 82C27, 37A25. Keywords and phrases: Random graphs, Local convergence, Giant component, Configu- ration model. 1. Introduction Local convergence techniques, as introduced by Aldous and Steele [3] and Benjamini and Schramm [13], have become the main methodology to study random graphs in sparse settings where the average degree remains bounded. Local convergence roughly means that the proportion of ver- tices whose neighborhoods have a certain shape converges to some limiting value, which is to be considered as a measure on rooted graphs. We refer to [39, Chapter 2] for background on local convergence. The giant component problem has received enormous attention ever since the very first and seminal results by Erd}osand R´enyi on the Erd}os-R´enyi random graph [33], see also [48, 18] for detailed further results for this model. The simplest form of this question is whether there exists a linear-size connected component or not, in many cases a sharp transition occurs depending on a certain graph parameter. See [6, 32, 50, 38, 39], as well as the references therein, for several books on the topic. This paper combines these two threads by investigating the size of the giant component when the random graph converges locally. Consider a sequence of random graphs (Gn)n≥1, where we will simplify the notation by assuming that Gn = (V (Gn);E(Gn)) is such that jV (Gn)j = n. We let jCmaxj = max jC (v)j (1.1) v2V (Gn) arXiv:2103.11733v1 [math.PR] 22 Mar 2021 denotes the maximal cluster size. When the random graph converges locally to some limiting rooted P measure, then one would expect that also jCmaxj=n −! ζ, where ζ is the survival probability of the local limit. However, while the number of connected components is well behaved in the local topology, the proportion of vertices in the giant is not so nicely behaved. Indeed, since local convergence is all about proportions of vertices whose finite (but arbitrarily large) neighborhoods converge to a limit, there is an enormous gap between surviving locally and being in the giant component. Thus, local convergence is obviously not enough to imply the convergence of the giant. In this paper, we identify what extra condition is needed to imply this natural implication. 2. Asymptotics and properties of the giant In this section, we investigate the behavior of the giant component jCmaxj for random graphs that converge locally. In Section 2.1, we introduce the notion of local convergence in probability. 1 2 In Section 2.2, we study the asymptotic of jCmaxj for random graphs that converge locally in probability, and in Section 2.3 we investigate local properties of the giant. 2.1. Local convergence in probability Local convergence was introduced independently by Aldous and Steele in [3] and Benjamini and Schramm in [13]. The purpose of Aldous and Steele in [3] was to describe the local structure of the so-called `stochastic mean-field model of distance', meaning the complete graph with i.i.d. exponential edge weights. This local description allowed Aldous to prove the celebrated ζ(2) limit of the random assignment problem [1]. Benjamini and Schramm in [13] instead used local weak convergence to show that limits of planar graphs are with probability one recurrent. Since its conception, local convergence has proved a key ingredient in random graph theory. In this section, we provide some basics of local convergence. For more detailed discussions, we refer the reader to [22] or [39, Chapter 2]. Let us start with some definitions. A rooted graph is a pair (G; o), where G = (V (G);E(G)) is a graph with vertex set V (G) and edge set E(G), and o 2 V (G) is a vertex. Further, a rooted or non-rooted graph is called locally finite when each of its vertices has finite degree (though not necessarily uniformly bounded). Two (finite or infinite) graphs G1 = (V (G1);E(G1)) and G2 = (V (G2);E(G2)) are called isomorphic, which we write as G1 ' G2, when there exists a bijection φ: V (G1) 7! V (G2) such that fu; vg 2 E(G1) precisely when fφ(u); φ(v)g 2 E(G2): Similarly, two rooted (finite or infinite) graphs (G1; o1) and (G2; o2) ,wtih Gi = (V (Gi);E(Gi)) for i 2 f1; 2g, are called isomorphic, abbreviated as (G1; o1) ' (G2; o2), when there exists a bijection φ: V (G1) 7! V (G2) such that φ(o1) = o2 and fu; vg 2 E(G1) precisely when fφ(u); φ(v)g 2 E(G2): These notions can be easily adapted to multi-graphs (which we will need to rely on), for which G = (V (G); (xi;j)i;j2V (G)), where xi;j denotes the number of edges between i and j, and xi;i the number of self-loops at i. There, instead, 0 0 the isomorphism φ: V (G) 7! V (G ) is required to satisfy that xi;j = xφ(i),φ(j), where (xi;j)i;j2V (G) 0 0 and (xi;j)i;j2V (G0) are the edge multiplicities of G and G respectively. We let G? be the space of (possibly infinite) connected rooted graphs, where we consider two graphs to be equal when they are isomorphic. Thus, we consider G? as the set of equivalence classes of rooted graphs modulo isomorphisms. The space G? of rooted graphs is a nice metric space, with an explicit metric, see [39, Appendix A] for details. (G) For a rooted graph (G; o), we let Br (o) denote the (rooted) subgraph of (G; o) of all vertices at (G) (G) (G) graph distance at most r away from o. Formally, this means that Br (o) = ((V (Br (o));E(Br (o)); o), where (G) V (Br (o)) = fu: dG(o; u) ≤ rg; (2.1) (G) E(Br (o)) = ffu; vg 2 E(G): dG(o; u); dG(o; v) ≤ rg: Let o 2 V (Gn) be chosen uniformly at random (u.a.r.) in V (Gn). We say that the graph sequence ? (Gn)n≥1 converges locally in probability to a limit (G; o) ∼ µ, when, for every r ≥ 0 and H 2 G?, 1 X 1 P (G) ? fB(Gn)(v)'H?g −! µ(Br (v) ' H ): (2.2) jV (Gn) r v2V (Gn) This means that the subgraph proportions in the random graph Gn are close, in probability, to those given by µ. There are related notions of local convergence, such as local weak convergence, where (2.2) is replaced by the expectation of the left-hand side converging to those on the right hand side, and local almost sure convergence, where the convergence in probability in (2.2) is replaced by convergence almost surely. For our purposes, local convergence in probability is the most convenient, for example since it implies that the neighborhoods of two uniformly chosen vertices are close to being independent (see e.g., [39, Corollary 2.18]), which is central in our proof. 2.2. Asymptotics of the giant Given a random graph sequence Gn that converges in locally probability to (G; o) ∼ µ, one P would expect that jCmaxj=n −! ζ := µ(jC (o)j = 1). However, the proportion of vertices in the 2.2 Asymptotics of the giant 3 largest connected component jCmaxj=n is not continuous in the local convergence topology, as it is a global object. In fact, also jC (on)j=n does not converge in distribution when Gn converges locally in probability to (G; o) ∼ µ. However, local convergence still tells us a useful story about the existence of a giant, as well as its size. Indeed, Corollary 2.1 shows that the upper bound is always valid, while Theorem 2.2 shows that a relative simple condition suffices to yield the lower bound as well: Corollary 2.1 (Upper bound on the giant). Let (Gn)n≥1 be a sequence of graphs whose sizes jV (Gn)j = n tend to infinity. Assume that Gn converges locally in probability to (G; o) ∼ µ. Write ζ = µ(jC (o)j = 1) for the survival probability of the limiting graph (G; o). Then, for every " > 0 fixed, P(jCmaxj ≤ n(ζ + ")) ! 1: (2.3) P In particular, Corollary 2.1 implies that jCmaxj=n −! 0 when ζ = 0, so that there can only be a giant when the local limit has a positive survival probability. Proof. Define X 1 Z≥k = fjC (v)|≥kg: (2.4) v2V (Gn) Assume that Gn converges locally in probability to (G; o) ∼ µ as defined in (2.2).
Recommended publications
  • Poisson Representations of Branching Markov and Measure-Valued
    The Annals of Probability 2011, Vol. 39, No. 3, 939–984 DOI: 10.1214/10-AOP574 c Institute of Mathematical Statistics, 2011 POISSON REPRESENTATIONS OF BRANCHING MARKOV AND MEASURE-VALUED BRANCHING PROCESSES By Thomas G. Kurtz1 and Eliane R. Rodrigues2 University of Wisconsin, Madison and UNAM Representations of branching Markov processes and their measure- valued limits in terms of countable systems of particles are con- structed for models with spatially varying birth and death rates. Each particle has a location and a “level,” but unlike earlier con- structions, the levels change with time. In fact, death of a particle occurs only when the level of the particle crosses a specified level r, or for the limiting models, hits infinity. For branching Markov pro- cesses, at each time t, conditioned on the state of the process, the levels are independent and uniformly distributed on [0,r]. For the limiting measure-valued process, at each time t, the joint distribu- tion of locations and levels is conditionally Poisson distributed with mean measure K(t) × Λ, where Λ denotes Lebesgue measure, and K is the desired measure-valued process. The representation simplifies or gives alternative proofs for a vari- ety of calculations and results including conditioning on extinction or nonextinction, Harris’s convergence theorem for supercritical branch- ing processes, and diffusion approximations for processes in random environments. 1. Introduction. Measure-valued processes arise naturally as infinite sys- tem limits of empirical measures of finite particle systems. A number of ap- proaches have been developed which preserve distinct particles in the limit and which give a representation of the measure-valued process as a transfor- mation of the limiting infinite particle system.
    [Show full text]
  • Coalescence in Bellman-Harris and Multi-Type Branching Processes Jyy-I Joy Hong Iowa State University
    Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2011 Coalescence in Bellman-Harris and multi-type branching processes Jyy-i Joy Hong Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Mathematics Commons Recommended Citation Hong, Jyy-i Joy, "Coalescence in Bellman-Harris and multi-type branching processes" (2011). Graduate Theses and Dissertations. 10103. https://lib.dr.iastate.edu/etd/10103 This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Coalescence in Bellman-Harris and multi-type branching processes by Jyy-I Hong A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Major: Mathematics Program of Study Committee: Krishna B. Athreya, Major Professor Clifford Bergman Dan Nordman Ananda Weerasinghe Paul E. Sacks Iowa State University Ames, Iowa 2011 Copyright c Jyy-I Hong, 2011. All rights reserved. ii DEDICATION I would like to dedicate this thesis to my parents Wan-Fu Hong and Wen-Hsiang Tseng for their un- conditional love and support. Without them, the completion of this work would not have been possible. iii TABLE OF CONTENTS ACKNOWLEDGEMENTS . vii ABSTRACT . viii CHAPTER 1. PRELIMINARIES . 1 1.1 Introduction . 1 1.2 Discrete-time Single-type Galton-Watson Branching Processes .
    [Show full text]
  • Lecture 16: March 12 Instructor: Alistair Sinclair
    CS271 Randomness & Computation Spring 2020 Lecture 16: March 12 Instructor: Alistair Sinclair Disclaimer: These notes have not been subjected to the usual scrutiny accorded to formal publications. They may be distributed outside this class only with the permission of the Instructor. 16.1 The Giant Component in Gn,p In an earlier lecture we briefly mentioned the threshold for the existence of a “giant” component in a random graph, i.e., a connected component containing a constant fraction of the vertices. We now derive this threshold rigorously, using both Chernoff bounds and the useful machinery of branching processes. We work c with our usual model of random graphs, Gn,p, and look specifically at the range p = n , for some constant c. Our goal will be to prove: c Theorem 16.1 For G ∈ Gn,p with p = n for constant c, we have: 1. For c < 1, then a.a.s. the largest connected component of G is of size O(log n). 2. For c > 1, then a.a.s. there exists a single largest component of G of size βn(1 + o(1)), where β is the unique solution in (0, 1) to β + e−βc = 1. Moreover, the next largest component in G has size O(log n). Here, and throughout this lecture, we use the phrase “a.a.s.” (asymptotically almost surely) to denote an event that holds with probability tending to 1 as n → ∞. This behavior is shown pictorially in Figure 16.1. For c < 1, G consists of a collection of small components of size at most O(log n) (which are all “tree-like”), while for c > 1 a single “giant” component emerges that contains a constant fraction of the vertices, with the remaining vertices all belonging to tree-like components of size O(log n).
    [Show full text]
  • Processes on Complex Networks. Percolation
    Chapter 5 Processes on complex networks. Percolation 77 Up till now we discussed the structure of the complex networks. The actual reason to study this structure is to understand how this structure influences the behavior of random processes on networks. I will talk about two such processes. The first one is the percolation process. The second one is the spread of epidemics. There are a lot of open problems in this area, the main of which can be innocently formulated as: How the network topology influences the dynamics of random processes on this network. We are still quite far from a definite answer to this question. 5.1 Percolation 5.1.1 Introduction to percolation Percolation is one of the simplest processes that exhibit the critical phenomena or phase transition. This means that there is a parameter in the system, whose small change yields a large change in the system behavior. To define the percolation process, consider a graph, that has a large connected component. In the classical settings, percolation was actually studied on infinite graphs, whose vertices constitute the set Zd, and edges connect each vertex with nearest neighbors, but we consider general random graphs. We have parameter ϕ, which is the probability that any edge present in the underlying graph is open or closed (an event with probability 1 − ϕ) independently of the other edges. Actually, if we talk about edges being open or closed, this means that we discuss bond percolation. It is also possible to talk about the vertices being open or closed, and this is called site percolation.
    [Show full text]
  • Correlation in Complex Networks
    Correlation in Complex Networks by George Tsering Cantwell A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Physics) in the University of Michigan 2020 Doctoral Committee: Professor Mark Newman, Chair Professor Charles Doering Assistant Professor Jordan Horowitz Assistant Professor Abigail Jacobs Associate Professor Xiaoming Mao George Tsering Cantwell [email protected] ORCID iD: 0000-0002-4205-3691 © George Tsering Cantwell 2020 ACKNOWLEDGMENTS First, I must thank Mark Newman for his support and mentor- ship throughout my time at the University of Michigan. Further thanks are due to all of the people who have worked with me on projects related to this thesis. In alphabetical order they are Eliz- abeth Bruch, Alec Kirkley, Yanchen Liu, Benjamin Maier, Gesine Reinert, Maria Riolo, Alice Schwarze, Carlos Serván, Jordan Sny- der, Guillaume St-Onge, and Jean-Gabriel Young. ii TABLE OF CONTENTS Acknowledgments .................................. ii List of Figures ..................................... v List of Tables ..................................... vi List of Appendices .................................. vii Abstract ........................................ viii Chapter 1 Introduction .................................... 1 1.1 Why study networks?...........................2 1.1.1 Example: Modeling the spread of disease...........3 1.2 Measures and metrics...........................8 1.3 Models of networks............................ 11 1.4 Inference.................................
    [Show full text]
  • Chapter 21 Epidemics
    From the book Networks, Crowds, and Markets: Reasoning about a Highly Connected World. By David Easley and Jon Kleinberg. Cambridge University Press, 2010. Complete preprint on-line at http://www.cs.cornell.edu/home/kleinber/networks-book/ Chapter 21 Epidemics The study of epidemic disease has always been a topic where biological issues mix with social ones. When we talk about epidemic disease, we will be thinking of contagious diseases caused by biological pathogens — things like influenza, measles, and sexually transmitted diseases, which spread from person to person. Epidemics can pass explosively through a population, or they can persist over long time periods at low levels; they can experience sudden flare-ups or even wave-like cyclic patterns of increasing and decreasing prevalence. In extreme cases, a single disease outbreak can have a significant effect on a whole civilization, as with the epidemics started by the arrival of Europeans in the Americas [130], or the outbreak of bubonic plague that killed 20% of the population of Europe over a seven-year period in the 1300s [293]. 21.1 Diseases and the Networks that Transmit Them The patterns by which epidemics spread through groups of people is determined not just by the properties of the pathogen carrying it — including its contagiousness, the length of its infectious period, and its severity — but also by network structures within the population it is affecting. The social network within a population — recording who knows whom — determines a lot about how the disease is likely to spread from one person to another. But more generally, the opportunities for a disease to spread are given by a contact network: there is a node for each person, and an edge if two people come into contact with each other in a way that makes it possible for the disease to spread from one to the other.
    [Show full text]
  • Pdf File of Second Edition, January 2018
    Probability on Graphs Random Processes on Graphs and Lattices Second Edition, 2018 GEOFFREY GRIMMETT Statistical Laboratory University of Cambridge copyright Geoffrey Grimmett Geoffrey Grimmett Statistical Laboratory Centre for Mathematical Sciences University of Cambridge Wilberforce Road Cambridge CB3 0WB United Kingdom 2000 MSC: (Primary) 60K35, 82B20, (Secondary) 05C80, 82B43, 82C22 With 56 Figures copyright Geoffrey Grimmett Contents Preface ix 1 Random Walks on Graphs 1 1.1 Random Walks and Reversible Markov Chains 1 1.2 Electrical Networks 3 1.3 FlowsandEnergy 8 1.4 RecurrenceandResistance 11 1.5 Polya's Theorem 14 1.6 GraphTheory 16 1.7 Exercises 18 2 Uniform Spanning Tree 21 2.1 De®nition 21 2.2 Wilson's Algorithm 23 2.3 Weak Limits on Lattices 28 2.4 Uniform Forest 31 2.5 Schramm±LownerEvolutionsÈ 32 2.6 Exercises 36 3 Percolation and Self-Avoiding Walks 39 3.1 PercolationandPhaseTransition 39 3.2 Self-Avoiding Walks 42 3.3 ConnectiveConstantoftheHexagonalLattice 45 3.4 CoupledPercolation 53 3.5 Oriented Percolation 53 3.6 Exercises 56 4 Association and In¯uence 59 4.1 Holley Inequality 59 4.2 FKG Inequality 62 4.3 BK Inequalitycopyright Geoffrey Grimmett63 vi Contents 4.4 HoeffdingInequality 65 4.5 In¯uenceforProductMeasures 67 4.6 ProofsofIn¯uenceTheorems 72 4.7 Russo'sFormulaandSharpThresholds 80 4.8 Exercises 83 5 Further Percolation 86 5.1 Subcritical Phase 86 5.2 Supercritical Phase 90 5.3 UniquenessoftheIn®niteCluster 96 5.4 Phase Transition 99 5.5 OpenPathsinAnnuli 103 5.6 The Critical Probability in Two Dimensions 107
    [Show full text]
  • BRANCHING PROCESSES in RANDOM TREES by Yanjmaa
    BRANCHING PROCESSES IN RANDOM TREES by Yanjmaa Jutmaan A dissertation submitted to the faculty of the University of North Carolina at Charlotte in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Applied Mathematics Charlotte 2012 Approved by: Dr. Stanislav A. Molchanov Dr. Isaac Sonin Dr. Michael Grabchak Dr. Celine Latulipe ii c 2012 Yanjmaa Jutmaan ALL RIGHTS RESERVED iii ABSTRACT YANJMAA JUTMAAN. Branching processes in random trees. (Under the direction of DR. STANISLAV MOLCHANOV) We study the behavior of branching process in a random environment on trees in the critical, subcritical and supercritical case. We are interested in the case when both the branching and the step transition parameters are random quantities. We present quenched and annealed classifications for such processes and determine some limit theorems in the supercritical quenched case. Corollaries cover the percolation problem on random trees. The main tool is the compositions of random generating functions. iv ACKNOWLEDGMENTS This work would not have been possible without the support and assistance of my colleagues, friends, and family. I have been fortunate to be Dr. Stanislav Molchanov's student. I am grateful for his guidance, teaching and involvement in my graduate study. I have learned much from him; this will be remembered and valued throughout my life. I am thankful also to the other members of my committee, Dr. Isaac Sonin, Dr. Michael Grabchak and Dr. Celine Latulipe for their time and support provided. I would also like to thank Dr. Joel Avrin, graduate coordinator, for his advice, kindness and patience. I am eternally thankful to my father Dagva and mother Tsendmaa for their love that I feel even this far away.
    [Show full text]
  • Network Science
    NETWORK SCIENCE Random Networks Prof. Marcello Pelillo Ca’ Foscari University of Venice a.y. 2016/17 Section 3.2 The random network model RANDOM NETWORK MODEL Pál Erdös Alfréd Rényi (1913-1996) (1921-1970) Erdös-Rényi model (1960) Connect with probability p p=1/6 N=10 <k> ~ 1.5 SECTION 3.2 THE RANDOM NETWORK MODEL BOX 3.1 DEFINING RANDOM NETWORKS RANDOM NETWORK MODEL Network science aims to build models that reproduce the properties of There are two equivalent defini- real networks. Most networks we encounter do not have the comforting tions of a random network: Definition: regularity of a crystal lattice or the predictable radial architecture of a spi- der web. Rather, at first inspection they look as if they were spun randomly G(N, L) Model A random graph is a graph of N nodes where each pair (Figure 2.4). Random network theoryof nodes embraces is connected this byapparent probability randomness p. N labeled nodes are connect- by constructing networks that are truly random. ed with L randomly placed links. Erds and Rényi used From a modeling perspectiveTo a constructnetwork is a arandom relatively network simple G(N, object, p): this definition in their string consisting of only nodes and links. The real challenge, however, is to decide of papers on random net- where to place the links between1) the Start nodes with so thatN isolated we reproduce nodes the com- works [2-9]. plexity of a real system. In this 2)respect Select the a philosophynode pair, behindand generate a random a network is simple: We assume thatrandom this goal number is best betweenachieved by0 and placing 1.
    [Show full text]
  • Fractal Network in the Protein Interaction Network Model
    Journal of the Korean Physical Society, Vol. 56, No. 3, March 2010, pp. 1020∼1024 Fractal Network in the Protein Interaction Network Model Pureun Kim and Byungnam Kahng∗ Department of Physics and Astronomy, Seoul National University, Seoul 151-747 (Received 22 September 2009) Fractal complex networks (FCNs) have been observed in a diverse range of networks from the World Wide Web to biological networks. However, few stochastic models to generate FCNs have been introduced so far. Here, we simulate a protein-protein interaction network model, finding that FCNs can be generated near the percolation threshold. The number of boxes needed to cover the network exhibits a heavy-tailed distribution. Its skeleton, a spanning tree based on the edge betweenness centrality, is a scaffold of the original network and turns out to be a critical branching tree. Thus, the model network is a fractal at the percolation threshold. PACS numbers: 68.37.Ef, 82.20.-w, 68.43.-h Keywords: Fractal complex network, Percolation, Protein interaction network DOI: 10.3938/jkps.56.1020 I. INTRODUCTION consistent with that of the hub-repulsion model [2]. The fractal scaling of a FCN originates from the fractality of its skeleton underneath it [8]. The skeleton is regarded Fractal complex networks (FCNs) have been discov- as a critical branching tree: It exhibits a plateau in the ered in diverse real-world systems [1, 2]. Examples in- mean branching number functionn ¯(d), defined as the clude the co-authorship network [3], metabolic networks average number of offsprings created by nodes at a dis- [4], the protein interaction networks [5], the World-Wide tance d from the root.
    [Show full text]
  • Neutral Evolution of Proteins: the Superfunnel in Sequence Space and Its Relation to Mutational Robustness
    Neutral evolution of proteins: The superfunnel in sequence space and its relation to mutational robustness. Josselin Noirel, Thomas Simonson To cite this version: Josselin Noirel, Thomas Simonson. Neutral evolution of proteins: The superfunnel in sequence space and its relation to mutational robustness.. Journal of Chemical Physics, American Institute of Physics, 2008, 129 (18), pp.185104. 10.1063/1.2992853. hal-00488189 HAL Id: hal-00488189 https://hal-polytechnique.archives-ouvertes.fr/hal-00488189 Submitted on 22 May 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THE JOURNAL OF CHEMICAL PHYSICS 129, 185104 ͑2008͒ Neutral evolution of proteins: The superfunnel in sequence space and its relation to mutational robustness Josselin Noirela͒ and Thomas Simonsonb͒ Laboratoire de Biochimie, École Polytechnique, Route de Saclay, Palaiseau 91128 Cedex, France ͑Received 31 July 2008; accepted 11 September 2008; published online 11 November 2008͒ Following Kimura’s neutral theory of molecular evolution ͓M. Kimura, The Neutral Theory of Molecular Evolution ͑Cambridge University Press, Cambridge, 1983͒͑reprinted in 1986͔͒,ithas become common to assume that the vast majority of viable mutations of a gene confer little or no functional advantage. Yet, in silico models of protein evolution have shown that mutational robustness of sequences could be selected for, even in the context of neutral evolution.
    [Show full text]
  • Ewens' Sampling Formula
    Ewens’ sampling formula; a combinatorial derivation Bob Griffiths, University of Oxford Sabin Lessard, Universit´edeMontr´eal Infinitely-many-alleles-model: unique mutations ......................................................................................................... •. •. •. ............................................................................................................ •. •. .................................................................................................... •. •. •. ............................................................. ......................................... .............................................. •. ............................... ..................... ..................... ......................................... ......................................... ..................... • ••• • • • ••• • • • • • Sample configuration of alleles 4 A1,1A2,4A3,3A4,3A5. Ewens’ sampling formula (1972) n sampled genes k b Probability of a sample having types with j types represented j times, jbj = n,and bj = k,is n! 1 θk · · b b 1 1 ···n n b1! ···bn! θ(θ +1)···(θ + n − 1) Example: Sample 4 A1,1A2,4A3,3A4,3A5. b1 =1, b2 =0, b3 =2, b4 =2. Old and New lineages . •. ............................... ......................................... •. .............................................. ............................... ............................... ..................... ..................... ......................................... ..................... ..................... ◦ ◦◦◦◦◦ n1 n2 nm nm+1
    [Show full text]