Statistical Mechanics of Convergent Evolution in Spatial Patterning
Total Page:16
File Type:pdf, Size:1020Kb
Statistical mechanics of convergent evolution in spatial patterning Bhavin S. Khatria,b,1, Tom C. B. McLeishc,d, and Richard P. Seara aDepartment of Physics, University of Surrey, Guildford GU2 7XH, United Kingdom; cSchool of Physics, University of Edinburgh, Edinburgh EH9 3JZ, United Kingdom; bSchool of Physics and Astronomy, University of Leeds, Leeds LS2 9JT, United Kingdom; and dDepartments of Physics and Chemistry, Durham University, Durham DH1 3HP, United Kingdom Edited by Jose´N. Onuchic, University of California at San Diego, La Jolla, CA, and approved April 24, 2009 (received for review December 3, 2008) We explore how the genotype–phenotype map determines con- RNA sequences (8), which suggest that a single RNA shape vergent evolution in a simple model of spatial gene regulation (phenotype) maps to large neutral networks in sequence (genotype) during development. Evolution is simulated via a Monte Carlo space. Such redundancy is also seen in the context of the evolution scheme that incorporates mutation, selection, and genetic drift, by of transcription factor binding (9–11), where a single binding using a bottom-up model of gene regulation with a fitness function constant of a protein to a regulatory region can be realized by a that is optimized by a switch-like response to a morphogen number of different pairs of sequences, distinguished only by the gradient. We find that even for very simple regulation, the geno- number of mismatches between them. The many-to-one nature of type–phenotype map gives rise to an emergent fitness landscape the genotype–phenotype map naturally gives rise to the concept of of remarkable complexity. This leads to a richness of evolutionary mutational entropy (9, 12, 13), which importantly, can bias the behavior as population size is increased that parallels the thermo- course of evolution in a manner not predictable from studies based dynamics of physical systems as temperature decreases. Conver- on phenotypic landscapes (14). gence is controlled by the existence of sufficiently dominant global In this article, we study the genotype–phenotype map for the optima in ‘‘free fitness,’’ which is a quantity that is the balance of evolution of spatial patterning in development and examine the role mutational entropy and fitness. In independent simulations at low that mutational entropy and finite population size play in deter- population sizes, we find convergence to a phenotype of subop- mining the independent convergence of evolutionary solutions. In timal fitness due to the multiplicity or entropy of solutions. This the regime of small population size, the most appropriate model of contrasts with convergence to the optimal fitness phenotype at evolution is described by a Wright–Fisher process (2, 3), where high population size. However, at sufficiently large population populations are monomorphic at loci, aside from short periods sizes, we find convergence in only the phenotypes with greatest when a rare and ultimately successful mutation vies for fixation in effect on fitness, whereas noncritical phenotypes exhibit diver- the population. Here, we introduce a Monte Carlo (MC) scheme gence due to quenched disorder on a locally rough landscape. Our based on the probability of fixation of a mutant of fitness ␦F results predict that for large populations, the evolution of even compared with the wild type, which we use to explore the fitness simple gene regulatory circuits may be glassy-like, such that, landscape. As demonstrated by Sella and Hirsh (12) in the general counter to the commonly accepted view that conservation implies case, and Berg and La¨ssig(9), for the case of transcription factor function, many conserved phenotypes are simply frozen accidents binding, there is an exact mathematical correspondence between of little consequence to the fitness of the organism. the Wright–Fisher process and the canonical ensemble of statistical mechanics in steady state; the ratio of forward and reverse fixation fitness landscapes ͉ gene regulation ͉ genotype–phenotype map ͉ rates (␦F)/(Ϫ␦F) ϭ exp(␦F), implying that the probability mutational entropy ͉ population genetics distribution of the most recent common ancestor is a Boltzmann distribution with an effective temperature Ϫ1 ϭ (2(N Ϫ 1))Ϫ1 for haploid asexual populations, as is assumed in this article. As shown ver the past 150 years, much work has established natural by Sella and Hirsh (12), this means that in steady-state evolutionary selection, mutation, and genetic drift as the basic processes of O processes, there is a balance between mutational entropy S and evolution (1–5). However, there still remains much controversy mean fitness ͗F͘, where a quantity analogous to the Helmholtz free concerning how the complexity and diversity of biological form has 1 energy, the free fitness ⌽ϭ͗F͘ϩ S is at maximum. We use this arisen from essentially random processes; does evolution play out in analogy to show that from a simple model of the genotype– an arena of weak constraints, or are there hidden limits to the phenotype map for gene regulation, an emergent and unexpected possibilities of biological form. This is epitomized in the debate complexity arises, which leads to a richness of behavior as popu- regarding the role of historical contingency in evolution; were we to lation size is varied that parallels that found in condensed-matter replay the tape of life, would, as Stephen Jay Gould (6) suggested, physics as a function of temperature. This, in turn, gives rise to a historical accidents compound and amplify in time such that number of unexpected consequences for molecular evolution, biological organisms on Earth today would be unrecognizable, or is which we explore within the context of the question of convergence. evolution constrained to the extent that life forms would corre- spond with those on earth today, as argued by Simon Conway Model Morris (7). There are many examples in the natural world of There are 3 main elements to the model: (i) a model of spatial independent convergence that support the latter stance, a famous patterning of an embryo, (ii) a model for the genotype–phenotype example is the near-identical structure of the mammalian and map, and (iii) a means of determining fitness of a phenotype and octopus camera eye (7), whereas empirical support for divergence in evolution is rarer, because by its very nature, the demonstration of different solutions under the same selective pressure is difficult. Author contributions: B.S.K., T.C.B.M., and R.P.S. designed research; B.S.K. performed At the heart of this debate is that, despite an understanding of the research; and B.S.K., T.C.B.M., and R.P.S. wrote the paper. basic mechanisms of evolution, questions of the genotype to The authors declare no conflict of interest. phenotype map and the underlying structure of evolutionary land- This article is a PNAS Direct Submission. scapes have attracted little attention to date. 1To whom correspondence should be addressed. E-mail: [email protected]. However, there are exceptions, for example, the body of work This article contains supporting information online at www.pnas.org/cgi/content/full/ that has studied the genotype–phenotype map of the folding of 0812260106/DCSupplemental. 9564–9569 ͉ PNAS ͉ June 16, 2009 ͉ vol. 106 ͉ no. 24 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0812260106 Downloaded by guest on October 1, 2021 Fig. 1. An overview of the genotype–phenotype map. The gene regulatory module has input a morphogen gradient M(x) across a 1-dimensional embryo of length L and outputs a transcription factor T(x). The specific equation shown for the morphogen profile arises from solving the reaction-diffusion equation with reflecting boundary conditions and is valid for all ␣ (SI). Gene regulation of T by using a morphogen and RNAP (R) is controlled in a bottom-up manner by binding to its regulatory region consisting of a promoter P and adjacent binding site 1. Binding (E) and glue energies (␦E) are determined by comparison of the relevant sequences contained in G. subsequent evolution via Monte Carlo simulation. An overview of adds stability for each favorable hydrophobic interaction between the first 2 parts is given in Fig. 1. amino acids (21). The regulatory region that controls transcription of the TF consists of adjacent nonoverlapping binding sites, each Spatial Patterning. Spatial patterning is of profound importance in with its own binary sequence. More complicated regulatory developmental biology because it allows the organization of tran- schemes are possible, such as allowing overlapping binding sites scription factors (TFs) into different spatial regions, which then (22); however, we choose a nonoverlapping approximation to keep ultimately leads to differentiation of the embryo into a variety of cell our model of the genotype phenotype map simple. In this article, SCIENCES ϭ types and higher morphological forms (15). However, different we assume the length of binding sequences are lb 10 and glue APPLIED PHYSICAL regions or cells must receive information on their relative position sequences lg ϭ 5, giving a total genome size, G ϭ 50. The binding within a developing embryo. A well-studied mechanism to achieve and cooperative energies are calculated by using the Hamming this is the formation of a concentration gradient of a morphogen distance or number of mismatches between a relevant pair of binary that pervades the embryo (16); concentration then becomes a sequences. To allow switch-like behavior with changing concentra- function of position, which can then be ‘‘read-off’’ by downstream tion (or chemical potential) for the occupation of binding sites targets. Many developmental processes have been shown to rely on requires the background free energy (binding of other nontarget morphogens, such as bicoid (17) or decapentaplegic (Dpp) (18) and sites in the genome) to be within kBT of the maximum TF binding EVOLUTION wingless (Wg) (19), which pattern the anterior–posterior (A–P) axis energy (20).