Reconstruction of DNA Sequences Using Genetic Algorithms and Cellular Automata: Towards Mutation Prediction? Ch

Total Page:16

File Type:pdf, Size:1020Kb

Reconstruction of DNA Sequences Using Genetic Algorithms and Cellular Automata: Towards Mutation Prediction? Ch Available online at www.sciencedirect.com BioSystems 92 (2008) 61–68 Reconstruction of DNA sequences using genetic algorithms and cellular automata: Towards mutation prediction? Ch. Mizas a, G.Ch. Sirakoulis a, V. Mardiris a, I. Karafyllidis a, N. Glykos b, R. Sandaltzopoulos b,∗ a Democritus University of Thrace, Department of Electrical and Computer Engineering, 67100 Xanthi, Greece b Democritus University of Thrace, Department of Molecular Biology and Genetics, Laboratory of Gene Expression, Molecular Diagnostics and Modern Therapeutics, 68100 Alexandroupolis, Greece Received 21 December 2006; received in revised form 28 November 2007; accepted 10 December 2007 Abstract Change of DNA sequence that fuels evolution is, to a certain extent, a deterministic process because mutagenesis does not occur in an absolutely random manner. So far, it has not been possible to decipher the rules that govern DNA sequence evolution due to the extreme complexity of the entire process. In our attempt to approach this issue we focus solely on the mechanisms of mutagenesis and deliberately disregard the role of natural selection. Hence, in this analysis, evolution refers to the accumulation of genetic alterations that originate from mutations and are transmitted through generations without being subjected to natural selection. We have developed a software tool that allows modelling of a DNA sequence as a one-dimensional cellular automaton (CA) with four states per cell which correspond to the four DNA bases, i.e. A, C, T and G. The four states are represented by numbers of the quaternary number system. Moreover, we have developed genetic algorithms (GAs) in order to determine the rules of CA evolution that simulate the DNA evolution process. Linear evolution rules were considered and square matrices were used to represent them. If DNA sequences of different evolution steps are available, our approach allows the determination of the underlying evolution rule(s). Conversely, once the evolution rules are deciphered, our tool may reconstruct the DNA sequence in any previous evolution step for which the exact sequence information was unknown. The developed tool may be used to test various parameters that could influence evolution. We describe a paradigm relying on the assumption that mutagenesis is governed by a near-neighbour-dependent mechanism. Based on the satisfactory performance of our system in the deliberately simplified example, we propose that our approach could offer a starting point for future attempts to understand the mechanisms that govern evolution. The developed software is open-source and has a user-friendly graphical input interface. © 2008 Elsevier Ireland Ltd. All rights reserved. Keywords: Genetic algorithms; Cellular automata; DNA sequence modelling; Evolution; Simulation; Bioinformatics; Nanotechnology 1. Introduction 1993, 1994; Sirakoulis et al., 2004). In general, computational intelligence methods may be employed provided that they are In the recent years, advances of biochemical and biophysi- wisely combined with bioinformatics, as illustrated in Baxevanis cal instrumentation methodologies allowed collection of whole and Ouellette (1998). In order to contribute to this effort we mod- genome and proteome sequences (Venter et al., 2001). The elled DNA sequence as a one-dimensional cellular automaton complexity of the available information calls for the devel- (CA) thus allowing the use of powerful and sophisticated com- opment of ever more powerful bioinformatics tools to attack putation techniques for the development of useful bioinformatics complex problems associated with proper interpretation of the tools (Sirakoulis et al., 2003). obtained information. Faster and more accurate computational CAs, originally developed by von Neumann (1966) as mod- intelligence algorithms and data processing technologies are els of self-reproducing systems, have been extensively used required (Baxevanis and Ouellette, 1998; Chou and Zhang, to model and simulate environmental and biological systems 1992; Cios et al., 2005; Durbin et al., 2000; Zhang and Chou, (Bernadres and dos Santos, 1997; Chou et al., 2006; Gao et al., 2006; Gaylord and Nishidate, 1996; Kansal et al., 2000; Karafyllidis, 1997, 1998; Karafyllidis and Thanailakis, 1997; ∗ Corresponding author. Tel.: +30 25510 30622; fax: +30 25510 30624. Patel et al., 2001; Salzberg et al., 2004; Sirakoulis et al., 2000; E-mail address: [email protected] (R. Sandaltzopoulos). Wang et al., 2005; Xiao et al., 2005a,b, 2006a,b). CAs were 0303-2647/$ – see front matter © 2008 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.biosystems.2007.12.002 62 Ch. Mizas et al. / BioSystems 92 (2008) 61–68 showed to be a promising model for DNA sequence (Sirakoulis et al., 2003), because certain aspects of the DNA structure, func- tion and evolution can be simulated using several mathematical tools (such as linear algebra and operators), introduced through the use of CAs. In our model the sugar-phosphate backbone of a DNA molecule corresponds to the CA lattice and the organic bases to the CA cells. At each position of the lattice one of the four bases A (adenine), C (cytosine), T (thymine) and G (gua- nine) of the DNA molecule may be allocated. These four bases correspond to the four possible states of the CA cell. In elementary CAs, the CA evolution rule can be extracted from a given number of CA evolution patterns. This method can also be applied to the CAs that model DNA sequences. Thus, we developed a simulator, named DNA EVO, which allows mod- elling DNA evolution by extracting CA rule(s) using genetic Fig. 1. (a) An example of a typical one-dimensional CA; (b) a sequence of algorithms (GAs) (Holland, 1975; Goldberg, 1989). The evolu- evolution steps of a one-dimensional CA. tion rule(s) can be determined by providing the global state of the DNA sequence in various evolution steps. Conversely, if the neighbouring cells. All the cells that may possibly affect the evolution rule(s) and DNA sequences are given for several evo- change of the state of the ith cell are the neighbourhood of this lution steps, it may be possible to determine the DNA sequence cell. The neighbourhood is defined as follows: in previous evolution steps. DNA EVO is an interactive simula- N i, r ={C ,...,C ,C ,C ,C,C ,C , tion tool that includes a graphical user interface [GUI] which has ( ) i−r i−3 i−2 i−1 i i+1 i+2 been implemented using Tcl/Tk facilities (Welch et al., 2003). Ci+3, ..., Ci+r},r= 0, 1, 2, 3,...,m (2) where r is the size of the neighbourhood. If r = 1, which is the 2. Modelling DNA in Terms of Cellular Automata most usual case then the neighbourhood of the ith cell consists of the same cell and its left and right immediate neighbours: In this work we match certain properties of DNA with the features of a proposed CA thus modelling DNA in terms of N(i, 1) ={Ci−1,Ci,Ci+1} (3) CAs. Sketching certain directions that DNA modelling might take, we hope to stimulate further research in this direction. The state of the ith cell at time step t + 1 is affected by the Cellular automata (CAs) were introduced by von Neumann states of its neighbours at the previous time step t, i.e. the state (1966) and Ulam (1974) as a possible idealization of biologi- of the ith cell at a time step is a function of the states of its cal systems, in order to model biological self-reproduction. In neighbours at the previous time step: this work, we model DNA as a one-dimensional CA and, there- Ct+1 = F Ct ,...,Ct ,Ct ,Ct ,Ct,Ct ,Ct , i ( i−r i−3 i−2 i−1 i i+1 i+2 fore, only one-dimensional CAs will be presented in this section Ct ,...,Ct (Sirakoulis et al., 2003). A one-dimensional CA consists of a reg- i+3 i+r) (4) ular uniform lattice, which may be infinite in size and expands in a one-dimensional space. Each site of this lattice is called a This function is the CA evolution rule. The upper index in cell. At each cell a variable takes values from a discrete set. The t+1 the state symbol denotes the time step. Ci is the state of the value of this variable is the state of the cell. Fig. 1(a) shows a ith cell at time step t +1.Ifr = 1, Eq. (4) becomes one-dimensional CA. The CA lattice consists of identical cells, Ct+1 = F Ct ,Ct,Ct ..., i − 3, i − 2, i − 1, i, i +1,i +2,i +3,..., and the correspond- i ( i−1 i i−1) (5) ing states of these cells are C − , C − , C − , C , C , C and i 3 i 2 i 1 i i+1 i+2 Fig. 1(b) shows the evolution of a one-dimensional CA. The C . i+3 horizontal axis is space and the vertical axis is time. Each row The state of the ith cell takes values from a predefined discrete represents the CA at each time step and each column represents set: the state of the same cell at various time steps. DNA may be modelled as a one-dimensional CA; the sugar- Ci ∈{c1,c2,c3,...,cn} (1) phosphate backbone corresponds to the CA lattice where the four where c1, c2, c3, ..., cn are the elements of the set. This set bases A, C, T and G may be attached, each one corresponding may be a set of integers, a set of real numbers, a set of atoms, a to the four possible states of a CA cell. The state of the ith cell set of molecules, or even a set of properties. If this set contains of this CA takes values from the discrete set that comprises the only the two binary numbers, i.e.
Recommended publications
  • Phylogenetic Analysis of Anostracans (Branchiopoda: Anostraca) Inferred from Nuclear 18S Ribosomal DNA (18S Rdna) Sequences
    MOLECULAR PHYLOGENETICS AND EVOLUTION Molecular Phylogenetics and Evolution 25 (2002) 535–544 www.academicpress.com Phylogenetic analysis of anostracans (Branchiopoda: Anostraca) inferred from nuclear 18S ribosomal DNA (18S rDNA) sequences Peter H.H. Weekers,a,* Gopal Murugan,a,1 Jacques R. Vanfleteren,a Denton Belk,b and Henri J. Dumonta a Department of Biology, Ghent University, Ledeganckstraat 35, B-9000 Ghent, Belgium b Biology Department, Our Lady of the Lake University of San Antonio, San Antonio, TX 78207, USA Received 20 February 2001; received in revised form 18 June 2002 Abstract The nuclear small subunit ribosomal DNA (18S rDNA) of 27 anostracans (Branchiopoda: Anostraca) belonging to 14 genera and eight out of nine traditionally recognized families has been sequenced and used for phylogenetic analysis. The 18S rDNA phylogeny shows that the anostracans are monophyletic. The taxa under examination form two clades of subordinal level and eight clades of family level. Two families the Polyartemiidae and Linderiellidae are suppressed and merged with the Chirocephalidae, of which together they form a subfamily. In contrast, the Parartemiinae are removed from the Branchipodidae, raised to family level (Parartemiidae) and cluster as a sister group to the Artemiidae in a clade defined here as the Artemiina (new suborder). A number of morphological traits support this new suborder. The Branchipodidae are separated into two families, the Branchipodidae and Ta- nymastigidae (new family). The relationship between Dendrocephalus and Thamnocephalus requires further study and needs the addition of Branchinella sequences to decide whether the Thamnocephalidae are monophyletic. Surprisingly, Polyartemiella hazeni and Polyartemia forcipata (‘‘Family’’ Polyartemiidae), with 17 and 19 thoracic segments and pairs of trunk limb as opposed to all other anostracans with only 11 pairs, do not cluster but are separated by Linderiella santarosae (‘‘Family’’ Linderiellidae), which has 11 pairs of trunk limbs.
    [Show full text]
  • Investgating Determinants of Phylogeneic Accuracy
    IMPACT OF MOLECULAR EVOLUTIONARY FOOTPRINTS ON PHYLOGENETIC ACCURACY – A SIMULATION STUDY Dissertation Submitted to The College of Arts and Sciences of the UNIVERSITY OF DAYTON In Partial Fulfillment of the Requirements for The Degree Doctor of Philosophy in Biology by Bhakti Dwivedi UNIVERSITY OF DAYTON August, 2009 i APPROVED BY: _________________________ Gadagkar, R. Sudhindra Ph.D. Major Advisor _________________________ Robinson, Jayne Ph.D. Committee Member Chair Department of Biology _________________________ Nielsen, R. Mark Ph.D. Committee Member _________________________ Rowe, J. John Ph.D. Committee Member _________________________ Goldman, Dan Ph.D. Committee Member ii ABSTRACT IMPACT OF MOLECULAR EVOLUTIONARY FOOTPRINTS ON PHYLOGENETIC ACCURACY – A SIMULATION STUDY Dwivedi Bhakti University of Dayton Advisor: Dr. Sudhindra R. Gadagkar An accurately inferred phylogeny is important to the study of molecular evolution. Factors impacting the accuracy of a phylogenetic tree can be traced to several consecutive steps leading to the inference of the phylogeny. In this simulation-based study our focus is on the impact of the certain evolutionary features of the nucleotide sequences themselves in the alignment rather than any source of error during the process of sequence alignment or due to the choice of the method of phylogenetic inference. Nucleotide sequences can be characterized by summary statistics such as sequence length and base composition. When two or more such sequences need to be compared to each other (as in an alignment prior to phylogenetic analysis) additional evolutionary features come into play, such as the overall rate of nucleotide substitution, the ratio of two specific instantaneous, rates of substitution (rate at which transitions and transversions occur), and the shape parameter, of the gamma distribution (that quantifies the extent of iii heterogeneity in substitution rate among sites in an alignment).
    [Show full text]
  • Eidesstattliche Erklärung
    ZENTRUM FÜR BIODIVERSITÄT UND NACHHALTIGE LANDNUTZUNG SEKTION BIODIVERSITÄT, ÖKOLOGIE UND NATURSCHUTZ CENTRE OF BIODIVERSITY AND SUSTAINABLE LAND USE SECTION: BIODIVERSITY, ECOLOGY AND NATURE CONSERVATION Mitochondrial genomes and the complex evolutionary history of the cercopithecine tribe Papionini Dissertation zur Erlangung des Doktorgrades der Mathematisch-Naturwissenschaftlichen Fakultäten der Georg-August-Universität zu Göttingen vorgelegt von Dipl. Biol. Rasmus Liedigk aus Westerstede Göttingen, September 2014 Referent: PD Dr. Christian Roos Korreferent: Prof. Dr. Eckhard Heymann Tag der mündlichen Prüfung: 19.9.2014 Table of content 1 General introduction .............................................................................................. 1 1.1 An introduction to phylogenetics ....................................................................... 1 1.2 Tribe Papionini – subfamily Cercopithecinae ..................................................... 3 1.2.1 Subtribe Papionina.................................................................................... 4 1.2.2 Subtribe Macacina, genus Macaca ........................................................... 5 1.3 Papionin fossils in Europe and Asia................................................................... 7 1.3.1 Fossils of Macaca ..................................................................................... 8 1.3.2 Fossils of Theropithecus ........................................................................... 9 1.4 The mitochondrial genome and its
    [Show full text]
  • Genetics 540 Winter, 2001 Models of DNA Evolution – Part 2 Joe Felsenstein Department of Genetics University of Washington
    Genetics 540 Winter, 2001 Models of DNA evolution { part 2 Joe Felsenstein Department of Genetics University of Washington joe@genetics A model of variation in evolutionary rates among sites The basic idea is that the rate at each site is drawn independently from a distribution of rates. The most widely used choice is the Gamma distribution, which has density function (if its mean is 1): αα rα 1 e α r f(r) = − − Γ(α) Unrealistic aspects of the model: • There is no reason, aside from mathematical convenience, to assume that the Gamma is the right distribution. A common variation is to assume there is a separate probability f0 of having rate 0. • Rates at different sites appear to be correlated, which this model does not allow. • Rates are not constant throughout evolution { they change with time. α = 0.25 cv = 2 α = 11.1111 α = 1 cv = 0.3 frequency cv = 1 0 0.5 1 1.5 2 rate Gamma distributions with mean 1 and different coefficients of variation (standard deviation / mean). α = 1=CV 2 is the \shape parameter" of the Gamma distribution Hidden Markov Models These are the most widely used models allowing rate variation to be correlated along the sequence. We assume: • There are a finite number of rates, m. Rate i is ri. • There are probabilities pi of a site having rate i. • A process not visible to us (\hidden") assigns rates to sites. It is a Markov process working along the sequence. For example it might have transition probability Prob (jji) of changing to rate j in the next site, given that it is at rate i in this site.
    [Show full text]
  • Convergent Evolution of Behavior in an Adaptive Radiation of Hawaiian Web-Building Spiders
    Convergent evolution of behavior in an adaptive radiation of Hawaiian web-building spiders Todd A. Blackledge† and Rosemary G. Gillespie Department of Environmental Science, Policy, and Management, Division of Insect Biology, 201 Wellman Hall, University of California, Berkeley, CA 94720 Communicated by Thomas W. Schoener, University of California, Davis, CA, October 6, 2004 (received for review March 22, 2004) Species in ecologically similar habitats often display patterns of such that differences in web shape can indicate differences in divergence that are strikingly comparable, suggesting that natural how spiders are using resources (17, 19–21). We assess the selection can lead to predictable evolutionary change in commu- relative predictability of behavioral evolution within the adap- nities. However, the relative importance of selection as an agent tive radiation of Hawaiian orb-weaving Tetragnatha by compar- mediating in situ diversification, versus dispersal between habi- ing the web architectures of species, both within islands and tats, cannot be addressed without knowledge of phylogenetic among communities on different islands, and by examining the history. We used an adaptive radiation of spiders within the historical diversification of those behaviors. Hawaiian Islands to test the prediction that species of spiders on different islands would independently evolve webs with similar Materials and Methods architectures. Tetragnatha spiders are the only nocturnal orb- Focal Localities. We studied spiders in focal localities on three weaving spiders endemic to the Hawaiian archipelago, and mul- different islands, each of which consisted of mature mesic to wet tiple species of orb-weaving Tetragnatha co-occur within mesic forest vegetation but varied in age according to island (22): and wet forest habitats on each of the main islands.
    [Show full text]
  • Phylogenetics and Bioinformatics for Evolution [30Pt] Maximum Likelihood
    c N. Salamin Sept 2007 Lecture outline Maximum likelihood in phylogenetics Definition Maximum likelihood Phylogenetics and bioinformatics for evolution and models Likelihood of a tree Computational complexity Statistical properties Maximum parsimony Maximum Likelihood Maximum likelihood Experimental design Hypothesis testing Tree support Tests of topology Tests of models September, 2007 c N. Salamin Sept 2007 Lecture outline Lecture outline Maximum likelihood in phylogenetics 1 Definition Maximum likelihood in phylogenetics Maximum likelihood and models Definition Likelihood of a tree Maximum likelihood and models Computational complexity Likelihood of a tree Statistical properties Computational complexity Maximum parsimony Maximum likelihood Experimental design 2 Statistical properties Hypothesis testing Maximum parsimony Tree support Maximum likelihood Tests of topology Tests of models Experimental design 3 Hypothesis testing Tree support Tests of topology Tests of models c N. Salamin Sept 2007 Lecture outline Lecture outline Maximum likelihood in phylogenetics 1 Definition Maximum likelihood in phylogenetics Maximum likelihood and models Definition Likelihood of a tree Maximum likelihood and models Computational complexity Likelihood of a tree Statistical properties Computational complexity Maximum parsimony Maximum likelihood Experimental design 2 Statistical properties Hypothesis testing Maximum parsimony Tree support Maximum likelihood Tests of topology Tests of models Experimental design 3 Hypothesis testing Tree support Tests of topology Tests of models c N. Salamin Sept 2007 Description Lecture outline Maximum likelihood in phylogenetics Definition Maximum likelihood Given an hypothesis H and some data D, the likelihood of H is and models Likelihood of a tree Computational L(H) = Prob(D H) = Prob(D1 H)Prob(D2 H) Prob(Dn H) complexity | | | · · · | Statistical properties if the D can be split in n independent parts.
    [Show full text]
  • PHYLOGENOMIC CONFLICT in HYLARANA 1 Exons, Introns, And
    bioRxiv preprint doi: https://doi.org/10.1101/765610; this version posted September 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. PHYLOGENOMIC CONFLICT IN HYLARANA 1 Exons, Introns, and UCEs Reveal Conflicting Phylogenomic Signals in a Rapid 2 Radiation of Frogs (Ranidae: Hylarana) 3 4 Kin Onn Chan1,2,*, Carl R. Hutter2, Perry L. Wood, Jr.3, L. Lee Grismer4, Rafe M. 5 Brown2 6 7 1 Lee Kong Chian National History Museum, Faculty of Science, National University of 8 Singapore, 2 Conservatory Drive, Singapore 117377. Email: [email protected] 9 10 2 Biodiversity Institute and Department of Ecology and Evolutionary Biology, University 11 of Kansas, Lawrence, KS 66045, USA. Email: [email protected]; [email protected] 12 13 3 Department of Biological Sciences & Museum of Natural History, Auburn University, 14 Auburn, Alabama 36849, USA. Email: [email protected] 15 16 4 Herpetology Laboratory, Department of Biology, La Sierra University, 4500 Riverwalk 17 Parkway, Riverside, California 92505, USA. Email: [email protected] 18 19 *Corresponding author 20 1 bioRxiv preprint doi: https://doi.org/10.1101/765610; this version posted September 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
    [Show full text]
  • Hierarchical Phylogeny Construction
    Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2019 Hierarchical phylogeny construction Anindya Das Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Bioinformatics Commons, and the Computer Sciences Commons Recommended Citation Das, Anindya, "Hierarchical phylogeny construction" (2019). Graduate Theses and Dissertations. 17433. https://lib.dr.iastate.edu/etd/17433 This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Hierarchical phylogeny construction by Anindya Das A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Major: Computer Science Program of Study Committee: Xiaoqiu Huang, Major Professor David Fernandez-Baca Oliver Eulenstein Peng Liu Dennis V Lavrov The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this dissertation. The Graduate College will ensure this dissertation is globally accessible and will not permit alterations after a degree is conferred. Iowa State University Ames, Iowa 2019 Copyright c Anindya Das, 2019. All rights reserved. ii DEDICATION I would like to dedicate this dissertation to my wife Soma and to my parents without whose support I would not have been able to complete this work. I would also like to thank my friends and family for their loving guidance and continuous encouragement and assistance during the writing of this work.
    [Show full text]
  • Molecular Phylogeny and Historical Biogeography of the Land Snail Genus Solatopupa (Pulmonata) in the Peri-Tyrrhenian Area
    Molecular Phylogenetics and Evolution 39 (2006) 439–451 www.elsevier.com/locate/ympev Molecular phylogeny and historical biogeography of the land snail genus Solatopupa (Pulmonata) in the peri-Tyrrhenian area Valerio Ketmaier a,b,c,¤, Folco Giusti d, Adalgisa Caccone b,e a Unit of Evolutionary Biology/Systematic Zoology, Institute of Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, D-14476 Potsdam, Germany b Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520-8106, USA c Dipartimento di Biologia Animale e dell’Uomo, Università di Roma “La Sapienza,” V.le dell’Università 32, I-00185 Rome, Italy d Dipartimento di Scienze Ambientali “G.Sarfatti,” Università di Siena, V.Mattioli 4, I-53100 Siena, Italy e Yale Institute of Biospheric Studies, Yale University, New Haven, CT 06520-8106, USA Received 8 August 2005; revised 6 December 2005; accepted 8 December 2005 Available online 25 January 2006 Abstract The land snail genus Solatopupa consists of six species and has a peri-Tyrrhenian distribution; most of the species have a very narrow range and all of them except one (Solatopupa cianensis, which inhabits porphyritic rocks) are strictly bound to calcareous substrates. One species (Solatopupa guidoni) is limited to Sardinia, Corsica, and Elba Island. Because the potential for dispersal of these snails is low, the insular range of this species has been traditionally related to the Oligocenic detachment of the Sardinia–Corsica microplate from the Ibe- rian plate and its subsequent rotation towards the Italian peninsula. In this study, we used sequences of three mitochondrial and one nuclear gene to reconstruct the evolutionary history of the genus.
    [Show full text]
  • Deep Diversification and Long-Term Persistence in the South American ‘Dry Diagonal’: Integrating Continent-Wide Phylogeography and Distribution Modeling of Geckos
    ORIGINAL ARTICLE doi:10.1111/j.1558-5646.2012.01682.x DEEP DIVERSIFICATION AND LONG-TERM PERSISTENCE IN THE SOUTH AMERICAN ‘DRY DIAGONAL’: INTEGRATING CONTINENT-WIDE PHYLOGEOGRAPHY AND DISTRIBUTION MODELING OF GECKOS Fernanda P. Werneck,1,2 Tony Gamble,3,4 Guarino R. Colli,5 Miguel T. Rodrigues,6 and Jack W. Sites, Jr1,7 1Department of Biology, Brigham Young University, Provo, Utah 84602 2E-mail: [email protected] 3Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota 55445 4Bell Museum of Natural History, University of Minnesota, St. Paul, Minnesota 55108 5Departamento de Zoologia, Universidade de Brasılia,´ 70910–900 Brasılia,´ DF, Brazil 6Departamento de Zoologia, Instituto de Biociencias,ˆ Universidade de Sao˜ Paulo, 05508–090 Sao˜ Paulo, SP, Brazil 7Bean Life Science Museum, Brigham Young University, Provo, Utah 84602 Received January 7, 2012 Accepted April 19, 2012 The relative influence of Neogene geomorphological events and Quaternary climatic changes as causal mechanisms on Neotropical diversification remains largely speculative, as most divergence timing inferences are based on a single locus and have limited taxonomic or geographic sampling. To investigate these influences, we use a multilocus (two mitochondrial and 11 nuclear genes) range-wide sampling of Phyllopezus pollicaris, a gecko complex widely distributed across the poorly studied South American ‘dry diagonal’ biomes. Our approach couples traditional and model-based phylogeography with geospatial methods, and demonstrates
    [Show full text]
  • Mutation Patterns of Mitochondrial H- and L-Strand DNA in Closely Related Cyprinid Fishes
    Copyright 2002 by the Genetics Society of America Mutation Patterns of Mitochondrial H- and L-Strand DNA in Closely Related Cyprinid Fishes Joseph P. Bielawski*,†,1 and John R. Gold* *Center for Biosystematics and Biodiversity, Texas A&M University, College Station, Texas 77843-2258 and †Department of Biology, University College London, London WC1E 6BT, United Kingdom Manuscript received January 14, 2002 Accepted for publication April 19, 2002 ABSTRACT Mitochondrial genome replication is asymmetric. Replication starts from the origin of heavy (H)-strand replication, displacing the parental H-strand as it proceeds along the molecule. The H-strand remains single stranded until light (L)-strand replication is initiated from a second origin of replication. It has been suggested that single-stranded H-strand DNA is more sensitive to mutational damage, giving rise to substitutional rate differences between the two strands and among genes in mammalian mitochondrial DNA. In this study, we analyzed sequences of the cytochrome b, ND4, ND4L, and COI genes of cyprinid fishes to investigate rates and patterns of nucleotide substitution in the mitochondrial genome. To test for strand-asymmetric mutation pressure, a likelihood-ratio test was developed and applied to the cyprinid sequences. Patterns of substitution and levels of strand-asymmetric mutation pressure were largely consistent with a mutation gradient between the H- and L-strand origins of replication. Significant strand bias was observed among rates of transitional substitution. However, biological interpretation of the direction and strength of strand asymmetry for specific classes of substitutions is problematic. The problem occurs because the rate of any single class of substitution inferred from one strand is actually a sum of rates on two strands.
    [Show full text]
  • Fast Distance-Based Phylogenetic Placement
    bioRxiv preprint doi: https://doi.org/10.1101/475566; this version posted November 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. APPLES: Fast Distance-based Phylogenetic Placement Metin Balaban,1 Shahab Sarmashghi,2 and Siavash Mirarab2* 1Bioinformatics and Systems Biology Graduate Program, UC San Diego, CA 92093, USA 2Department of Electrical and Computer Engineering, UC San Diego, CA 92093, USA *Corresponding author: [email protected] Abstract Phylogenetic placement consists of adding a query species onto an existing phylogeny and has increasing relevance as sequence datasets continue to grow in size and diversity. Placement is useful for updating existing phylogenies and for identifying samples taxonomically using (meta-)barcoding or metagenomics. Maximum likelihood (ML) methods of phylogenetic placement exist, but these methods are not scalable to trees with many thousands of leaves. They also rely on assembled and aligned sequences for the reference tree and the query and thus cannot analyze unassembled reads used recently in applications such as genome skimming. Here, we introduce APPLES, a distance-based method of phylogenetic placement that improves on ML by more than an order of magnitude in speed and memory and comes very close to ML in accuracy. APPLES has better accuracy than ML for placing on trees with thousands of species and can place on trees with a hundred thousands species where ML cannot run. Finally, APPLES can accurately identify samples without assembled sequences for the reference or the query using k-mer-based distances, a scenario that ML cannot handle.
    [Show full text]