Phylogeny Reconstruction

Total Page:16

File Type:pdf, Size:1020Kb

Phylogeny Reconstruction Phylogeny reconstruction Eva Stukenbrock & Julien Dutheil [email protected] [email protected] Max Planck Institute for Terrestrial Microbiology – Marburg 19 Janvier 2011 Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 1 / 30 Infer the history of taxa Taxonomy Phylogeography Necessary to comparative analysis, as species are not “independent” Why studying phylogenetics? Phylogenetics The study of the evolution of (taxonomic) groups of organisms (e.g. species, populations). Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 2 / 30 Why studying phylogenetics? Phylogenetics The study of the evolution of (taxonomic) groups of organisms (e.g. species, populations). Infer the history of taxa Taxonomy Phylogeography Necessary to comparative analysis, as species are not “independent” Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 2 / 30 Some applications: Resolving orthology and paralogy Datation: estimate divergence times Reconstruct ancestral sequences Exhibit sites under positive selection Predict the structure of a molecule. Phylogenetics applications Phylogenetic tree A graph depicting the ancestor-descendant relationships between organisms or gene sequences. The sequences are the tips of the tree. Branches of the tree connect the tips to their (unobservable) ancestral sequences (Holder 2003) Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 3 / 30 Phylogenetics applications Phylogenetic tree A graph depicting the ancestor-descendant relationships between organisms or gene sequences. The sequences are the tips of the tree. Branches of the tree connect the tips to their (unobservable) ancestral sequences (Holder 2003) Some applications: Resolving orthology and paralogy Datation: estimate divergence times Reconstruct ancestral sequences Exhibit sites under positive selection Predict the structure of a molecule. Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 3 / 30 3 leaves 4 leaves A A C A 1 B C C D B B 2 D A A B B C D C D With branch lengths: With clock: A A B B (2n − 5)! C C 2n−3(n − 2)! D D distinct unrooted trees What is a tree? 2 leaves A B Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 4 / 30 4 leaves A C A 1 B C D B 2 D A A B B C D C D With branch lengths: With clock: A A B B (2n − 5)! C C 2n−3(n − 2)! D D distinct unrooted trees What is a tree? 2 leaves 3 leaves A A B C B Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 4 / 30 A 1 B C D 2 A B C D With branch lengths: With clock: A A B B (2n − 5)! C C 2n−3(n − 2)! D D distinct unrooted trees What is a tree? 2 leaves 3 leaves 4 leaves A A C A B C B B D A B C D Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 4 / 30 A 1 B C D 2 A B C D With branch lengths: With clock: A A B B C C D D What is a tree? 2 leaves 3 leaves 4 leaves A A C A B C B B D A B C D (2n − 5)! 2n−3(n − 2)! distinct unrooted trees Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 4 / 30 2 A B C D With branch lengths: With clock: A A B B C C D D What is a tree? 2 leaves 3 leaves 4 leaves A A C A 1 B A B C C D B B D A B C D (2n − 5)! 2n−3(n − 2)! distinct unrooted trees Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 4 / 30 With branch lengths: With clock: A A B B C C D D What is a tree? 2 leaves 3 leaves 4 leaves A A C A 1 B A B C C D B B 2 D A A B B C D C D (2n − 5)! 2n−3(n − 2)! distinct unrooted trees Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 4 / 30 With clock: A B C D What is a tree? 2 leaves 3 leaves 4 leaves A A C A 1 B A B C C D B B 2 D A A B B C D C D With branch lengths: A B (2n − 5)! C 2n−3(n − 2)! D distinct unrooted trees Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 4 / 30 What is a tree? 2 leaves 3 leaves 4 leaves A A C A 1 B A B C C D B B 2 D A A B B C D C D With branch lengths: With clock: A A B B (2n − 5)! C C 2n−3(n − 2)! D D distinct unrooted trees Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 4 / 30 Taxon 2 Taxon 1 Site Taxon 1 AAGACATGTGGCA Taxon 2 AGGAC-TGTGGCA Taxon 3 AGTAC-TGTGA-A Taxon 3 Taxon 4 AGCAC-TGTG--T Taxon 5 AGCACATGTGA-A Taxon 5 Taxon 4 Aligned homologous positions: sites Each site is a realization of a random variable From sequences to trees Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 5 / 30 Taxon 2 Taxon 1 Site Taxon 3 Taxon 5 Taxon 4 Aligned homologous positions: sites Each site is a realization of a random variable From sequences to trees Taxon 1 AAGACATGTGGCA Taxon 2 AGGAC-TGTGGCA Taxon 3 AGTAC-TGTGA-A Taxon 4 AGCAC-TGTG--T Taxon 5 AGCACATGTGA-A Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 5 / 30 Site Aligned homologous positions: sites Each site is a realization of a random variable From sequences to trees Taxon 2 Taxon 1 Taxon 1 AAGACATGTGGCA Taxon 2 AGGAC-TGTGGCA Taxon 3 AGTAC-TGTGA-A Taxon 3 Taxon 4 AGCAC-TGTG--T Taxon 5 AGCACATGTGA-A Taxon 5 Taxon 4 Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 5 / 30 From sequences to trees Taxon 2 Taxon 1 Site Taxon 1 AAGACATGTGGCA Taxon 2 AGGAC-TGTGGCA Taxon 3 AGTAC-TGTGA-A Taxon 3 Taxon 4 AGCAC-TGTG--T Taxon 5 AGCACATGTGA-A Taxon 5 Taxon 4 Aligned homologous positions: sites Each site is a realization of a random variable Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 5 / 30 The phenetic approach Uses an overall similarity measure (commonly used in morphology) Consider that the similarity is (inversely) correlated to the evolutionary distance Build a tree from a pairwise distance matrix: T1 T2 T3 T4 T5 T1 0 T2 2 0 T3 3.5 4.5 0 T4 8 10 8 0 T5 6 8 9 3 0 Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 6 / 30 (Assumes that the rate of change is constant over time: molecular clock) 2 Gather the corresponding groups 3 Recompute distances from the new group: 1.0 T1 d + d 1.0 d = i;k j;k ij;k 1.0 2 2.125 T2 2.0 T3 1.5 T4 2.625 1.5 T5 The WPGMA clustering method Weighted Pair Group Method using Average T1 T2 T3 T4 T5 1 Pick the smallest distance T1 0 T2 2 0 T3 3.5 4.5 0 T4 8 10 8 0 T5 6 8 9 3 0 Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 7 / 30 (Assumes that the rate of change is constant over time: molecular clock) 1 Pick the smallest distance 3 Recompute distances from the new group: di;k + dj;k 1.0 dij;k = 2 2.125 2.0 T3 1.5 T4 2.625 1.5 T5 The WPGMA clustering method Weighted Pair Group Method using Average T1 T2 T3 T4 T5 T1 0 2 Gather the corresponding T2 2 0 groups T3 3.5 4.5 0 T4 8 10 8 0 T5 6 8 9 3 0 1.0 T1 1.0 T2 Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 7 / 30 1 Pick the smallest distance 2 Gather the corresponding groups 1.0 2.125 2.0 T3 1.5 T4 2.625 1.5 T5 The WPGMA clustering method Weighted Pair Group Method using Average T12 T3 T4 T5 T12 0 T3 4 0 T4 9 8 0 3 Recompute distances from the T5 7 9 3 0 new group: 1.0 T1 di;k + dj;k dij;k = 2 1.0 T2 (Assumes that the rate of change is constant over time: molecular clock) Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 7 / 30 (Assumes that the rate of change is constant over time: molecular clock) 2 Gather the corresponding groups 3 Recompute distances from the new group: di;k + dj;k 1.0 dij;k = 2 2.125 2.0 T3 1.5 T4 2.625 1.5 T5 The WPGMA clustering method Weighted Pair Group Method using Average T12 T3 T4 T5 1 Pick the smallest distance T12 0 T3 4 0 T4 9 8 0 T5 7 9 3 0 1.0 T1 1.0 T2 Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 7 / 30 (Assumes that the rate of change is constant over time: molecular clock) 1 Pick the smallest distance 3 Recompute distances from the new group: di;k + dj;k 1.0 dij;k = 2 2.125 2.0 T3 2.625 The WPGMA clustering method Weighted Pair Group Method using Average T12 T3 T4 T5 T12 0 2 Gather the corresponding T3 4 0 groups T4 9 8 0 T5 7 9 3 0 1.0 T1 1.0 T2 1.5 T4 1.5 T5 Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 7 / 30 1 Pick the smallest distance 2 Gather the corresponding groups 1.0 2.125 (Assumes that the rate of 2.0 T3 change is constant over time: molecular clock) 2.625 The WPGMA clustering method Weighted Pair Group Method using Average T12 T3 T45 T12 0 T3 4 0 T45 8 8.5 0 3 Recompute distances from the new group: 1.0 T1 di;k + dj;k dij;k = 2 1.0 T2 1.5 T4 1.5 T5 Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 7 / 30 (Assumes that the rate of change is constant over time: molecular clock) 2 Gather the corresponding groups 3 Recompute distances from the new group: di;k + dj;k 1.0 dij;k = 2 2.125 2.0 T3 2.625 The WPGMA clustering method Weighted Pair Group Method using Average 1 Pick the smallest distance T12 T3 T45 T12 0 T3 4 0 T45 8 8.5 0 1.0 T1 1.0 T2 1.5 T4 1.5 T5 Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 7 / 30 (Assumes that the rate of change is constant over time: molecular clock) 1 Pick the smallest distance 3 Recompute distances from the new group: di;k + dj;k dij;k = 2 2.125 2.625 The WPGMA clustering method Weighted Pair Group Method using Average T12 T3 T45 2 Gather the corresponding T12 0 groups T3 4 0 T45 8 8.5 0 1.0 T1 1.0 1.0 T2 2.0 T3 1.5 T4 1.5 T5 Stukenbrock EH & Dutheil JY (MPItM) Phylogeny reconstruction 19 Janvier 2011 7 / 30 1 Pick the smallest
Recommended publications
  • Species Concepts Should Not Conflict with Evolutionary History, but Often Do
    ARTICLE IN PRESS Stud. Hist. Phil. Biol. & Biomed. Sci. xxx (2008) xxx–xxx Contents lists available at ScienceDirect Stud. Hist. Phil. Biol. & Biomed. Sci. journal homepage: www.elsevier.com/locate/shpsc Species concepts should not conflict with evolutionary history, but often do Joel D. Velasco Department of Philosophy, University of Wisconsin-Madison, 5185 White Hall, 600 North Park St., Madison, WI 53719, USA Department of Philosophy, Building 90, Stanford University, Stanford, CA 94305, USA article info abstract Keywords: Many phylogenetic systematists have criticized the Biological Species Concept (BSC) because it distorts Biological Species Concept evolutionary history. While defences against this particular criticism have been attempted, I argue that Phylogenetic Species Concept these responses are unsuccessful. In addition, I argue that the source of this problem leads to previously Phylogenetic Trees unappreciated, and deeper, fatal objections. These objections to the BSC also straightforwardly apply to Taxonomy other species concepts that are not defined by genealogical history. What is missing from many previous discussions is the fact that the Tree of Life, which represents phylogenetic history, is independent of our choice of species concept. Some species concepts are consistent with species having unique positions on the Tree while others, including the BSC, are not. Since representing history is of primary importance in evolutionary biology, these problems lead to the conclusion that the BSC, along with many other species concepts, are unacceptable. If species are to be taxa used in phylogenetic inferences, we need a history- based species concept. Ó 2008 Elsevier Ltd. All rights reserved. When citing this paper, please use the full journal title Studies in History and Philosophy of Biological and Biomedical Sciences 1.
    [Show full text]
  • Phylogenetic Comparative Methods: a User's Guide for Paleontologists
    Phylogenetic Comparative Methods: A User’s Guide for Paleontologists Laura C. Soul - Department of Paleobiology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA David F. Wright - Division of Paleontology, American Museum of Natural History, Central Park West at 79th Street, New York, New York 10024, USA and Department of Paleobiology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA Abstract. Recent advances in statistical approaches called Phylogenetic Comparative Methods (PCMs) have provided paleontologists with a powerful set of analytical tools for investigating evolutionary tempo and mode in fossil lineages. However, attempts to integrate PCMs with fossil data often present workers with practical challenges or unfamiliar literature. In this paper, we present guides to the theory behind, and application of, PCMs with fossil taxa. Based on an empirical dataset of Paleozoic crinoids, we present example analyses to illustrate common applications of PCMs to fossil data, including investigating patterns of correlated trait evolution, and macroevolutionary models of morphological change. We emphasize the importance of accounting for sources of uncertainty, and discuss how to evaluate model fit and adequacy. Finally, we discuss several promising methods for modelling heterogenous evolutionary dynamics with fossil phylogenies. Integrating phylogeny-based approaches with the fossil record provides a rigorous, quantitative perspective to understanding key patterns in the history of life. 1. Introduction A fundamental prediction of biological evolution is that a species will most commonly share many characteristics with lineages from which it has recently diverged, and fewer characteristics with lineages from which it diverged further in the past. This principle, which results from descent with modification, is one of the most basic in biology (Darwin 1859).
    [Show full text]
  • Lineages, Splits and Divergence Challenge Whether the Terms Anagenesis and Cladogenesis Are Necessary
    Biological Journal of the Linnean Society, 2015, , – . With 2 figures. Lineages, splits and divergence challenge whether the terms anagenesis and cladogenesis are necessary FELIX VAUX*, STEVEN A. TREWICK and MARY MORGAN-RICHARDS Ecology Group, Institute of Agriculture and Environment, Massey University, Palmerston North, New Zealand Received 3 June 2015; revised 22 July 2015; accepted for publication 22 July 2015 Using the framework of evolutionary lineages to separate the process of evolution and classification of species, we observe that ‘anagenesis’ and ‘cladogenesis’ are unnecessary terms. The terms have changed significantly in meaning over time, and current usage is inconsistent and vague across many different disciplines. The most popular definition of cladogenesis is the splitting of evolutionary lineages (cessation of gene flow), whereas anagenesis is evolutionary change between splits. Cladogenesis (and lineage-splitting) is also regularly made synonymous with speciation. This definition is misleading as lineage-splitting is prolific during evolution and because palaeontological studies provide no direct estimate of gene flow. The terms also fail to incorporate speciation without being arbitrary or relative, and the focus upon lineage-splitting ignores the importance of divergence, hybridization, extinction and informative value (i.e. what is helpful to describe as a taxon) for species classification. We conclude and demonstrate that evolution and species diversity can be considered with greater clarity using simpler, more transparent terms than anagenesis and cladogenesis. Describing evolution and taxonomic classification can be straightforward, and there is no need to ‘make words mean so many different things’. © 2015 The Linnean Society of London, Biological Journal of the Linnean Society, 2015, 00, 000–000.
    [Show full text]
  • Comparative Phylogeography As an Integrative Approach to Historical Biogeography
    Journal of Biogeography, 28, 819±825 Comparative phylogeography as an integrative approach to historical biogeography ABSTRACT GUEST EDITORIAL Phylogeography has become a powerful approach for elucidating contemporary geographical patterns of evolutionary subdivision within species and species complexes. A recent extension of this approach is the comparison of phylogeographic patterns of multiple co-distributed taxonomic groups, or `comparative phylogeography.' Recent comparative phylogeographic studies have revealed pervasive and previously unrecognized biogeographic patterns which suggest that vicariance has played a more important role in the historical development of modern biotic assemblages than current taxonomy would indicate. Despite the utility of comparative phylogeography for uncovering such `cryptic vicariance', this approach has yet to be embraced by some researchers as a valuable complement to other approaches to historical biogeography. We address here some of the common misconceptions surrounding comparative phylogeography, provide an example of this approach based on the boreal mammal fauna of North America, and argue that together with other approaches, comparative phylogeography can contribute importantly to our understanding of the relationship between earth history and biotic diversi®cation. Keywords Area cladistics, comparative phylogeography, historical biogeography, vicariance. INTRODUCTION In a recent guest editorial Humphries (2000) presented an overview of historical biogeography. He concentrated largely on comparisons
    [Show full text]
  • Statistical Inference for the Linguistic and Non-Linguistic Past
    Statistical inference for the linguistic and non-linguistic past Igor Yanovich DFG Center for Advanced Study “Words, Bones, Genes and Tools” Universität Tübingen July 6, 2017 Igor Yanovich 1 / 46 Overview of the course Overview of the course 1 Today: trees, as a description and as a process 2 Classes 2-3: simple inference of language-family trees 3 Classes 4-5: computational statistical inference of trees and evolutionary parameters 4 Class 6: histories of languages and of genes 5 Class 7: simple spatial statistics 6 Class 8: regression taking into account linguistic relationships; synthesis of the course Igor Yanovich 2 / 46 Overview of the course Learning outcomes By the end of the course, you should be able to: 1 read and engage with the current literature in linguistic phylogenetics and in spatial statistics for linguistics 2 run phylogenetic and basic spatial analyses on linguistic data 3 proceed further in the subject matter on your own, towards further advances in the field Igor Yanovich 3 / 46 Overview of the course Today’s class 1 Overview of the course 2 Language families and their structures 3 Trees as classifications and as process depictions 4 Linguistic phylogenetics 5 Worries about phylogenetics in linguistics vs. biology 6 Quick overview of the homework 7 Summary of Class 1 Igor Yanovich 4 / 46 Language families and their structures Language families and their structures Igor Yanovich 5 / 46 Language families and their structures Dravidian A modification of [Krishnamurti, 2003, Map 1.1], from Wikipedia Igor Yanovich 6 / 46 Language families and their structures Dravidian [Krishnamurti, 2003]’s classification: Dravidian family South Dravidian Central Dravidian North Dravidian SD I SD II Tamil Malay¯al.am Kannad.a Telugu Kolami Brahui (70M) (38M) (40M) (75M) (0.1M) (4M) (Numbers of speakers from Wikipedia) Igor Yanovich 7 / 46 Language families and their structures Dravidian: similarities and differences Proto-Dr.
    [Show full text]
  • University of Florida Thesis Or Dissertation Formatting
    TOOLS FOR BIODIVERSITY ANALYSES USING NATURAL HISTORY COLLECTIONS AND REPOSITORIES: DATA MINING, MACHINE LEARNING AND PHYLODIVERSITY By CHANDRA EARL A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2020 1 . © 2020 Chandra Earl 2 . ACKNOWLEDGMENTS I thank my co-chairs and members of my supervisory committee for their mentoring and generous support, my collaborators and colleagues for their input and support and my parents and siblings for their loving encouragement and interest. 3 . TABLE OF CONTENTS page ACKNOWLEDGMENTS .................................................................................................. 3 LIST OF TABLES ............................................................................................................ 5 LIST OF FIGURES .......................................................................................................... 6 ABSTRACT ..................................................................................................................... 8 CHAPTER 1 INTRODUCTION ...................................................................................................... 9 2 GENEDUMPER: A TOOL TO BUILD MEGAPHYLOGENIES FROM GENBANK DATA ...................................................................................................................... 12 Materials and Methods...........................................................................................
    [Show full text]
  • Resolving Postglacial Phylogeography Using High-Throughput Sequencing
    Resolving postglacial phylogeography using high-throughput sequencing Kevin J. Emerson1, Clayton R. Merz, Julian M. Catchen, Paul A. Hohenlohe, William A. Cresko, William E. Bradshaw, and Christina M. Holzapfel Center for Ecology and Evolutionary Biology, University of Oregon, Eugene, OR 97403-5289 Edited by David L. Denlinger, Ohio State University, Columbus, OH, and approved August 4, 2010 (received for review May 11, 2010) The distinction between model and nonmodel organisms is becom- not useful for determining the genetic similarity in closely related ing increasingly blurred. High-throughput, second-generation se- populations of nonmodel species or species for which whole-genome quencing approaches are being applied to organisms based on their resequencing is not yet possible. interesting ecological, physiological, developmental, or evolutionary Phylogeography and phylogenetics have recently benefited from properties and not on the depth of genetic information available for genome-wide SNP detection methods to elucidate patterns of fi them. Here, we illustrate this point using a low-cost, ef cient tech- variation in model taxa or their close relatives (7, 13). The limiting fi nique to determine the ne-scale phylogenetic relationships among step in the applicability of multilocus datasets in nonmodel organ- recently diverged populations in a species. This application of restric- isms has been generating the genetic markers to be used (14). tion site-associated DNA tags (RAD tags) reveals previously unre- solved genetic structure and direction of evolution in the pitcher Baird et al. (4) developed a second generation sequencing ap- plant mosquito, Wyeomyia smithii, from a southern Appalachian proach that allowed for the simultaneous discovery and typing of Mountain refugium following recession of the Laurentide Ice Sheet thousands of SNPs throughout the genome (5).
    [Show full text]
  • Learning the “Game” of Life: Reconstruction of Cell Lineages Through Crispr-Induced Dna Mutations
    LEARNING THE “GAME” OF LIFE: RECONSTRUCTION OF CELL LINEAGES THROUGH CRISPR-INDUCED DNA MUTATIONS JIAXIAO CAI, DA KUANG, ARUN KIRUBARAJAN, MUKUND VENKATESWARAN ABSTRACT. Multi-cellular organisms are composed of many billions of individuals cells that exist by the mutation of a single stem cell. The CRISPR-based molecular-tools induce DNA mutations, which allows us to study the mutation lineages of various multi-ceulluar organisms. However, to date no lineage reconstruction algorithms have been examined for their performance/robustness across various molecular tools, datasets and sizes of lineage trees. In addition, it is unclear whether classical machine learning algorithms, deep neural networks or some combination of the two are the best approach for this task. In this paper, we introduce 1) a simulation framework for the zygote development process to achieve a dataset size required by deep learning models, and 2) various supervised and unsupervised approaches for cell mutation tree reconstruction. In particular, we show how deep generative models (Autoencoders and Variational Autoencoders), unsupervised clustering methods (K-means), and classical tree reconstruction algorithms (UPGMA) can all be used to trace cell mutation lineages. 1. INTRODUCTION Multi-cellular organisms are composed of billions or trillions of different interconnected cells that derive from a single cell through repeated rounds of cell division — knowing the cell lineage that produces a fully developed organism from a single cell provides the framework for understanding when, where, and how cell fate decisions are made. The recent advent of new CRISPR-based molecular tools synthetically induced DNA mutations and made it possible to solve lineage of complex model organisms at single-cell resolution.[1] Starting in early embryogenesis, CRISPR-induced mutations occur stochastically at introduced target sites, and these mutations are stably inherited by the offspring of these cells and immune to further change.
    [Show full text]
  • Math/C SC 5610 Computational Biology Lecture 12: Phylogenetics
    Math/C SC 5610 Computational Biology Lecture 12: Phylogenetics Stephen Billups University of Colorado at Denver Math/C SC 5610Computational Biology – p.1/25 Announcements Project Guidelines and Ideas are posted. (proposal due March 8) CCB Seminar, Friday (Mar. 4) Speaker: Jack Horner, SAIC Title: Phylogenetic Methods for Characterizing the Signature of Stage I Ovarian Cancer in Serum Protein Mas Time: 11-12 (Followed by lunch) Place: Media Center, AU008 Math/C SC 5610Computational Biology – p.2/25 Outline Distance based methods for phylogenetics UPGMA WPGMA Neighbor-Joining Character based methods Maximum Likelihood Maximum Parsimony Math/C SC 5610Computational Biology – p.3/25 Review: Distance Based Clustering Methods Main Idea: Requires a distance matrix D, (defining distances between each pair of elements). Repeatedly group together closest elements. Different algorithms differ by how they treat distances between groups. UPGMA (unweighted pair group method with arithmetic mean). WPGMA (weighted pair group method with arithmetic mean). Math/C SC 5610Computational Biology – p.4/25 UPGMA 1. Initialize C to the n singleton clusters f1g; : : : ; fng. 2. Initialize dist(c; d) on C by defining dist(fig; fjg) = D(i; j): 3. Repeat n ¡ 1 times: (a) determine pair c; d of clusters in C such that dist(c; d) is minimal; define dmin = dist(c; d). (b) define new cluster e = c S d; update C = C ¡ fc; dg Sfeg. (c) define a node with label e and daughters c; d, where e has distance dmin=2 to its leaves. (d) define for all f 2 C with f 6= e, dist(c; f) + dist(d; f) dist(e; f) = dist(f; e) = : (avg.
    [Show full text]
  • S41598-021-95872-0.Pdf
    www.nature.com/scientificreports OPEN Phylogeography, colouration, and cryptic speciation across the Indo‑Pacifc in the sea urchin genus Echinothrix Simon E. Coppard1,2*, Holly Jessop1 & Harilaos A. Lessios1 The sea urchins Echinothrix calamaris and Echinothrix diadema have sympatric distributions throughout the Indo‑Pacifc. Diverse colour variation is reported in both species. To reconstruct the phylogeny of the genus and assess gene fow across the Indo‑Pacifc we sequenced mitochondrial 16S rDNA, ATPase‑6, and ATPase‑8, and nuclear 28S rDNA and the Calpain‑7 intron. Our analyses revealed that E. diadema formed a single trans‑Indo‑Pacifc clade, but E. calamaris contained three discrete clades. One clade was endemic to the Red Sea and the Gulf of Oman. A second clade occurred from Malaysia in the West to Moorea in the East. A third clade of E. calamaris was distributed across the entire Indo‑Pacifc biogeographic region. A fossil calibrated phylogeny revealed that the ancestor of E. diadema diverged from the ancestor of E. calamaris ~ 16.8 million years ago (Ma), and that the ancestor of the trans‑Indo‑Pacifc clade and Red Sea and Gulf of Oman clade split from the western and central Pacifc clade ~ 9.8 Ma. Time since divergence and genetic distances suggested species level diferentiation among clades of E. calamaris. Colour variation was extensive in E. calamaris, but not clade or locality specifc. There was little colour polymorphism in E. diadema. Interpreting phylogeographic patterns of marine species and understanding levels of connectivity among popula- tions across the World’s oceans is of increasing importance for informed conservation decisions 1–3.
    [Show full text]
  • Understanding the Processes Underpinning Patterns Of
    Opinion Understanding the Processes Underpinning Patterns of Phylogenetic Regionalization 1,2, 3,4 1 Barnabas H. Daru, * Tammy L. Elliott, Daniel S. Park, and 5,6 T. Jonathan Davies A key step in understanding the distribution of biodiversity is the grouping of regions based on their shared elements. Historically, regionalization schemes have been largely species centric. Recently, there has been interest in incor- porating phylogenetic information into regionalization schemes. Phylogenetic regionalization can provide novel insights into the mechanisms that generate, distribute, and maintain biodiversity. We argue that four processes (dispersal limitation, extinction, speciation, and niche conservatism) underlie the forma- tion of species assemblages into phylogenetically distinct biogeographic units. We outline how it can be possible to distinguish among these processes, and identify centers of evolutionary radiation, museums of diversity, and extinction hotspots. We suggest that phylogenetic regionalization provides a rigorous and objective classification of regional diversity and enhances our knowledge of biodiversity patterns. Biogeographical Regionalization in a Phylogenetic Era 1 Department of Organismic and Biogeographical boundaries delineate the basic macrounits of diversity in biogeography, Evolutionary Biology, Harvard conservation, and macroecology. Their location and composition of species on either side of University, Cambridge, MA 02138, these boundaries can reflect the historical processes that have shaped the present-day USA th 2 Department of Plant Sciences, distribution of biodiversity [1]. During the early 19 century, de Candolle [2] created one of the University of Pretoria, Private Bag first global geographic regionalization schemes for plant diversity based on both ecological X20, Hatfield 0028, Pretoria, South and historical information. This was followed by Sclater [3], who defined six zoological regions Africa 3 Department of Biological Sciences, based on the global distribution of birds.
    [Show full text]
  • Phylogenetics
    Phylogenetics What is phylogenetics? • Study of branching patterns of descent among lineages • Lineages – Populations – Species – Molecules • Shift between population genetics and phylogenetics is often the species boundary – Distantly related populations also show patterning – Patterning across geography What is phylogenetics? • Goal: Determine and describe the evolutionary relationships among lineages – Order of events – Timing of events • Visualization: Phylogenetic trees – Graph – No cycles Phylogenetic trees • Nodes – Terminal – Internal – Degree • Branches • Topology Phylogenetic trees • Rooted or unrooted – Rooted: Precisely 1 internal node of degree 2 • Node that represents the common ancestor of all taxa – Unrooted: All internal nodes with degree 3+ Stephan Steigele Phylogenetic trees • Rooted or unrooted – Rooted: Precisely 1 internal node of degree 2 • Node that represents the common ancestor of all taxa – Unrooted: All internal nodes with degree 3+ Phylogenetic trees • Rooted or unrooted – Rooted: Precisely 1 internal node of degree 2 • Node that represents the common ancestor of all taxa – Unrooted: All internal nodes with degree 3+ • Binary: all speciation events produce two lineages from one • Cladogram: Topology only • Phylogram: Topology with edge lengths representing time or distance • Ultrametric: Rooted tree with time-based edge lengths (all leaves equidistant from root) Phylogenetic trees • Clade: Group of ancestral and descendant lineages • Monophyly: All of the descendants of a unique common ancestor • Polyphyly:
    [Show full text]