Introduction to Phylogenetics Workshop on Molecular Evolution 2018 Marine Biological Lab, Woods Hole, MA
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Lecture Notes: the Mathematics of Phylogenetics
Lecture Notes: The Mathematics of Phylogenetics Elizabeth S. Allman, John A. Rhodes IAS/Park City Mathematics Institute June-July, 2005 University of Alaska Fairbanks Spring 2009, 2012, 2016 c 2005, Elizabeth S. Allman and John A. Rhodes ii Contents 1 Sequences and Molecular Evolution 3 1.1 DNA structure . .4 1.2 Mutations . .5 1.3 Aligned Orthologous Sequences . .7 2 Combinatorics of Trees I 9 2.1 Graphs and Trees . .9 2.2 Counting Binary Trees . 14 2.3 Metric Trees . 15 2.4 Ultrametric Trees and Molecular Clocks . 17 2.5 Rooting Trees with Outgroups . 18 2.6 Newick Notation . 19 2.7 Exercises . 20 3 Parsimony 25 3.1 The Parsimony Criterion . 25 3.2 The Fitch-Hartigan Algorithm . 28 3.3 Informative Characters . 33 3.4 Complexity . 35 3.5 Weighted Parsimony . 36 3.6 Recovering Minimal Extensions . 38 3.7 Further Issues . 39 3.8 Exercises . 40 4 Combinatorics of Trees II 45 4.1 Splits and Clades . 45 4.2 Refinements and Consensus Trees . 49 4.3 Quartets . 52 4.4 Supertrees . 53 4.5 Final Comments . 54 4.6 Exercises . 55 iii iv CONTENTS 5 Distance Methods 57 5.1 Dissimilarity Measures . 57 5.2 An Algorithmic Construction: UPGMA . 60 5.3 Unequal Branch Lengths . 62 5.4 The Four-point Condition . 66 5.5 The Neighbor Joining Algorithm . 70 5.6 Additional Comments . 72 5.7 Exercises . 73 6 Probabilistic Models of DNA Mutation 81 6.1 A first example . 81 6.2 Markov Models on Trees . 87 6.3 Jukes-Cantor and Kimura Models . -
Phylogeny Codon Models • Last Lecture: Poor Man’S Way of Calculating Dn/Ds (Ka/Ks) • Tabulate Synonymous/Non-Synonymous Substitutions • Normalize by the Possibilities
Phylogeny Codon models • Last lecture: poor man’s way of calculating dN/dS (Ka/Ks) • Tabulate synonymous/non-synonymous substitutions • Normalize by the possibilities • Transform to genetic distance KJC or Kk2p • In reality we use codon model • Amino acid substitution rates meet nucleotide models • Codon(nucleotide triplet) Codon model parameterization Stop codons are not allowed, reducing the matrix from 64x64 to 61x61 The entire codon matrix can be parameterized using: κ kappa, the transition/transversionratio ω omega, the dN/dS ratio – optimizing this parameter gives the an estimate of selection force πj the equilibrium codon frequency of codon j (Goldman and Yang. MBE 1994) Empirical codon substitution matrix Observations: Instantaneous rates of double nucleotide changes seem to be non-zero There should be a mechanism for mutating 2 adjacent nucleotides at once! (Kosiol and Goldman) • • Phylogeny • • Last lecture: Inferring distance from Phylogenetic trees given an alignment How to infer trees and distance distance How do we infer trees given an alignment • • Branch length Topology d 6-p E 6'B o F P Edo 3 vvi"oH!.- !fi*+nYolF r66HiH- .) Od-:oXP m a^--'*A ]9; E F: i ts X o Q I E itl Fl xo_-+,<Po r! UoaQrj*l.AP-^PA NJ o - +p-5 H .lXei:i'tH 'i,x+<ox;+x"'o 4 + = '" I = 9o FF^' ^X i! .poxHo dF*x€;. lqEgrE x< f <QrDGYa u5l =.ID * c 3 < 6+6_ y+ltl+5<->-^Hry ni F.O+O* E 3E E-f e= FaFO;o E rH y hl o < H ! E Y P /-)^\-B 91 X-6p-a' 6J. -
A Phylogenetic Analysis of the Basal Ornithischia (Reptilia, Dinosauria)
A PHYLOGENETIC ANALYSIS OF THE BASAL ORNITHISCHIA (REPTILIA, DINOSAURIA) Marc Richard Spencer A Thesis Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements of the degree of MASTER OF SCIENCE December 2007 Committee: Margaret M. Yacobucci, Advisor Don C. Steinker Daniel M. Pavuk © 2007 Marc Richard Spencer All Rights Reserved iii ABSTRACT Margaret M. Yacobucci, Advisor The placement of Lesothosaurus diagnosticus and the Heterodontosauridae within the Ornithischia has been problematic. Historically, Lesothosaurus has been regarded as a basal ornithischian dinosaur, the sister taxon to the Genasauria. Recent phylogenetic analyses, however, have placed Lesothosaurus as a more derived ornithischian within the Genasauria. The Fabrosauridae, of which Lesothosaurus was considered a member, has never been phylogenetically corroborated and has been considered a paraphyletic assemblage. Prior to recent phylogenetic analyses, the problematic Heterodontosauridae was placed within the Ornithopoda as the sister taxon to the Euornithopoda. The heterodontosaurids have also been considered as the basal member of the Cerapoda (Ornithopoda + Marginocephalia), the sister taxon to the Marginocephalia, and as the sister taxon to the Genasauria. To reevaluate the placement of these taxa, along with other basal ornithischians and more derived subclades, a phylogenetic analysis of 19 taxonomic units, including two outgroup taxa, was performed. Analysis of 97 characters and their associated character states culled, modified, and/or rescored from published literature based on published descriptions, produced four most parsimonious trees. Consistency and retention indices were calculated and a bootstrap analysis was performed to determine the relative support for the resultant phylogeny. The Ornithischia was recovered with Pisanosaurus as its basalmost member. -
Phylogenetic Analysis
Phylogenetic Analysis Aristotle • Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus • Swedish botanist (1700s) • Listed all known species • Developed scheme of classification to discover the plan of the Creator 1 Linnaeus’ Main Contributions 1) Hierarchical classification scheme Kingdom: Phylum: Class: Order: Family: Genus: Species 2) Binomial nomenclature Before Linnaeus physalis amno ramosissime ramis angulosis glabris foliis dentoserratis After Linnaeus Physalis angulata (aka Cutleaf groundcherry) 3) Originated the practice of using the ♂ - (shield and arrow) Mars and ♀ - (hand mirror) Venus glyphs as the symbol for male and female. Charles Darwin • Species evolved from common ancestors. • Concept of closely related species being more recently diverged from a common ancestor. Therefore taxonomy might actually represent phylogeny! The phylogeny and classification of life a proposed by Haeckel (1866). 2 Trees - Rooted and Unrooted 3 Trees - Rooted and Unrooted ABCDEFGHIJ A BCDEH I J F G ROOT ROOT D E ROOT A F B H J G C I 4 Monophyletic: A group composed of a collection of organisms, including the most recent common ancestor of all those organisms and all the descendants of that most recent common ancestor. A monophyletic taxon is also called a clade. Paraphyletic: A group composed of a collection of organisms, including the most recent common ancestor of all those organisms. Unlike a monophyletic group, a paraphyletic group does not include all the descendants of the most recent common ancestor. Polyphyletic: A group composed of a collection of organisms in which the most recent common ancestor of all the included organisms is not included, usually because the common ancestor lacks the characteristics of the group. -
Phylogenetic Systematics Using POY Final Final.Book
Dynamic Homology and Phylogenetic Systematics: A Unified Approach Using POY Ward Wheeler Lone Aagesen Division of Invertebrate Zoology Division of Invertebrate Zoology American Museum of Natural History American Museum of Natural History Claudia P. Arango Julián Faivovich Division of Invertebrate Zoology Division of Vertebrate Zoology American Museum of Natural History American Museum of Natural History Taran Grant Cyrille D’Haese Division of Vertebrate Zoology Département Systématique et Evolution American Museum of Natural History Museum National d'Histoire Naturelle Daniel Janies William Leo Smith Department of Biomedical Informatics Division of Vertebrate Zoology The Ohio State University American Museum of Natural History Andrés Varón Gonzalo Giribet Department of Computer Science Museum of Comparative Zoology City University of New York Department of Organismic & Evolutionary Biology Harvard University Published in cooperation with NASA Fundamental Space Biology, the U.S. Army Research Laboratory, and the U.S. Army Research Office Copyright 2006 by the American Museum of Natural History All rights reserved Printed in the United States of America Library of Congress Cataloging-in-Publication Data Dynamic homology and phylogenetic systematics : a unified approach using POY / Ward Wheeler ... [et al.]. p. cm. "Published in cooperation with NASA Fundamental Space Biology." Includes bibliographical references. ISBN 0-913424-58-7 (alk. paper) 1. Homology (Biology) 2. Animals--Classification. 3. Phylogeny. 4. POY. I. Wheeler, Ward. QH367.5.D96 -
Taxonomy and Classification Goals: Un Ders Tan D Traditi Onal and Hi Erarchi Cal Cl Assifi Cati Ons of Biodiversity, and What Information Classifications May Contain
Taxonomy and classification Goals: Un ders tan d tra ditional and hi erarchi cal cl assifi cati ons of biodiversity, and what information classifications may contain. Readings: 1. Chapter 1. Figure 1-1 from Pough et al. Taxonomy and classification (cont ’d) Some new words This is a cladogram. Each branching that are very poiiint is a nod dEhbhe. Each branch, starti ng important: at the node, is a clade. 9 Cladogram 9 Clade 9 Synapomorphy (Shared, derived character) 9 Monophyly; monophyletic 9 PhlParaphyly; parap hlihyletic 9 Polyphyly; polyphyletic Definitions of cladogram on the Web: A dichotomous phylogenetic tree that branches repeatedly, suggesting the classification of molecules or org anisms based on the time sequence in which evolutionary branches arise. xray.bmc.uu.se/~kenth/bioinfo/glossary.html A tree that depicts inferred historical branching relationships among entities. Unless otherwise stated, the depicted branch lengt hs in a cl ad ogram are arbi trary; onl y th e b ranchi ng ord er is significant. See phylogram. www.bcu.ubc.ca/~otto/EvolDisc/Glossary.html TAKE-HOME MESSAGE: Cladograms tell us about the his tory of the re lati onshi ps of organi sms. K ey word : Hi st ory. Historically, classification of organisms was mainlyypg a bookkeeping task. For this monumental job, Carrolus Linnaeus invented the s ystem of binomial nomenclature that we are all familiar with. (Did you know that his name was Carol Linne? He liidhilatinized his own name th e way h e named speci i!)es!) Merely giving species names and arranging them according to similar groups was acceptable while we thought species were static entities . -
Geo 302D: Age of Dinosaurs LAB 4: Systematics
Geo 302D: Age of Dinosaurs LAB 4: Systematics Systematics is the comparative study of biological diversity, with the intent of determining the relationships between organisms. Humankind has always tried to find ways of organizing the creatures that surround us into categories or classes. Linnaeus was the first to utilize a working system of hierarchical classification, in 1758. It is his classification scheme which most of you are familiar with, as it is still taught in its basic form in grade schools. The system is based upon the organization of life forms into groups based upon their overall similarity. In this course we use phylogenetic systematics, also called cladistics. This technique is used by most professional biologists, zoologists, and paleontologists. In this system, organisms are grouped together on the basis of shared ancestry. A result of using this system is that the ranks (e.g. Kingdom, Phylum, Class, Order, etc.) which many of you were forced to learn in previous science classes are impractical and do not necessarily reflect evolutionary relationships between organisms. Therefore, they are not used in cladistic methodology. Cladograms Cladistics uses branching diagrams called cladograms (or trees) to visually display the hypothesized relationships between taxa (a taxon is any unit of biological diversity; taxa is the plural form of the word). Look at the cladogram below. Relative time runs vertically. A, B, C, and D represent different taxa. They are out at the terminal tips of branches on the tree, so they are called terminal taxa. The points on the tree where branches meet are called nodes. A node represents the point of divergence between evolutionary lineages. -
Heterotachy and Long-Branch Attraction in Phylogenetics. Hervé Philippe, Yan Zhou, Henner Brinkmann, Nicolas Rodrigue, Frédéric Delsuc
Heterotachy and long-branch attraction in phylogenetics. Hervé Philippe, Yan Zhou, Henner Brinkmann, Nicolas Rodrigue, Frédéric Delsuc To cite this version: Hervé Philippe, Yan Zhou, Henner Brinkmann, Nicolas Rodrigue, Frédéric Delsuc. Heterotachy and long-branch attraction in phylogenetics.. BMC Evolutionary Biology, BioMed Central, 2005, 5, pp.50. 10.1186/1471-2148-5-50. halsde-00193044 HAL Id: halsde-00193044 https://hal.archives-ouvertes.fr/halsde-00193044 Submitted on 30 Nov 2007 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. BMC Evolutionary Biology BioMed Central Research article Open Access Heterotachy and long-branch attraction in phylogenetics Hervé Philippe*1, Yan Zhou1, Henner Brinkmann1, Nicolas Rodrigue1 and Frédéric Delsuc1,2 Address: 1Canadian Institute for Advanced Research, Centre Robert-Cedergren, Département de Biochimie, Université de Montréal, Succursale Centre-Ville, Montréal, Québec H3C3J7, Canada and 2Laboratoire de Paléontologie, Phylogénie et Paléobiologie, Institut des Sciences de l'Evolution, UMR 5554-CNRS, Université -
Phylogeny of Angiosperms
Unit 8: Phylogeny of Angiosperms • Terms and concepts ❖ primitive and advanced ❖ homology and analogy ❖ parallelism and convergence ❖ monophyly, Paraphyly, polyphyly ❖ clades • origin& evolution of angiosperms; • co-evolution of angiosperms and animals; • methods of illustrating evolutionary relationship (phylogenetic tree, cladogram). TAXONOMY & SYSTEMATICS • Nomenclature = the naming of organisms • Classification = the assignment of taxa to groups of organisms • Phylogeny = Evolutionary history of a group (Evolutionary patterns & relationships among organisms) Taxonomy = Nomenclature + Classification Systematics = Taxonomy + Phylogenetics Phylogeny-Terms • Phylogeny- the evolutionary history of a group of organisms/ study of the genealogy and evolutionary history of a taxonomic group. • Genealogy- study of ancestral relationships and lineages. • Lineage- A continuous line of descent; a series of organisms or genes connected by ancestor/ descendent relationships. • Relationships are depicted through a diagram better known as a phylogram Evolution • Changes in the genetic makeup of populations- evolution, may occur in lineages over time. • Descent with modification • Evolution may be recognized as a change from a pre-existing or ancestral character state (plesiomorphic) to a new character state, derived character state (apomorphy). • 2 mechanisms of evolutionary change- 1. Natural selection – non-random, directed by survival of the fittest and reproductive ability-through Adaptation 2. Genetic Drift- random, directed by chance events -
Phylogenetic Analysis Aristotle • Through Classification, One Might Discover the Essence and Purpose of Species
Phylogenetic Analysis Aristotle • Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus • Swedish botanist (1700s) • Listed all known species • Developed scheme of classification to discover the plan of the Creator Linnaeus’ Main Contributions 1) Hierarchical classification scheme Kingdom: Phylum: Class: Order: Family: Genus: Species 2) Binomial nomenclature Before Linnaeus physalis amno ramosissime ramis angulosis glabris foliis dentoserratis After Linnaeus Physalis angulata (aka Cutleaf groundcherry) 3) Originated the practice of using the ♂ - (shield and arrow) Mars and ♀ - (hand mirror) Venus glyphs as the symbol for male and female. Charles Darwin • Species evolved from common ancestors. • Concept of closely related species being more recently diverged from a common ancestor. Therefore taxonomy might actually represent phylogeny! The phylogeny and classification of life a proposed by Haeckel (1866). Trees - Rooted and Unrooted Trees - Rooted and Unrooted ABCDEFGHIJ A BCDEH I J F G ROOT ROOT D E ROOT A F B H J G C I Monophyletic: A group composed of a collection of organisms, including the most recent common ancestor of all those organisms and all the descendants of that most recent common ancestor. A monophyletic taxon is also called a clade. Paraphyletic: A group composed of a collection of organisms, including the most recent common ancestor of all those organisms. Unlike a monophyletic group, a paraphyletic group does not include all the descendants of the most recent common ancestor. Polyphyletic: A group composed of a collection of organisms in which the most recent common ancestor of all the included •right organisms is not included, usually because the •left common ancestor lacks the characteristics of the group. -
C3020 – Molecular Evolution Exercises #3: Phylogenetics
C3020 – Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from the earlier version to make computation easier) 1 ACAAACAGTT CGATCGATTT GCAGTCTGGG 2 ACAAACAGTT TCTAGCGATT GCAGTCAGGG 3 ACAGACAGTT CGATCGATTT GCAGTCTCGG 4 ACTGACAGTT CGATCGATTT GCAGTCAGAG 5 ATTGACAGTT CGATCGATTT GCAGTCAGGA O TTTGACAGTT CGATCGATTT GCAGTCAGGG 1. Make a distance matrix using raw distances (number of differences) for the five ingroup sequences. | 1 2 3 4 5 1 | - 2 | 9 - 3 | 2 11 - 4 | 4 12 4 - 5 | 5 13 5 3 - 2. Infer the UPGMA tree for these sequences from your matrix. Label the branches with their lengths. /------------1------------- 1 /---1.25 ---| | \------------1------------- 3 /---------3.375------| | | /------------1.5------------4 | \---0.75---| | \------------1.5----------- 5 \---------5.625--------------------------------------------- 2 To derive this tree, start by clustering the two species with the lowest pairwise difference -- 1 and 3 -- and apportion the distance between them equally on the two branches leading from their common ancestor to the two taxa. Treat them as a single composite taxon, with distances from this taxon to any other species equal to the mean of the distances from that species to each of the species that make up the composite (the use of arithmetic means explains why some branch lengths are fractions). Then group the next most similar pair of taxa -- here* it is 4 and 5 -- and apportion the distance equally. Repeat until you have the whole tree and its lengths. 2 is most distant from all other taxa and composite taxa, so it must be the sister to the clade of the other four taxa. -
Basics of Cladistic Analysis
Basics of Cladistic Analysis Diana Lipscomb George Washington University Washington D.C. Copywrite (c) 1998 Preface This guide is designed to acquaint students with the basic principles and methods of cladistic analysis. The first part briefly reviews basic cladistic methods and terminology. The remaining chapters describe how to diagnose cladograms, carry out character analysis, and deal with multiple trees. Each of these topics has worked examples. I hope this guide makes using cladistic methods more accessible for you and your students. Report any errors or omissions you find to me and if you copy this guide for others, please include this page so that they too can contact me. Diana Lipscomb Weintraub Program in Systematics & Department of Biological Sciences George Washington University Washington D.C. 20052 USA e-mail: [email protected] © 1998, D. Lipscomb 2 Introduction to Systematics All of the many different kinds of organisms on Earth are the result of evolution. If the evolutionary history, or phylogeny, of an organism is traced back, it connects through shared ancestors to lineages of other organisms. That all of life is connected in an immense phylogenetic tree is one of the most significant discoveries of the past 150 years. The field of biology that reconstructs this tree and uncovers the pattern of events that led to the distribution and diversity of life is called systematics. Systematics, then, is no less than understanding the history of all life. In addition to the obvious intellectual importance of this field, systematics forms the basis of all other fields of comparative biology: • Systematics provides the framework, or classification, by which other biologists communicate information about organisms • Systematics and its phylogenetic trees provide the basis of evolutionary interpretation • The phylogenetic tree and corresponding classification predicts properties of newly discovered or poorly known organisms THE SYSTEMATIC PROCESS The systematic process consists of five interdependent but distinct steps: 1.