Phylogenetic Comparative Methods

Total Page:16

File Type:pdf, Size:1020Kb

Phylogenetic Comparative Methods Introduction and Motivation Models of Trait Evolution Along Phylogenies Phylogenetic Comparative Methods Laura Salter Kubatko Departments of Statistics and Evolution, Ecology, and Organismal Biology The Ohio State University [email protected] May 25, 2010 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Introduction I Thus far, we have focused on estimation of the phylogenetic tree as the primary goal of our analysis. I Often, however, the phylogeny itself is not of interest; rather, it is a nusiance parameter. I One setting in which this occurs is the case where we wish to study relationships among traits, either discrete or continuous, across taxa. I We’ll begin by examining some hypothetical data to motivate the method. Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 1: Correlation between discrete traits I Consider the following traits for 10 taxa: Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 1: Correlation between discrete traits I Consider the following traits for 10 taxa: 0 6 4 0 Fisher’s exact test gives a p-value of 0.0048 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 1: Correlation between discrete traits I But .... 0 6 4 0 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 1: Correlation between discrete traits I Correlation is completely explained by phylogeny: the tree has 18 branches – the probability of two changes on the same 1 branch is then 18 = 0.056. Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 2: Correlation between continuous traits I Consider the following traits for 14 taxa: Character 2 Character Character 1 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Models of Trait Evolution Along Phylogenies Example 2: Correlation between continuous traits I But ..... Character 2 Character Character 1 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Brownian Motion I The problem is that the taxa are not independent outcomes of the evolutionary process; they’re all related by the phylogeny. I We need to model the evolution of traits along the phylogeny and adjust appropriately for correlation due to shared evolutionary history. I The simplest model of trait evolutionary along a phylogeny is the Brownian motion (BM) model. I BM was proposed by Robert Brown (1773-1858) based on observation that pollen grains suspended in solutions “jiggled” continually. Later theoretical work is due to Einstein and Weiner, among others. I BM was first applied to model trait evolution along phylogenies by Edwards and Cavalli-Sforza in 1964. Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Brownian Motion I Consider a particle moving in a single dimension (say along the x-axis). I Measure position of particle at small intervals of time. I The movement of the particle in each interval is assumed to: I be independent of movement in other intervals of time I have mean 0 I have constant variance, regardless of position of the particle I Consider n distinct intervals of time. I After n steps, the net displacement is the sum of the displacements at each step. Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Brownian Motion 2 I Let s be the variance of the displacement at each interval. The variance in the net displacement is then ns2 (since the displacement in intervals is independent). 2 I Now let s → 0 as n → ∞ such that their product is constant. This is the BM or Weiner process. I What is important for us is the distribution of the net displacement after t units of time. 2 I Let σ be the variance per unit time.Then after t units of time, the variance of the net displacement is σ2t. I In addition, the net displacement across an interval of t units of time is N(0, σ2t). Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny I How do we apply this to trait evolution along a phylogeny? I Assume that displacements on different branches of a tree are independent; traits evolve over branches of tree according to BM model. x0 v12 v9 x12 v11 x11 x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7 x1 x2 x3 x4 x5 x6 x7 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny: An Example I Let xi denote the phenotype at node i, and let vi denote the branch length. I As an example, look at value of trait at external node 5, x5. x0 v12 v9 x12 v11 x11 x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7 x1 x2 x3 x4 x5 x6 x7 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny: An Example I Assume that the state at the root (x0 ) is fixed. I Note that x5 = x0 + (x12 − x0) + (x11 − x12) + (x5 − x11) I The last three terms above are independent draws from normal distributions with mean 0 and variances depending on the vi . x0 v12 v9 x12 v11 x11 x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7 x1 x2 x3 x4 x5 x6 x7 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny: An Example I We have I E(x5) = x0 2 2 2 I Var(x5) = σ v12 + σ v11 + σ v5 x0 v12 v9 x12 v11 x11 x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7 x1 x2 x3 x4 x5 x6 x7 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny: An Example I Similarly, for x7, we have I E(x7) = x0 2 2 2 2 I Var(x7) = σ v12 + σ v11 + +σ v10 + σ v7 x0 v12 v9 x12 v11 x11 x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7 x1 x2 x3 x4 x5 x6 x7 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny: An Example I Note: x5 and x7 are not independent – they share a history up to node x11 2 2 I cov(x5, x7) = σ v12 + σ v11 x0 v12 v9 x12 v11 x11 x9 v10 v8 v4 x8 v5 x10 v3 v1 v2 v6 v7 x1 x2 x3 x4 x5 x6 x7 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies BM Along a Phylogeny I Now do this for all pairs of tips on a tree. I The characters on the tips are jointly multivariate normal with expectation x0 and covariances based on their shared history. I For our example, the variance-covariance matrix is: 0v1 + v8 + v9 v8 + v9 v9 0 0 0 0 1 B v + v v + v + v v 0 0 0 0 C B 8 9 2 8 9 9 C B v v v + v 0 0 0 0 C B 9 9 3 9 C B 0 0 0 v + v v v v C B 4 12 12 12 12 C B 0 0 0 v v + v + v v + v v + v C B 12 5 11 12 11 12 11 12 C @ 0 0 0 v12 v11 + v12 v6 + v10 + v11 + v12 v10 + v11 + v12 A 0 0 0 v12 v11 + v12 v10 + v11 + v12 v7 + v10 + v11 + v12 Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Inference Under the BM Model I Suppose that p characters are observed – each follows the multivariate normal distribution I We can write a likelihood function as the product over characters (assuming characters evolve independently along the phylogeny) I Want to estimate parameters – e.g., x0i , i = 1, ··· p and 2 σi , i = 1 ··· p I However, we run into a problem – likelihood goes to ∞ as branch length goes to 0 (see Ch. 23 in text for a worked example). Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Inference Under the BM Model I Some possible solutions: I Assume a molecular clock I Recode data as differences between character states at the tips – contrasts I Look more at the second case – let C be an (n − 1) × n matrix of contrasts. I Let V be the original variance-covariance matrix (multiplied by σ2). I Then for the data re-coded as differences in trait values, which we denote by the vector y, we have y = Cx ∼ N(0, CVCT) Stat 882: Statistical Phylogenetics - Lecture 8 - Part I Introduction and Motivation Brownian Motion Models of Trait Evolution Along Phylogenies Inference Under the BM Model I Next, we’ll discuss computing the likelihood under this model. I Need a notion of independent contrasts.
Recommended publications
  • Lecture Notes: the Mathematics of Phylogenetics
    Lecture Notes: The Mathematics of Phylogenetics Elizabeth S. Allman, John A. Rhodes IAS/Park City Mathematics Institute June-July, 2005 University of Alaska Fairbanks Spring 2009, 2012, 2016 c 2005, Elizabeth S. Allman and John A. Rhodes ii Contents 1 Sequences and Molecular Evolution 3 1.1 DNA structure . .4 1.2 Mutations . .5 1.3 Aligned Orthologous Sequences . .7 2 Combinatorics of Trees I 9 2.1 Graphs and Trees . .9 2.2 Counting Binary Trees . 14 2.3 Metric Trees . 15 2.4 Ultrametric Trees and Molecular Clocks . 17 2.5 Rooting Trees with Outgroups . 18 2.6 Newick Notation . 19 2.7 Exercises . 20 3 Parsimony 25 3.1 The Parsimony Criterion . 25 3.2 The Fitch-Hartigan Algorithm . 28 3.3 Informative Characters . 33 3.4 Complexity . 35 3.5 Weighted Parsimony . 36 3.6 Recovering Minimal Extensions . 38 3.7 Further Issues . 39 3.8 Exercises . 40 4 Combinatorics of Trees II 45 4.1 Splits and Clades . 45 4.2 Refinements and Consensus Trees . 49 4.3 Quartets . 52 4.4 Supertrees . 53 4.5 Final Comments . 54 4.6 Exercises . 55 iii iv CONTENTS 5 Distance Methods 57 5.1 Dissimilarity Measures . 57 5.2 An Algorithmic Construction: UPGMA . 60 5.3 Unequal Branch Lengths . 62 5.4 The Four-point Condition . 66 5.5 The Neighbor Joining Algorithm . 70 5.6 Additional Comments . 72 5.7 Exercises . 73 6 Probabilistic Models of DNA Mutation 81 6.1 A first example . 81 6.2 Markov Models on Trees . 87 6.3 Jukes-Cantor and Kimura Models .
    [Show full text]
  • In Defence of the Three-Domains of Life Paradigm P.T.S
    van der Gulik et al. BMC Evolutionary Biology (2017) 17:218 DOI 10.1186/s12862-017-1059-z REVIEW Open Access In defence of the three-domains of life paradigm P.T.S. van der Gulik1*, W.D. Hoff2 and D. Speijer3* Abstract Background: Recently, important discoveries regarding the archaeon that functioned as the “host” in the merger with a bacterium that led to the eukaryotes, its “complex” nature, and its phylogenetic relationship to eukaryotes, have been reported. Based on these new insights proposals have been put forward to get rid of the three-domain Model of life, and replace it with a two-domain model. Results: We present arguments (both regarding timing, complexity, and chemical nature of specific evolutionary processes, as well as regarding genetic structure) to resist such proposals. The three-domain Model represents an accurate description of the differences at the most fundamental level of living organisms, as the eukaryotic lineage that arose from this unique merging event is distinct from both Archaea and Bacteria in a myriad of crucial ways. Conclusions: We maintain that “a natural system of organisms”, as proposed when the three-domain Model of life was introduced, should not be revised when considering the recent discoveries, however exciting they may be. Keywords: Eucarya, LECA, Phylogenetics, Eukaryogenesis, Three-domain model Background (English: domain) as a higher taxonomic level than The discovery that methanogenic microbes differ funda- regnum: introducing the three domains of cellular life, Ar- mentally from Bacteria such as Escherichia coli or Bacillus chaea, Bacteria and Eucarya. This paradigm was proposed subtilis constitutes one of the most important biological by Woese, Kandler and Wheelis in PNAS in 1990 [2].
    [Show full text]
  • Phylogenetics Topic 2: Phylogenetic and Genealogical Homology
    Phylogenetics Topic 2: Phylogenetic and genealogical homology Phylogenies distinguish homology from similarity Previously, we examined how rooted phylogenies provide a framework for distinguishing similarity due to common ancestry (HOMOLOGY) from non-phylogenetic similarity (ANALOGY). Here we extend the concept of phylogenetic homology by making a further distinction between a HOMOLOGOUS CHARACTER and a HOMOLOGOUS CHARACTER STATE. This distinction is important to molecular evolution, as we often deal with data comprised of homologous characters with non-homologous character states. The figure below shows three hypothetical protein-coding nucleotide sequences (for simplicity, only three codons long) that are related to each other according to a phylogenetic tree. In the figure the nucleotide sequences are aligned to each other; in so doing we are making the implicit assumption that the characters aligned vertically are homologous characters. In the specific case of nucleotide and amino acid alignments this assumption is called POSITIONAL HOMOLOGY. Under positional homology it is implicit that a given position, say the first position in the gene sequence, was the same in the gene sequence of the common ancestor. In the figure below it is clear that some positions do not have identical character states (see red characters in figure below). In such a case the involved position is considered to be a homologous character, while the state of that character will be non-homologous where there are differences. Phylogenetic perspective on homologous characters and homologous character states ACG TAC TAA SYNAPOMORPHY: a shared derived character state in C two or more lineages. ACG TAT TAA These must be homologous in state.
    [Show full text]
  • Phylogenetics of Buchnera Aphidicola Munson Et Al., 1991
    Türk. entomol. derg., 2019, 43 (2): 227-237 ISSN 1010-6960 DOI: http://dx.doi.org/10.16970/entoted.527118 E-ISSN 2536-491X Original article (Orijinal araştırma) Phylogenetics of Buchnera aphidicola Munson et al., 1991 (Enterobacteriales: Enterobacteriaceae) based on 16S rRNA amplified from seven aphid species1 Farklı yaprak biti türlerinden izole edilen Buchnera aphidicola Munson et al., 1991 (Enterobacteriales: Enterobacteriaceae)’nın 16S rRNA’ya göre filogenetiği Gül SATAR2* Abstract The obligate symbiont, Buchnera aphidicola Munson et al., 1991 (Enterobacteriales: Enterobacteriaceae) is important for the physiological processes of aphids. Buchnera aphidicola genes detected in seven aphid species, collected in 2017 from different plants and altitudes in Adana Province, Turkey were analyzed to reveal phylogenetic interactions between Buchnera and aphids. The 16S rRNA gene was amplified and sequenced for this purpose and a phylogenetic tree built up by the neighbor-joining method. A significant correlation between B. aphidicola genes and the aphid species was revealed by this phylogenetic tree and the haplotype network. Specimens collected in Feke from Solanum melongena L. was distinguished from the other B. aphidicola genes on Aphis gossypii Glover, 1877 (Hemiptera: Aphididae) with a high bootstrap value of 99. Buchnera aphidicola in Myzus spp. was differentiated from others, and the difference between Myzus cerasi (Fabricius, 1775) and Myzus persicae (Sulzer, 1776) was clear. Although, B. aphidicola is specific to its host aphid, certain nucleotide differences obtained within the species could enable specification to geographic region or host plant in the future. Keywords: Aphid, genetic similarity, phylogenetics, symbiotic bacterium Öz Obligat simbiyont, Buchnera aphidicola Munson et al., 1991 (Enterobacteriales: Enterobacteriaceae), yaprak bitlerinin fizyolojik olaylarının sürdürülmesinde önemli bir rol oynar.
    [Show full text]
  • Introductory Activities
    TEACHER’S GUIDE Introduction Dean Madden Introductory NCBE, University of Reading activities Version 1.0 CaseCase Studies introduction Introductory activities The activities in this section explain the basic principles behind the construction of phylogenetic trees, DNA structure and sequence alignment. Students are also intoduced to the Geneious software. Before carrying out the activities in the DNA to Darwin Case studies, students will need to understand: • the basic principles behind the construction of an evolutionary tree or phylogeny; • the basic structure of DNA and proteins; • the reasons for and the principle of alignment; • use of some features of the Geneious computer software (basic version). The activities in this introduction are designed to achieve this. Some of them will reinforce what students may already know; others involve new concepts. The material includes extension activities for more able students. Evolutionary trees In 1837, 12 years before the publication of On the Origin of Species, Charles Darwin famously drew an evolutionary tree in one of his notebooks. The Origin also included a diagram of an evolutionary tree — the only illustration in the book. Two years before, Darwin had written to his friend Thomas Henry Huxley, saying: ‘The time will come, I believe, though I shall not live to see it, when we shall have fairly true genealogical trees of each great kingdom of Nature.’ Today, scientists are trying to produce the ‘Tree of Life’ Darwin foresaw, using protein, DNA and RNA sequence data. Evolutionary trees are covered on pages 2–7 of the Student’s guide and in the PowerPoint and Keynote slide presentations.
    [Show full text]
  • 1 "Principles of Phylogenetics: Ecology
    "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200 Spring 2016 University of California, Berkeley D.D. Ackerly March 7, 2016. Phylogenetics and Adaptation What is to be explained? • What is the evolutionary history of trait x that we see in a lineage (homology) or multiple lineages (homoplasy) - adaptations as states • Is natural selection the primary evolutionary process leading to the ‘fit’ of organisms to their environment? • Why are some traits more prevalent (occur in more species): number of origins vs. trait- dependent diversification rates (speciation – extinction) Some high points in the history of the adaptation debate: 1950s • Modern Synthesis of Genetics (Dobzhansky), Paleontology (Simpson) and Systematics (Mayr, Grant) 1960s • Rise of evolutionary ecology – synthesis of ecology with strong adaptationism via optimality theory, with little to no history; leads to Sociobiology in the 70s • Appearance of cladistics (Hennig) 1972 • Eldredge and Gould – punctuated equilibrium – argue that Modern Synthesis can’t explain pervasive observation of stasis in fossil record; Gould focuses on development and constraint as explanations, Eldredge more on ecology and importance of migration to minimize selective pressure 1979 • Gould and Lewontin – Spandrels – general critique of adaptationist program and call for rigorous hypothesis testing of alternatives for the ‘fit’ between organism and environment 1980’s • Debate on whether macroevolution can be explained by microevolutionary processes • Comparative methods
    [Show full text]
  • The Caper Package: Comparative Analysis of Phylogenetics and Evolution in R
    The caper package: comparative analysis of phylogenetics and evolution in R David Orme April 16, 2018 This vignette documents the use of the caper package for R (R Development Core Team, 2011) in carrying out a range of comparative analysis methods for phylogenetic data. The caper package, and the code in this vignette, requires the ape package (Paradis et al., 2004) along with the packages mvtnorm and MASS. Contents 1 Background 2 2 Comparative datasets 3 2.1 The comparative.data class and objects. .3 2.1.1 na.omit ......................................3 2.1.2 subset ......................................5 2.1.3 [ ..........................................5 2.2 Example datasets . .6 3 Methods and functions provided by caper. 7 3.0.1 Phylogenetic linear models . .7 3.0.2 Fitting phylogenetic GLS models: pgls ....................8 3.1 Optimising branch length transformations: profile.pgls...............9 3.1.1 Criticism and simplification ofpgls models: plot, anova and AIC...... 11 3.2 Phylogenetic independent contrasts . 12 3.2.1 Variable names in contrast functions . 12 3.2.2 Continuous variables: crunch .......................... 13 3.2.3 Categorical variables: brunch .......................... 13 3.2.4 Species richness contrasts: macrocaic ..................... 14 3.2.5 Phylogenetic signal: phylo.d .......................... 15 3.3 Checking and comparing contrast models. 16 3.3.1 Testing evolutionary assumptions: caic.diagnostics............. 16 3.3.2 Robust contrasts: caic.robust ......................... 17 3.3.3 Model criticism: plot .............................. 19 3.3.4 Model comparison: anova & AIC ........................ 20 3.4 Other comparative functions . 21 3.4.1 Tree imbalance: fusco.test .......................... 21 3.5 Phylogenetic diversity: pd.calc, pd.bootstrap and ed.calc............
    [Show full text]
  • Working at the Interface of Phylogenetics and Population
    Molecular Ecology (2007) 16, 839–851 doi: 10.1111/j.1365-294X.2007.03192.x WorkingBlackwell Publishing Ltd at the interface of phylogenetics and population genetics: a biogeographical analysis of Triaenops spp. (Chiroptera: Hipposideridae) A. L. RUSSELL,* J. RANIVO,†‡ E. P. PALKOVACS,* S. M. GOODMAN‡§ and A. D. YODER¶ *Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA, †Département de Biologie Animale, Université d’Antananarivo, Antananarivo, BP 106, Madagascar, ‡Ecology Training Program, World Wildlife Fund, Antananarivo, BP 906 Madagascar, §The Field Museum of Natural History, Division of Mammals, 1400 South Lake Shore Drive, Chicago, IL 60605, USA, ¶Department of Ecology and Evolutionary Biology, PO Box 90338, Duke University, Durham, NC 27708, USA Abstract New applications of genetic data to questions of historical biogeography have revolutionized our understanding of how organisms have come to occupy their present distributions. Phylogenetic methods in combination with divergence time estimation can reveal biogeo- graphical centres of origin, differentiate between hypotheses of vicariance and dispersal, and reveal the directionality of dispersal events. Despite their power, however, phylo- genetic methods can sometimes yield patterns that are compatible with multiple, equally well-supported biogeographical hypotheses. In such cases, additional approaches must be integrated to differentiate among conflicting dispersal hypotheses. Here, we use a synthetic approach that draws upon the analytical strengths of coalescent and population genetic methods to augment phylogenetic analyses in order to assess the biogeographical history of Madagascar’s Triaenops bats (Chiroptera: Hipposideridae). Phylogenetic analyses of mitochondrial DNA sequence data for Malagasy and east African Triaenops reveal a pattern that equally supports two competing hypotheses.
    [Show full text]
  • Phylogenetics: Recovering Evolutionary History COMP 571 Luay Nakhleh, Rice University
    1 Phylogenetics: Recovering Evolutionary History COMP 571 Luay Nakhleh, Rice University 2 The Structure and Interpretation of Phylogenetic Trees unrooted, binary species tree rooted, binary species tree speciation (direction of descent) Flow of time ๏ six extant taxa or operational taxonomic units (OTUs) 3 The Structure and Interpretation of Phylogenetic Trees Phylogenetics-RecoveringEvolutionaryHistory - March 3, 2017 4 The Structure and Interpretation of Phylogenetic Trees In a binary tree on n taxa, how may nodes, branches, internal nodes and internal branches are there? How many unrooted binary trees on n taxa are there? How many rooted binary trees on n taxa are there? ๏ six extant taxa or operational taxonomic units (OTUs) 5 The Structure and Interpretation of Phylogenetic Trees polytomy Non-binary Multifuracting Partially resolved Polytomous ๏ six extant taxa or operational taxonomic units (OTUs) 6 The Structure and Interpretation of Phylogenetic Trees A polytomy in a tree can be resolved (not necessarily fully) in many ways, thus producing trees with higher resolution (including binary trees) A binary tree can be turned into a partially resolved tree by contracting edges In how many ways can a polytomy of degree d be resolved? Compatibility between two trees guarantees that one can back and forth between the two trees by means of node refinement and edge contraction Phylogenetics-RecoveringEvolutionaryHistory - March 3, 2017 7 The Structure and Interpretation of Phylogenetic Trees branch lengths have Additive no meaning tree Additive tree ultrametric rooted at an tree outgroup (molecular clock) 8 The Structure and Interpretation of Phylogenetic Trees bipartition (split) AB|CDEF clade cluster 11 clades (4 nontrivial) 9 bipartitions (3 nontrivial) How many nontrivial clades are there in a binary tree on n taxa? How many nontrivial bipartitions are there in a binary tree on n taxa? How many possible nontrivial clusters of n taxa are there? 9 The Structure and Interpretation of Phylogenetic Trees Species vs.
    [Show full text]
  • High-Throughput Genomic Data in Systematics and Phylogenetics
    ES44CH06-Lemmon ARI 29 October 2013 10:2 High-Throughput Genomic Data in Systematics and Phylogenetics Emily Moriarty Lemmon1 and Alan R. Lemmon2 1Department of Biological Science, Florida State University, Biomedical Research Facility, Tallahassee, Florida 32306; email: [email protected] 2Department of Scientific Computing, Florida State University, Dirac Science Library, Tallahassee, Florida 32306; email: [email protected] Annu. Rev. Ecol. Evol. Syst. 2013. 44:99–121 Keywords First published online as a Review In Advance on high-throughput sequencing, next-generation sequencing, phylogenomics, October 9, 2013 genomic partitioning, target enrichment, hybrid enrichment, anchored The Annual Review of Ecology, Evolution, and phylogenomics, ultraconserved element enrichment, targeted amplicon Systematics is online at ecolsys.annualreviews.org sequencing, transcriptome sequencing, RAD sequencing, locus selection, This article’s doi: model selection, phylogeny estimation by Florida State University on 11/26/13. For personal use only. 10.1146/annurev-ecolsys-110512-135822 Copyright c 2013 by Annual Reviews. Abstract All rights reserved High-throughput genomic sequencing is rapidly changing the field of phy- Annu. Rev. Ecol. Evol. Syst. 2013.44:99-121. Downloaded from www.annualreviews.org logenetics by decreasing the cost and increasing the quantity and rate of data collection by several orders of magnitude. This deluge of data is exerting tremendous pressure on downstream data-analysis methods providing new opportunities for method development. In this review, we present (a) recent advances in laboratory methods for collection of high-throughput phyloge- neticdataand(b) challenges and constraints for phylogenetic analysis of these data. We compare the merits of multiple laboratory approaches, compare methods of data analysis, and offer recommendations for the most promising protocols and data-analysis workflows currently available for phylogenetics.
    [Show full text]
  • Phylogenetics
    Phylogenetics What is phylogenetics? • Study of branching patterns of descent among lineages • Lineages – Populations – Species – Molecules • Shift between population genetics and phylogenetics is often the species boundary – Distantly related populations also show patterning – Patterning across geography What is phylogenetics? • Goal: Determine and describe the evolutionary relationships among lineages – Order of events – Timing of events • Visualization: Phylogenetic trees – Graph – No cycles Phylogenetic trees • Nodes – Terminal – Internal – Degree • Branches • Topology Phylogenetic trees • Rooted or unrooted – Rooted: Precisely 1 internal node of degree 2 • Node that represents the common ancestor of all taxa – Unrooted: All internal nodes with degree 3+ Stephan Steigele Phylogenetic trees • Rooted or unrooted – Rooted: Precisely 1 internal node of degree 2 • Node that represents the common ancestor of all taxa – Unrooted: All internal nodes with degree 3+ Phylogenetic trees • Rooted or unrooted – Rooted: Precisely 1 internal node of degree 2 • Node that represents the common ancestor of all taxa – Unrooted: All internal nodes with degree 3+ • Binary: all speciation events produce two lineages from one • Cladogram: Topology only • Phylogram: Topology with edge lengths representing time or distance • Ultrametric: Rooted tree with time-based edge lengths (all leaves equidistant from root) Phylogenetic trees • Clade: Group of ancestral and descendant lineages • Monophyly: All of the descendants of a unique common ancestor • Polyphyly:
    [Show full text]
  • Handout for the Phylogenetics Lectures
    Handout for the Phylogenetics Lectures Dirk Metzler December 29, 2020 Contents 1 Intro: Outline and Tree Notation 2 2 Distance-Based Phylogeny Reconstruction 6 2.1 What is a distance? . .6 2.2 UPGMA . .6 2.3 Neighbor Joining . .8 3 Parsimony in phylogeny reconstruction 9 3.1 Parsimony of a tree . .9 3.2 Finding parsimonious trees for given data . 11 3.3 Limitations of the parsimony principle . 14 4 Measures for how different two trees are 14 5 Maximum-Likelihood (ML) in phylogeny estimation 16 5.1 What is a likelihood? . 16 5.2 How to compute the likelihood of a tree . 17 5.3 Jukes-Cantor model of sequence evolution . 18 5.4 Reversibility and convergence into equilibrium . 20 5.5 How to search for the ML tree . 22 5.6 Maximum Parsimony from a probabilistic perspective . 23 5.7 Maximum likelihood for pairwise distances . 24 5.8 Consistency of the Maximum-Likelihood method . 24 6 Bootstrapping 25 6.1 The concept of bootstrapping . 25 6.2 Bootstrap for phylogenetic trees . 26 6.3 How can we interprete the bootstrap values? . 27 7 Bayesian phylogeny reconstruction and MCMC 28 7.1 Principles of Bayesian statistics . 28 7.2 MCMC sampling . 29 7.3 Checking convergence of MCMC . 31 7.4 Interpretation of posterior probabilities and robustness . 32 8 Modelling the substitution process on sequences 36 8.1 Transition matrix and rate matrix . 36 8.2 Residence time . 39 8.3 Computing S(t) from the rate matrix R ............................ 40 8.4 A model for transition probabilities in closed form .
    [Show full text]