Local and Relaxed Clocks: the Best of Both Worlds
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Mutations, the Molecular Clock, and Models of Sequence Evolution
Mutations, the molecular clock, and models of sequence evolution Why are mutations important? Mutations can Mutations drive be deleterious evolution Replicative proofreading and DNA repair constrain mutation rate UV damage to DNA UV Thymine dimers What happens if damage is not repaired? Deinococcus radiodurans is amazingly resistant to ionizing radiation • 10 Gray will kill a human • 60 Gray will kill an E. coli culture • Deinococcus can survive 5000 Gray DNA Structure OH 3’ 5’ T A Information polarity Strands complementary T A G-C: 3 hydrogen bonds C G A-T: 2 hydrogen bonds T Two base types: A - Purines (A, G) C G - Pyrimidines (T, C) 5’ 3’ OH Not all base substitutions are created equal • Transitions • Purine to purine (A ! G or G ! A) • Pyrimidine to pyrimidine (C ! T or T ! C) • Transversions • Purine to pyrimidine (A ! C or T; G ! C or T ) • Pyrimidine to purine (C ! A or G; T ! A or G) Transition rate ~2x transversion rate Substitution rates differ across genomes Splice sites Start of transcription Polyadenylation site Alignment of 3,165 human-mouse pairs Mutations vs. Substitutions • Mutations are changes in DNA • Substitutions are mutations that evolution has tolerated Which rate is greater? How are mutations inherited? Are all mutations bad? Selectionist vs. Neutralist Positions beneficial beneficial deleterious deleterious neutral • Most mutations are • Some mutations are deleterious; removed via deleterious, many negative selection mutations neutral • Advantageous mutations • Neutral alleles do not positively selected alter fitness • Variability arises via • Most variability arises selection from genetic drift What is the rate of mutations? Rate of substitution constant: implies that there is a molecular clock Rates proportional to amount of functionally constrained sequence Why care about a molecular clock? (1) The clock has important implications for our understanding of the mechanisms of molecular evolution. -
Phylogeny Codon Models • Last Lecture: Poor Man’S Way of Calculating Dn/Ds (Ka/Ks) • Tabulate Synonymous/Non-Synonymous Substitutions • Normalize by the Possibilities
Phylogeny Codon models • Last lecture: poor man’s way of calculating dN/dS (Ka/Ks) • Tabulate synonymous/non-synonymous substitutions • Normalize by the possibilities • Transform to genetic distance KJC or Kk2p • In reality we use codon model • Amino acid substitution rates meet nucleotide models • Codon(nucleotide triplet) Codon model parameterization Stop codons are not allowed, reducing the matrix from 64x64 to 61x61 The entire codon matrix can be parameterized using: κ kappa, the transition/transversionratio ω omega, the dN/dS ratio – optimizing this parameter gives the an estimate of selection force πj the equilibrium codon frequency of codon j (Goldman and Yang. MBE 1994) Empirical codon substitution matrix Observations: Instantaneous rates of double nucleotide changes seem to be non-zero There should be a mechanism for mutating 2 adjacent nucleotides at once! (Kosiol and Goldman) • • Phylogeny • • Last lecture: Inferring distance from Phylogenetic trees given an alignment How to infer trees and distance distance How do we infer trees given an alignment • • Branch length Topology d 6-p E 6'B o F P Edo 3 vvi"oH!.- !fi*+nYolF r66HiH- .) Od-:oXP m a^--'*A ]9; E F: i ts X o Q I E itl Fl xo_-+,<Po r! UoaQrj*l.AP-^PA NJ o - +p-5 H .lXei:i'tH 'i,x+<ox;+x"'o 4 + = '" I = 9o FF^' ^X i! .poxHo dF*x€;. lqEgrE x< f <QrDGYa u5l =.ID * c 3 < 6+6_ y+ltl+5<->-^Hry ni F.O+O* E 3E E-f e= FaFO;o E rH y hl o < H ! E Y P /-)^\-B 91 X-6p-a' 6J. -
Molecular Evolution
An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Molecular Evolution An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Outline • Evolutionary Tree Reconstruction • “Out of Africa” hypothesis • Did we evolve from Neanderthals? • Distance Based Phylogeny • Neighbor Joining Algorithm • Additive Phylogeny • Least Squares Distance Phylogeny • UPGMA • Character Based Phylogeny • Small Parsimony Problem • Fitch and Sankoff Algorithms • Large Parsimony Problem • Evolution of Wings • HIV Evolution • Evolution of Human Repeats An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Early Evolutionary Studies • Anatomical features were the dominant criteria used to derive evolutionary relationships between species since Darwin till early 1960s • The evolutionary relationships derived from these relatively subjective observations were often inconclusive. Some of them were later proved incorrect An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Evolution and DNA Analysis: the Giant Panda Riddle • For roughly 100 years scientists were unable to figure out which family the giant panda belongs to • Giant pandas look like bears but have features that are unusual for bears and typical for raccoons, e.g., they do not hibernate • In 1985, Steven O’Brien and colleagues solved the giant panda classification problem using DNA sequences and algorithms An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Evolutionary Tree of Bears and Raccoons An Introduction to Bioinformatics Algorithms www.bioalgorithms.info Evolutionary Trees: DNA-based Approach • 40 years ago: Emile Zuckerkandl and Linus Pauling brought reconstructing evolutionary relationships with DNA into the spotlight • In the first few years after Zuckerkandl and Pauling proposed using DNA for evolutionary studies, the possibility of reconstructing evolutionary trees by DNA analysis was hotly debated • Now it is a dominant approach to study evolution. -
Concentration-Affinity Equivalence in Gene Regulation: Convergence Of
Proc. Nad. Acad. Sci. USA Vol. 85, pp. 4784-4788, July 1988 Evolution Concentration-affinity equivalence in gene regulation: Convergence of genetic and environmental effects (phenocopies/parallel evolution/genetic assimilation/expressivity/penetrance) EMILE ZUCKERKANDL* AND RUXTON VILLETt *Linus Pauling Institute of Science and Medicine, 440 Page Mill Road, Palo Alto, CA 94306; and tU.S. Department of Agriculture, Agricultural Research Services, National Program Staff, Building 05S, BARC-West, Beltsville, MD 20705 Communicated by Francisco J. Ayala, March 21, 1988 ABSTRACT It is proposed that equivalent phenotypic affinity can also be altered by a reversible structural change effects can be obtained by either structural changes in macro- in the protein, resulting in a modification of specific fit molecules involved in gene regulation or changes in activity of between macromolecules and brought about by a combina- the structurally unaltered macromolecules. This equivalence tion with, or by the release of, an effector. An important between changes in activity (concentration) and changes in variant of the latter process is a reversible formation of a structure can come into play within physiologically plausible covalent compound between a polynucleotide-binding pro- limits and seems to represent an important interface between tein and some other organic moiety (e.g., acetylation, ADP- environment and genome-namely, between environmentally ribosylation; see ref. 6). The potential equivalence is between determined and genetically determined gene expression. The the effect of a change in component concentration (activity) equivalence principle helps explain the appearance of pheno- under a constant structural state and the effect ofa structural copies. It also points to a general pathway favorable to the modification under constant component concentration (ac- occurrence, during evolution, offrequent episodes correspond- tivity). -
MOLECULAR CLOCKS Definition Introduction
MOLECULAR CLOCKS 583 Kishino, H., Thorne, J. L., and Bruno, W. J., 2001. Performance of Thorne, J. L., Kishino, H., and Painter, I. S., 1998. Estimating the a divergence time estimation method under a probabilistic model rate of evolution of the rate of molecular evolution. Molecular of rate evolution. Molecular Biology and Evolution, 18,352–361. Biology and Evolution, 15, 1647–1657. Kodandaramaiah, U., 2011. Tectonic calibrations in molecular dat- Warnock, R. C. M., Yang, Z., Donoghue, P. C. J., 2012. Exploring ing. Current Zoology, 57,116–124. uncertainty in the calibration of the molecular clock. Biology Marshall, C. R., 1997. Confidence intervals on stratigraphic ranges Letters, 8, 156–159. with nonrandom distributions of fossil horizons. Paleobiology, Wilkinson, R. D., Steiper, M. E., Soligo, C., Martin, R. D., Yang, Z., 23, 165–173. and Tavaré, S., 2011. Dating primate divergences through an Müller, J., and Reisz, R. R., 2005. Four well-constrained calibration integrated analysis of palaeontological and molecular data. Sys- points from the vertebrate fossil record for molecular clock esti- tematic Biology, 60,16–31. mates. Bioessays, 27, 1069–1075. Yang, Z., and Rannala, B., 2006. Bayesian estimation of species Parham, J. F., Donoghue, P. C. J., Bell, C. J., et al., 2012. Best practices divergence times under a molecular clock using multiple fossil for justifying fossil calibrations. Systematic Biology, 61,346–359. calibrations with soft bounds. Molecular Biology and Evolution, Peters, S. E., 2005. Geologic constraints on the macroevolutionary 23, 212–226. history of marine animals. Proceedings of the National Academy Zuckerkandl, E., and Pauling, L., 1962. Molecular disease, evolution of Sciences, 102, 12326–12331. -
Manipulating Underdetermination in Scientific Controversy: the Case of the Molecular Clock
Dartmouth College Dartmouth Digital Commons Open Dartmouth: Published works by Dartmouth faculty Faculty Work 9-1-2007 Manipulating Underdetermination in Scientific Controversy: The Case of the Molecular Clock Michael Dietrich Dartmouth College Robert A. Skipper University of Cincinnati Follow this and additional works at: https://digitalcommons.dartmouth.edu/facoa Part of the Biology Commons Dartmouth Digital Commons Citation Dietrich, Michael and Skipper, Robert A., "Manipulating Underdetermination in Scientific Controversy: The Case of the Molecular Clock" (2007). Open Dartmouth: Published works by Dartmouth faculty. 16. https://digitalcommons.dartmouth.edu/facoa/16 This Article is brought to you for free and open access by the Faculty Work at Dartmouth Digital Commons. It has been accepted for inclusion in Open Dartmouth: Published works by Dartmouth faculty by an authorized administrator of Dartmouth Digital Commons. For more information, please contact [email protected]. Manipulating Underdetermination in Scientiªc Controversy: The Case of the Molecular Clock Michael R. Dietrich Dartmouth College Robert A. Skipper, Jr. University of Cincinnati Where there are cases of underdetermination in scientiªc controversies, such as the case of the molecular clock, scientists may direct the course and terms of dispute by playing off the multidimensional framework of theory evaluation. This is because assessment strategies themselves are underdetermined. Within the framework of assessment, there are a variety of trade-offs between differ- ent strategies as well as shifting emphases as speciªc strategies are given more or less weight in assessment situations. When a strategy is underdetermined, scientists can change the dynamics of a controversy by making assessments using different combinations of evaluation strategies and/or weighting what- ever strategies are in play in different ways. -
The Phylogenetic Handbook: a Practical Approach to Phylogenetic Analysis and Hypothesis Testing, Philippe Lemey, Marco Salemi, and Anne-Mieke Vandamme (Eds.)
10 Selecting models of evolution THEORY David Posada 10.1 Models of evolution and phylogeny reconstruction Phylogenetic reconstruction is a problem of statistical inference. Since statistical inferences cannot be drawn in the absence of probabilities, the use of a model of nucleotide substitution or amino acid replacement – a model of evolution – becomes indispensable when using DNA or protein sequences to estimate phylo- genetic relationships among taxa. Models of evolution are sets of assumptions about the process of nucleotide or amino acid substitution (see Chapters 4 and 9). They describe the different probabilities of change from one nucleotide or amino acid to another along a phylogenetic tree, allowing us to choose among different phylogenetic hypotheses to explain the data at hand. Comprehensive reviews of models of evolution are offered elsewhere (Swofford et al., 1996;Lio` & Goldman, 1998). As discussed in the previous chapters, phylogenetic methods are based on a number of assumptions about the evolutionary process. Such assumptions can be implicit, like in parsimony methods (see Chapter 8), or explicit, like in distance or maximum likelihood methods (see Chapters 5 and 6, respectively). The advantage of making a model explicit is that the parameters of the model can be estimated. Distance methods can only estimate the number of substitutions per site. However, maximum likelihood methods can estimate all the relevant parameters of the model of evolution. Parameters estimated via maximum likelihood have desirable statistical properties: as sample sizes get large, they converge to the true value and The Phylogenetic Handbook: a Practical Approach to Phylogenetic Analysis and Hypothesis Testing, Philippe Lemey, Marco Salemi, and Anne-Mieke Vandamme (eds.). -
What Is the Viewpoint of Hemoglobin, and Does It Matter?
Hist. Phil. Life Sci., 31 (2009), 241-262 What is the Viewpoint of Hemoglobin, and Does It Matter? Jonathan Marks University of North Carolina at Charlotte Department of Anthropology Charlotte, NC 28223, USA ABSTRACT - In this paper I discuss reductive trends in evolutionary anthropology. The first involved the reduction of human ancestry to genetic relationships (in the 1960s) and the second involved a parallel reduction of classification to phylogenetic retrieval (in the 1980s). Neither of these affords greater accuracy than their alternatives; that is to say, their novelty is epistemic, not empirical. As a result, there has been a revolution in classification in evolutionary anthropology, which arguably clouds the biological relationships of the relevant species, rather than clarifying them. Just below the species level, another taxonomic issue is raised by the reinscription of race as a natural category of the human species. This, too, is driven by the convergent interests of cultural forces including conservative political ideologies, the creation of pharmaceutical niche markets, free-market genomics, and old- fashioned scientific racism. KEYWORDS – Molecular anthropology, Systematics, Classification, Human evolution Introduction In this paper I explore the intersection of genetics, taxonomy, and evolutionary anthropology. All three fields are scientific areas saturated with cultural meanings and associations. First, genetics is the scientific study of heredity, but has historically capitalized on non-scientific prejudices about heredity to curry support: hence James Watson’s epigrammatic proclamation in support of the Human Genome Project, “We used to think our fate was in the stars. Now we know, in large measure, our fate is in our genes” (Jaroff 1989; Duster 1990; Nelkin and Lindee 1995). -
A Review of Molecular-Clock Calibrations and Substitution Rates In
Molecular Phylogenetics and Evolution 78 (2014) 25–35 Contents lists available at ScienceDirect Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev A review of molecular-clock calibrations and substitution rates in liverworts, mosses, and hornworts, and a timeframe for a taxonomically cleaned-up genus Nothoceros ⇑ Juan Carlos Villarreal , Susanne S. Renner Systematic Botany and Mycology, University of Munich (LMU), Germany article info abstract Article history: Absolute times from calibrated DNA phylogenies can be used to infer lineage diversification, the origin of Received 31 January 2014 new ecological niches, or the role of long distance dispersal in shaping current distribution patterns. Revised 30 March 2014 Molecular-clock dating of non-vascular plants, however, has lagged behind flowering plant and animal Accepted 15 April 2014 dating. Here, we review dating studies that have focused on bryophytes with several goals in mind, (i) Available online 30 April 2014 to facilitate cross-validation by comparing rates and times obtained so far; (ii) to summarize rates that have yielded plausible results and that could be used in future studies; and (iii) to calibrate a species- Keywords: level phylogeny for Nothoceros, a model for plastid genome evolution in hornworts. Including the present Bryophyte fossils work, there have been 18 molecular clock studies of liverworts, mosses, or hornworts, the majority with Calibration approaches Cross validation fossil calibrations, a few with geological calibrations or dated with previously published plastid substitu- Nuclear ITS tion rates. Over half the studies cross-validated inferred divergence times by using alternative calibration Plastid DNA substitution rates approaches. Plastid substitution rates inferred for ‘‘bryophytes’’ are in line with those found in angio- Substitution rates sperm studies, implying that bryophyte clock models can be calibrated either with published substitution rates or with fossils, with the two approaches testing and cross-validating each other. -
Relative Model Selection Can Be Sensitive to Multiple Sequence Alignment Uncertainty
bioRxiv preprint doi: https://doi.org/10.1101/2021.08.04.455051; this version posted August 6, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Relative model selection can be sensitive to multiple sequence alignment uncertainty Stephanie J. Spielman1∗ and Molly L. Miraglia2;3 1Department of Biological Sciences, Rowan University, Glassboro, NJ, 08028, USA. 2Department of Molecular and Cellular Biosciences, Rowan University, Glassboro, NJ, 08028, USA. 3Fox Chase Cancer Center, Philadelphia, PA, 19111, USA. ∗Corresponding author: [email protected] Abstract Background: Multiple sequence alignments (MSAs) represent the fundamental unit of data inputted to most comparative sequence analyses. In phylogenetic analyses in particular, errors in MSA construction have the potential to induce further errors in downstream analyses such as phylogenetic reconstruction itself, ancestral state reconstruction, and divergence time esti- mation. In addition to providing phylogenetic methods with an MSA to analyze, researchers must also specify a suitable evolutionary model for the given analysis. Most commonly, re- searchers apply relative model selection to select a model from candidate set and then provide both the MSA and the selected model as input to subsequent analyses. While the influence of MSA errors has been explored for most stages of phylogenetics pipelines, the potential effects of MSA uncertainty on the relative model selection procedure itself have not been explored. Results: We assessed the consistency of relative model selection when presented with multi- ple perturbed versions of a given MSA. -
Molecular Evolution 1
Molecular Evolution 1. Evolutionary Tree Reconstruction 2. Two Hypotheses for Human Evolution 3. Did we evolve from Neanderthals? 4. Distance-Based Phylogeny 5. Neighbor Joining Algorithm 6. Additive Phylogeny 7. Least Squares Distance Phylogeny 8. UPGMA 9. Character-Based Phylogeny 10. Small Parsimony Problem 11. Fitch and Sankoff Algorithms 12. Large Parsimony Problem 13. Nearest Neighbor Interchange 14. Evolution of Human Repeats 15. Minimum Spanning Trees Section 1: Evolutionary Tree Reconstruction Early Evolutionary Studies • Anatomical features were the dominant criteria used to derive evolutionary relationships between species since Darwin till early 1960s. • The evolutionary relationships derived from these relatively subjective observations were often inconclusive. Some of them were later proven incorrect. DNA Analysis: The Giant Panda Riddle • For roughly 100 years, scientists were unable to figure out to which family the giant panda should belong. • Giant pandas look like bears but have features that are unusual for bears and typical for raccoons, e.g. they do not hibernate. • In 1985, Steven O’Brien and colleagues solved the giant panda classification problem using DNA sequences and algorithms. Evolutionary Tree of Bears and Raccoons Evolutionary Trees: DNA-Based Approach • 40 years ago: Emile Zuckerkandl and Linus Pauling brought reconstructing evolutionary relationships with DNA into the spotlight. • In the first few years after Zuckerkandl and Emile Zuckerkandl Pauling proposed using DNA for evolutionary studies, the possibility -
Empirical Models for Substitution in Ribosomal RNA
Empirical Models for Substitution in Ribosomal RNA Andrew D. Smith, Thomas W. H. Lui and Elisabeth R. M. Tillier Department of Medical Biophysics, University of Toronto and Ontario Cancer Institute, University Health Network, Toronto, Ontario, Canada Corresponding author: Elisabeth R. M. Tillier Ontario Cancer Institute University Health Network 620 University Avenue, Suite 703 Toronto, Ontario M5G 2M9 Canada Phone: 416 946-4501 ext 3978 Fax: 416 946-2397 email: [email protected] URL: http://www.uhnres.utoronto.ca/tillier/ Key words: rRNA evolution, empirical substitution model, RNA structure, simulation, Running head: Empirical rRNA substitution model Nonstandard abbreviations: Percent Accepted Mutations (PAM), Probability model from Blocks (PMB), ribosomal RNA (rRNA), transfer RNA (tRNA), Hidden Markov Model (HMM) 1 Abstract Empirical models of substitution are often used in protein sequence analysis because the large alphabet of amino acids requires that many parameters be estimated in all but the simplest parametric models. When information about structure is used in the analysis of substitutions in structured RNA, a similar situation occurs. The number of parameters necessary to adequately describe the substitution process increases in order to model the substitution of paired bases. We have developed a method to obtain substitution rate matrices empirically from RNA alignments that include structural information in the form of base pairs. Our data consisted of alignments from the European ribosomal RNA database of Bacterial and Eukaryotic Small Subunit and Large Subunit ribosomal RNA (Wuyts et al., 2001a; Wuyts et al., 2002). Using secondary structural information, we converted each sequence in the alignments into a sequence over a 20-symbol code: one symbol for each of the 4 individual bases, and one symbol for each of the 16 ordered pairs.