Research Statement Tony Jhwueng 2011

Total Page:16

File Type:pdf, Size:1020Kb

Research Statement Tony Jhwueng 2011 Research Statement Tony Jhwueng 2011 Summary My research focuses on developing and analyzing phylogenetic comparative methods (PCMs), studying their application to biological data sets, and relevant mathematical/statsitcal prop- erties. Breakthroughs require knowledge of phylogenetics, model selection theory, stochastic differential equations, finite difference methods, applied statistics; techniques including pro- gramming and the development of relevant software. My expertise will allow me to continue to make important contributions to these areas. Contributions and Impact Model Selection and Goodness of Fit for PCMs (Thesis): Phylogenetic comparative methods (PCMs) are statistical methods for analyzing the ecological and evo- lutionary data. Many such methods [FIC (Felsenstein 1985); PMM (Housworth et. al 2004); PA (Cheverud et. al 1985; Gittleman and Kot 1990); PGLS (Martin and Hansen 1997; Butler and King 2004)] have been proposed but their fit to data is rarely assessed. I assess the fit of several PCMs to a large collection of data sets gathered from the literature. I compared the corresponding statistical models using model selection criteria [(AIC) Akaike 1977; (AICc)Hurvich and Tsai 1990] in order to determine if the more complicated models containing additional parameters provided a significant improvement for data. Data were collected through a meta analysis on existing published phylogenetic data sets by searching all papers using a keyword search in JEB, Blackwell Synergy and JSTOR using the key- words: ((Comparative methods OR Comparative analysis) AND independent contrasts) from year 2002 to 2005. All data collected for this study are (1) averaged trait values with an associated sample size and standard deviation or with a given standard error and (2) rooted phylogenetic trees molecular clock tree or the dendrogram. I use AICc to compare the PCMs and use bootstrapping technique[Burham and Anderson 2002] to assess whether a simpler 1 Research Statement Tony Jhwueng 2011 model(FIC or ID) or a more complicated model(PMM, PA, or PGLS) with an additional parameter provide a better fit than other models in the candidate set. Due to the unknown distribution of data, I choose to use the D'Agostino normality test [D'Agostino 1990] for the univariate data to test the normal assumption of the PCMs because it combines skewness and kurtosis and is regarded as which has been shown to be informative, with good power properties over a broad range of non-normal distributions. I also perturbed the phylogenies by randomly varying the branch length without changing its topology to study the impact of small errors in the phylogeny on model selection criterion of PCMs. Conclusion. In this study, FIC and the model assuming no phylogenetic effect are the most frequently chosen as the best model. This is due in part due to the penalty AICc places on models with more parameters. While we chose to use AICc for model selection due to the fact that these models do not create a nested sequence of models, many of these models do include the extreme models as a nested component. For instance, when the heri- tability parameter h in the PMM model converges to 0 or 1, PMM is identical to ID or FIC, and thus the likelihood for PMM and one of the extreme models would be identical in this case. However, AICc penalizes PMM for using an additional parameter. Such situations also happen for PA: when ρ = 0 PA is identical to ID. Similarly OU is identical to FIC when α = 0. On the other hand, although we collected 122 data sets, the majority, 77 of them, do not pass a normality test. Thus, it might be that trait values among different species vary in a wide range such that the normality assumption common to PCMs can not be applied to describe the data. Such wide range of data might be either resulted from the heterogeneous rate of evolution occured along different lineage on the phylogeny [O'Meara et al. 2006]. Under the circumstance, developing robust, nonparametric analytical methods and models of trait evolution that do not lead to multivariate normal data seem to call for(see future project D). 2 Research Statement Tony Jhwueng 2011 While there is no evidence that the more complicated models with additional parameters are necessary to describe most comparative data, there is the question whether fitting them to data is detrimental to the analyses most commonly employed by researchers. This question is then investigated through examining the effect on correlation estimates of fitting different models to bivariate data sets. Although similar approaches have been done in literature (From Ricklefs and Starck 1996 to Revell 2010), their work are through the regression anal- ysis, my attempt uses the maximum likelihood estimation in that the MLE estimator are consistent and the bias of estimators decrease as the sample size increases. I also develop two statistical distributions for bivariate PCMs (OU and PA). Since bivariate models con- tains more parameters, an efficient optimization algorithm is also developed [analogously to Housworth et. al 2004] to improve the parameters estimation. I then use parametric boot- strapping to investigate the significance for the correlation between two different traits. My results report that most estimated correlations are concordant. Thus, if there is a significant positive correlation under one model, using a different model would also yield a significant positive correlation. This is a very reassuring result for the practical application of bivariate PCMs. Although the variation of parameters varies in quite wide range, the proportion of concordance between different PCMs in detecting non-zero correlations in actual data is very high. Therefore, researchers should apply the PCM they believe best describes the evolu- tionary mechanisms underlying their data. However, it is unlikely that any given idealized model perfectly describes both traits and their joint evolution in a given bivariate analysis. Current and Future Projects A. Modeling PCMs with Hybridizations (with Elizabeth Housworth): Hybrid species are known for sharing some common phenotypes from their parents. The rate of variation of the trait between hybrid and its parents has not been studied and reported through comparative analysis. Two improved phylogenetic comparative method (PCMs) are proposed to allow for data sets that involve hybrids. Instead of using phylogenetic trees, 3 Research Statement Tony Jhwueng 2011 the new methods analyze comparative data by incorporating phylogenetic networks where ancient hybrids are explicitly identified. The new methods can also be applied to test the heterogeneous hybrid effects when multiple hybrids are presented on the network. Simula- tion studies for accessing the robustness of the new method indicate that the corresponding statistical models are sensitive to the timing of the ancient hybridization. Increased power and decreased bias were found when the hybridization event occurred more recently. Finally, we incorporate the phylogenetic network in [Koblm¨ulleret. al 2008] to analyze the rate of variation for body lengths of cichilds. B. Impact of AICc on Models of DNA Evolution (With Brian O'Meara): We inves- tigate the impact of Akaike Information Criterion (AIC) on selecting the models of nucleotide substitution. Our approach focuses on the sample size correction criterion AICc[Hurvich and Tsai 1990] where three different versions of sample sizes: (i) the number of site used in the alignment, (ii) the number of taxa, (iii) the product of number of site and number of taxa are considered for AICc. We find that when small data set are used, in particular for using only one site in the alignment, version (ii) prefers the simpler model while version (i) choose more complicated models. C. An Statistical Approach of an improved comparative method for studying adaptation to a randomly evolving environment (with Vasileios Maroulas): We improve a phylogenetic comparative method for correcting the mal-adaptation. We compare our new model (OU-OU) to the Hansen's model (BM-OU)[Hansen et al. 2008] by evaluating the ability in estimating the phylogenetic inertia and the performance of evolutionary and optimal regressions. Both models are built around an Ornstein-Uhlenbeck process, the main difference between the OU-OU and BM-OU is the stochastic assumption on the optimumsa. Simulation studies on investigating the bias and power of the estimators for both models will also be assessed. 4 Research Statement Tony Jhwueng 2011 D. A non-Gaussian multivariate distribution of phylogenetic tree dependency (Independent project): Phylogenetic comparative methods(PCMs) incorporate informa- tion on the evolutionary relationships of organisms (phylogenetic trees) to compare species. However, trait values among different species vary in a wide range such that the normality assumption common to PCMs can not be applied to describe the data. Neglecting the nor- mality assumption and proceeding data analysis often lead to error and eventually change the end conclusion. We develop robust parametric analytical methods and models of trait evolution that do not lead to multivariate normal data. We expect that by simulation our model could fit heavy tail data set (log-normal and log-gamma data), persists good power property on the violation of the normality and outperforms other PCMs under the normality assumption to large collection of real comparative data set. E. Optimal Taxa Sampling Problem in Phylogenetic Comparative Methods (In- dependent project): Given a large molecular-clock based phylogeny with N taxa, it is not always possible to sample all N species. Hence the question: if we can sample k species from these N, which should we choose? For a univariate analysis, that typically means finding the sample that minimizing the variation in the estimate of the mean. This problem will be investigated under various PCMs [FIC, PGLS, PMM, and PA]. With different evolutionary assumptions, the structure of the similarity matrices for those PCMs are quite different and the solution may not be the same if the answers depend on more than just the branching order.
Recommended publications
  • Phylogenetic Analysis of Anostracans (Branchiopoda: Anostraca) Inferred from Nuclear 18S Ribosomal DNA (18S Rdna) Sequences
    MOLECULAR PHYLOGENETICS AND EVOLUTION Molecular Phylogenetics and Evolution 25 (2002) 535–544 www.academicpress.com Phylogenetic analysis of anostracans (Branchiopoda: Anostraca) inferred from nuclear 18S ribosomal DNA (18S rDNA) sequences Peter H.H. Weekers,a,* Gopal Murugan,a,1 Jacques R. Vanfleteren,a Denton Belk,b and Henri J. Dumonta a Department of Biology, Ghent University, Ledeganckstraat 35, B-9000 Ghent, Belgium b Biology Department, Our Lady of the Lake University of San Antonio, San Antonio, TX 78207, USA Received 20 February 2001; received in revised form 18 June 2002 Abstract The nuclear small subunit ribosomal DNA (18S rDNA) of 27 anostracans (Branchiopoda: Anostraca) belonging to 14 genera and eight out of nine traditionally recognized families has been sequenced and used for phylogenetic analysis. The 18S rDNA phylogeny shows that the anostracans are monophyletic. The taxa under examination form two clades of subordinal level and eight clades of family level. Two families the Polyartemiidae and Linderiellidae are suppressed and merged with the Chirocephalidae, of which together they form a subfamily. In contrast, the Parartemiinae are removed from the Branchipodidae, raised to family level (Parartemiidae) and cluster as a sister group to the Artemiidae in a clade defined here as the Artemiina (new suborder). A number of morphological traits support this new suborder. The Branchipodidae are separated into two families, the Branchipodidae and Ta- nymastigidae (new family). The relationship between Dendrocephalus and Thamnocephalus requires further study and needs the addition of Branchinella sequences to decide whether the Thamnocephalidae are monophyletic. Surprisingly, Polyartemiella hazeni and Polyartemia forcipata (‘‘Family’’ Polyartemiidae), with 17 and 19 thoracic segments and pairs of trunk limb as opposed to all other anostracans with only 11 pairs, do not cluster but are separated by Linderiella santarosae (‘‘Family’’ Linderiellidae), which has 11 pairs of trunk limbs.
    [Show full text]
  • Investgating Determinants of Phylogeneic Accuracy
    IMPACT OF MOLECULAR EVOLUTIONARY FOOTPRINTS ON PHYLOGENETIC ACCURACY – A SIMULATION STUDY Dissertation Submitted to The College of Arts and Sciences of the UNIVERSITY OF DAYTON In Partial Fulfillment of the Requirements for The Degree Doctor of Philosophy in Biology by Bhakti Dwivedi UNIVERSITY OF DAYTON August, 2009 i APPROVED BY: _________________________ Gadagkar, R. Sudhindra Ph.D. Major Advisor _________________________ Robinson, Jayne Ph.D. Committee Member Chair Department of Biology _________________________ Nielsen, R. Mark Ph.D. Committee Member _________________________ Rowe, J. John Ph.D. Committee Member _________________________ Goldman, Dan Ph.D. Committee Member ii ABSTRACT IMPACT OF MOLECULAR EVOLUTIONARY FOOTPRINTS ON PHYLOGENETIC ACCURACY – A SIMULATION STUDY Dwivedi Bhakti University of Dayton Advisor: Dr. Sudhindra R. Gadagkar An accurately inferred phylogeny is important to the study of molecular evolution. Factors impacting the accuracy of a phylogenetic tree can be traced to several consecutive steps leading to the inference of the phylogeny. In this simulation-based study our focus is on the impact of the certain evolutionary features of the nucleotide sequences themselves in the alignment rather than any source of error during the process of sequence alignment or due to the choice of the method of phylogenetic inference. Nucleotide sequences can be characterized by summary statistics such as sequence length and base composition. When two or more such sequences need to be compared to each other (as in an alignment prior to phylogenetic analysis) additional evolutionary features come into play, such as the overall rate of nucleotide substitution, the ratio of two specific instantaneous, rates of substitution (rate at which transitions and transversions occur), and the shape parameter, of the gamma distribution (that quantifies the extent of iii heterogeneity in substitution rate among sites in an alignment).
    [Show full text]
  • Eidesstattliche Erklärung
    ZENTRUM FÜR BIODIVERSITÄT UND NACHHALTIGE LANDNUTZUNG SEKTION BIODIVERSITÄT, ÖKOLOGIE UND NATURSCHUTZ CENTRE OF BIODIVERSITY AND SUSTAINABLE LAND USE SECTION: BIODIVERSITY, ECOLOGY AND NATURE CONSERVATION Mitochondrial genomes and the complex evolutionary history of the cercopithecine tribe Papionini Dissertation zur Erlangung des Doktorgrades der Mathematisch-Naturwissenschaftlichen Fakultäten der Georg-August-Universität zu Göttingen vorgelegt von Dipl. Biol. Rasmus Liedigk aus Westerstede Göttingen, September 2014 Referent: PD Dr. Christian Roos Korreferent: Prof. Dr. Eckhard Heymann Tag der mündlichen Prüfung: 19.9.2014 Table of content 1 General introduction .............................................................................................. 1 1.1 An introduction to phylogenetics ....................................................................... 1 1.2 Tribe Papionini – subfamily Cercopithecinae ..................................................... 3 1.2.1 Subtribe Papionina.................................................................................... 4 1.2.2 Subtribe Macacina, genus Macaca ........................................................... 5 1.3 Papionin fossils in Europe and Asia................................................................... 7 1.3.1 Fossils of Macaca ..................................................................................... 8 1.3.2 Fossils of Theropithecus ........................................................................... 9 1.4 The mitochondrial genome and its
    [Show full text]
  • Genetics 540 Winter, 2001 Models of DNA Evolution – Part 2 Joe Felsenstein Department of Genetics University of Washington
    Genetics 540 Winter, 2001 Models of DNA evolution { part 2 Joe Felsenstein Department of Genetics University of Washington joe@genetics A model of variation in evolutionary rates among sites The basic idea is that the rate at each site is drawn independently from a distribution of rates. The most widely used choice is the Gamma distribution, which has density function (if its mean is 1): αα rα 1 e α r f(r) = − − Γ(α) Unrealistic aspects of the model: • There is no reason, aside from mathematical convenience, to assume that the Gamma is the right distribution. A common variation is to assume there is a separate probability f0 of having rate 0. • Rates at different sites appear to be correlated, which this model does not allow. • Rates are not constant throughout evolution { they change with time. α = 0.25 cv = 2 α = 11.1111 α = 1 cv = 0.3 frequency cv = 1 0 0.5 1 1.5 2 rate Gamma distributions with mean 1 and different coefficients of variation (standard deviation / mean). α = 1=CV 2 is the \shape parameter" of the Gamma distribution Hidden Markov Models These are the most widely used models allowing rate variation to be correlated along the sequence. We assume: • There are a finite number of rates, m. Rate i is ri. • There are probabilities pi of a site having rate i. • A process not visible to us (\hidden") assigns rates to sites. It is a Markov process working along the sequence. For example it might have transition probability Prob (jji) of changing to rate j in the next site, given that it is at rate i in this site.
    [Show full text]
  • Convergent Evolution of Behavior in an Adaptive Radiation of Hawaiian Web-Building Spiders
    Convergent evolution of behavior in an adaptive radiation of Hawaiian web-building spiders Todd A. Blackledge† and Rosemary G. Gillespie Department of Environmental Science, Policy, and Management, Division of Insect Biology, 201 Wellman Hall, University of California, Berkeley, CA 94720 Communicated by Thomas W. Schoener, University of California, Davis, CA, October 6, 2004 (received for review March 22, 2004) Species in ecologically similar habitats often display patterns of such that differences in web shape can indicate differences in divergence that are strikingly comparable, suggesting that natural how spiders are using resources (17, 19–21). We assess the selection can lead to predictable evolutionary change in commu- relative predictability of behavioral evolution within the adap- nities. However, the relative importance of selection as an agent tive radiation of Hawaiian orb-weaving Tetragnatha by compar- mediating in situ diversification, versus dispersal between habi- ing the web architectures of species, both within islands and tats, cannot be addressed without knowledge of phylogenetic among communities on different islands, and by examining the history. We used an adaptive radiation of spiders within the historical diversification of those behaviors. Hawaiian Islands to test the prediction that species of spiders on different islands would independently evolve webs with similar Materials and Methods architectures. Tetragnatha spiders are the only nocturnal orb- Focal Localities. We studied spiders in focal localities on three weaving spiders endemic to the Hawaiian archipelago, and mul- different islands, each of which consisted of mature mesic to wet tiple species of orb-weaving Tetragnatha co-occur within mesic forest vegetation but varied in age according to island (22): and wet forest habitats on each of the main islands.
    [Show full text]
  • Phylogenetics and Bioinformatics for Evolution [30Pt] Maximum Likelihood
    c N. Salamin Sept 2007 Lecture outline Maximum likelihood in phylogenetics Definition Maximum likelihood Phylogenetics and bioinformatics for evolution and models Likelihood of a tree Computational complexity Statistical properties Maximum parsimony Maximum Likelihood Maximum likelihood Experimental design Hypothesis testing Tree support Tests of topology Tests of models September, 2007 c N. Salamin Sept 2007 Lecture outline Lecture outline Maximum likelihood in phylogenetics 1 Definition Maximum likelihood in phylogenetics Maximum likelihood and models Definition Likelihood of a tree Maximum likelihood and models Computational complexity Likelihood of a tree Statistical properties Computational complexity Maximum parsimony Maximum likelihood Experimental design 2 Statistical properties Hypothesis testing Maximum parsimony Tree support Maximum likelihood Tests of topology Tests of models Experimental design 3 Hypothesis testing Tree support Tests of topology Tests of models c N. Salamin Sept 2007 Lecture outline Lecture outline Maximum likelihood in phylogenetics 1 Definition Maximum likelihood in phylogenetics Maximum likelihood and models Definition Likelihood of a tree Maximum likelihood and models Computational complexity Likelihood of a tree Statistical properties Computational complexity Maximum parsimony Maximum likelihood Experimental design 2 Statistical properties Hypothesis testing Maximum parsimony Tree support Maximum likelihood Tests of topology Tests of models Experimental design 3 Hypothesis testing Tree support Tests of topology Tests of models c N. Salamin Sept 2007 Description Lecture outline Maximum likelihood in phylogenetics Definition Maximum likelihood Given an hypothesis H and some data D, the likelihood of H is and models Likelihood of a tree Computational L(H) = Prob(D H) = Prob(D1 H)Prob(D2 H) Prob(Dn H) complexity | | | · · · | Statistical properties if the D can be split in n independent parts.
    [Show full text]
  • PHYLOGENOMIC CONFLICT in HYLARANA 1 Exons, Introns, And
    bioRxiv preprint doi: https://doi.org/10.1101/765610; this version posted September 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. PHYLOGENOMIC CONFLICT IN HYLARANA 1 Exons, Introns, and UCEs Reveal Conflicting Phylogenomic Signals in a Rapid 2 Radiation of Frogs (Ranidae: Hylarana) 3 4 Kin Onn Chan1,2,*, Carl R. Hutter2, Perry L. Wood, Jr.3, L. Lee Grismer4, Rafe M. 5 Brown2 6 7 1 Lee Kong Chian National History Museum, Faculty of Science, National University of 8 Singapore, 2 Conservatory Drive, Singapore 117377. Email: [email protected] 9 10 2 Biodiversity Institute and Department of Ecology and Evolutionary Biology, University 11 of Kansas, Lawrence, KS 66045, USA. Email: [email protected]; [email protected] 12 13 3 Department of Biological Sciences & Museum of Natural History, Auburn University, 14 Auburn, Alabama 36849, USA. Email: [email protected] 15 16 4 Herpetology Laboratory, Department of Biology, La Sierra University, 4500 Riverwalk 17 Parkway, Riverside, California 92505, USA. Email: [email protected] 18 19 *Corresponding author 20 1 bioRxiv preprint doi: https://doi.org/10.1101/765610; this version posted September 11, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
    [Show full text]
  • Hierarchical Phylogeny Construction
    Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2019 Hierarchical phylogeny construction Anindya Das Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Bioinformatics Commons, and the Computer Sciences Commons Recommended Citation Das, Anindya, "Hierarchical phylogeny construction" (2019). Graduate Theses and Dissertations. 17433. https://lib.dr.iastate.edu/etd/17433 This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Hierarchical phylogeny construction by Anindya Das A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Major: Computer Science Program of Study Committee: Xiaoqiu Huang, Major Professor David Fernandez-Baca Oliver Eulenstein Peng Liu Dennis V Lavrov The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this dissertation. The Graduate College will ensure this dissertation is globally accessible and will not permit alterations after a degree is conferred. Iowa State University Ames, Iowa 2019 Copyright c Anindya Das, 2019. All rights reserved. ii DEDICATION I would like to dedicate this dissertation to my wife Soma and to my parents without whose support I would not have been able to complete this work. I would also like to thank my friends and family for their loving guidance and continuous encouragement and assistance during the writing of this work.
    [Show full text]
  • Molecular Phylogeny and Historical Biogeography of the Land Snail Genus Solatopupa (Pulmonata) in the Peri-Tyrrhenian Area
    Molecular Phylogenetics and Evolution 39 (2006) 439–451 www.elsevier.com/locate/ympev Molecular phylogeny and historical biogeography of the land snail genus Solatopupa (Pulmonata) in the peri-Tyrrhenian area Valerio Ketmaier a,b,c,¤, Folco Giusti d, Adalgisa Caccone b,e a Unit of Evolutionary Biology/Systematic Zoology, Institute of Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, D-14476 Potsdam, Germany b Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520-8106, USA c Dipartimento di Biologia Animale e dell’Uomo, Università di Roma “La Sapienza,” V.le dell’Università 32, I-00185 Rome, Italy d Dipartimento di Scienze Ambientali “G.Sarfatti,” Università di Siena, V.Mattioli 4, I-53100 Siena, Italy e Yale Institute of Biospheric Studies, Yale University, New Haven, CT 06520-8106, USA Received 8 August 2005; revised 6 December 2005; accepted 8 December 2005 Available online 25 January 2006 Abstract The land snail genus Solatopupa consists of six species and has a peri-Tyrrhenian distribution; most of the species have a very narrow range and all of them except one (Solatopupa cianensis, which inhabits porphyritic rocks) are strictly bound to calcareous substrates. One species (Solatopupa guidoni) is limited to Sardinia, Corsica, and Elba Island. Because the potential for dispersal of these snails is low, the insular range of this species has been traditionally related to the Oligocenic detachment of the Sardinia–Corsica microplate from the Ibe- rian plate and its subsequent rotation towards the Italian peninsula. In this study, we used sequences of three mitochondrial and one nuclear gene to reconstruct the evolutionary history of the genus.
    [Show full text]
  • Deep Diversification and Long-Term Persistence in the South American ‘Dry Diagonal’: Integrating Continent-Wide Phylogeography and Distribution Modeling of Geckos
    ORIGINAL ARTICLE doi:10.1111/j.1558-5646.2012.01682.x DEEP DIVERSIFICATION AND LONG-TERM PERSISTENCE IN THE SOUTH AMERICAN ‘DRY DIAGONAL’: INTEGRATING CONTINENT-WIDE PHYLOGEOGRAPHY AND DISTRIBUTION MODELING OF GECKOS Fernanda P. Werneck,1,2 Tony Gamble,3,4 Guarino R. Colli,5 Miguel T. Rodrigues,6 and Jack W. Sites, Jr1,7 1Department of Biology, Brigham Young University, Provo, Utah 84602 2E-mail: [email protected] 3Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, Minnesota 55445 4Bell Museum of Natural History, University of Minnesota, St. Paul, Minnesota 55108 5Departamento de Zoologia, Universidade de Brasılia,´ 70910–900 Brasılia,´ DF, Brazil 6Departamento de Zoologia, Instituto de Biociencias,ˆ Universidade de Sao˜ Paulo, 05508–090 Sao˜ Paulo, SP, Brazil 7Bean Life Science Museum, Brigham Young University, Provo, Utah 84602 Received January 7, 2012 Accepted April 19, 2012 The relative influence of Neogene geomorphological events and Quaternary climatic changes as causal mechanisms on Neotropical diversification remains largely speculative, as most divergence timing inferences are based on a single locus and have limited taxonomic or geographic sampling. To investigate these influences, we use a multilocus (two mitochondrial and 11 nuclear genes) range-wide sampling of Phyllopezus pollicaris, a gecko complex widely distributed across the poorly studied South American ‘dry diagonal’ biomes. Our approach couples traditional and model-based phylogeography with geospatial methods, and demonstrates
    [Show full text]
  • Mutation Patterns of Mitochondrial H- and L-Strand DNA in Closely Related Cyprinid Fishes
    Copyright 2002 by the Genetics Society of America Mutation Patterns of Mitochondrial H- and L-Strand DNA in Closely Related Cyprinid Fishes Joseph P. Bielawski*,†,1 and John R. Gold* *Center for Biosystematics and Biodiversity, Texas A&M University, College Station, Texas 77843-2258 and †Department of Biology, University College London, London WC1E 6BT, United Kingdom Manuscript received January 14, 2002 Accepted for publication April 19, 2002 ABSTRACT Mitochondrial genome replication is asymmetric. Replication starts from the origin of heavy (H)-strand replication, displacing the parental H-strand as it proceeds along the molecule. The H-strand remains single stranded until light (L)-strand replication is initiated from a second origin of replication. It has been suggested that single-stranded H-strand DNA is more sensitive to mutational damage, giving rise to substitutional rate differences between the two strands and among genes in mammalian mitochondrial DNA. In this study, we analyzed sequences of the cytochrome b, ND4, ND4L, and COI genes of cyprinid fishes to investigate rates and patterns of nucleotide substitution in the mitochondrial genome. To test for strand-asymmetric mutation pressure, a likelihood-ratio test was developed and applied to the cyprinid sequences. Patterns of substitution and levels of strand-asymmetric mutation pressure were largely consistent with a mutation gradient between the H- and L-strand origins of replication. Significant strand bias was observed among rates of transitional substitution. However, biological interpretation of the direction and strength of strand asymmetry for specific classes of substitutions is problematic. The problem occurs because the rate of any single class of substitution inferred from one strand is actually a sum of rates on two strands.
    [Show full text]
  • Fast Distance-Based Phylogenetic Placement
    bioRxiv preprint doi: https://doi.org/10.1101/475566; this version posted November 23, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. APPLES: Fast Distance-based Phylogenetic Placement Metin Balaban,1 Shahab Sarmashghi,2 and Siavash Mirarab2* 1Bioinformatics and Systems Biology Graduate Program, UC San Diego, CA 92093, USA 2Department of Electrical and Computer Engineering, UC San Diego, CA 92093, USA *Corresponding author: [email protected] Abstract Phylogenetic placement consists of adding a query species onto an existing phylogeny and has increasing relevance as sequence datasets continue to grow in size and diversity. Placement is useful for updating existing phylogenies and for identifying samples taxonomically using (meta-)barcoding or metagenomics. Maximum likelihood (ML) methods of phylogenetic placement exist, but these methods are not scalable to trees with many thousands of leaves. They also rely on assembled and aligned sequences for the reference tree and the query and thus cannot analyze unassembled reads used recently in applications such as genome skimming. Here, we introduce APPLES, a distance-based method of phylogenetic placement that improves on ML by more than an order of magnitude in speed and memory and comes very close to ML in accuracy. APPLES has better accuracy than ML for placing on trees with thousands of species and can place on trees with a hundred thousands species where ML cannot run. Finally, APPLES can accurately identify samples without assembled sequences for the reference or the query using k-mer-based distances, a scenario that ML cannot handle.
    [Show full text]