1 Ultra-Conserved Elements and Morphology Reciprocally Illuminate Conflicting
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/761874; this version posted September 9, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 1 Ultra-Conserved Elements and morphology reciprocally illuminate conflicting 2 phylogenetic hypotheses in Chalcididae (Hymenoptera, Chalcidoidea) 3 4 Astrid Cruaud*1, Gérard Delvare*1,2, Sabine Nidelet1, Laure Sauné1, Sujeevan 5 Ratnasingham3, Marguerite Chartois1, Bonnie B. Blaimer9, Michael Gates5, Seán G. Brady4, 6 Sariana Faure6, Simon van Noort7,8, Jean-Pierre Rossi1 and Jean-Yves Rasplus£1 7 8 1 CBGP, Univ Montpellier, CIRAD, INRA, IRD, Montpellier SupAgro, Montpellier, France. 9 2 CIRAD, UMR CBGP, F-34398 Montpellier, France. 10 3 Centre for Biodiversity Genomics, University of Guelph, Canada. 11 4 Department of Entomology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA 12 5 USDA, ARS, SEL, c/o Smithsonian Institution, National Museum of Natural History, Washington, USA. 13 6 Department of Zoology and Entomology, Rhodes University, Grahamstown, South Africa. 14 7 Research and Exhibitions Department, South African Museum, Iziko Museums of South Africa, P.O. Box 61, Cape Town, 15 8000, South Africa 16 8 Department of Biological Sciences, University of Cape Town, Private Bag, Rondebosch, 7701, Cape Town, South Africa 17 9 North Carolina State University, Raleigh, NC, USA 18 19 £Corresponding author: [email protected] 20 *equal contributors 21 22 Running headline = Conflicting phylogenomic hypotheses in Chalcididae 23 24 25 1 bioRxiv preprint doi: https://doi.org/10.1101/761874; this version posted September 9, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 26 ABSTRACT 27 Recent technical advances combined with novel computational approaches promised the 28 acceleration of our understanding of the tree of life. However, when it comes to hyperdiverse 29 and poorly known groups of invertebrates, studies are still scarce. As published phylogenies 30 will be rarely challenged by future taxonomists, careful attention must be paid to potential 31 analytical bias. We present the first molecular phylogenetic hypothesis for the family 32 Chalcididae, an emblematic group of parasitoid wasps, with a representative sampling (144 33 ingroups and 7 outgroups) that covers all described subfamilies and tribes and 82% of the 34 known genera. Analyses of 538 Ultra-Conserved Elements (UCEs) with supermatrix 35 (RAxML and IQTREE) and gene-tree reconciliation approaches (ASTRAL, ASTRID) 36 resulted in highly supported topologies in overall agreement with morphology but reveal 37 conflicting topologies for some of the deepest nodes. To resolve these conflicts, we explored 38 the phylogenetic tree space with clustering and gene genealogy interrogation methods, 39 analyzed marker and taxon properties that could bias inferences and performed a thorough 40 morphological analysis (130 characters encoded for 40 taxa representative of the diversity). 41 This joint analysis reveals that UCEs enable attainment of resolution between ancestry and 42 convergent /divergent evolution when morphology is not informative enough, but also shows 43 that a systematic exploration of bias with different analytical methods and a careful analysis 44 of morphological features is required to prevent publication of artefactual results. We 45 highlight a GC-content bias for ML approaches, an artefactual mid-point rooting of the 46 ASTRAL tree and a deleterious effect of high percentage of missing data on gene tree 47 reconciliation methods. Based on the results we propose a new classification of the family 48 into eight subfamilies and 10 tribes that lay the foundation for future studies on the 49 evolutionary history of Chalcididae. 50 2 bioRxiv preprint doi: https://doi.org/10.1101/761874; this version posted September 9, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 51 Key words: morphology, systematic bias, topological conflict, Ultra Conserved Elements. 52 53 54 INTRODUCTION 55 At a time when biodiversity studies are of critical importance (Dirzo et al. 2014, 56 Hallmann et al. 2017), efforts made to solve the tree of life are unequal between the different 57 groups of organisms. In animals, most attempts using pangenomic data focus on vertebrates, 58 for which past research has provided a solid framework based on external morphology, 59 anatomy, biology and fossils (Titley et al. 2017). Many teams continuously add to our 60 knowledge of vertebrate groups by performing phylogenomic studies to test previous 61 hypotheses and resolve long-standing taxonomic disputes. When it comes to invertebrate 62 groups, specifically to insects, which are the most speciose terrestrial organisms (Foottit and 63 Adler, 2009), the picture is quite different. Background knowledge is poor for most groups, 64 essentially based on a small number of morphological features, and only a few phylogenomic 65 hypotheses using representative but limited sampling have been published at the family level. 66 This is certainly a consequence of the so-called taxonomic impediment (Ebach et al. 2011, 67 Wägele et al. 2011), difficulty in adapting to new technologies and the inherent complexity of 68 working with hyperdiverse groups. Obtaining a representative taxon sampling is problematic 69 as new species and genera are constantly discovered, whereas some of the described taxa have 70 only been found once. In addition, sampling is complicated by the recent restrictive access 71 regulations to reduce the risk of supposed biopiracy (Prathapan et al. 2018). Finally, in many 72 groups, species complexes do exist that are difficult to untangle based on morphology alone. 73 Consequently, pangenomic data must be obtained from single, often tiny specimens, which is 74 technically challenging. 3 bioRxiv preprint doi: https://doi.org/10.1101/761874; this version posted September 9, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 75 Parasitic wasps and more precisely chalcid wasps (ca 500,000 species, Heraty et al. 76 2013) belong to these hyperdiverse and poorly known groups. As they naturally regulate 77 populations of others insects, chalcid wasps have a key functional role in the ecosystems 78 (Godfray 1994) and are frequently used as biological control agents (Consoli et al. 2010). 79 However, families of chalcid wasps are understudied and only a few family-wide, Sanger 80 based and poorly resolved phylogenetic hypotheses with reduced sampling have been 81 published (Chen et al. 2004, Desjardins et al. 2007, Owen et al. 2007, Burks et al. 2011, 82 Cruaud et al. 2012, Murray et al. 2013, Janšta et al. 2017). Consequently, classifications are 83 still based on morphological characters, although morphological convergence due to similar 84 biology is widespread (van Noort and Compton 1996, Heraty et al. 2013). 85 By reducing stochastic errors, genome-scale data offer greater opportunities to better 86 resolve phylogenetic relationships (Philippe et al. 2005). Phylogenomic trees are usually 87 highly supported but high statistical support is still often confused with accuracy (Lartillot et 88 al. 2007). However, as for reduced-size data sets, when the strongest signal that emerges from 89 the data is not the historical signal, models or methods can be misled and infer incorrect 90 topologies with high support (systematic error; Swofford et al. 1996, Phillips et al. 2004)). 91 With the increase in the number of gene regions, the probability to observe conflicting signal 92 between markers due to violation of model assumption also increases (Kumar et al. 2012, 93 Philippe et al. 2017) and total evidence approaches can lead to incorrect yet highly supported 94 trees. This is why it is crucial to interpret molecular results in the light of morphological and 95 biological data to point out possible contradictions (Wiens 2004). However, in the case of 96 poorly known groups, a feedback on morphological features is not straightforward. 97 Here, we were interested in testing to which extent a total evidence approach using 98 pangenomic data with or without knowledge on the morphology and biology of the target 99 group could lead to artefactual results and fill knowledge gaps. We focused on the 4 bioRxiv preprint doi: https://doi.org/10.1101/761874; this version posted September 9, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 100 Chalcididae (Delvare 2017), the type family of Chalcidoidea that comprises 1,548 described 101 species and 83 genera classified into six subfamilies and seven tribes (Noyes 2018) (Table 1). 102 The family is found on all continents except the polar regions but has its greatest diversity in 103 the tropics. Species diversity within genera varies greatly, and four genera (Antrocephalus, 104 Brachymeria, Conura, Hockeria) represent more than half (54%) of the species diversity 105 while 65 genera (78%) comprise less than five described species. The evolutionary history of 106 the Chalcididae has been the focus of a single study based on a limited sampling (22 taxa) and 107 a few (34) morphological characters (Wijesekara 1997b). Another morphological study has 108 addressed relationships within two tribes (Wijesekara 1997a).