A Consensus Phylogenomic Approach Highlights the Ancient Rapid Radiation of Ericales

A Consensus Phylogenomic Approach Highlights the Ancient Rapid Radiation of Ericales

bioRxiv preprint doi: https://doi.org/10.1101/816967; this version posted October 24, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 1 A consensus phylogenomic approach highlights the ancient rapid radiation of Ericales 2 3 Drew A. Larson1,4, Joseph F. Walker2, Oscar M. Vargas3 and Stephen A. Smith1 4 5 6 1Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor, MI 7 48109, USA 8 2Sainsbury Laboratory (SLCU), University of Cambridge, Cambridge, CB2 1LR, UK 9 3Department of Ecology & Evolutionary Biology, University of California, Santa Cruz, CA, 10 95064, USA 11 4Author for correspondence: D. A. Larson ([email protected]) 12 13 14 15 16 17 18 19 20 21 22 Manuscript received _______; revision accepted _______. 23 Running Head: A consensus approach to resolving the Ericales phylogeny 1 bioRxiv preprint doi: https://doi.org/10.1101/816967; this version posted October 24, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 24 ABSTRACT 25 Premise of study: Large genomic datasets offer the promise of resolving historically recalcitrant 26 phylogenetic problems. However, different methodologies can yield conflicting results, 27 especially when diversification occurs rapidly. Here, we used an array of dataset filtering 28 strategies and species tree methods to infer a consensus topology of Ericales and explored 29 sources of uncertainty associated with an ancient radiation. 30 Methods: We used a hierarchical clustering approach, along with tree-based homology and 31 orthology detection, to generate multiple phylogenomic datasets. Support for species 32 relationships was inferred from multiple lines of evidence including shared gene duplications, 33 gene tree conflict, gene-wise edge-based analyses, concatenation, and coalescent-based species 34 tree methods and summarize these results in a consensus framework. 35 Key Results: Our consensus approach supported a topology largely concordant with the current 36 theorized relationships, but suggests that the data are not capable of resolving several early 37 relationships due to lack of informative characters, sensitivity to methodology, and extensive 38 gene tree conflict. We find evidence of one or more paleopolyploidy events before the radiation 39 of ericalean families that likely contributes to the high levels of gene tree conflict observed. 40 Conclusions: Our approach provides a novel hypothesis regarding the history of Ericales and 41 confidently resolves several nodes. However, we also demonstrate that some ancient divergences 42 are unresolvable with our current data. Whether this is because of rapid and unresolvable ancient 43 speciation or lack of data needs to be addressed with additional data collection efforts. The role 44 that paleopolyploidy plays in generating gene tree conflict warrants further investigation. 45 Keywords: consensus topology; Ericales; gene duplication; gene tree conflict; multifurcation, 46 phylogenetic uncertainty; phylogenomics; polytomy; rapid radiation; whole genome duplication 2 bioRxiv preprint doi: https://doi.org/10.1101/816967; this version posted October 24, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 47 INTRODUCTION 48 The flowering plant clade Ericales contains several ecologically important lineages that 49 shape the structure and function of ecosystems including tropical rainforests (e.g. Lecythidaceae, 50 Sapotaceae, Ebenaceae), heathlands (e.g. Ericaceae), and open habitats (e.g. Primulaceae) around 51 the globe (ter Steege et al., 2006; Hedwall et al., 2013; He et al., 2014; Memiaghe et al., 2016; 52 Moquet et al., 2017). With 22 families comprising ca. 12,000 species (Chase et al., 2016; 53 Stevens, 2001 onward), Ericales are a diverse and disparate clade with an array of economically 54 and culturally important plants. These include agricultural crops such as blueberries (Ericaceae), 55 kiwifruits (Actinidiaceae), sapotas (Sapotaceae), Brazil nuts (Lecythidaceae), and tea (Theaceae) 56 as well as ornamental plants such as cyclamens and primroses (Primulaceae), rhododendrons 57 (Ericaceae), and phloxes (Polemoniaceae). The group has also given rise to multiple parasitic 58 lineages and the carnivorous American pitcher plants (Sarraceniaceae). Although Ericales has 59 been a well-recognized clade throughout the literature (Chase et al., 1993; Anderberg et al., 60 2002; Schönenberger et al., 2005; Rose et al., 2018), the evolutionary relationships among major 61 clades within Ericales remain contentious. 62 One of the first molecular studies investigating these deep relationships used three plastid 63 and two mitochondrial loci; the authors concluded that the dataset was unable to resolve major 64 familial relationships (Anderber et al., 2002). These relationships were revisited with 11 loci 65 (two nuclear, two mitochondrial, and seven chloroplast), where maximum parsimony and 66 Bayesian analyses provided support for the resolution of several early diverging lineages 67 (Schönenberger et al., 2005). Recent work employing three nuclear, nine mitochondrial, and 13 68 chloroplast loci in a concatenated supermatrix consisting of 49,435 aligned sites and including 69 4,531 ericalean species but with 87.6% missing data, provided the most statistically robust 3 bioRxiv preprint doi: https://doi.org/10.1101/816967; this version posted October 24, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 70 hypothesis to date (Rose et al., 2018). Despite the broad sampling utilized in Rose et al. (2018), 71 many relationships remain poorly supported, including several deep divergences that seem to be 72 the result of an ancient, rapid radiation. Moreover, these and other studies have recovered several 73 conflicting interfamilial relationships, highlighting the need to investigate possible biological and 74 methodological explanations for this topological incongruence (Anderber et al., 2002; Bremer et 75 al., 2002; Schönenberger et al., 2005). 76 Large genomic datasets can be used to help understand conflicting species tree results 77 and as these data have become more affordable, it is possible to conduct thorough investigations 78 of elusive relationships across the Tree of Life. Yet, despite the increasing availability of 79 genome-scale datasets, many relationships remain controversial, as research groups recover 80 different answers for the same evolutionary questions, often with seemingly strong support (Shen 81 et al., 2017). One benefit of genome-scale data for phylogenetics (i.e. phylogenomics) is the 82 ability to examine conflicting signal within and among datasets and a key finding in the 83 phylogenomics literature has been the high prevalence of gene tree conflict at contentious nodes 84 (Brown et al. 2017b, Reddy et al., 2017; Vargas et al., 2017; Walker et al., 2018a). The 85 differences seen across previous analyses may in fact be the result of real biological conflict 86 among the molecular data.. However, biological sources of conflict (e.g. hybridization, 87 incomplete lineage sorting, horizontal gene transfer) can also provide valuable information 88 regarding the evolutionary history of lineages. By identifying this conflict, it becomes possible to 89 locate where focused analyses are warranted and future sampling efforts might prove useful. One 90 prominent method for examining genomic conflict is analyzing the relationships among gene 91 trees, which may be obtained through a variety of methods including from transcriptome data 92 (Yang and Smith, 2014). Transcriptomes provide data on hundreds to thousands of coding 4 bioRxiv preprint doi: https://doi.org/10.1101/816967; this version posted October 24, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 93 sequences per sample, and have elucidated many of the most contentious relationships in the 94 green plant phylogeny (e.g. Simon et al., 2012; Wickett et al., 2014; Walker et al., 2017). 95 Transcriptomes also provide the necessary information to identify gene and genome duplications. 96 gene duplications may be associated with important molecular evolutionary events but also 97 represent a potentially informative event in a lineage’s history and shared duplications can 98 therefore provide additional evidence for how species are related. These multiple lines of 99 evidence regarding evolutionary history can be applied as metrics of support to generate robust 100 phylogenetic hypotheses. 101 In this study, we sought to understand the evolutionary history of Ericales and asked 102 whether the available genome-scale data robustly supported the resolution of deep relationships 103 in the clade. We leveraged

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    50 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us