A Phylogenomic Approach Resolves the Backbone of Prunus (Rosaceae) and Identifies Signals of Hybridization and Allopolyploidy
Total Page:16
File Type:pdf, Size:1020Kb
Molecular Phylogenetics and Evolution xxx (xxxx) xxx Contents lists available at ScienceDirect Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev A phylogenomic approach resolves the backbone of Prunus (Rosaceae) and identifies signals of hybridization and allopolyploidy Richard G. J. Hodel *, Elizabeth Zimmer , Jun Wen Department of Botany, National Museum of Natural History, MRC 166, Smithsonian Institution, Washington, DC 20013-7012, USA ARTICLE INFO ABSTRACT Keywords: The genus Prunus, which contains 250–400 species, has ample genomic resources for the economically important Allopolyploidy taxa in the group including cherries, peaches, and almonds. However, the backbone of Prunus, specifically the Cytonuclear discord position of the racemose group relative to the solitary and corymbose groups, remains phylogenetically uncer Gene tree conflict tain. Surprisingly, phylogenomic analyses to resolve relationships in the genus are lacking. Here, we assemble Hybridization transcriptomes from 17 Prunus species representing four subgenera, and use existing transcriptome assemblies, to Phylogenomics Phylotranscriptomics resolve key relationships in the genus using a phylogenomic approach. From the transcriptomes, we constructed 21-taxon datasets of putatively single-copy nuclear genes with 591 and 379 genes, depending on taxon- occupancy filtering. Plastome sequences were obtained or assembled for all species present in the nuclear data set. The backbone of Prunus was resolved consistently in the nuclear and chloroplast phylogenies, but we found substantial cytonuclear discord within subgenera. Our nuclear phylogeny recovered a monophyletic racemose group, contrasting with previous studies finding paraphyly that suggests repeated allopolyploidy early in the evolutionary history of the genus. However, we detected multiple species with histories consistent with hy bridization and allopolyploidy, including a deep hybridization event involving subgenus Amygdalus and the Armeniaca clade in subgenus Prunus. Analyses of gene tree conflictrevealed substantial discord at several nodes, including the crown node of the racemose group. Alternative gene tree topologies that conflictedwith the species tree were consistent with a paraphyletic racemose group, highlighting the complex reticulated evolutionary history of this group. 1. Introduction hybridization are ubiquitous in the Rosaceae (e.g., Wang et al., 2019; Liu et al., 2020), a well-studied clade of angiosperms. The genomic age has demonstrated that the resolution of phyloge In many understudied clades, collecting phylogenomic data is netic relationships often requires many unlinked loci from the nuclear expensive and difficult, which explains their lack of resolved relation genome (Sang 2002; Zimmer and Wen, 2015). It has become clear that ships. Meanwhile, some clades have ample genomic resources, but still relying on only a few loci, or many linked loci (i.e., chloroplast genomes) lack phylogenetic resolution of key relationships. One notable example can lead to incorrect phylogenomic inferences due to idiosyncrasies or of the latter is Prunus (Rosaceae), an angiosperm genus of approximately biases in linked organellar genomes (Moore, 1995; Rothfels et al., 2017). 250–400 species that contains economically important taxa such as Processes such as incomplete lineage sorting, hybridization, polyploidy, cherries, peaches, apricots, almonds, and plums. The genus also contains and horizontal gene transfer complicate phylogenetic inference (Mad many evergreen and deciduous species occurring throughout the dison and Wiens, 1997; Eaton and Ree, 2013; Folk et al., 2018; Knowles temperate regions of the northern hemisphere and in the tropics and et al., 2018). Approximately one quarter of plant species have under subtropics. The genus is characterized by reproductive features such as gone hybridization (Rieseberg and Willis, 2007), and polyploidy appears superior ovary position and a monocarpellate pistil developing into a to accompany at least 15% of speciation events in angiosperms (Wood drupe, and vegetative characters including a solid pith and leaf glands et al., 2009), making both key mechanisms driving diversification and (Chin et al., 2013; Chin et al., 2014). Within Prunus, there are three complicating phylogenetic inference. Instances of polyploidy and/or major groups defined by inflorescence architecture. These include the * Corresponding author. E-mail address: [email protected] (R. G. J. Hodel). https://doi.org/10.1016/j.ympev.2021.107118 Received 29 September 2020; Received in revised form 1 February 2021; Accepted 8 February 2021 Available online 18 February 2021 1055-7903/© 2021 Elsevier Inc. All rights reserved. Please cite this article as: Richard G. J. Hodel, Molecular Phylogenetics and Evolution, https://doi.org/10.1016/j.ympev.2021.107118 R. G. J. Hodel et al. Molecular Phylogenetics and Evolution xxx (xxxx) xxx deciduous solitary flowergroup, which includes species such as apricot ASTRAL with 11 different gene sets. They recovered a monophyletic and peach, the deciduous corymbose group (i.e., cherries) with flowers racemose lineage with their concatenation analysis and two of the arranged in a corymb or umbel, and the racemose group characterized ASTRAL datasets with few genes (113, 166 genes), but a paraphyletic by flowers borne on a raceme, and containing both tropical and racemose group in all other phylogenies. All of their ASTRAL phylog temperate species and deciduous and evergreen representatives (Chin enies with greater than 256 genes indicated some degree of paraphyly in et al., 2013). the racemose group. Although Xiang et al. (2017) were targeting genes There are substantial genomic resources for the genus Prunus, with to resolve the Rosaceae as opposed to Prunus, the levels of uncertainty genome assemblies available for seven species: apricot (P. armeniaca; along the backbone of Prunus in their study while using hundreds of Jiang et al., 2019), sweet cherry (P. avium; Shirasawa et al., 2017), nuclear genes highlight the need for a study focused on resolving Prunus. almond (P. dulcis; Sanchez-P´ ´erez et al., 2019), Chinese plum (P. mume; A query of the NCBI sequence read archive (SRA; https://www.ncbi. Zhang et al., 2012), peach (P. persica; Verde et al., 2013), Japanese plum nlm.nih.gov/sra) found that assembled transcriptomes and/or RNA-Seq (P. salicina; Xue et al., 2019), and Yoshino cherry (P. yedoensis; Shir data were available for 21 species in Prunus, with at least two species asawa et al., 2019). However, surprisingly, there has not been a phy represented in each of the three major groups (Table 2). We have logenomic investigation focusing on resolving key relationships within assembled transcriptomes to generate a phylogenomic dataset of hun Prunus. Phylogenetic studies on Prunus have used several plastid dreds of nuclear genes for 21 Prunus species plus an outgroup. Specif markers, the nuclear ribosomal ITS, and a few nuclear loci and have ically, our goals are to: 1) use this phylogenomic dataset to resolve the produced some significant insights, but many questions remain (Bortiri relationships of the backbone of the genus; 2) identify key nodes in the et al., 2001, 2002; Lee and Wen, 2001; Wen et al., 2008; Chin et al., genus with underlying gene tree conflict; 3) investigate the causes of 2010, 2014). Phylogenetic relationships within Prunus are not fully observed conflict at these nodes. resolved, particularly along the backbone of the phylogeny (Chin et al., 2014; Zhao et al., 2016; Xiang et al., 2017; Zhao et al., 2018; see 2. Materials and methods Table 1). Cytonuclear discord and multiple copies of low-copy nuclear loci suggest that allopolyploidy and/or ancient hybridization is 2.1. Dataset construction obscuring phylogenetic relationships (Zhao et al., 2016). The relation ships among major groups within Prunus defined by inflorescence We queried publicly available nucleotide sequence databases to morphology (solitary, corymbose, and racemose lineages) are poorly generate a phylogenomic dataset for as many Prunus species as possible. resolved (Chin et al., 2014). Chloroplast data show a sister relationship Raw reads were downloaded from the NCBI SRA for 21 accessions of between the solitary and corymbose groups, but nuclear data have not Prunus species plus one outgroup, Malus domestica (apple) (Table 2). clearly resolved the relationship among the solitary, corymbose, and Reads were either obtained from RNA-Seq or whole genome sequencing racemose groups. projects (Table 2). We downloaded transcriptomes from three Prunus Shi et al. (2013) used a concatenated matrix of 12 chloroplast genes, species and the outgroup with annotated draft genomes (P. avium, ITS, and two single-copy nuclear genes to infer with maximum support P. mume, P. persica, and M. domestica). For species with available RNA- that the racemose lineage was sister to a clade of the solitary + Seq data, but not assembled transcriptomes, we assembled tran corymbose lineages. The topology recovered by Shi et al. (2013) is scriptomes for use in this project (N = 17); all analyses were run on the concordant with chloroplast phylogenies from other studies (Chin et al., Smithsonian Institution High Performance Cluster (SI/HPC, “Hydra”) 2014), suggesting that it is possible the concatenation approach means (https://doi.org/10.25572/SIHPC). For these 17 species, we followed that the signal from their data matrix was dominated by the