Downloaded 27 Additional Transcriptomes from Genbank
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2021.07.29.454256; this version posted July 30, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 1 A phylotranscriptome study using silica gel-dried leaf tissues 2 produces an updated robust phylogeny of Ranunculaceae 3 4 Running title: RNA-seq using silica gel-dried tissues 5 Jian He1†, Rudan Lyu1†, Yike Luo1†, Jiamin Xiao1†, Lei Xie1*, Jun Wen2*, Wenhe Li1, 6 Linying Pei4, Jin Cheng3 7 1 School of Ecology and Nature Conservation, Beijing Forestry University, Beijing, 8 100083 PR China 9 2Department of Botany, National Museum of Natural History, MRC 166, Smithsonian 10 Institution, Washington, DC 20013-7012, USA 11 3 Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, 12 College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 13 100093 PR China 14 4Beijing Engineering Technology Research Center for Garden Plants, Beijing Forestry 15 University Forest Science Co. Ltd., Beijing, 100083, PR China 16 †These authors contributed equally to this work. 17 Correspondence: Lei Xie, email: [email protected]; Jun Wen, email: [email protected]. 18 1 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.29.454256; this version posted July 30, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 19 Abstract 20 The utility of transcriptome data in plant phylogenetics has gained popularity in recent years. 21 However, because RNA degrades much more easily than DNA, the logistics of obtaining 22 fresh tissues has become a major limiting factor for widely applying this method. Here, we 23 used Ranunculaceae to test whether silica-dried plant tissues could be used for RNA 24 extraction and subsequent phylogenomic studies. We sequenced 27 transcriptomes, 21 from 25 silica gel-dried (SD-samples) and six from liquid nitrogen-preserved (LN-samples) leaf 26 tissues, and downloaded 27 additional transcriptomes from GenBank. Our results showed that 27 although the LN-samples produced slightly better reads than the SD-samples, there were no 28 significant differences in RNA quality and quantity, assembled contig lengths and numbers, 29 and BUSCO comparisons between two treatments. Using this data, we conducted 30 phylogenomic analyses, including concatenated- and coalescent-based phylogenetic 31 reconstruction, molecular dating, coalescent simulation, phylogenetic network estimation, and 32 whole genome duplication (WGD) inference. The resulting phylogeny was consistent with 33 previous studies with higher resolution and statistical support. The 11 core Ranunculaceae 34 tribes grouped into two chromosome type clades (T- and R-types), with high support. 35 Discordance among gene trees is likely due to hybridization and introgression, ancient genetic 36 polymorphism and incomplete lineage sorting. Our results strongly support one ancient 37 hybridization event within the R-type clade and three WGD events in Ranunculales. 38 Evolution of the three Ranunculaceae chromosome types is likely not directly related to WGD 39 events. By clearly resolving the Ranunculaceae phylogeny, we demonstrated that SD-samples 2 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.29.454256; this version posted July 30, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 40 can be used for RNA-seq and phylotranscriptomic studies of angiosperms. 41 Keywords 42 chromosomal type, phylotranscriptomics, Ranunculaceae, RNA-seq, silica-dried leaf tissue 3 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.29.454256; this version posted July 30, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 43 1 INTRODUCTION 44 With the recent advances in high-throughput sequencing and analytical methods, 45 phylogenetic reconstruction using genome-wide sequence data has become widely used 46 in plant evolutionary studies (Johnson et al., 2012; Yu et al., 2018). However, whole- 47 genome sequencing of densely sampled phylogenetic analyses has remained impractical 48 and unnecessary due to high costs and computational limitations. Hence researchers 49 have developed reduced-representation methods (e.g., genome skimming, restriction 50 site-associated DNA sequencing or RAD-seq, target enrichment sequencing such as 51 Hyb-seq, and transcriptome sequencing or RNA-seq) as practical tools for phylogenetic 52 studies (Zimmer & Wen, 2015; McKain et al., 2018). 53 Genome skimming is one of the most widely applied partitioning strategies for 54 phylogenetic inferences, and especially efficient for obtaining complete plastid genome 55 sequences of plants (Dodsworth, 2015; Liu et al., 2018; Zhai et al., 2019; He et al, 2019; 56 Liu et al., 2020; Wang et al, 2020). However, it is often of limited utility to obtain 57 enough single copy nuclear genes for phylogenetic analyses, especially for non-model 58 plant taxa possessing both huge genome sizes and no whole-genome reference (McKain 59 et al., 2018; but see Liu et al., 2021). Hyb-seq is another widely used genome 60 partitioning method that can target low-copy nuclear genes. Like genome skimming, 61 Hyb-seq can use almost all kinds of tissue samples (e.g., silica gel-preserved, flash- 62 frozen, fresh, or even old herbarium materials) (Yu et al., 2018; McKain et al., 2018; 63 Reichelt et al., 2021; Wang et al., 2021). However, Hyb-seq requires a complex 64 laboratory protocol involving bait capture. Furthermore, this method often results in a 4 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.29.454256; this version posted July 30, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 65 high proportion of missing data, and also cannot be used to detect ancient whole 66 genome duplication (WGD) based on paralogous genes. 67 In recent years, using RNA-seq to reconstruct phylogenetic relationships 68 (phylotranscriptomics) and gene family evolution has gained popularity because of its 69 relatively low cost and improved analytical pipelines (Wen et al., 2013, 2015; Wickett et 70 al., 2014; Landis et al., 2017; Zeng et al., 2017; One Thousand Plant Transcriptomes 71 Initiative, 2019; Cheon et al., 2020; Alejo-Jacuinde et al., 2020). Using that method, 72 researchers can often assemble thousands of genes (especially single-copy nuclear 73 genes) from plant taxa under study and use them for both inferring phylogenetic 74 relationships and gene family evolution (Yang and Smith, 2014; Yang et al., 2015; 75 Xiang et al., 2017). Compared to Hyb-seq, RNA-seq uses relatively simple 76 experimental protocols to generate more complete nuclear data, and paralogous genes 77 from RNA-seq data can also be used to infer WGD (McKain et al., 2018). 78 While improvements in sequencing and extraction protocols have made RNA-seq 79 much easier in plants (Romero et al., 2014; Yang et al., 2017), its application in 80 phylogenetic study still remains challenging. Because RNA is more unstable than DNA, 81 RNA-seq requires more stringent material preservation techniques. Previous studies 82 have used fresh or liquid-nitrogen flash frozen plant tissues or fresh tissue quickly 83 soaked in RNA stabilization solution, and then subsequently preserved in an ultra-low 84 temperature (-80 ℃) freezer (Yu et al., 2018; McKain et al., 2018; Dodsworth et al., 85 2019). However, because phylotranscriptomic studies often focus on non-model plant 86 taxa, they usually require extensive field work and broad taxon sampling schemes. The 5 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.29.454256; this version posted July 30, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 87 logistics of using liquid nitrogen tanks in the field or expensive RNA stabilization 88 solution to preserve collected plant tissues at multiple locations, not to mention quick 89 access to an ultra-low temperature freezer for subsequent laboratory work, greatly limits 90 the practicality of using RNA-seq for phylogenetic studies (Zimmer & Wen, 2015; Yang 91 et al., 2017). 92 Traditional DNA-based molecular phylogenetics, DNA barcoding, as well as 93 genome skimming and RAD-seq methods, often use silica gel-dried leaf tissues 94 (Narzary et al., 2015; Yu et al., 2018). Such sampling method is cheaper and amenable 95 to collecting and transporting large number of samples. Much emphasis has been placed 96 on how different plant tissues, preservation methods, and RNA extraction protocols may 97 impact the quantity and quality of extracted RNA (Johnson et al., 2012; Romero et al., 98 2014; Yang et al., 2017), but no empirical study has explored silica gel-dried plant 99 tissues for RNA-seq. This may be due to the reasons that even though total RNA may be 100 extracted from silica gel-dried plant tissues, it is not possible to quantitatively measure 101 gene expression using this kind of samples for evo-devo studies. However, a 102 phylotranscriptomic study needs to obtain large nuclear data sets for pertinent plant taxa 103 for analysis, and it does not need to quantitatively measure gene expression.