Chloroplast Genome of an Extremely Endangered Conifer Thuja Sutchuenensis Franch.: Gene Organization, Comparative and Phylogenetic Analysis
Total Page:16
File Type:pdf, Size:1020Kb
Physiol Mol Biol Plants (March 2020) 26(3):409–418 https://doi.org/10.1007/s12298-019-00736-7 RESEARCH ARTICLE Chloroplast genome of an extremely endangered conifer Thuja sutchuenensis Franch.: gene organization, comparative and phylogenetic analysis 1 2 1 2 Tao Yu • Bing-Hong Huang • Yuyang Zhang • Pei-Chun Liao • Jun-Qing Li1 Received: 5 May 2019 / Revised: 24 October 2019 / Accepted: 22 November 2019 / Published online: 1 January 2020 Ó Prof. H.S. Srivastava Foundation for Science and Society 2020 Abstract Thuja sutchuenensis is a critically endangered Keywords Thuja sutchuenensis Á Chloroplast genome Á tertiary relict species of Cupressaceae from southwestern Sequence divergence Á Non-synonymous substitution Á China. We sequenced the complete chloroplast (cp) gen- Phylogenomics ome of T. sutchuenensis, showing the genome content of 129,776 bp, 118 unique genes including 82 unique protein- coding genes, 32 tRNA genes, and 4 rRNA genes. The Introduction genome structures, gene order, and GC content are similar to other typical gymnosperm cp genomes. Thirty-eight The chloroplast (cp) is a semiautonomous organelle in simple sequence repeats were identified in the T. sutchue- plants, encoding several key proteins involved in photo- nensis cp genome. We also found an apparent inversion synthesis and interactions between plants and the sur- between trnT and psbK between genera Thuja and Thu- rounding environment (Saski et al. 2007; Daniell et al. jopsis. In addition, positive selection signals were detected 2016). With the rapid development of high-throughput in seven genes with high Ka/Ks ratios. The reconstructed sequencing technologies, an increasing number of phylogeny based on locally collinear blocks of cp genomes sequenced cp genomes can easily be acquired from online among 21 gymnosperms species is similar to previous databases such as the National Center for Biotechnology inferences. We also inferred a Late-Miocene divergence Information (NCBI, http://www.ncbi.nlm.nih.gov/gen between T. sutchuenensis and T. standishii, according to omes/) (Daniell et al. 2016). The size of the cp genome in the dating of * 11.05 Mya by cp genomes. These results almost all vascular plant species ranges between 120 and will be helpful for future studies of Cupressaceae phy- 160 kb in length, and contains about 130 genes (Chumley logeny as well as studies in population genetics, system- et al. 2006). In most gymnosperm plants, the cp genome is atics, and cp genetic engineering. inherited from the patrilineal lineage, and exhibits very little or no recombination (Yi et al. 2013). Due to the rel- atively simple features of the cp genome, the cp sequences are commonly used as DNA barcodes for genetic identifi- Electronic supplementary material The online version of this cation, for systematic plant studies, and studies of plant article (https://doi.org/10.1007/s12298-019-00736-7) contains sup- plementary material, which is available to authorized users. biodiversity, biogeography, adaptation, etc. (Wambugu et al. 2015; Brozynska et al. 2016). & Pei-Chun Liao Thuja sutchuenensis is a tertiary relict species of [email protected] Cupressaceae (Tang et al. 2015). It is found in Sichuan and & Jun-Qing Li Chongqing Provinces in China, with an altitudinal range [email protected] between 800 and 2100 m (mostly between 1000 and 1 Forestry College, Beijing Forestry University, 35 Qinghua 1500 m) (Qiaoping et al. 2015). This species was firstly East Road, Haidian District, Beijing 100083, China discovered by a French botanist, Paul Guillaume Farges, 2 School of Life Science, National Taiwan Normal University, who collected specimens from 1892 to 1900 and described 88 Ting-Chow Rd., Sec. 4, Taipei 116, Taiwan it in 1899. It was not seen again for almost 100 years. T. 123 410 Physiol Mol Biol Plants (March 2020) 26(3):409–418 sutchuenensis was then classified as extinct in the wild fragments with relatively concentrated sizes were screened (EW) until it was rediscovered in northeastern Chongqing according to the requirements of subsequent analysis. in 1999 (Qiaoping et al. 2015). It was reassessed as criti- Finally, enrich the target fragment and obtain the library for cally endangered (CR) in 2003 (Qiaoping et al. 2015; Gao sequencing (Meyer and Kircher 2010). We gener- et al. 2018). Southwest China is rich in vegetation, but after ated 9.5 Gb of total data with a 150-bp average read length. decades of excessive deforestation and habitat destruction, the original forest of T. sutchuenensis was fragmented, and Genome assembly, annotation only a small area remains. Protection of this critically endangered plant requires adequate basic research. High-quality data were filtered from raw sequence data using The study of conservation genetics and the reconstruction the NGSQC (Dai et al. 2010) with default settings. Clean of evolutionary history depend on suitable molecular reads were assembled using MITObim v1.8 (Hahn et al. markers. However, molecular research on T. sutchuenensis is 2013) and NOVOplasty (Dierckxsens et al. 2016) and using scarce. Previous population genetics studies based on inter the cp genome of T. standishii as the reference (Qu et al. simple sequence repeats (ISSR) can roughly describe genetic 2017). Samtools (Li et al. 2009) was used to recheck the diversity but cannot comprehensively illustrate evolutionary sequences based on the degree of coverage. Gaps between history due to the limitation of dominant markers and small the plastomic contigs were filled up using Sanger sequenc- numbers of loci used (Liu et al. 2013), particularly consid- ing. The specific primers used for PCR are shown in Sup- ering the large size of the T. sutchuenensis genome plementary Table S1. We annotated protein-coding genes [1C = 12.10 pg (Zonneveld 2012)]. The large genome size using Cpgavas (Yong and Zheng 2012) and checked gene makes the development of ample nuclear markers difficult. boundaries by comparing T. standishii cp genomes using the Instead, it is easier to obtain homologous genes from small- BLASTN software in NCBI (http://www.ncbi.nlm.nih.gov). sized cp genomes, and they are easier to sequence and pro- The error-corrected SQN-file of T. sutchuenensis cp genome vide comparable genetic information. In this study, we used sequence was submitted to GenBank (accession number: an Illumina Miseq Platform to assemble the cp genomes of T. MH784400). The circular gene map of the cp genome sutchuenensis in order to (1) deepen the understanding of the (Fig. 1) was drawn by the Organella Genome DRAW soft- cp genome structure of T. sutchuenensis and (2) to develop ware (OGDRAW v1.2.) (Lohse et al. 2007). effective molecular markers for T. sutchuenensis that can be applied to conservation genetics and evolutionary inference. Identifying cp SSRs We also reconstructed a phylogenomic tree with other pub- lished cp genomes of Cupressaceae species in order to Simple sequence repeats (SSRs) in T. sutchuenensis and T. acquire more robust phylogenetic inference in the subfamily standishii cp genomes were detected by MISA Perl script Cupressideae. (http://pgrc.ipk-gatersleben.de/misa/). The parameter of minimum repeat units was set at 10 for mono-, 6 for di-, and 5 for tri- to hexanucleotides. Materials and methods Genome-wide homologous comparison Plant materials and DNA sequencing and divergence of coding gene sequences Young leaves from T. sutchuenensis trees in the Chinese Whole cp genomes of T. sutchuenensis, T. standishii, and Academy of Forestry were collected and dried immediately Thujopsis dolabrata were aligned in MAUVE (Darling with silica gel for preservation. Whole genomic DNA was et al. 2004) under default settings to test rearrangement extracted using the plant genome DNA extraction kit events. Coding genes in each cp genome were determined (TIANGEN, Beijing, China) based on the manufacturer’s and aligned by MAFFT (Katoh et al. 2005). To identify protocol. The average insert sizes were approximately positive selection of T. sutchuenensis and T. standishii cp 350 bp. Paired-end Library preparations were constructed genome, synonymous (Ks) and non-synonymous substitu- using the Illumina Miseq platform according to the Illumina tion rate (Ka) of each coding gene was calculated in DnaSP standard method at Megagenomics Company (Beijing, 5.0 software (Librado and Rozas 2009). China). First, DNA sequence fragmentation by ultrasound, then terminal repair phosphorylation: DNA polymerase is Phylogenetic analysis and divergence time used to repair fragmented DNA into dsDNA at the flat ter- estimation minal, and T4 poly nucleotide kinase phosphorylates the 5 ‘terminal, and add A prominent A tail to the 3’ end of Twenty-one cp genomes of Cupressideae were used to dsDNA. Connect sequencing joints, and appropriate reconstruct the phylogenomic tree. Species of subfamilies 123 Physiol Mol Biol Plants (March 2020) 26(3):409–418 411 Fig. 1 Map of the chloroplast genome of Thuja sutchuenensis. Genes different functional groups are color-coded. The dashed area in the inside and outside of the circle are transcribed in the clockwise and inner circle indicates the GC content of the chloroplast genomes counterclockwise directions, respectively. Genes belonging to (colour figure online) Taiwanioideae (Taiwania crypyomerioides and T. flou- analysis between locally collinear block sequences and T. siana), Cunninghamhioideae (Cunninghamia lanceolata), sutchuenensis genes sequence. PhyML (Guindon et al. Sequoioideae (Metasequoia glyptostroboides), and Taxo- 2009) and MrBayes (Huelsenbeck