Comparative Transcriptome Analysis of Alpinia Oxyphylla Miq. Reveals
Total Page:16
File Type:pdf, Size:1020Kb
Yuan et al. BMC Genomic Data (2021) 22:19 BMC Genomic Data https://doi.org/10.1186/s12863-021-00973-4 RESEARCH ARTICLE Open Access Comparative transcriptome analysis of Alpinia oxyphylla Miq. reveals tissue-specific expression of flavonoid biosynthesis genes Lin Yuan1, Kun Pan1, Yonghui Li1,BoYi2* and Bingmiao Gao1* Abstract Background: Alpinia oxyphylla Miq. is an important edible and medicinal herb, and its dried fruits are widely used in traditional herbal medicine. Flavonoids are one of the main chemical compounds in A. oxyphylla; however, the genetic and molecular mechanisms of flavonoid biosynthesis are not well understood. We performed transcriptome analysis in the fruit, root, and leaf tissues of A. oxyphylla to delineate tissue-specific gene expression and metabolic pathways in this medicinal plant. Results: In all, 8.85, 10.10, 8.68, 6.89, and 8.51 Gb clean data were obtained for early-, middle-, and late-stage fruits, leaves, and roots, respectively. Furthermore, 50,401 unigenes were grouped into functional categories based on four databases, namely Nr (47,745 unigenes), Uniprot (49,685 unigenes), KOG (20,153 unigenes), and KEGG (27,285 unigenes). A total of 3110 differentially expressed genes (DEGs) and five distinct clusters with similar expression patterns were obtained, in which 27 unigenes encoded 13 key enzymes associated with flavonoid biosynthesis. In particular, 9 DEGs were significantly up-regulated in fruits, whereas expression of 11 DEGs were highly up-regulated in roots, compared with those in leaves. Conclusion: The DEGs and metabolic pathway related to flavonoids biosynthesis were identified in root, leaf, and different stages of fruits from A. oxyphylla. These results provide insights into the molecular mechanism of flavonoid biosynthesis in A. oxyphylla and application of genetically engineered varieties of A. oxyphylla. Keywords: Alpinia oxyphylla, Transcriptome analysis, Differentially expressed genes, Secondary metabolites, Flavonoid biosynthesis Background constituent of the four southern Chinese medicines in Alpinia oxyphylla Miq., a member of the Zingiberaceae China [2, 3]. The fruits of A. oxyphylla are widely used in family, is an important plant species for traditional the treatment of ulcerations, gastralgia, diarrhea, demen- Chinese medicine, which originates in the Hainan tia, diabetes, and Alzheimer’s disease [4–9]. Numerous Province and is widely cultivated in southern China [1]. studies have reported that the fruits of A. oxyphylla are The dried fruits of A. oxyphylla are regarded as a valuable rich in flavonoids, diarylheptanoids, terpenoids, volatile drug that has a long clinical history as a well-known oils, and steroids and their glycosides [10–13]. Among these compounds, flavonoids and terpenoids are the main * Correspondence: [email protected]; [email protected] active ingredients of A. oxyphylla fruits, which have been 2Department of Pharmacy, 928th Hospital of PLA Joint Logistics Support found to exert various pharmacological activities [13]. Force, Haikou 571159, China Usually, there are variations in the distribution of sec- 1Key Laboratory of Tropical Translational Medicine of the Ministry of Education, Hainan Key Laboratory for Research and Development of Tropical ondary metabolites in different tissues of higher plants Herbs, Hainan Medical University, Haikou 571199, China [14–16]. The concentration of chemical constituents was © The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Yuan et al. BMC Genomic Data (2021) 22:19 Page 2 of 10 comparable in roots and leaves of A. oxyphylla, but was unigenes for the whole transcriptome, and N50 was cal- significantly higher in fruits [17]. In addition, the content culated to be 1567 bp and 1073 bp and the mean lengths of chemical compounds in the fruits of A. oxyphylla har- were 916 bp and 658 bp. The average GC content of vested at different times indicates that the 45-day har- contigs and unigenes for the A. oxyphylla transcriptome vested fruit had the highest content of chemicals [17, were 43.76 and 43.78%, respectively (Table 1). 18] The metabolic processes and regulatory mechanisms of these chemical compounds in different tissues and Functional annotation and classification fruits at different stages have not yet been elucidated. To investigate the function of unigenes, annotation was The transcriptome is a complete set of RNA tran- performed based on four databases. A total of 50,401 scripts in a cell at a specific developmental stage, and unigenes were grouped into the databases, non- provides information on gene expression and regulation redundant protein (Nr) (47,745 unigenes), Universal Pro- related to a variety of cellular processes including sec- tein (Uniport) (49,685 unigenes), EuKaryotic Ortholo- ondary metabolite biosynthesis [19, 20]. With the devel- gous Groups (KOG) (20,153 unigenes), and Kyoto opment of next-generation sequencing, RNA sequencing Encyclopedia of Genes and Genomes (KEGG) (27,285 is an effective method for investigating the metabolic unigenes), respectively, while an additional 89,725 uni- pathways influenced by active ingredients and associated genes were not found in these databases. A detailed gene expression in different tissues or samples, such as comparison of the unigenes annotated by four different flavonoid biosynthesis in Ampelopsis megalophylla [21], databases are illustrated in Fig. 1. terpenoids metabolism in ginseng roots [22] and poly- GO analysis illustrated that 37,555 unigenes of A. oxy- saccharide and alkaloid content in Dendrobium [23]. To phylla were annotated into three categories: molecular date, there are no studies on the genetic modification of function (30,356), cellular component (20,203), and bio- A. oxyphylla either toward increased production of logical process (26,368), respectively (Supplementary secondary metabolites or biomass accumulation. There- Fig. 1 in Additional file 1). The binding (19,730) and fore, it is important to explore the whole genome tran- catalytic activity (17,452) functional groups were the scriptome of A. oxyphylla to identify candidate genes most prominent molecular functions. A total of 20,153 contributing to metabolic processes and regulatory unigenes of A. oxyphylla were further annotated and mechanisms. grouped into 25 molecular families in KOG database In this study, the differentially expressed genes (DEGs) (Supplementary Fig. 2 in Additional file 1). These mo- and metabolic pathway related to flavonoids biosynthesis lecular families were grouped into four categories: infor- were identified in root, leaf, and different stages of fruits mation storage and processing (5575), cellular processes from A. oxyphylla. Therefore, the results of this study and signaling (7377), metabolism (6180), and poorly may serve as a significant resource for developing genet- characterized (5803). For KEGG analysis, 29,211 uni- ically engineered varieties of A. oxyphylla with improved genes of A. oxyphylla had significant matches in the quality and yield. database and were assigned to five primary categories: cellular processes (3324), environmental information Results processing (2571), genetic information processing De novo assembly (5073), metabolism (13,599), and organismal systems The three tissue samples (fruits of different developmen- (4644) (Supplementary Fig. 3 in Additional file 1). A ma- tal stages, leaves, and roots) of A. oxyphylla were jority of unigenes were assigned to metabolism, and glo- sequenced using Illumina HiSeq 4000 which generated bal and overview maps had the highest number of approximately 29.50, 33.67, 28.93, 22.98, and 27.84 mil- annotated unigenes (5005). lion pair-end short reads with a length of 150 bp for early-fruits, middle-fruits, late-fruits, leaves, and roots, Differential gene expression analysis respectively. After filtering out low-quality reads and There were 35,278 DEGs identified between the leaf vs adapters, we obtained 8.85, 10.10, 8.68, 6.89, and 8.51 fruit sample, including 15,063 up-regulated and 20,215 Gb clean data for each sample, and the clean data ratio down-regulated DEGs in fruit (Fig. 2a). A total of 34,846 were estimated to be 99.84, 99.85, 99.84, 99.80, and DEGs were identified between root vs. fruit sample, in- 99.86%, respectively (Table 1). The lllumina reads have cluding 14,807 up-regulated and 20,039 down-regulated been deposited in the Sequence Read Archive (SRA) DEGs in fruit (Fig. 2b). There were 19,776 DEGs be- database at NCBI (https://www.ncbi.nlm.nih.gov/sra) tween