Arch Biol Sci. 2017;69(1):181-190 DOI: 10.2298/ABS160123125P

De novo characterization of the ruthenicum transcriptome and analysis of its digital gene expression profiles during development and ripening

Yong Peng1, Huiqin Ma2 and Shangwu Chen1,3,*

1 Beijing Laboratory for Food Quality and Safety, College of Food Sciences and Nutrition Engineering, China Agricultural University, Beijing, China 2 College of Horticulture, China Agricultural University, Beijing, China 3 Beijing Advanced Innovation Center for Food Nutrition and Human Health, China Agricultural University, Beijing, China

*Corresponding author: [email protected] Received: January 23, 2016; Revised: March 16, 2016; Accepted: April 4, 2016; Published online: November 11, 2016

Abstract: Lycium ruthenicum Murr., which belongs to the family , is a resource for Chinese traditional medicine and nutraceutical foods. In this study, RNA sequencing was applied to obtain raw reads of L. ruthenicum fruit at different stages of ripening, and a de novo assembly of its sequence was performed. Approximately 52.45 million 100- bp paired-end raw reads were generated from the samples by deep RNA-seq analysis. These short reads were assembled to obtain 164814 contigs, and the contigs were assembled into 84968 non-redundant unigenes using the Trinity method. Assembled sequences were annotated with gene descriptions, gene ontology, clusters of orthologous group and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway terms. Digital gene expression analysis was applied to compare gene-expression patterns at different fruit developmental stages. These results contribute to existing sequence resources for Lycium spp. during the fruit-ripening stages, which is valuable for further functional studies of genes involved in L. ruthenicum fruit nutraceutical quality.

Key words: de novo characterization; digital gene expression profiles; fruit development; Lycium ruthenicum; transcriptome

INTRODUCTION studies has aimed to reveal the genetic information of L. ruthenicum in shrub population and structure, L. ruthenicum Murr., from the genus Lycium of the fam- sequence-related amplified polymorphism (SRAP) ily Solanaceae, is mainly distributed in the desert area markers [10] and simple sequence repeats (SSRs) of northwestern China and Central Asia. In China, it is markers [11]. These researches have enriched the ge- mainly found in Qinghai, Ningxia and Xinjiang prov- netic knowledge of L. ruthenicum. SSRs are variable inces. It is an ideal plant for preventing soil desertifica- and ubiquitous in eukaryotic genomes, and are use- tion and alleviating soil salinity and alkalinity because ful tools for evaluating genetic diversity, population of its specific physiological characteristics of drought- structure and fingerprint mapping [12,13]. Recently, and salt-resistance, which are very important for the next-generation sequencing (NGS) technology was ecosystem and agriculture in arid areas [1]. created as an innovative approach for high-through- The fruit of L. ruthenicum, a Chinese traditional put sequence determination and it has dramatically herb, is used in Tibetan medicine for the treatment improved the speed and efficiency of gene discovery. of heart disease, abnormal menstruation and meno- Peel color is a very important fruit characteristic pause, as well as a nutraceutical food [2]. Studies that is mainly determined by the content of anthocya- on L. ruthenicum have mainly focused on chemical nins, which also confer beneficial properties to many components such as anthocyanins [3-5], essential oils and vegetables [14]. L. ruthenicum contains [6], polysaccharides [7] and their pharmacological abundant acylated anthocyanins such as petunidin properties [8,9]. Recently, an increasing number of diglucoside, petunidin glucoside and malvidin glu-

© 2017 by the Serbian Biological Society How to cite this article: Peng Y, Ma H, Chen S. De novo characterization of the 181 Lycium ruthenicum transcriptome and analysis of its digital gene expression profiles during fruit development and ripening. Arch Biol Sci. 2017;69(1):181-90. 182 Arch Biol Sci. 2017;69(1):181-190 copyranoside [15]. The anthocyanin pathway is regu- DNA. The cDNA library for reference transcriptome lated by the MYB-bHLH-WD40 transcription factor was prepared by pooling the extracted RNA from the complexes comprising of R2R3-MYB, basic-helix- different fruit-ripening stages (P1, P2 and P3). Beijing loop-helix (bHLH) and WD-repeat (WDR) proteins. Genome Institute (BGI, Shenzhen, China) operated the Meanwhile, the R3-MYB and R2R3-MYB repressors subsequent enrichment of mRNAs, fragment interrup- play a role in reducing anthocyanin production [16]. tion, addition of adapters, PCR amplification and RNA sequencing according to the next-generation RNA-seq The aim of this study was to uncover transcrip- protocol and manufacturer’s instructions [18-20]. The tome information for L. ruthenicum fruit and compare quality and integrity of the RNA samples were exam- its digital gene expression (DGE) profiles at various ined using the Agilent 2100 Bioanalyzer, and samples developmental stages. Initially, we sequenced the with RNA integrity number (RIN) values higher than transcriptome of L. ruthenicum fruit with Illumina se- 7.0 were used. Poly (A) mRNA was isolated using mag- quencing technology. Furthermore, we compared the gene expression profiles of fruits among different fruit netic beads (Illumina) bonding with oligo (dT), the ripening stages by the digital gene expression system, mRNA was fragmented, and cDNA was synthesized aiming to characterize the transcriptome related to using random hexamer primers (Sangon Biotech) and biosynthesis of anthocyanins of L. ruthenicum. The as- reverse transcriptase (Invitrogen). The libraries were sembled, annotated transcriptome sequences and gene sequenced using the Illumina HiSeq 2000 platform af- expression profiles provide valuable information for ter end-repair, adaptor ligation and PCR amplification. a better understanding of gene expression patterns at Libraries that gave reads unevenly distributed among different fruit developmental stages ofL. ruthenicum. the gene regions were discarded and replaced.

Assembly and annotation of the reference MATERIALS AND METHODS transcriptome

Plant material The raw sequencing reads were first filtered by remov- ing invalid and low-quality reads, which included reads L. ruthenicum fruits were collected in a semi-wild, with adaptor contamination, with N percentages greater semi-desert area near Yinchuan, Ningxia province, than 5% and reads with more than 50% bases with qual- China. The fruits were harvested at different develop- ity lower than Q20 level (an error probability of 1%) mental stages: the green fruit peel period (P1), initia- in one sequence. The short reads were first assembled tion of peel color change (P2) and the fully mature, into longer but gapless contigs using Trinity software black peel period (P3) (Fig. 1). Collected fruits were [20], and the contigs were further assembled to obtain cleaned, frozen by liquid nitrogen and stored at -80°C. the unigenes that could not be extended on either end. RNA extraction and RNA sequencing We used Blast2GO to obtain gene ontology (GO) annotations of the unigenes [21] and classified the Total RNA was extracted from the whole fruit by the unigenes’ GO function by using WEGO software [22] CTAB method [17] and treated with DNaseI (Taka- and connecting some unigenes with metabolic path- ra Biotech Incorporation) to remove contaminating ways using the KEGG database [23] (E-value ≤1.0E- 5). The sequence direction of the unigenes was deter- mined according to the best alignment results. When the results were conflicting among databases, the di- rection was determined successively by the NCBI non- redundant protein sequence database (NR), Swiss-Prot, KEGG and clusters of orthologous groups (COG). Fig. 1. The phenotype of L. ruthenicum fruits at different stages. When a unigene would not align to any database, we Green fruit peel period (A), initiation of peel color change (B), applied ESTScan to predict coding regions and deter- fully mature black peel (C). mine sequence direction [24]. SSR was screened by the Arch Biol Sci. 2017;69(1):181-190 183

SSR detecting program MicroSAtellite (MISA) (http:// per Million mapped reads) [26] method to calculate pgrc.ipk-gatersleben.de/misa/misa.html). The criteria the gene expression level. Based on gene expression, for SSR identification were as follows: 12 nt for mono- we selected differentially expressed genes (DEGs) and nucleotide, di-nucleotide and tri-nucleotide SSRs; 16 carried out further functional analysis. nt for tetra-nucleotide SSRs; 20 nt for penta-nucleotide SSRs; and 24 nt for hexa-nucleotide SSRs. Evaluation of DGEs

Preparation and sequencing of DGE library We analyzed the expression levels to compare the gene expression in three different fruit ripening periods. After extracting total RNAs from the samples, mRNAs We used “false discovery rate (FDR) ≤0.001 [27] and of samples were enriched by using the oligo (dT) mag- the absolute value of log2 Ratio >1” as the threshold netic beads (Illumina) from the total RNAs. By us- to judge the significance of the differences in gene ing the fragmentation buffer (Life Technologies), the expression. In the pathway analysis, we mapped the mRNAs were fragmented into short fragments (about differentially expressed genes in the KEGG database 200 bp), then the first-strand cDNAs were synthesized and identified significantly enriched genes related to by random hexamer-primer (Sangon Biotech) using metabolic or signal-transduction pathways. the mRNA fragments as templates. Buffer, dNTPs, RNase H and DNA polymerase I (Invitrogen) were added to synthesize the second strand. The double- RESULTS strand cDNAs were purified with a QiaQuick PCR extraction kit and washed with ethidium bromide (0.5 Sequencing and assembly of clean reads μg/ml) buffer for end repair and poly(A) addition. Finally, sequencing adaptors were ligated to the frag- We obtained an overview of the gene-expression ments. The fragments were purified by agarose gel profiles of L. ruthenicum fruit at different stages of electrophoresis and enriched by PCR amplification ripening. Approximately 52.45 million 100-bp paired- using adapters as primers. The library products were end raw reads were generated from the L. ruthenicum sequenced using the Illumina HiSeq 2000 platform. fruit by deep RNA-seq analysis. After filtering out the repetitive, low-complexity and low-quality reads, we DGE analysis and mapping obtained 51.39 million clean 100-bp reads. These short reads were assembled into 164814 contigs ranging The original image data were transferred into se- from 200 to over 3000 bp, with a mean size of 323 bp. quence data via base calling. Filtering and quality Q20 level and GC content were 99.01% and 45.78%, control checks (base composition analysis, per base respectively. The contigs were assembled into 84968 sequence quality and per sequence quality scores) non-redundant unigenes with a mean size of 610 bp were performed using FastQC (http://www.bioinfor- using the Trinity method for paired-end joining and matics.bbsrc.ac.uk/projects/fastqc/). We filtered the gap-filling. The 84968 unigenes’ total consensus se- reads with adapters, with unknown bases more than quences were divided into 9377 distinct clusters and 5% and low-quality reads with more than 50% of bases 64432 distinct singletons after clustering. with quality lower than Q20 level. Functional annotation of unigenes using public Clean reads were mapped to reference sequences databases by using the SOAP2 [25] aligner. No more than 2 mismatches were allowed in the alignment. Qual- The functions of a protein can be predicted from an- ity control of alignment included the statistics of notation of known proteins in the NCBI NR, Swiss- alignment analysis, sequencing saturation analysis, Prot and KEGG databases. We obtained 33868 unige- distribution of reads on reference genes analysis and nes after mapping unigene sequences to the Swiss-Prot coverage analysis. For gene expression analysis, we database (Table 1). BLASTx was used to annotate the used the RPKM (Reads Per Kilo base of exon model 184 Arch Biol Sci. 2017;69(1):181-190

Table 1. Summary of the fruit transcriptome of L. ruthenicum annotation against different databases. Unigenes in database Number Percentage Total number of unigenes 84968 100 NR 50880 59.88 NT 54876 64.58 Swiss-Prot 33868 39.86 COG 17133 20.16 GO 30874 36.34 KEGG 26269 30.92 NR − NCBI non-redundant protein sequences, NT − NCBI non-redundant nucleotide sequences, COG − Clusters of Orthologous Groups, GO − Gene Ontology, KEGG − Kyoto Encyclopedia of Genes and Genomes.

Table 2. Characteristics of homology search of unigenes of L. ruthenicum against the NCBI nr database. Distribution Percentage 0 20.94 0~1E-100 13.26 1E-100~1E-60 12.25 E-value 1E-60~1E-45 8.01 distribution 1E-45~1E-30 13.1 1E-30~1E-15 18.68 1E-15~1E-5 13.76 Solanum tuberosum 28.36 Solanum lycopersicum 15.05 Pyrenophora teres f. teres 0-1 1.7 Species Pyrenophora tritici-repentis 1.67 distribution Pt-1C-BFP Nicotiana tabacum 1.35 Vitis vinifera 1.22 Others 15.31 18%~40% 3.76 40%~60% 10.55 Similarity 60%~80% 22.1 distribution 80%~95% 50.26 Fig. 2. COG functional classification of L. ruthenicum sequence. X-axis indicates the number of unigenes in each category and y- 95%~100% 13.33 axis indicates the COG categories. In total, 17133 sequences were assigned to 25 COG classifications. sample’s unigenes against the NCBI NR nucleotide database with an E-value of less than 1.0E-5, yield- determine the biological pathways in L. ruthenicum ing 50880 genes. The E-value distribution statistics fruit. KEGG analysis assigned 26269 unigenes to ap- revealed that 46.4% of the mapped sequences have proximately 38635 items and 128 known metabolic or strong homology (smaller than 1.0E-50), and 53.6% of signaling pathways, of which 6330 (24.1%) unigenes the sequences have homology ranging between 1.0E-5 were related to metabolic pathways, 3380 (12.87%) to and 1.0E-60. The similarity distribution showed that biosynthesis of secondary metabolites, 1479 (5.63%) 63.6% of the sequences have higher than 80% similar- to plant-pathogen interactions, 1404 (5.34%) to RNA ity, and the highest percentage of unigenes mapped to transport, 1304 (4.96%) to plant hormone signal trans- Solanum tuberosum sequences (Table 2). duction and 497 (including 420 individual unigenes) to flavonoid-biosynthesis pathway categories, closely The unigenes were annotated with Enzyme Com- related to flavonoid, flavone, flavonol, isoflavonoid mission (EC) numbers from BLASTx alignments and anthocyanin biosynthesis for nutraceutical qual- against the KEGG database (E-value ≤1.0E-5) to ity formation. Arch Biol Sci. 2017;69(1):181-190 185

Fig. 3. GO annotation analysis for L. ruthenicum unigenes. The results are classified in three main categories: biological process, cellular component and molecular function. A total of 30874 unigenes of the pooled samples of L. ruthenicum fruit were assigned for GO annotation and functional clas- sification. The left y-axis indicates the percentage of a specific category of genes in the main category. The right y-axis indicates the number of genes in a category.

COG and GO classification bination and repair” (2458; 14.34%). Secondary me- tabolites and related transporters were run in the third We aligned the unigenes to the COG database to group. However, only a few unigenes were matched to predict their possible functions and analyzed their the “nuclear structure” (4, 0.02%) and “extracellular functional classification. A total of 17133 unigenes structures” (21; 0.12%) categories. In addition, 1588 were assigned to 25 COG classification groups (Fig. (9.27%) unigenes were related to “function unknown” 2). Among these groups, the “general function predic- and could be undiscovered genes in L. ruthenicum. tion only” cluster (5019; 29.29%) ranked as the larg- GO is the system for sorting gene descriptions est group, followed by “transcription” (2552; 14.9%), according to three ontology categories and func- “posttranslational modification, protein turnover, tional subcategories. A total of 30874 unigenes of the chaperones” (2524; 14.73%) and “replication, recom- 186 Arch Biol Sci. 2017;69(1):181-190

repeats represented a smaller proportion of the mo- tif classes. Among the dinucleotide repeats, AG/CT was the most abundant (59.8%) followed by AC/GT (25.07%) and AT/AT (14.7%). The trinucleotide re- peats, including AAG/CTT, AAC/GTT, ATC/ATG and ACC/GGT, were predominant in this part (Fig. 4).

Sequences mapping to the reference transcriptome database

We mapped the tag sequences of the three fruit-stage libraries to the fruit-reference library of L. ruthenicum with the DGE method. Approximatively 72544, 68100 and 75212 unigenes were mapped to the reference database in the P1, P2 and P3 fruit stages, respectively. For all fruit-stage DGEs, the number of detected genes Fig. 4. Distribution and frequency of SSR classification in differ- vs. the sequenced read amount were displayed as a sat- ent nucleotide types of L. ruthenicum. The x-axis indicates the uration function curve, when the clean reads number SSR type. The y-axis indicates the number of SSR motif in each reached above 10 million. In the 20 most abundantly category. A total of 5172 SSRs were identified in 4653 unigenes. expressed genes of the P3 stage of L. ruthenicum, genes such as cytochrome P450, 2S sulfur-rich stor- pooled sample of L. ruthenicum fruit were assigned age protein, lipid transfer protein LTP1 precursor, for GO annotation and functional classification using flower-specific defensin, chitin-binding lectin 1 and Blast2GO and WEGO. In the biological process (BP) pathogenesis-related proteins were expressed in all category, the major function terms were “metabolic three fruit stages. process” (20915 terms), “cellular process” (18094), “single-organism process” (14247) and “response to Differential expression profiles at the different stimulus” (7152). Within the cellular component (CC) fruit stages category, most unigenes were assigned to “cell” (15606 terms), “cell part” (15606), “organelle” (11579) and To obtain the digital expression profile of the DEGs “membrane” (7846). In the molecular function (MF) among the three fruit development stages of L. ru- category, the dominating function terms were “bind- thenicum we used the RPKM method to perform gene ing” (16960), “catalytic activity” (16595), “transporter expression analysis pairwise among the P1, P2 and P3 activity” (1861) and “structural molecule activity” libraries. In the “P2 vs. P1,” “P3 vs. P2,” and “P3 vs. (1025) (Fig. 3). P1” comparisons we identified 9479, 15953 and 10344 DEGs, respectively, under FDR ≤0.001 and absolute Frequency and distribution of SSRs value of log2-ratio >1 (Fig. 5). In total, 6761 upregu- lated and 2718 downregulated genes were detected in All of the unigenes were analyzed for SSR motifs, and P2 vs. P1, 6178 upregulated and 9775 downregulated in 4653 (5.48%) unigenes 5172 SSRs were identified, genes in P3 vs. P1, and 1760 upregulated and 8584 with the average frequency of one SSR per 10.01 kb. downregulated genes in P3 vs. P2 (Table 3). In ad- Most unigenes contained 1 SSR, while there were 499 dition, 2943, 286 and 6155 genes were uniquely ex- unigenes containing multiple SSRs with up to 212 SSRs pressed in the P1, P2 and P3 stages, respectively, while present in compound formation. The trinucleotide re- 62232 genes were expressed during all three stages. peats were most abundant (38.5%) followed by mono- nucleotide (31.11%) and dinucleotide repeats (26.84%) The differential expression profiles obtained by in various groups of SSRs. The hexanucleotide (1.82%), pairwise comparison were further delineated by ana- pentanucleotide (0.95%) and tetranucleotide (0.79%) lyzing the DEGs for enriched GO terms on GO on- Arch Biol Sci. 2017;69(1):181-190 187

Fig. 5. Venn diagram of unique expressed genes pairwise among P1, P2 and P3 fruit stages of L. ruthenicum. The sum of the num- bers in each large circle represents total number of differentially expressed genes between different stages, the overlap part of the circles represents common differentially expressed genes among combinations. Table 3. Differential expression of genes in different stages of L. Fig. 6. Hierarchical cluster analysis of differentially expressed ruthenicum. transcripts of L. ruthenicum fruits. Clustering of (A) flavonoid Compared Differentially biosynthesis, (B) phenylpropanoid biosynthesis-related and (C) Up-regulated Down-regulated groups expressed genes starch and sucrose metabolism genes expressed in three ripening P2/P1 9479 6761 2718 stages. Red and green represents the high and low levels of gene expression, respectively. All data were normalized by standard P3/P1 15953 6178 9775 normal distribution. The different ripening stages are shown in P3/P2 10344 1760 8584 the columns and the unigenes in the rows.

(P ≤0.05). Hierarchical cluster analysis of the DEGs tology domains of BP, CC and MF and for enriched at the different ripening stages was performed to ex- KEGG pathway ontology (KO) terms illustrating the amine the similarity and diversity of gene expression. fruit’s developmental changes. For fruit stages P2 vs. We clustered the three categories of DEGs mentioned P1, a total 156 functional catalogs were found changed above of the different fruit ripening stages. The expres- for GO terms by DEGs, including 81 in BP, 41 in CC sion of genes related to flavonoid biosynthesis was and 34 in MF. For P3 vs. P1 fruit stages, a total of 246 downregulated in the P2 stage and upregulated in the functional groups were found changed, and in the P3 P3 stage (Fig. 6A). In addition, the overall trend in vs. P2 comparison, 333 functional groups changes expression level of genes related to phenylpropanoid were found, including 188, 79 and 66 groups for BP, biosynthesis was ascending (Fig. 6B), while the overall CC and MP, respectively. The parallel DEGs signed trend in expression level of genes related to starch and by KO terms for enriched KEGG pathway annotation sucrose metabolism successively declined during the with pairwise comparisons of the three fruit ripening fruit ripening (Fig. 6C). stages of L. ruthenicum indicated significant changes in biosynthesis of secondary metabolites and metabol- The expression levels of leucoanthocyanidin di- ic pathways during fruit development and ripening. oxygenase, flavonoid 3-glucosyltransferase precursor, anthocyanidin 3-O-glucoside 5-O-glucosyltransferase A total of 458 DEGs related to “flavonoid biosyn- and flavonoid 3’,5’-hydroxylase [28,29] (Fig. 7A-D), thesis” (126), “phenylpropanoid biosynthesis” (156) genes related to pigment formation and color, were and “starch and sucrose metabolism” (176) categories considerably higher at the fully ripe stage. These incre- were detected during the ripening stages according to ments were positively correlated with the expression gene annotation and biosynthesis pathway analysis of R2R3-Myb [30,31] (Fig. 7 E), the plant transcrip- 188 Arch Biol Sci. 2017;69(1):181-190

ruthenicum fruits. The species distribution returned from the NR database demonstrates that the transcrip- tome of L. ruthenicum is more closely related to that of Solanum tuberosum than to other plant genomes present in current public databases. Overall, “biosynthesis of secondary metabolites” accounted for almost half of unigenes related to meta- bolic pathways in KEGG data mapping, suggesting that the secondary metabolic processes are active pathways in L. ruthenicum fruit development. It is noticeable that 903 unigenes (5.27%) were classified into the group of “secondary metabolite biosynthesis, transport and catabolism” in COG categories, imply- ing that those secondary metabolite processes play crucial roles in L. ruthenicum fruit development. Among the dinucleotide SSRs, the most abun- dant motif was GA/CT among the dinucleotide re- peats SSRs, similar to the situation in a broad array of species such as Capsicum annuum [32,33], Solanum Fig. 7. Analyses of differentially expressed genes during L. rutheni- melongena [13], wheat [34] and barley [35]. AAG/ cum fruit development. The gene expression levels of (A) leuco- CTT was the most common trimeric motif present anthocyanidin dioxygenase; (B) flavonoid 3-glucosyl transferase in L. ruthenicum, a situation that almost matches that precursor; (C) anthocyanidin 3-O-glucoside 5-O-glucosyltransfer- in peppers (Capsicum) [36], tomato [37], peanut [38] ase; (D) flavonoid 3’,5’-hydroxylase; (E) R2R3 MYB transcription and radish [39]. The SSRs recognized in the present factor; (F) translation elongation factor 1 were determined by calculating RPKM of each gene. research provide an abundant resource for L. rutheni- cum molecular marker studies. tion factor which is highly correlated to the anthocy- Based on the DGE data of L. ruthenicum, multiple anin accumulation, while the expression of translation genes exhibited different expression patterns in differ- elongation factor 1, used as a housekeeping gene, was ent ripening stages. A total 2943, 286 and 6155 genes nearly equal during the all three stages (Fig. 7 F). were uniquely expressed in stages P1, P2 and P3, re- spectively, implying that in the fully ripe stage the ex- pression level of genes is higher than in previous stages. DISCUSSION In L. ruthenicum, nearly all the structural and reg- Next-generation sequencing technology has emerged ulatory genes involved in anthocyanin pathway were as a timesaving approach in genetic analysis, compara- significantly upregulated at the fully ripe stage [29]. tive genomics and functional genomics of non-model The same trend was documented in this study, where organisms. The genome information onL. ruthenicum the anthocyanin accumulation-related genes were in- fruit or related species was absent. In this study, the creasingly expressed during fruit ripening, reaching RNA-seq data from different ripening stages ofL. ru- significantly increased expression at full ripe stage. thenicum fruit provides a comprehensive approach for DEG searches and anthocyanin synthesis molecular Fruit skin coloration is a unique pattern in the mechanism researching. life cycle of fruiting and is mainly attributed to anthocyanin pigments [40]. The content of anthocya- The annotation results imply that the Illumina nins and color of the fruit skin are related; the increase HiSeq 2000 sequencing project in this study produced in anthocyanin biosynthesis-related gene expression a substantial number of assembled transcripts of L. causes the anthocyanin accumulation and results in Arch Biol Sci. 2017;69(1):181-190 189 fruit color deepening. In a anthocyanin biosynthesis- 7. Wang JH, Chen XQ, Zhang WJ. Study on hypoglycemic related study, anthocyanin profiling confirmed that function of polysaccharides from Lycium ruthenicum Murr. fruit and its mechanism. Food Sci. 2009;30(5):244-8. anthocyanins were increasingly accumulated during 8. Peng Q, Liu H, Shi S, Li M. Lycium ruthenicum polysaccha- fruit ripening in L. ruthenicum, and sharply increased ride attenuates inflammation through inhibiting TLR4/NF-κ in fully expanded mature fruit [29]. On the other B signaling pathway. Int J Bio Macromol. 2014;67:330-5. hand, for the starch and sucrose metabolism-related 9. Li J, Yuan H, Zeng X-C, Han B, Shi D-H. Toxicological DEGs, the gene expression decreased during L. ru- assessment of pigment of Lycium ruthenicum Murr. Food Sci. 2007;28(7):470-5. thenicum fruit development. This suggests that dur- 10. Liu Z, Shu Q, Wang L, Yu M, Hu Y, Zhang H, Tao Y, Shao Y. ing fruit ripening sucrose synthesis shows a trend of Genetic diversity of the endangered and medically impor- initial increase followed by a decrease [41] and would tant Lycium ruthenicum Murr. revealed by sequence-related be reduced in fully ripened fruit. amplified polymorphism (SRAP) markers. Biochem Syst Ecol. 2012;45:86-97. 11. Chen H, Zhong Y. Microsatellite markers for Lycium rutheni- Availability of supporting data cum (Solananeae). Mol Biol Rep. 2014;41(9):5545-8. 12. Wang S-Z, Pan L, Hu K, Chen C-Y, Ding Y. Development All clean reads generated by Illumina sequencing have and characterization of polymorphic microsatellite mark- ers in Momordica charantia (Cucurbitaceae). Am J Bot. been deposited in the Sequence Read Archive (SRA) 2010;97(8):e75-e78. data base (http://www.ncbi.nlm.nih.gov/sra) under 13. Vilanova S, Manzur JP, Prohens J. Development and char- the accession no. ID SRP071022. acterization of genomic simple sequence repeat markers in eggplant and their application to the study of diversity and relationships in a collection of different cultivar types and Acknowledgments: We thank the Agriculture and Animal Hus- origins. Mol Breeding. 2012;30(2):647-60. bandry Bureau of Yinchuan City for providing L. ruthenicum fruit 14. Liu W, Lu XY, He GY, Gao X, Li MX, Wu JH, Li ZJ, Wu samples. JH, Wang JC, Luo C. Cytosolic protection against ultravio- let induced DNA damage by blueberry anthocyanins and Authors’ contribution: PY, MHQ and CSW conceived and de- anthocyanidins in hepatocarcinoma HepG2 cells. Biotechnol signed the experiments, contributed reagents/materials/analysis Lett. 2013;35(4):491-8. tools and wrote the paper. PY performed the experiments and 15. Jin H, Liu Y, Yang F, Wang J, Fu D, Zhang X, Peng XJ, Liang analyzed the data. XM. Characterization of anthocyanins in wild Lycium Conflict of interest disclosure:The authors declare that no com- ruthenicum Murray by HPLC-DAD/QTOF-MS/MS. Anal peting interests exist. Method. 2015;7:4947-56. 16. Liu J, Osbourn A, Ma P. MYB transcription factors as regu- lators of phenylpropanoid metabolism in plants. Mol Plant. 2015;8(5):689-708. REFERENCES 17. Reid KE, Olsson N, Schlosser J, Peng F, Lund ST. An opti- mized grapevine RNA isolation procedure and statistical 1. Chen H, Pu L, Cao J, Ren X. Current Research State and determination of reference genes for real-time RT-PCR Exploitation of Lycium ruthenicum Murr. Heilongjiang Agr during berry development. BMC Plant Boil. 2006;6(1):27. Sci. 2008;5:155-7. 18. Liang C, Liu X, Yiu SM, Lim BL. De novo assembly and 2. Gan QM LG, Li PY, Zhuoma DZ, Chen YR, Zuo ZC. Study characterization of Camelina sativa transcriptome by paired- on the development and utilization of Tibetan medicine end sequencing. BMC Genomics. 2013;14(1):146. Lycium ruthenicum Murr. Qinghai Sci Tech. 1997;4(1):17-9. 19. Wang H, Jiang J, Chen S, Qi X, Peng H, Li P, Song A, Guan 3. Zheng J, Ding C, Wang L, Li G, Shi J, Li H, Wang HL, Suo Z, Fang W, Liao Y, Chen F. Next-generation sequencing of YR. Anthocyanins composition and antioxidant activity of the Chrysanthemum nankingense (Asteraceae) transcriptome wild Lycium ruthenicum Murr. from Qinghai-Tibet Plateau. permits large-scale unigene assembly and SSR marker dis- Food Chem. 2011;126(3):859-65. covery. PloS One. 2013;8(4):e62293. 4. Li J, Qu W, Zhang S, Lv H. Study on antioxidant activity of 20. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson pigment of Lycium ruthenicum. China J Chinese Mater Med. DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng 2006;31(14):1179-83. Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di 5. Kosar M, Altintas A, Kirimer N, Baser KHC. Determination Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Fried- of the free radical scavenging activity of Lycium extracts. man N, Regev A. Full-length transcriptome assembly from Chem Nat Compd. 2003;39(6):531-5. RNA-Seq data without a reference genome. Nat Biotechnol. 6. Altintas A, Kosar M, Kirimer N, Baser K, Demirci B. Com- 2011;29(7):644-52. position of the essential oils of Lycium barbarum and L. 21. Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, ruthenicum fruits. Chem Nat Compd. 2006;42(1):24-5. Robles M. Blast2GO: a universal tool for annotation, visu- 190 Arch Biol Sci. 2017;69(1):181-190

alization and analysis in functional genomics research. Bio- DNA sequence, and their potential utility for genetic map- informatics. 2005;21(18):3674-6. ping. Plant Sci. 2007;172(3):640-8. 22. Ye J, Fang L, Zheng H, Zhang Y, Chen J, Zhang Z, Wang 34. Nicot N, Chiquet V, Gandon B, Amilhat L, Legeai F, Leroy J, Li ST, Li RQ, Bolund L, Wang J. WEGO: a web tool for P, Bernard M, Sourdille P. Study of simple sequence repeat plotting GO annotations. Nucleic Acids Res. 2006;34(suppl (SSR) markers from wheat expressed sequence tags (ESTs). 2):W293-W297. Theor Appl Genet. 2004;109(4):800-5. 23. Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, 35. Thiel T, Michalek W, Varshney R, Graner A. Exploiting EST Itoh M, Katayama T, Kawashima S, Okuda S, Tokimatsu T, databases for the development and characterization of gene- Yamanishi Y. KEGG for linking genomes to life and the envi- derived SSR-markers in barley (Hordeum vulgare L.). Theor ronment. Nucleic Acids Res. 2008;36(Suppl. 1):D480-D484. Appl Genet. 2003;106(3):411-22. 24. Iseli C, Jongeneel CV, Bucher P. ESTScan: a program for 36. Shirasawa K, Ishii K, Kim C, Ban T, Suzuki M, Ito T, detecting, evaluating, and reconstructing potential coding Muranaka T, Kobayashi M, Nagata N, Isobe S, Tabata S. regions in EST sequences. Proc Int Conf Intell Syst Mol Biol. Development of Capsicum EST-SSR markers for species 1999:138-48. identification and in silico mapping onto the tomato genome 25. Li R, Yu C, Li Y, Lam T-W, Yiu S-M, Kristiansen K, Wang J. sequence. Mol Breeding. 2013;31(1):101-10. SOAP2: an improved ultrafast tool for short read alignment. 37. Shirasawa K, Asamizu E, Fukuoka H, Ohyama A, Sato S, Bioinformatics. 2009;25(15):1966-7. Nakamura Y, Tabata S, Sasamoto S, Wada T, Kishida Y, Tsu- 26. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. ruoka H, Fujishiro T, Yamada M, Isobe S. An interspecific Mapping and quantifying mammalian transcriptomes by linkage map of SSR and intronic polymorphism markers in RNA-Seq. Nat Method. 2008;5(7):621-8. tomato. Theor Appl Genet. 2010;121(4):731-9. 27. Benjamini Y, Yekutieli D. The control of the false discov- 38. Koilkonda P, Sato S, Tabata S, Shirasawa K, Hirakawa H, ery rate in multiple testing under dependency. Ann Stat. Sakai H, Sasamoto S, Watanabe A, Wada T, Kishida Y, 2001;29(4):1165-88. Tsuruoka H, Fujishiro T, Yamada M, Kohara M, Suzuki S, 28. De Jong W, Eannetta N, De Jong D, Bodis M. Candidate gene Hasegawa M, Kiyoshima H, Isobe S. Large-scale develop- analysis of anthocyanin pigmentation loci in the Solanaceae. ment of expressed sequence tag-derived simple sequence Theor Appl Genet. 2004;108(3):423-32. repeat markers and diversity analysis in Arachis spp. Mol 29. Zeng S, Wu M, Zou C, Liu X, Shen X, Hayward A, Liu C, Breeding. 2012;30(1):125-38. Wang Y. Comparative analysis of anthocyanin biosynthesis 39. Shirasawa K, Oyama M, Hirakawa H, Sato S, Tabata S, during fruit development in two Lycium species. Physiol Fujioka T, Kimizuka-Takagi C, Sasamoto S, Watanabe A, Plantarum. 2014;150(4):505-16. Kato M, Kishida Y, Kohara M, Takahashi C, Tsuruoka H, 30. Kiferle C, Fantini E, Bassolino L, Povero G, Spelt C, Wada T, Sakai T, Isobe S. An EST-SSR linkage map of Rapha- Buti S, Giuliano G, Quattrocchio F, Koes R, Perata P, nus sativus and comparative genomics of the Brassicaceae. Gonzali S. Tomato R2R3-MYB Proteins SlANT1 and DNA Res. 2011;18(4):221-32. SlAN2: Same Protein Activity, Different Roles. PloS One. 40. Kayesh E, Shangguan L, Korir NK, Sun X, Bilkish N, 2015;10(8):e0136365. Zhang Y, Han J, Song CN, Cheng ZM, Fang JG. Fruit skin 31. Czemmel S, Heppel SC, Bogs J. R2R3 MYB transcription color and the role of anthocyanin. Acta Physiol Plant. factors: key regulators of the flavonoid biosynthetic pathway 2013;35(10):2879-90. in grapevine. Protoplasma. 2012;249(Suppl 2):S109-S118. 41. Zhao J, Li H, Xi W, An W, Niu L, Cao Y, Han J, Song CN, 32. Ince A, Onus A, Elmasulu S, Bilgen M, Karaca M. In silico Cheng ZM, Fang JG. Changes in sugars and organic acids in data mining for development of Capsicum microsatellites. wolfberry (Lycium barbarum L.) fruit during development Acta Hortic. 2004;729:123-7. and maturation. Food Chem. 2015;173:718-24. 33. Portis E, Nagy I, Sasvári Z, Stágel A, Barchi L, Lanteri S. The design of Capsicum spp. SSR assays via analysis of in silico