molecules

Article Identification of a Novel Specific Cucurbitadienol Synthase Allele in Siraitia grosvenorii Correlates with High Catalytic Efficiency

Jing Qiao 1, Zuliang Luo 1, Zhe Gu 1, Yanling Zhang 2, Xindan Zhang 2 and Xiaojun Ma 1,* 1 Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100193, China; [email protected] (J.Q.); [email protected] (Z.L.); [email protected] (Z.G.) 2 Guilin GFS Monk Fruit Corp., Guilin 541006, China; [email protected] (Y.Z.); [email protected] (X.Z.) * Correspondence: [email protected]; Tel.: +86-10-57833155

 Received: 7 December 2018; Accepted: 24 January 2019; Published: 11 February 2019 

Abstract: Mogrosides, the main bioactive compounds isolated from the fruits of Siraitia grosvenorii, are a group of cucurbitane-type triterpenoid glycosides that exhibit a wide range of notable biological activities and are commercially available worldwide as natural sweeteners. However, the extraction cost is high due to their relatively low contents in plants. Therefore, molecular breeding needs to be achieved when conventional plant breeding can hardly improve the quality so far. In this study, the levels of 21 active mogrosides and two precursors in 15 S. grosvenorii varieties were determined by HPLC-MS/MS and GC-MS, respectively. The results showed that the variations in mogroside V content may be caused by the accumulation of cucurbitadienol. Furthermore, a total of four wild-type cucurbitadienol synthase protein variants (50R573L, 50C573L, 50R573Q, and 50C573Q) based on two missense mutation single nucleotide polymorphism (SNP) sites were discovered. An in vitro reaction analysis indicated that 50R573L had the highest activity, with a specific activity of 10.24 nmol min−1 mg−1. In addition, a site-directed mutant, namely, 50K573L, showed a 33% enhancement of catalytic efficiency compared to wild-type 50R573L. Our findings identify a novel cucurbitadienol synthase allele correlates with high catalytic efficiency. These results are valuable for the molecular breeding of luohanguo.

Keywords: Siraitia grosvenorii; molecular breeding; site-directed mutant; mogrosides; cucurbitadienol synthase; single nucleotide polymorphism (SNP)

1. Introduction Siraitia grosvenorii (luohanguo or monk fruit) is an herbaceous perennial of the Cucurbitaceae family. It is principally cultivated in Guilin city, Guangxi Province, China [1]. The fruit of S. grosvenorii has been used in China as a natural sweetener and as a folk remedy for the treatment of lung congestion, sore throat and constipation for hundreds of years [2]. To date, luohanguo products have been approved as dietary supplements in Japan, the US, New Zealand and Australia [3,4]. Mogrosides, the major bioactive components isolated from the fruits of S. grosvenorii, are a mixture of cucurbitane-type triterpenoid glycosides that have been proven to be powerful and zero-calorie sweeteners and can hence be used as a sucrose substitute for patients with diabetes and patients who are obese [5]. Because of their complex structures (Table A1), the chemical synthesis of these compounds is inherently difficult [6]. Currently, these valuable chemicals are mainly produced through their extraction from the fruits of S. grosvenorii. With the rapid rise in market demand, the production of

Molecules 2019, 24, 627; doi:10.3390/molecules24030627 www.mdpi.com/journal/molecules Molecules 2019, 24, 627 2 of 21 luohanguo extracts has increased rapidly from two tons in 2002 to 60 tons in 2007, becoming one of Molecules 2018, 23, x FOR PEER REVIEW 2 of 23 the fastest growing industries of traditional Chinese medicine extracts [7]. However, the extraction yield of theseextraction ingredients from the is fruits limited of S. bygrosvenorii difficulties. With in theS. rapid grosvenorii rise in cultivation,market demand, including the production a requirement of for heavyluohanguo artificial pollination,extracts has increased a scarcity rapidly of appropriate from two tons cultivatable in 2002 to 60 land tons andin 2007, a high becoming purification one of cost due to thethe presence fastest growing of seeds industries [8,9]. In of addition,traditional theChines lowe medicine contents extracts of the [7]. main However, active the components extraction also yield of these ingredients is limited by difficulties in S. grosvenorii cultivation, including a result in a high cost of extraction. According to the statistics of our team, the extraction cost can be requirement for heavy artificial pollination, a scarcity of appropriate cultivatable land and a high reduced bypurification 1% when cost the due mogroside to the presence V (MV) of seeds content [8,9]. increasesIn addition, by the 0.1%. low contents Therefore, of the the main production active of high-qualitycomponentsS. grosvenorii also resultis an in urgent a high issue cost of for extrac researchtion. According and in the to production the statistics field. of our team, the In recentextraction years, cost our can team be hasreduced selected by 1% many when new theS. mogroside grosvenorii Vvarieties (MV) content with betterincreases agronomic by 0.1%. traits, such as higherTherefore, yield, the improved production resistance of high-quality to various S. grosvenorii diseases is an and urgent seedless issuefruits for research [10–13 and]. In in this the study, production field. we chose 15 S. grosvenorii varieties (E1, E2, E3, E5, E12, E23, E29, S2, S3, S10, S13, C2, C3, C6 and W4). In recent years, our team has selected many new S. grosvenorii varieties with better agronomic From thesetraits, varieties, such as higher elite germplasms,yield, improved which resistance constitute to various an diseases important and seedless resource fruits for [10–13].S. grosvenorii In breeding,this may study, be developed.we chose 15 S. Thegrosvenorii macroscopic varieties (E1, characteristics E2, E3, E5, E12, of E23, the E29, 15 S.S2, grosvenorii S3, S10, S13,varieties C2, C3, are very similar,C6 and especially W4). From beyond these varieties, the flowering elite germplas andms, fruiting which constitute periods. an However, important resource the quality for S. varies dramatically.grosvenorii Therefore, breeding, further may evaluations be developed. of S.The grosvenorii macroscopicgermplasms characteristics based of onthethe 15 activeS. grosvenorii component content arevarieties important are very to similar, better serveespecially molecular beyond breedingthe flowering purposes. and fruiting Because periods. all However, of these varietiesthe quality varies dramatically. Therefore, further evaluations of S. grosvenorii germplasms based on the are cultivated under the same conditions, we hypothesize that effective genetic polymorphisms in active component content are important to better serve molecular breeding purposes. Because all of functionalthese genes varieties involved are cultivated in MV biosynthesisunder the same are cond oneitions, of the we most hypothesize important that effective reasons genetic for the MV content change.polymorphisms in functional genes involved in MV biosynthesis are one of the most important To date,reasons key for genes the MV involved content change. in the biosynthesis of mogrosides have been successfully cloned and characterized,To date, including key genes genesinvolved from in the five biosynthesis enzyme families:of mogrosides the have squalene been successfully epoxidase cloned (SQE) [14], cucurbitadienoland characterized, synthase including (CS) [15 ],genes epoxide from hydrolasefive enzyme (EPH),families: cytochrome the squalene P450epoxidase (CYP450) (SQE) [14], [2,9 ] and cucurbitadienol synthase (CS) [15], epoxide (EPH), cytochrome P450 (CYP450)[2,9] and UDP-glucosyltransferaseUDP-glucosyltransferase (UGT) (UGT) families families (Figure (Figure1) [ 16 1)]. Squalene[16]. Squalene is thought is thought to be to the be initialthe initial substrate and precursorsubstrate for triterpenoidand precursor and for triterpenoid sterol biosynthesis. and sterol SQE biosynthesis. has been SQE generally has been recognized generally recognized as the common rate-limitingas the enzyme common in therate-limiting common pathwayenzyme in from the mevalonatecommon pathway (MVA) from and mevalonate methylerythritol (MVA) phosphateand (MEP) pathways,methylerythritol catalyzing phosphate squalene (MEP) to pathways, 2,3-oxidosqualene catalyzing squalene [17,18]. to In 2,3-oxidosqualeneS. grosvenorii, the[17,18]. initial In S. step in the biosynthesisgrosvenorii of, the cucurbitane-type initial step in themogrosides biosynthesis of is cucurbitane-type the cyclization mogrosides of 2,3-oxidosqualene is the cyclization to form of the 2,3-oxidosqualene to form the triterpenoid skeleton of cucurbitadienol, which is catalyzed by CS. triterpenoid skeleton of cucurbitadienol, which is catalyzed by CS. EPH and CYP450 further oxidize EPH and CYP450 further oxidize cucurbitadienols to produce mogrol, which is glycosylated by UGT cucurbitadienolsto form MV. to produce mogrol, which is glycosylated by UGT to form MV.

Figure 1. Genes involved in the mogroside biosynthesis pathway. 2,3-oxidosqualene is converted cucurbitadienol by cucurbitadienol synthase (CS), while cucurbitadienol is further converted to mogrosides through epoxide hydrolase (EPH), cytochrome P450s (CYP450) and UDP-glucosyltransferases (UGT). Single arrows represent the one-step conversions, while triple arrows represent multiple steps. Molecules 2019, 24, 627 3 of 21

Single nucleotide polymorphism (SNP) derived markers, identified in coding sequences of different genes, have been developed to discriminate very similar cultivars [19–21]. Technological improvements make the use of SNP attractive for high throughput use in marker-assisted breeding, for population studies [22] and to develop high-density linkage maps for map-based gene discovery [23,24]. In the resulting 2.44 Mbp of aligned sequence of soybean, a total of 5551 SNPs were discovered, including 4712 single-base changes and 839 indels for an average nucleotide diversity of Theta = 0.000997 [25]. No SNP were analyzed in S grosvenorii so far though the genome was assembled successfully [26]. This study analyzes the variation of 23 profiles in 15 S. grosvenorii varieties aimed to identify favorite allele of SgSC gene that can be used increase the production of them in the breeding program. A holistic targeted secondary metabolomics analysis was conducted to quantitatively determine the contents of 21 mogrosides and two intermediates in 15 varieties of luohanguo samples using high-performance liquid chromatography with tandem mass spectrometry (HPLC-MS/MS) or gas chromatography mass spectrometry (GC-MS). SgCS genes from a total of 15 S. grosvenorii varieties were cloned to investigate the enzyme activity of the allele gene products. Using a combined approach that relied on SNP analyses, yeast expression, and site-directed mutagenesis, the key amino acid residues underlying triterpene product efficiency were identified, providing new insight into the molecular breeding of high-quality luohanguo.

2. Results

2.1. Determination of the Levels of 21 Mogrosides in 15 S. grosvenorii Varieties The developed HPLC-MS/MS method was applied to determine the levels of the 21 mogrosides in the fruits of different S. grosvenorii varieties. The quantitative analyses were performed by means of an external standard method [27]. The results are summarized in Table A2, and a graphical representation of the results is shown in Figure2. The content of each targeted mogroside and the total content of each targeted analyte in the fruits of different S. grosvenorii varieties varied dramatically (7.77–19.97 mg/g). Among these, the level of M5 was much higher than that of any other mogroside. In the fruit of S. grosvenorii, the average total content of M5 was 63.86-fold, 15.86-fold, 5.05-fold and 11.26-fold higher than the contents of M2, M3, M4 and M6, respectively. According to a previous report, ripe fruits mainly contain M5, while unripe fruits have higher levels of M2 and M3. The M5 content significantly increased, while the levels of M2 and M3 dramatically decreased with increasing growing time (disappeared after 70 DAA) [28,29]. In our study, we did not find MIIA1 in most of the samples and found only a small amount of MIII. The average values of MIII, MIIIE, MIIIA1, and MIIIA2 were 0.10, 0.62, 0.06 and 0.15 mg/g, respectively. In terms of individual constituents, MV was the predominant component in all samples. The MV content accounts for 45.23–63.58% of the total content of the 21 targeted analytes. This result is consistent with previous reports that MV was found in a sample of whole fruits of S. grosvenorii at levels of 49.29–66.96% [27]. Moreover, its content was in the range of 4.86–13.49 mg/g in 15 varieties of luohanguo samples, all of whose contents, except the S10 variety, were all higher than the 5 mg/g set in the Chinese Pharmacopeia. Then, 11-E-MV and MIVE were the second and third most abundant sweet mogrosides in the tested samples, respectively. The average values of 11-E-MV and MIVE were 2.67 and 1.12 mg/g, respectively. MIIE, MIIIX, MIVA and SI are the biosynthetic precursor components for the synthesis of MV. MIIE was detected in E2, E3, E5 and E12 but not in the other samples. MIIIX was not detected because the standard was unavailable. The MIVA content ranged from 0.16 to 1.80 mg/g, with a mean value of 0.73 mg/g. As the sweetest among the mogrosides, SI was the fifth most abundant sweet mogroside in the tested varieties. The mean content of SI was 0.81 mg/g and ranged from 0.26 to 1.21 mg/g. The results showed that the contents of the mogrosides were significantly different between the different varieties of S. grosvenorii. Molecules 2018, 23, x FOR PEER REVIEW 4 of 23 sweet mogroside in the tested varieties. The mean content of SI was 0.81 mg/g and ranged from 0.26 to 1.21 mg/g. The results showed that the contents of the mogrosides were significantly different between the different varieties of S. grosvenorii. In order to explore the clustering of S. grosvenorii samples from different species, hierarchical clustering analysis (HCA) was performed on 21 mogrosides content. 15 species were analyzed using HCA of the peak areas of the 21 analytical markers in the LC–MS/MS. The between-groups linkage method was chosen as average linkage and the squared Euclidean distance was selected to establish clusters, and to assess the resemblance of the 15 species of samples. The resulting hierarchical cluster heat-map in Figure A1 produced well-defined clusters and showed the peak area profile of the 21 mogrosides peaks. Species were obviously grouped into four main clusters. Species S10 and S13 were categorized into Cluster I. S2, E23 and E5 were grouped into Cluster II. E1 was defined into third distinct cluster (Cluster III). The remaining species including C2, C3, C6, W4, S3, E29, E3, E12 and E2 were put into Cluster IV. These results agreed with a simple visual comparison of content analysis, and with the ratio among different mogrosides (Figure 2), which can reflect their quality. Combining these findings, we further predicted that the magnitude of the four groups’ quality should be ordered as follows: Cluster III > Cluster II > Cluster IV > Cluster I. Although the total mogrosides content was not different, significant variation (p < 0.05) in MV concentration was observed for the elite germplasm lines S2 and E1. These results showed that species S2 and E1 could beMolecules useful2019 to ,breeders24, 627 for S. grosvenorii quality improvement. 4 of 21

Figure 2. Graphical representation of the content of each mogroside and the total content in the different Figure 2. Graphical representation of the content of each mogroside and the total content in the varieties of luohanguo. MV was the predominant component in all samples. 11-E-MV and MIVE were different varieties of luohanguo. MV was the predominant component in all samples. 11-E-MV and the second and third most abundant sweet mogrosides in the tested samples, respectively. S2 and E1 MIVE were the second and third most abundant sweet mogrosides in the tested samples, can be used as the elite germ to breeders for S. grosvenorii quality improvement. Abbreviations: MIIA1, respectively. S2 and E1 can be used as the elite germ to breeders for S. grosvenorii quality mogroside IIA1; MIIA, mogroside IIA; MIIA2, mogroside IIA2; MIIE, mogroside IIE; MIIIA1, mogroside improvement. Abbreviations: MIIA1, mogroside IIA1; MIIA, mogroside IIA; MIIA2, mogroside IIA2; IIIA1; MIIIA2, mogroside IIIA2; MIIIE, mogroside IIIE; MIII, mogroside III; 11-O-SI, 11-O-siamenoside I; MIIE, mogroside IIE; MIIIA1, mogroside IIIA1; MIIIA2, mogroside IIIA2; MIIIE, mogroside IIIE; MIVE, mogroside IVE; MIVA, mogroside IVA; SI, siamenoside I; 11-D-O-MV, 11-deoxymogroside V; MIII, mogroside III; 11-O-SI, 11-O-siamenoside I; MIVE, mogroside IVE; MIVA, mogroside IVA; SI, 11-E-MV, 11-epi-mogroside V; IMV, isomogroside V; MV, mogroside V; O-MV, 11-oxo-mogroside V; siamenoside I; 11-D-O-MV, 11-deoxymogroside V; 11-E-MV, 11-epi-mogroside V; IMV, 11-O-MVI, 11-oxo-mogroside VI; MVIB, mogroside VIB; MVIA, mogroside VI A; MVI, mogroside VI. isomogroside V; MV, mogroside V; O-MV, 11-oxo-mogroside V; 11-O-MVI, 11-oxo-mogroside VI; MVIB,In order mogroside to explore VIB; theMVIA, clustering mogroside of S.VI grosvenoriiA; MVI, mogrosidesamples VI. from different species, hierarchical clustering analysis (HCA) was performed on 21 mogrosides content. 15 species were analyzed using 2.2. The Determination of Squalene and Cucurbitadienol Levels in the Fruit of S. grosvenorii HCA of the peak areas of the 21 analytical markers in the LC–MS/MS. The between-groups linkage method was chosen as average linkage and the squared Euclidean distance was selected to establish clusters, and to assess the resemblance of the 15 species of samples. The resulting hierarchical cluster heat-map in Figure A1 produced well-defined clusters and showed the peak area profile of the 21 mogrosides peaks. Species were obviously grouped into four main clusters. Species S10 and S13 were categorized into Cluster I. S2, E23 and E5 were grouped into Cluster II. E1 was defined into third distinct cluster (Cluster III). The remaining species including C2, C3, C6, W4, S3, E29, E3, E12 and E2 were put into Cluster IV. These results agreed with a simple visual comparison of content analysis, and with the ratio among different mogrosides (Figure2), which can reflect their quality. Combining these findings, we further predicted that the magnitude of the four groups’ quality should be ordered as follows: Cluster III > Cluster II > Cluster IV > Cluster I. Although the total mogrosides content was not different, significant variation (p < 0.05) in MV concentration was observed for the elite germplasm lines S2 and E1. These results showed that species S2 and E1 could be useful to breeders for S. grosvenorii quality improvement.

2.2. The Determination of Squalene and Cucurbitadienol Levels in the Fruit of S. grosvenorii When the contents of related intermediates (squalene, cucurbitadienol) in luohanguo were analyzed by GC-MS, the results showed significant variations among the different varieties (p < 0.05 Molecules 2018, 23, x FOR PEER REVIEW 5 of 23

Molecules 2019, 24, 627 5 of 21 When the contents of related intermediates (squalene, cucurbitadienol) in luohanguo were analyzed by GC-MS, the results showed significant variations among the different varieties (p < 0.05 andand 0.01, 0.01, respectively). respectively). For For example, example, the squalenethe squa contentlene content of C2 wasof C2 1.24 was mg/g, 1.24 whereas mg/g, thewhereas squalene the contentsqualene of E3content was onlyof E3 0.03 was mg/g only (Figure 0.033 a).mg/g Similar (Figure to squalene, 3a). Similar the contentsto squalene, of cucurbitadienol the contents inof thecucurbitadienol different varieties in the of different luohanguo varieties were obviouslyof luohanguo different. were obviously The mean different. content of The cucurbitadienol mean content wasof cucurbitadienol 0.50 mg/g and was ranged 0.50 from mg/g 0.17 and to ranged 1.80 mg/g from (Figure 0.17 to3 b).1.80 mg/g (Figure 3b).

FigureFigure 3. 3.Graphical Graphical representationrepresentation of of the the ( a(a)) squalene squalene and and ( b(b)) cucurbitadienol cucurbitadienol content content in in the the different different varietiesvarieties of of luohanguo. luohanguo. The results revealed a wide range of variability among the 15 S. grosvenorii varieties for 23 The results revealed a wide range of variability among the 15 S. grosvenorii varieties for 23 quantitative components (AppendixA Tables A2 and A3). The squalene and cucurbitadienol quantitative components (Appendix Tables A2 and A3). The squalene and cucurbitadienol coefficients of variation (CVs) were higher than the mogroside CV. The CVs of squalene and coefficients of variation (CVs) were higher than the mogroside CV. The CVs of squalene and cucurbitadienol were 135.03 and 86.03%, respectively. In contrast, the lowest CV belonged to the cucurbitadienol were 135.03 and 86.03%, respectively. In contrast, the lowest CV belonged to the MV MV (21.12%). Because the M2, M3 and M4 contents were very low in ripe fruit, we speculated that (21.12%). Because the M2, M3 and M4 contents were very low in ripe fruit, we speculated that catalysis by UGTs is not a rate-limiting step for MV biosynthesis in plants. Moreover, squalene and catalysis by UGTs is not a rate-limiting step for MV biosynthesis in plants. Moreover, squalene and cucurbitadienol, the precursor of mogrosides, accumulates in fresh fruit. SgCS and CYP450 are the cucurbitadienol, the precursor of mogrosides, accumulates in fresh fruit. SgCS and CYP450 are the that can catalyze these two substrates, respectively. We proposed that the conversion of enzymes that can catalyze these two substrates, respectively. We proposed that the conversion of squalene and cucurbitadienol through improved activity of SgCS and CYP450 was proportional to the squalene and cucurbitadienol through improved activity of SgCS and CYP450 was proportional to accumulation of MV. In this paper, we mainly focus on the enzyme catalytic efficiency of SgCS, and we the accumulation of MV. In this paper, we mainly focus on the enzyme catalytic efficiency of SgCS, think it is necessary to study the catalytic efficiency of CYP450 in the future. When cucurbitadienol and we think it is necessary to study the catalytic efficiency of CYP450 in the future. When increases under high SgCS catalytic efficiency, the content of M2, M3 and M4 may increase, and then it cucurbitadienol increases under high SgCS catalytic efficiency, the content of M2, M3 and M4 may may lead to the increase of pharmaceutical ingredients MV via the same biosynthetic route they shared. increase, and then it may lead to the increase of pharmaceutical ingredients MV via the same 2.3.biosynthetic SNP Identification route they in ORFshared. Region of the SgCS Gene

2.3. SNPSgCS Identification is a member in of ORF the oxidosqualene Region of the SgCS cyclase Gene (OSC ) gene family and catalyzes the cyclization of 2,3-oxidosqualene to cucurbitadienol. This step, catalyzed by OSCs, is the key branch-point leading SgCS is a member of the oxidosqualene cyclase (OSC) gene family and catalyzes the cyclization to triterpenoid or sterol synthesis [30–32]. A 2800 bp full-length cDNA sequence of the SgCS gene of 2,3-oxidosqualene to cucurbitadienol. This step, catalyzed by OSCs, is the key branch-point encoding a 759-residue protein (between 200 and 2479 bp) was obtained from 15 varieties. To detect leading to triterpenoid or sterol synthesis [30–32]. A 2800 bp full-length cDNA sequence of the SgCS sequence polymorphisms among different cultivars, we aligned the ORF sequences, which revealed gene encoding a 759-residue protein (between 200 and 2479 bp) was obtained from 15 varieties. To a total of 4 SNPs among 15 varieties. No InDels were distributed in the ORF region of the SgCS gene. detect sequence polymorphisms among different cultivars, we aligned the ORF sequences, which The changes in nucleotides at the 84 and 148 sites were due to transitions (A-G or C-T), whereas revealed a total of 4 SNPs among 15 varieties. No InDels were distributed in the ORF region of the transversions (A-T, A-C) existed in 618 and 1962 sites. Of the SNPs, only 148 sites were missense SgCS gene. The changes in nucleotides at the 84 and 148 sites were due to transitions (A-G or C-T), mutations that caused changes in amino acid sequences between Arg (R) and Cys (C). To date, whereas transversions (A-T, A-C) existed in 618 and 1962 sites. Of the SNPs, only 148 sites were a universal platform that can allow data comparisons across different laboratories has been used. missense mutations that caused changes in amino acid sequences between Arg (R) and Cys (C). To The sequence alignment analysis of the 15 SgCS clones against the expressed sequence tags (ESTs) date, a universal platform that can allow data comparisons across different laboratories has been within GenBank (GenBank accession number: HQ128567) identified another SNP site in 1718 bp, used. The sequence alignment analysis of the 15 SgCS clones against the expressed sequence tags and this SNP is a missense mutation (Figure A2). Therefore, four types of SgCS protein variants (ESTs) within GenBank (GenBank accession number: HQ128567) identified another SNP site in 1718 (50R573L, 50C573L, 50R573Q and 50C573Q) based on the missense mutation SNP sites (Figure4) bp, and this SNP is a missense mutation (Figure A2). Therefore, four types of SgCS protein variants were identified.

Molecules 2018, 23, x FOR PEER REVIEW 6 of 23

(50R573L,Molecules 2019 50C573L,, 24, 627 50R573Q and 50C573Q) based on the missense mutation SNP sites (Figure6 of 4) 21 were identified.

FigureFigure 4. MissenseMissense mutations mutations of of SgCS SgCS in in the 15 S. grosvenorii varieties. Four Four types types of of SgCS SgCS protein protein variantsvariants (50R573L, (50R573L, 50C573L, 50C573L, 50R573Q 50R573Q and and 50C573Q) 50C573Q) base basedd on on the the missense missense mutation mutation SNP SNP sites sites were were identified.identified. The The number number is is the the position position of of the the protei proteinn sequence, sequence, R, R, C, C, L, L, and and Q Q is is corresponding corresponding amino amino acidacid Arg, Arg, Cys, Cys, Leu, Leu, Gln. Gln. 2.4. Activity Comparison 2.4. Activity Comparison To measure cucurbitadienol synthetic activity, the four different types (50R573L, 50C573L, 50R573Q To measure cucurbitadienol synthetic activity, the four different types (50R573L, 50C573L, and 50C573Q) of CS genes from S. grosvenorii were codon-optimized, synthesized and subcloned into 50R573Q and 50C573Q) of CS genes from S. grosvenorii were codon-optimized, synthesized and the yeast expression vector pCEV-G4-Km. The resulting recombinant plasmids and the empty vector subcloned into the yeast expression vector pCEV-G4-Km. The resulting recombinant plasmids and pCEV-G4-Km, used as a negative control, were transformed into the BY4742 strain. Colonies were the empty vector pCEV-G4-Km, used as a negative control, were transformed into the BY4742 strain. randomly picked from the plate and verified by PCR amplification. All correct colonies were cultivated Colonies were randomly picked from the plate and verified by PCR amplification. All correct in yeast extract peptone dextrose medium (YPD) medium with 2% glucose for 3 days. GC-MS analysis colonies were cultivated in yeast extract peptone dextrose medium (YPD) medium with 2% glucose of the cell extract confirmed production of cucurbitadienol. Protein variants 50R573L and 50C573L for 3 days. GC-MS analysis of the cell extract confirmed production of cucurbitadienol. Protein generated products with similar cucurbitadienol yields (0.365 and 0.300 mg/g yeast cells), whereas variants 50R573L and 50C573L generated products with similar cucurbitadienol yields (0.365 and 50R573Q and 50C573Q produced much smaller amounts of cucurbitadienol (0.015 and 0.022 mg/g 0.300 mg/g yeast cells), whereas 50R573Q and 50C573Q produced much smaller amounts of yeast cells). At the same time, 50C573Q accumulated 3.7 times more squalene than 50R573L. These results cucurbitadienol (0.015 and 0.022 mg/g yeast cells). At the same time, 50C573Q accumulated 3.7 times demonstrate that SgCS variants from cultivated S. grosvenorii are highly divergent, as inferred from their more squalene than 50R573L. These results demonstrate that SgCS variants from cultivated S. product contents (Figure5a,b). grosvenorii are highly divergent, as inferred from their product contents (Figure 5a,b). The approach adopted here was to search for the SNP site of the gene that is known to be closely The approach adopted here was to search for the SNP site of the gene that is known to be linked to a trait of interest, namely, content. Unexpectedly, SNPs were tested for significant deviation closely linked to a trait of interest, namely, content. Unexpectedly, SNPs were tested for significant from equal distribution between the two groups of high cucurbitadienol content and low cucurbitadienol deviation from equal distribution between the two groups of high cucurbitadienol content and low content with the χ2 test. The results suggested that there were no significant relationships between these cucurbitadienol content with the χ2 test. The results suggested that there were no significant SNP sites and cucurbitadienol content in the different S. grosvenorii varieties. A possible explanation relationships between these SNP sites and cucurbitadienol content in the different S. grosvenorii is that the accumulation of cucurbitadienol is a dynamic process because these classes of secondary varieties. A possible explanation is that the accumulation of cucurbitadienol is a dynamic process metabolites share a common synthetic pathway [33]. The cucurbitadienol content in plants depends because these classes of secondary metabolites share a common synthetic pathway [33]. The not only on the activity of SgCS but also on the downstream enzyme that can use cucurbitadienol as cucurbitadienol content in plants depends not only on the activity of SgCS but also on the a substrate for synthase mogrosides. downstream enzyme that can use cucurbitadienol as a substrate for synthase mogrosides. Measuring Km values in the presence of detergents is notoriously difficult because the concentrations of the substrate and enzyme are distorted by the biphasic aqueous and micellar system [34,35]. Although most of the enzyme and substrate are probably constrained to the restricted volume of the micelle, the soluble proportion is not readily determined. We consequently compared the catalytic competence of these enzymes using homogenate assays with substrate at a concentration (250 µM) well above the literature Km of 25–125 µM described for several plant cyclases. The activities of purified variants were tested, and the results are shown in Table1. 50R573L showed the highest activity as the best variant. The specific activity of 50C573L showed a slight decline, with a loss of

Molecules 2019, 24, 627 7 of 21 approximately 20% of activity compared to 50R573L. Inferences from in vitro experiments of 50R573L and 50C573L were nearly in agreement with the results of the in vivo experiments, indicating that the in vivo assay can serve as an alternative to the in vitro assay when the enzyme activities of 50R573Q and 50C573Q are too weak for measurements of Km and k . Molecules 2018, 23, x FOR PEER REVIEW cat 7 of 23

FigureFigure 5.5. ProductionProduction ofof cucurbitadienolcucurbitadienol byby anan engineeredengineered S.S. cerevisiaecerevisiae strain.strain. ((AA)) GC-MSGC-MS analysisanalysis ofof cellcell extractsextracts ofof BY4742,BY4742, whichwhich hadhad wild-typewild-type SgCS.SgCS. SqualeneSqualene (RT(RT == 16.6316.63 min)min) andand cucurbitadienolcucurbitadienol (RT(RT == 19.9819.98 min)min) ((B)) SqualeneSqualene andand cucurbitadienolcucurbitadienol productionproduction byby thethe differentdifferent yeastyeast strainsstrains withwith fourfour wild-typewild-type SgCSsSgCSs (50R573L,(50R573L, 50C573L,50C573L, 50R573Q50R573Q andand 50C573Q).50C573Q). ((CC)) CucurbitadienolCucurbitadienol productionproduction byby thethe differentdifferent yeastyeast strainsstrains withwith SgCSSgCS mutantsmutants (50A573L,(50A573L, 50D573L,50D573L, 50E573L,50E573L, 50H573L50H573L andand 50K573L).50K573L).

TableMeasuring 1. Determination Km values of kinetic in the parameters presence of fourof dete wildrgents types ofis SgCS notoriously (50R573L, 50C573L,difficult 50R573Q,because the concentrationsand 50C573Q). of Thethe kinetic substrate parameters and enzyme of the reaction are distorted were assayed by with the 3(S)-oxidosqualenebiphasic aqueous as sbustrateand micellar systemat 30 [34,35].◦C and pHAlthough 7.0. most of the enzyme and substrate are probably constrained to the restricted volume of the micelle, the soluble proportion is not readily determined. We consequently compared the catalyticKm competenceKcat of theseKcat/Km enzymes usingSpecific homogenate Activity assays Relativewith substrate at a (µM) (min−1) (µM−1 min−1) (nmol min−1 mg−1) Activity (%) concentration (250 µM) well above the literature Km of 25–125 µM described for several plant cyclases. 50R573LThe activities 0.29 of purified 0.88 variants 2.98 were tested, and 10.24the results are shown 100 in Table 1. 50C573L 0.32 0.77 2.41 8.28 80.8 50R573L 50R573Qshowed theND highest(a) activityND as the NDbest variant. The specific ND activity of 50C573L - showed a slight decline,50C573Q with a loss ND of approximately ND 20% ND of activity compared ND to 50R573L. Inferences - from in vitro experiments of 50R573L and 50C573L(a) were: not detected. nearly in agreement with the results of the in vivo experiments, indicating that the in vivo assay can serve as an alternative to the in vitro assay when the enzyme activities of 50R573Q and 50C573Q are too weak for measurements of Km and kcat. The amino acid sequence of SgCS shares high similarity with the sequences of CSs (85% with CcCS from C. colocynthis, 89% with CpCS from C. pepo and 84% with CsCS from C. sativus). Residues 50R and 573L in SgCS correspond to 52R and 578L, respectively, in CcCS and CpCS and 70R and 599L, respectively, in CsCS. Previous study has shown that SgCS, which had the highest cucurbitadienol yield, is the wild-type CS enzyme with highest catalytic efficiency [36]. Therefore, SgCS can be used as a guideline for other CS enzyme modification.

Molecules 2019, 24, 627 8 of 21

The amino acid sequence of SgCS shares high similarity with the sequences of CSs (85% with CcCS from C. colocynthis, 89% with CpCS from C. pepo and 84% with CsCS from C. sativus). Residues 50R and 573L in SgCS correspond to 52R and 578L, respectively, in CcCS and CpCS and 70R and 599L, respectively, in CsCS. Previous study has shown that SgCS, which had the highest cucurbitadienol yield, is the wild-type CS enzyme with highest catalytic efficiency [36]. Therefore, SgCS can be used as a guideline for other CS enzyme modification.

2.5. Improving the Activity of SgCS by Site-Directed Mutagenesis To rationally produce a more catalytically efficient SgCS, site-directed mutagenesis of 50R573L was performed. We constructed five variants (50A573L, 50D573L, 50E573L, 50H573L and 50K573L) to verify the effects of 50 amino acid residues in SgCS. The subsequent enzyme assay by GC-MS revealed that the cucurbitadienol contents of 50A573L, 50E573L and 50H573L were reduced slightly in comparison with that of the wild-type, whereas that of 50K573L increased by approximately 33.5% (Figure5c). Notably, the catalytic efficiency of the 50K573L mutant was enhanced by approximately 1.62-fold compared to that of 50C573L. These results showed that lysine was the optimal amino acid for the 50 position, with higher activity. Therefore, to rationally produce more catalytically efficient SgCSs, site-saturation mutagenesis of the 50 position needs to be further performed individually.

3. Discussion S. grosvenorii is an important herbal crop with multiple economic and pharmacological uses. Mogrosides, the main effective components of luohanguo, are partial substitutes for sucrose because of their extremely sweet and noncaloric characteristics, and increasing progress is being made in terms of molecular breeding and purification processes. That the production and distribution of mogrosides in monk fruits is regulated by genetic factors, growth times, environmental factors, physiological factors, and chemical factors is generally accepted. In this study, 15 varieties of S. grosvenorii were cultivated under the same conditions and harvested at the same time, suggesting that the differences in the active component contents were most likely caused by genetic differences. We hypothesize that effective genetic polymorphisms in functional genes involved in MV biosynthesis constitute one of the most important mechanisms of changes in MV content. The biosynthesis pathway of mogrosides has been extensively studied, and several genes have been identified. The initial committed step in the biosynthesis of cucurbitane-type mogrosides is the cyclization of 2,3-oxidosqualene to form the triterpenoid skeleton of cucurbitadienol. This step is catalyzed by CS, which has been functionally characterized in several plants, including Cucurbita pepo [37], Citrullus colocynthis [38], Cucumis sativus [39] and S. grosvenorii [15]. However, no kinetic data for CS from these plants have been reported. In this study, the two key amino acid sites that determine the enzyme activity of cucurbitadienol synthesis were successfully identified through comprehensive SNP analysis among 15 varieties of S. grosvenorii. An in vitro study indicated that the wild-type 50R573L enzyme has quite good efficiency considering that many triterpene cyclases have very low efficiencies [34]. The apparent Km values for 3(S)-oxisqualene were determined to be 55µM for lanosterol cyclase in rat [40] and 33.8 µM for β-Amyrin cyclase in E. tirucalli L. [35], the reported values for pea cyclases being 25 and 50 µM, respectively [41]. The use of SNP information combined with site-directed mutation is an effective approach to enhance enzyme activity. In this study, this approach enabled us to identify a mutant 50K573L that had 133% efficiency compared to wild-type 50R573L. To rationally produce more catalytically efficient SgCSs, site-saturation mutagenesis of the 50 position needs to be further performed individually. During the evolutionary process, plants have perfected their systems to produce diverse catalogs of compounds in adaptations to both natural and artificial selection processes. Rapid advances in genome sequencing technologies have greatly accelerated studies of the molecular basis underlying these evolutionary events [42,43], which provide deep insights into nature’s strategy for survival in a specific ecological niche. Phylogenetic analysis showed that S. grosvenorii diverged from the Molecules 2019, 24, 627 9 of 21

Cucurbitaceae family approximately 40.95 million years ago [26]. We generated and analyzed a multiple sequence alignment to test whether the 50 and 573 amino sites of CS were conserved in the Cucurbitaceae plant family. The substitution of 573L with 573Q resulted in an almost complete loss of function in terms of producing cucurbitadienol, indicating that this amino acid site may serve as an active-site residue for 2,3-oxidosqualene cyclization to cucurbitadienol. Previous study showed that a point mutations Cys (C) to Tyr (Y) at position 393 of cucumber CS which disables the catalytic capacity of CcCS inhibiting bitterness biosynthesis in cucumber [39]. However, in contrast to the well-studied functionally characterized SgCS, key amino acid residues responsible for the formation of cucurbitadienol have not been identified. Therefore, further studies based on homologous modeling and site-directed mutagenesis should be carried out. Integrating “good genes”, into high-yielding, high-content cultivars continues to be one of primary objectives of many breeding programs [44]. Although substantial efforts have been made during the past twenty years to develop an optimal yield of the final product MV S. grosvenorii cultivars using conventional plant breeding methods, there has been limited success in achieving the desired goal [45]. After crossings one needs to screen thousands of individual plants for their performance, which is time consuming and costly as the plants have to be cultivated till there are fruits. With the advent of molecular biology techniques, it was presumed that developing high-quality cultivars would be convenient and relatively less time consuming. Results from our studies identified genetic sources of high SgCS efficiency that could be useful to breeders for S. grosvenorii improvement. In S. grosvenorii, squalene is converted to cucurbitadienol by SQE, and cucurbitadienol is further converted to mogroside compounds by CYP450 and UGT. After the metabolic pathway of mogroside in S. grosvenorii was identified [2,9,16,26,46], the SNP sites of SQE, CYP450s and UGT could be discovered to reveal a new hypothesis that effective genetic polymorphisms in functional genes involved in MV biosynthesis can result in changes in MV content. Due to the large collection of high-efficiency mutants, the further step for breeding the “super” S. grosvenorii cultivar should focus on cotransformation of all of these homozygous functional enzymes with high activity into the elite germplasm S2 or E1. Additionally, these mutants can be valuable gene resources for the production of mogrosides by metabolic engineering.

4. Materials and Methods

4.1. Chemicals and Reagents The following reference compounds were purchased from Chengdu Must Bio-Technology Co., Ltd. (Sichuan, China): mogroside IIA (MIIA), mogroside IIA1 (MIIA1), mogroside IIA2 (MIIA2), mogroside IIE (MIIE), mogroside III (MIII), mogroside IIIE (MIIIE), mogroside IIIA1 (MIIIA1), mogroside IIIA2 (MIIIA2), 11-O-siamenoside I (11-O-SI), siamenoside I (SI), mogroside IVA (MIVA), mogroside IVE (MIVE), 11-deoxymogroside V (11-D-O-MV), 11-epi-mogroside V (11-E-MV), MV, 11-oxo-mogroside V (O-MV), isomogroside V (IMV), 11-oxo-mogroside VI (11-O-MVI), mogroside VI (MVI), mogroside VI A (MVIA) and mogroside VI B (MVIB), all with purity > 98% as determined by HPLC-DAD-ELSD. Squalene and 3(S)-oxidosqualene were obtained from J&K Scientific Ltd. (Beijing, China). HPLC-grade formic acid, methanol, acetonitrile and hexane were obtained from Fisher (Emerson, IA, USA). Other reagents and chemicals were of analytical grade and purchased from Sinopharm Chemical Regent Beijing (Beijing, China). Deionized water was prepared using a Milli-Q purification system (Millipore, MA, USA). Plant genomic DNA kits and DNA Marker II were obtained from TianGen Biotech Co. Ltd. (Beijing, China). KOD-Plus DNA Polymerase was purchased from TOYOBO Biotech Co., Ltd. (Shanghai, China). Primer synthesis and DNA sequencing were conducted at Sangon Biotech Company (Beijing, China). Restriction enzymes and DNA were purchased from Takara Biotechnology Co. Ltd. (Dalian, China). The host strain S. cerevisiae BY4742 was purchased from Invitrogen (Carlsbad, CA, USA). The yeast pCEV-G4-Km expression plasmid (Addgene plasmid #46819) was a gift from Lars Molecules 2019, 24, 627 10 of 21

Nielsen and Claudia Vickers [47]. Other reagents were purchased from Beijing Chemical Corporation (Beijing, China) unless otherwise specified.

4.2. Sample Collection and Preparation Fifteen S. grosvenorii varieties were cultivated on the farm of Lingui Monk Fruit Production Base (25.2◦ N, 110.0◦ E, Guangxi, China) and harvested 80 days after flowering. The samples were authenticated as the fruits of S grosvenorii Swingle by Prof. Xiaojun Ma from the Institute of Medicinal Plant Development. All fruits were sealed in plastic bags and stored at 20 ◦C before analysis. Plant leaves were cut and frozen in liquid nitrogen and saved at −80 ◦C for DNA extraction. The dry fruit samples were powdered to a homogeneous size (ca. 50 mesh) by a disintegrator (Shanghai Shuli Instrument, Shanghai, China). Approximately 0.5 g of powder from each sample was accurately weighed and introduced into a 50-mL capped conical flask with 25 mL of methanol/water (80:20, v/v). The flask was sealed and sonicated in a KQ-300 ultrasonic water bath (Kunshan Ultrasonic Instrument, Jiangsu, China) operating at 40 kHz with an output power of 300 W for 30 min at room temperature. A duplicate extract was prepared. Both extracts were mixed and transferred into a volumetric flask and then diluted to 100 mL with methanol/water (80:20, v/v) and filtered through a 0.22 µm microporous membrane. For the analysis of the nonpolar compounds squalene and cucurbitadienol from S. grosvenorii, 50 mL of the above mentioned methanol/water extracts was extracted three times with the same volume of n-hexane. The combined extracts were dried under reduced pressure distillation and dissolved in 1 mL of n-hexane.

4.3. LC–MS/MS Analysis of 21 Mogrosides The HPLC system consisted of an Agilent Technologies 1260 Series LC system (Agilent, USA) equipped with an automatic degasser, a quaternary pump, and an autosampler. Chromatographic separations were performed on an Agilent Poroshell 120 SB C18 column (100 mm × 2.1 mm, 2.7 µm) by gradient elution using a mobile phase consisting of (A) water (containing 0.1% formic acid) and (B) acetonitrile with the following gradient procedure: 0–8 min 25% B, 11 min 80% B, 11.01–11.50 min 80% B and 11.51–15.0 min 20% B, with a flow rate of 0.20 mL/min. The injection volume was 2.0 µL. The column effluent was monitored using a 4000 QTRAP® LC–MS/MS (AB Sciex, Toronto, Canada). Ionization was achieved using electrospray ionization (ESI) in the negative-ion mode with nitrogen as the nebulizer. Multiple reaction monitoring (MRM) scanning was employed for quantification. The source settings and instrument parameters for each MRM transition were optimized not only to maximize the generated deprotonated analyte molecule ([M − H]−) of each targeted mogroside but also to efficiently produce its characteristic fragment/product ions. The electrospray voltage was set at −4500 V, and the source temperature was 500 ◦C. The curtain gas (CUR), nebulizer gas (GS1), and heater gas (GS2) were set at 15, 50, and 40 psi, respectively. The compound-dependent instrumental parameters of two individual precursor-to-product ion transitions specific for each analyte, including the precursor ion, two product ions, declustering potential (DP), entrance potential (EP), collision energy (CE), and collision cell exit potential (CXP), were optimized and are listed in Table2. The dwell time was 400 ms for each MRM transition. LC–MS/MS chromatograms of the standards of 21 mogrosides are presented in Figure A3a, and the samples are shown in Figure A3b. Molecules 2019, 24, 627 11 of 21

Table 2. MS parameters of 21 mogrosides.

Molecular Analytes t (min) m/z Precursor m/z Productions DP (V) CE (eV) Formula R (a) MIIA C42H72O14 16.4 799.5 637.6 /475.5 −170 −65 (a) MIIA1 C42H72O14 16.7 799.5 637.6 /475.5 −170 −65 M2 (a) MIIA2 C42H72O14 15.27 799.5 637.6 /475.5 −170 −65 (a) MIIE C42H72O14 13.4 799.5 637.6 /475.5 −170 −65 (a) MIII C48H82O19 11.56 961.6 799.4 /637.3 −170 −70 (a) MIIIE C48H82O19 11.85 961.6 799.4 /637.3 −170 −70 M3 (a) MIIIA1 C48H82O19 14.31 961.6 799.4 /637.3 −170 −70 (a) MIIIA2 C48H82O19 12.07 961.6 799.4 /637.3 −170 −70 (a) 11-O-SI C54H90O24 9.32 1121.6 959.6 /797.4 −220 −75 (a) SI C54H92O24 9.82 1123.6 961.6 /799.2 −220 −75 M4 (a) MIVA C54H92O24 10.20 1123.6 961.6 /799.2 −220 −75 (a) MIVE C54H92O24 10.77 1123.6 961.6 /799.2 −220 −75 (a) 11-D-O-MV C60H102O28 12.48 1269.8 1107.7 /945.6 −230 −90 (a) 11-E-MV C60H100O29 7.12 1283.8 1121.6 /959.5 −220 −85 (a) M5 MV C60H102O29 8.44 1285.8 1123.7 /961.7 −220 −90 (a) O-MV C60H102O29 6.52 1285.8 1123.7 /961.7 −220 −90 (a) IMV C60H102O29 9.35 1285.8 1123.7 /961.7 −220 −90 (a) 11-O-MVI C66H110O34 5.29 1445.8 1283.5 /1121.7 −240 −100 (a) MVI C66H112O34 6.15 1447.8 1285.6 /1123.6 −230 −105 M6 (a) MVIA C66H112O34 7.33 1447.8 1285.6 /1123.6 −230 −105 (a) MVIB C66H112O34 7.13 1447.8 1285.6 /1123.6 −230 −105 Source temperature (◦C) 500 Ionization voltage (V) −4500 Ion source (GS1) setting (psi) 50 Ion source (GS2) setting (psi) 40 Curtain gas setting (psi) 15 CAD high Dwell time (ms) 400 EP (V) −10 CXP (V) −15 (a) Product ion used for quantification. Abbreviations: tR, retention time. DP, declustering potential; CE, collision energy; CAD, Collision activation dissociation; EP, Entrance potential; CXP, Collision cell exit potential.

4.4. SNP Analysis The dried S. grosvenorii materials were wiped with 75% ethanol and ground into powder. Total DNA was extracted from approximately 100 mg of the powder with the plant genomic DNA kit following the manufacturer’s instructions and dissolved in 30 µL of sterile water. SgCS sequences were amplified from genomic DNA by polymerase chain reaction (PCR) using the SgCS-F and SgCS-R primers (Table A4). The PCR mixture (30 µL) contained template DNA 0.8 µL, forward primer (10 µM) 0.7 µL, reverse primer (10 µM) 0.7 µL, 10× PCR Buffer 3.0 µL, KOD-Plus-Neo DNA polymerase 0.6 µL, ◦ dNTP (2 mM) 3.0 µL, MgSO4 (25 mM) 1.2 µL and ddH2O 20.0 µL. The PCR conditions were 94 C for 2 min, followed by 30 cycles at 98 ◦C for 10 s, 59 ◦C for 30 s and 68 ◦C for 2 min 30 s, with a final incubation at 68 ◦C for 7 min. PCR products were examined by 1.5% agarose gel electrophoresis before bidirectional DNA sequencing on a 3730XL sequencer (Applied Biosystems, Foster City, CA, USA). Sequences were aligned using DNAman (version 8.0, Lynnon Biosoft, Quebec, Canada), and SNPs were identified by visual inspection of the alignments.

4.5. Cloning 4 Different Copies of SgCS Genes in Yeast Four different copies of SgCS genes were codon-optimized for synthesis according to the codon bias of yeast and cloned into the BamHI/EcoRI sites of the pCEV-G4-Km yeast expression vector under the control of the TEF1 promoter to construct pCEV-50R573L, pCEV-50C573L, pCEV-50R573Q and pCEV-50C573Q, respectively. The sequences that showed in National Center for Biotechnology Information (NCBI) database was obtained and codon-optimized as described by Qiao previously [36]. Molecules 2019, 24, 627 12 of 21

4.6. SgCS Mutagenesis Experiments Mutagenesis of the 50 sites was performed using a Site-directed Mutagenesis Kit (Biomed, Beijing, China), and the corresponding degenerate primers are presented in Table A4 with the substitutions underlined. The PCR products were then purified by agarose gel electrophoresis and transformed into Trans1-T1 E. coli. The sequences of the mutant genes in the resulting plasmid (pCEV-G4-Km) were confirmed by Sanger sequencing using the oligonucleotide primers pCEV-Seq-F and pCEV-Seq-R (Table A4).

4.7. Yeast Transformation and Cell Cultivation The plasmids were transfected into S. cerevisiae strain BY4742 using the Frozen-EZ yeast transformation II kit purchased from Zymo Research (CA, USA) and selected for growth on YPD plates with 200 mg/L G418. The empty pCEV-G4-Km vector was also introduced into BY4742 as a control. The recombinant cells were first inoculated into 15 mL culture tubes containing 2 mL of YPD medium with 200 mg/L G418 and grown at 30 ◦C and 250 rpm to an OD600 of approximately 1.0. Flasks (250 mL) containing 100 mL of medium were then inoculated to an OD600 of 0.05 with the seed cultures. Strains were grown at 30 ◦C and 250 rpm for 3 days, and all optical densities at 600 nm (OD600) were measured using a Shimadzu UV-2550 spectrophotometer.

4.8. In Vitro Activity The host strain S. cerevisiae BY4742 barboring different gene types of SgCS were grown under identical conditions to those described above and collected by centrifugation. Each yeast strain was suspended in 2 volumes of 100 mM sodium phosphate buffer, pH 7, and lysed using an Emulsiflex-C5 homogenizer. After lysis, 100 mM sodium phosphate buffer, pH 7, was added to generate a 20% slurry. A solution of 3(S)-oxidosqualene and Triton X-100 was added to the homogenate aliquots (350 µL) to a final concentration of 1 mg/mL substrate and 0.1% Triton X-100. After 0.5, 1, 3, 5, and 10 h, the reactions were terminated by adding two volumes of ethanol. The denatured protein was removed by centrifugation, and the supernatant was concentrated to dryness under a nitrogen stream. The residue was resuspended in n-hexane and analyzed by GC-MS.

4.9. GC-MS Analysis of Yeast and Plant Extracts Cells were collected by centrifugation at 10,000× g for 5 min, refluxed with 5 mL of 20% KOH/50% ethanol and extracted three times with the same volume of n-hexane. The combined extracts were dried under reduced pressure distillation and dissolved in 1 mL of n-hexane. GC-MS was performed on a Thermo Scientific ISQ Single Quadrupole GC-MS (Thermo Scientific, Waltham, MA USA). Separation of the components of the nonpolar active ingredients was carried out on an HP-5 MS capillary column (5% phenyl and 95% dimethylpolysiloxane, 30 m × 0.25 mm × 0.25 µm). An electron ionization system with an ionization energy of 70 eV was used in the mass spectrometer. Helium gas was used as the carrier gas, and the carrier flow rate was 1.5 mL/min. The temperatures of both the injector and MS transfer line were 250 ◦C, and the ion source temperature was 220 ◦C. The initial oven temperature was 70 ◦C, which was held for 2 min, and the temperature was then programmed to linearly increase at a rate of 20 ◦C/min up to 260 ◦C and finally increase by 10 ◦C/min up to 300 ◦C, where it was held for 10 min. A 1 µL sample was injected automatically into the monitoring system in 1:10 split mode. Under these conditions, squalene and cucurbitadienol eluted at 14.73 and 18.85 min, respectively.

4.10. Data Processing and Statistics Data were expressed as the mean ± standard deviation. IBM SPSS statistics software 22.0 (SPSS Inc., Chicago, IL, USA) was used for data analysis, including descriptive statistics of the data, Molecules 2019, 24, 627 13 of 21 correlation analysis of the factors, regression analysis and the chi-square test. The boxplots were generated with Origin 8 (OriginLab Co., Northampton, MA, USA).

5. Conclusions In this study, we conducted a biosynthesis-based secondary metabolomics analysis of 15 S. grosvenorii varieties. A high catalytic efficiency SgCS enzyme was obtained by SNP analysis and site-directed mutation. The genotypes and chemical differences of 15 S. grosvenorii varieties were revealed by SNP analysis of cucurbitadienol and quantitative analysis of secondary metabolomics, respectively. The 15 varieties showed significant differences in their secondary metabolite profiles. A total of four wild-type SgCS protein variants based on two missense mutation SNP sites were discovered. Moreover, a site-directed mutant, namely, 50K573L, produced a 33% enhancement in efficiency compared to wild-type 50R573L. Our findings thus identify a novel cucurbitadienol synthase allele correlates with high efficiency and provide new insight into molecular breeding of S. grosvenorii. Future studies are planned to include more varieties and functional genes, such as CYP450 and UGT, to produce high-quality S. grosvenorii. Gene editing technology can be used to knock out the “bad gene” and transfer the “better gene” into the elite S. grosvenorii germ. The “high catalytic efficiency” plant can be obtained and verified by PCR, Western Blot analysis and so on, which will lay the foundation for molecular design breeding.

Author Contributions: Conceptualization, X.M. and J.Q.; Methodology, Z.L. and Z.G.; Formal Analysis, J.Q.; Investigation, Y.Z.; Resources, Y.Z. and X.Z.; Data Curation, J.Q. and Z.L.; Writing-Original Draft Preparation, J.Q.; Writing-Review & Editing, X.M.; Project Administration, Z.L. and X.M.; Funding Acquisition, X.M. Funding: This work was supported by the National Natural Science Foundation of China (NO. 81573521), Beijing Municipal Natural Science Foundation (NO. 5172028) and CAMS Innovation Fund for Medical Sciences (NO. 2017-I2 M-1-013). Acknowledgments: We thank Yuan Zhou (Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences) for providing the cucurbitadienol standard. Conflicts of Interest: The authors declare no conflicts of interest.

Abbreviations

11-D-O-MV 11-deoxymogroside V 11-E-MV 11-epi-mogroside V O-MV 11-oxo-mogroside V 11-O-SI 11-O-siamenoside I 11-O-MVI 11-oxo-mogroside VI CS cucurbitadienol synthase CYP450 cytochrome P450s EPH epoxide IMV isomogroside V MIIA mogroside IIA MIIA1 mogroside IIA1 MIIA2 mogroside IIA2 MIIE mogroside IIE MIII mogroside III MIIIA1 mogroside IIIA1 MIIIA2 mogroside IIIA2 MIIIE mogroside IIIE MIVE mogroside IVE MIVA mogroside IVA MV mogroside V MVI mogroside VI Molecules 2019, 24, 627 14 of 21

MVIA mogroside VI A MVIB mogroside VIB SI siamenoside I SNP single nucleotide polymorphism SQE squalene epoxidases UGT UDP-glucosyltransferase

Appendix A Figure A1. Hierarchical clustering analysis of S. grosvenorii varieties. Figure A2. Alignment of PCR fragment showing the SNP sites of SgCS. Figure A3. Typical LC–MS/MS total ion chromatograms. Table A1. Chemical structures of 21 mogrosides from luohanguo. Table A2. Contents (mg/g) of 21 mogrosides in tested samples. MoleculesTable A3 2018. Contents, 23, x FOR (mg/g) PEER ofREVIEW squalene and cucurbitadienol in tested samples. 15 of 23 Table A4. Primers used in this study.

FigureFigure A1. A1. HierarchicalHierarchical clustering clustering analysis analysis of of S.S. grosvenorii grosvenorii varieties.varieties. Species Species were were grouped grouped into into four four mainmain clusters. clusters. Species Species S10 S10 and and S13 S13 were were categorized categorized into into Cluster Cluster I. I. S2, S2, E23 E23 and and E5 E5 were were grouped grouped into into ClusterCluster II. II. E1 E1 was was defined defined into into third third distinct distinct cluste clusterr (Cluster (Cluster III). III). The The remain remaininging species species including including C2, C2, C3, C6, W4, S3, E29, E3, E12 and E2 were put into Cluster IV. C3, C6, W4, S3, E29, E3, E12 and E2 were put into Cluster IV.

Molecules 2019, 24, 627 15 of 21 Molecules 2018, 23, x FOR PEER REVIEW 16 of 23

Figure A2.A2. AlignmentAlignment of of PCR PCR fragment fragment showing showing the SNPthe SN sitesP ofsites SgCS. of SgCS. Missense Missense mutation mutation are marked are withmarked stars. with stars.

Molecules 20182019, 2423, 627x FOR PEER REVIEW 1716 of of 23 21

Figure A3. Typical LC–MS/MS total ion chromatograms. (a) standard solutions of 21 analytes Figure A3. Typical LC–MS/MS total ion chromatograms. (a) standard solutions of 21 analytes (b) (b) real sample. real sample.

Molecules 2018, 23, x FOR PEER REVIEW 18 of 23 Molecules 2018, 23, x FOR PEER REVIEW 18 of 23 Table A1. Chemical structures of 21 mogrosides from luohanguo. Molecules 2018, 23, x FORTable PEER A1. REVIEW Chemical structures of 21 mogrosides from luohanguo. 18 of 23 Molecules 2018, 23, x FOR PEER REVIEW 18 of 23 Molecules 2018, 23, x FOR PEER REVIEW OR3 18 of 23 MoleculesMolecules 2018 ,2018 23, x, 23FOR, x FORPEER PEER REVIEW REVIEW 18 of18 23 of 23 Molecules 20192018, 2423, 627x FOR TablePEER REVIEW A1. Chemical structures of 21 mogrosidesOR3 from luohanguo. 1817 of of 23 21 Molecules 2018, 23, x FOR PEER REVIEW R2 18 of 23 Table A1. Chemical structuresMolecules 2018 of 21, 23 mo, x grosidesFOR PEER from REVIEW luohanguo. 18 of 23 Table A1. Chemical structures of 21 mogrosides from luohanguo. Molecules 2018, 23, x FORTable PEER A1. REVIEW Chemical structuresR2 of 21 mogrosides fromOH luohanguo. 18 of 23 Table A1. Chemical structures of 21 mogrosidesOR from3 luohanguo. TableOR A1.3 Chemical structures of 21 mogrosides fromOH luohanguo. TableTable A1. A1. ChemicalChemical structures structures of of 21 21 mo mogrosidesgrosidesOR from3 from luohanguo. luohanguo. OR 3 Table A1. Chemical structuresR2 of 21 mogrosides 3from luohanguo. OR R2 OR 3 R1O R2 3 OH R 2 OR3 OH 2 OR R O R 3 OH 1 R 2 OH 2 OH R2 OH R2 OH Commom Name CAS No. R1O -R1 -R2 OH -R3 R1O OH Commom Name CAS No. R1O -R1 -R2 -R3 2-1 mogroside IIA 1613527-65-3R O 1 -H α-OH Glc Glc 1 R1O 2-1 R1O -O H 6-1 mogroside IIA 1613527-65-3R O -H α-OH Glc Glc mogrosideCommom IIA1 Name 88901-44-4 CAS No. 1 -H -R1 α -R 2 Glc-R3 Glc R O Commom Name CAS No. -R1 -R21 -R3 6-1 6-11 -OH 2 3 mogrosideCommom IIA1 Name 88901-44-4 CAS No. -H -R1 α-OH-R 2 Glc-R3 Glc2-1 mogrosideCommomCommom IIA2 Name Name 88901-45-5 CAS CAS No. No. -RGlc -R1 Glc -Rα 2 -R -O H -H -R3Glc Glc Commommogroside Name IIA CAS 1613527-65-3No. -R1 -H 2-1 -R2 α -R3 mogroside IIA 1613527-65-3 -H α-OH Glc6-1 Glc 2-1 mogrosideCommommogroside IIA2 Name IIA 88901-45-5 1613527-65-3 CAS No. Glc -H1 -Ra 1 α-OHα2 -O-RH2 -H 3 Glc-R3 2-1 Glc mogrosideCommommogrosidemogroside IIe Name IIA IIA 88901-38-6 1613527-65-3 CAS 1613527-65-3 No. -H-RGlc -H Glc α-O-RHα -OH -RGlc 2-1Glc 6-1 Glc mogrosideCommommogroside IIA Name IIA1 1613527-65-3 CAS88901-44-4 No. -H -R-H1 6-1 α-OH-Rα 2 Glc-R3Glc Glc Glc Commom Name CAS -ONo.H -R1 -R2 -R3 6-12-1 mogroside IIA1 88901-44-4 mogroside IIe -H 88901-38-6α GlcGlca Glc α-O-OH -OH -OH H Glc 6-12-16-1 mogrosidemogrosidemogrosidemogrosidemogroside III IIA IIAIIA1 IIA1 130567-83-8 88901-44-4 1613527-65-3 88901-44-4 1613527-65-3 -H -HGlc -H -H 6-1 α-OαHα -Oα H Glc6-1GlcGlc2-1GlcGlc Glc mogrosidemogroside IIA1 IIAIIA2 88901-44-4 1613527-65-388901-45-5 -H -H Glc Glc α-OHα -OH Glc-HGlc Glc Glc 6-1 -OH -OH 6-12-1Glc mogroside IIA2 88901-45-5 mogroside IIAGlc Glc 1613527-65-3α -HGlc 6-1 -OαH Glc 6-1 Glc6-1 mogrosidemogrosidemogrosidemogrosidemogroside III IIA1 IIA1IIA2 IIA2 130567-83-8 88901-45-5 88901-44-4 88901-45-5 88901-44-4 -HGlc -HGlc Glc α-OαH-Oα -OHα-OH H Glc-H2-1 Glc6-1 mogrosidemogroside IIIe IIA2 88901-37-5 88901-45-5 6-1Glc a Glc α α -O H Glc-HGlc GlcGlcGlcGlc Glc mogrosidemogroside IIA2 IIA1IIe 88901-45-5 88901-44-488901-38-6 Glc -H Glc α-OHα -H Glc6-1 Glc Glca -OH Glc a -OH 2-16-1 mogroside IIe 88901-38-6 mogrosidemogroside IIA1 IIe 88901-38-6 88901-44-4α -HGlc Glc6-1a 6-1 -OαH -O H GlcGlc Glc mogrosidemogrosidemogrosidemogroside IIIe IIA2 IIA2IIe 88901-37-5 88901-45-5 88901-38-6 88901-45-5 Glca 6-1 α-OαH-Oα -OHα -OH H -HGlc Glc-H Glc6-1 mogrosidemogroside IIIA1 IIe 88901-42-2 88901-38-6 -H Glca GlcGlc Glc Glc α α -OH Glc mogrosidemogroside IIe IIA2III 88901-38-6 88901-45-5130567-83-8 Glc Glc6-1 Glc α-OHα -H 6-1 Glc Glc mogrosideGlc III 130567-83-8-OH -OH 2-1 6-1 mogroside III 130567-83-8 mogroside IIA2 88901-45-5α GlcGlca Glc -OαH -O H -HGlc Glc6-1 mogrosidemogrosidemogrosidemogroside IIIA1 IIe IIeIII 88901-42-2 88901-38-6 130567-83-8 88901-38-6 -H GlcGlcGlc a α α-Oα -OHα -OH H Glc6-1Glc Glc2-1 Glc mogroside III 130567-83-8 Glc Glca -OHα-OH Glc GlcGlc mogrosidemogrosidemogroside III IIeIIIe IIIe 130567-83-8 88901-37-5 88901-38-688901-37-5 2-1 α α GGlcl2-1c Glc Glc -OH Glca -OH Glc mogroside IIIe 88901-37-5 mogroside IIe 88901-38-6α Glc6-1Glc Glc α 6-12-16-16-1 mogrosidemogroside III IIIe 130567-83-8 88901-37-5 GlcGlc Glc -OαH-Oα-OH -OH H Glc Glc 2-16-1 Glc mogrosidemogrosidemogroside IIIA2 IIIIIIe 88901-43-3 88901-37-5 130567-83-8 Glc Glc Glc α α -Oα H GlcGlc 2-1GlcGlcGlcGlc Glc Glc mogrosidemogroside IIIe IIIIIIA1 88901-37-5 130567-83-888901-42-2 -H 6-1 α-OHα Glc Glc6-1Glc Glc mogroside IIIA1 88901-42-2 mogroside -H III 130567-83-8α-OH Glc6-1 Glc α-OH 6-1 6-1 mogrosidemogroside IIIA2 IIIA1 88901-43-3 88901-42-2 -HGlc α-OH -O H GlcGlc 6-1Glc2-1 Glc2-1Glc 11-O-siamenosidemogrosidemogrosidemogroside IIIe IIIeIIIA1 I 1793003-50-5 88901-37-5 88901-42-2 88901-37-5 Glc -H GlcGlc αO-Oα -OHα -OH H GlcGlcGlcGlcGlc2-1GlcGlc Glc mogrosidemogroside IIIA1 IIIe 88901-42-2 88901-37-5 -H 2-1Glc α-OH α -OH Glc GlcGlc Glc Glc 6-1Gl2-1c6-1 mogroside IIIe 88901-37-5 Glc α-OH Glc2-1Glc GlcGlc6-1 11-O-siamenosidemogroside IIIA1 I 1793003-50-5 88901-42-2 -HG lc 6-1 αO-O H 2-1Glc 6-1Glc mogroside IIIA1 88901-42-2 -H -Oα -OH H GGlclcGlc GlcGlc mogrosidemogroside IIIA1IIIA26-1 IIIA2 88901-43-3 88901-42-288901-43-3 -H Glc Glc α Gl2-1c Glc6-1 -OH Glc 6-1 -OH GlcGlc2-1 Glc mogroside IIIA2 88901-43-3 mogroside IIIA1Glc Glc 88901-42-2α -H 6-1 6-1 α 6-1Glc2-16-12-1 mogroside IIIA2 88901-43-3 Glc α-OH Glc mogrosidemogroside IVa IIIA2 88901-41-1 88901-43-3 Glc 6-1Glc Glc O α-OH GGlclc Glc Glc mogroside11-O-siamenoside IIIA2 I 88901-43-3 1793003-50-5 Glc 6-1Glc α-OH O GlcG l2-1c GGlclc Glc Glc6-1 Glc 6-1Glc 6-1 11-O-siamenoside I 1793003-50-5 O Glc6-1 Glc2-1 Glc mogrosidemogroside11-O-siamenoside11-O-siamenoside IVa IIIA2 I I 88901-41-11793003-50-5 88901-43-3 1793003-50-5 Glc Glc Glc6-16-1 -OαOH-O HO GlcGGlclc6-1Glc Glc Glc siamenosidemogroside11-O-siamenoside I IIIA2 I 126105-12-2 1793003-50-5 88901-43-3 Glc 2-1 Glc α -OαO -OH H GlcGlc 11-O-siamenosidemogroside IIIA2 I 1793003-50-5 88901-43-3 Glc GlcGlc6-1 GlcGlc O α Glc Glc -OH 2-1Glc6-1Gl2-1c6-1 mogroside IIIA2 88901-43-3 GlcGlc Glc -OαH Glc Glc 6-1 siamenoside11-O-siamenoside I I 126105-12-2 1793003-50-5 GGlc lc 6-1 α O 2-1Glc 6-1Glc 11-O-siamenoside11-O-siamenosidemogroside IVa I 1793003-50-588901-41-1 1793003-50-5 GlcGlc O O GGlclc6-1Glc GlcGlc mogroside6-1 IVa 88901-41-1 6-1 Glc Gl2-1c 2-1Glc Glc mogroside IVa 88901-41-1 11-O-siamenoside I 1793003-50-5O Glc 6-1 O GlcGlc 6-12-1Glc mogrosideGlc IVa Glc 88901-41-1 Glc6-1Glc 6-1Glc O 2-1Glc2-16-16-1 mogrosidemogroside IVE IVa 89590-95-4 88901-41-1 Glc 6-1Glc Glc α-OH -OOH GGlclc 6-1Glc Glc Glc mogrosidesiamenoside IVa I 88901-41-1 126105-12-2 Glc 6-1Glc O α GlcGl2-1c Glc Glc -OH Glc6-1 Glc 2-16-1GlcGl6-1c siamenoside I 126105-12-2 α Glc6-1 -OH -OH Glc2-16-1 Glc mogrosidemogrosidesiamenoside IVE IVa I 89590-95-4 88901-41-1 126105-12-2 GlcGlc6Glc-1 Glc6-16-1 α α O-O H GlcGGlclc6-1GlcGlc6-1 6-1Glc 11-deoxymogrosidesiamenosidesiamenoside I V I 1707161-17-8 126105-12-2 126105-12-2 Glc 2-1 GlcGlc -H α Glc GlcGlc siamenosidemogrosidemogroside I IVa 126105-12-2 88901-41-1 88901-41-1 GlcGlc6-1 GlcGlc α-OH O O 6-1Glc2-16-1Glc GlcGlc 6-1 2-1Gl2-1c6-1 mogroside IVa 88901-41-1 Glc Glc O GlcGlc Glc6-1Glc 11-deoxymogrosidesiamenoside I V 1707161-17-8 126105-12-2 GlcGGlclc Glc6-1 -Hα -OH 2-1Glc 2-1Glc6-1 GlcGlc -O H GGlclcGlc GlcGlc siamenosidesiamenosidemogroside IVE I 6-1 126105-12-289590-95-4 126105-12-2 Glc Glc α α-O H Gl2-1c GGlclc6-1 Glc Glc 2-1 -OH GlcGlc2-1 Glc mogroside IVE 89590-95-4 siamenosidemogroside GlcI IVEGlc 89590-95-4 126105-12-2α-OH Glc 6-1Glc α 6-12-12-16-1 mogroside IVE 89590-95-4 6Glc-1 6-1 Glc -OHα-OH Glc GlcGlc2-12-1 Glc 11-epi-mogrosidemogroside IVE V 2146088-12-0 89590-95-4 Glc 6-1Glc6-1Glc Glc β α GlcGl2-1cGlcGlc Glc mogroside11-deoxymogroside IVE 6-1 V 89590-95-4 1707161-17-8 Glc GlcGlc Glc α-OH-H Glc 6-1 Glc Glc6-1 Glc 2-1GlcGl6-1c 11-deoxymogroside V 1707161-17-8 Glc Glc -H 6-16-1 β-OH Glc Glc2-1Glc Glc 11-epi-mogrosidemogroside11-deoxymogroside11-deoxymogroside IVE V V 2146088-12-0 89590-95-4 1707161-17-8 Glc GlcGlc6-1 Glc α-O-H H GGlclc6-1Glc 2-1GlcGlc 11-deoxymogroside V 1707161-17-8 1707161-17-8 Glc2-16Glc-1 6-1GlcGlc -H -H-O H Glc Glc2-1 11-deoxymogrosidemogrosidemogroside IVE V 1707161-17-8 89590-95-4 89590-95-4 Glc GlcGlc6-1Glc Glc -H α α-O H Gl2-1c Glc2-1Glc GlcGlc V Glc -OH 6-1Gl2-1c6-1 mogroside IVE 89590-95-4 Glc6-16-1 Glc α 2-1Glc 6-1Glc mogroside11-deoxymogroside V V 88901-36-4 1707161-17-8 GGlclc 6-1-1Glc α-O-HH Glc Glc 6-1 11-deoxymogroside11-epi-mogroside V V 1707161-17-82146088-12-0 Glc Glc6-1Glc6-1 Glc -Hβ -OH Glc GGlclc6-1Glc GlcGlc 11-deoxymogroside6-1 V 1707161-17-8 Glc Glc6-1 GlcGlcGlc -H GlcGlc2-16-1 6-1Glc 11-epi-mogroside V 2146088-12-0 11-deoxymogrosideGlc VGlc 1707161-17-8β-OH Glc6-1 6-1Glc -H Glc2-1 2-1Glc6-1 mogroside11-epi-mogroside V V 88901-36-4 2146088-12-0 Glc GlcGlc6-1 Glc α-OHβ -OH 6-1Glc2-12-1 Glc 11-epi-mogroside11-epi-mogroside V 2146088-12-0 2-16Glc-1 Glc β GlcGl2-1c Glc 11-epi-mogroside V 2146088-12-02146088-12-0 Glc Glc β-OH Gl2-1c 2-1 6-1Gll2-1c6-1 V 6-16-1 2-1Glc Gl6-1c Glc 11-oxo-mogroside11-epi-mogroside V V 126105-11-1 2146088-12-0 GGlclc 6-1-1Glc βO-OH GlcGlc Glc 6-1 11-epi-mogrosidemogroside V V 2146088-12-088901-36-4 Glc Glc6-1Glc6-1 Glc βα--OOHH Glc GGlclc6-1Glc GlcGlc 11-epi-mogroside6-1 V 2146088-12-0 Glc Glc6-1 GlcGlcGlc β-OH GlcGlc2-16-1 6-1Glc mogroside V 88901-36-4 11-epi-mogrosideGlc V Glc 2146088-12-0α-OH Glc6-1 6-1Glc β-OH Glc2-1 2-1Glc6-1 11-oxo-mogrosidemogroside V V 126105-11-1 88901-36-4 Glc GlcGlc6-1 Glc O α-OH 6-1Glc2-12-1 Glc mogroside V 88901-36-4 2-16Glc-1 Glc α GlcGl2-1c Glc mogroside V 88901-36-4 Glc Glc α-OH Gl2-1c 2-1 mogroside V 88901-36-4 6-1Gll2-1c6-1 4-16-1 2-1Glc Gl6-1c Glc isomogrosidemogroside VV 1126032-65-2 88901-36-4 GGlclc 6-1-1Glc α-OαH-OH GlcGlc Glc 6-1 mogroside11-oxo-mogroside V V 88901-36-4126105-11-1 Glc Glc6-1Gl6c-1 Glc α -OOH Glc GGlclc6-1 Glc mogroside V 6-1 88901-36-4 Glc 6-1 GlcGlc α-OH GlcGlc2-16-1 Glc6-1Glc Glc 11-oxo-mogroside V 126105-11-1 mogroside VGlc Glc 88901-36-4O Glc4-1Glc6-1Glc Glc α-OH Glc2-1 2-1Glc6-1 isomogroside11-oxo-mogroside V V 1126032-65-2 126105-11-1 Glc GlcG6-1lc Glc α-OH O 6-1Glc2-1 Glc 11-oxo-mogroside V 126105-11-1 2-16Glc-1 Glc O GlcGl2-1c Glc2-1 11-oxo-mogroside V 126105-11-1 Glc Glc O Gl2-1c 2-1 11-oxo-mogroside 6-1 6-1Gll2-1c6-1 6-1 2-1 Gl6-1c 11-oxo-mogroside V 126105-11-1 126105-11-1 GlcGGlclc Glc64-1-1 O GlcGGlclc GlcGlc 11-oxo-mogrosideisomogrosideV VIV 1421942-59-7 1126032-65-2 6-1 Glc O α-OH Glc GGlclc6-1 6-1Glc 11-oxo-mogroside4-1 V 126105-11-1 6-1GlcG6lc-1 6-1 GlcGlc O G6-1lcGlc6-1 Glc isomogroside V 1126032-65-2 11-oxo-mogroside11-oxo-mogrosideGlc V GVl c 126105-11-1 126105-11-1α-OH 2-1Glc Glc4-1Glc Glc O O Gl2-1cGlc2-1 6-1Glc 11-oxo-mogrosideisomogroside VIV 1421942-59-7 1126032-65-2 Glc GlcGlc4-1GlcG lc O α-OH Glc 6-1Glc2-1Glc Glc isomogroside V 1126032-65-2 2-14-1Glc Glc α 2-1 2-1 isomogroside V 1126032-65-2 GGl2-1lcc Glc α-OH GGlcl2-1cGlc Glc Gll2-1c6-1 6-14-16-1 6-1 6-1 isomogroside V 1126032-65-2 GlcGGlclc Glc4-1Glc -OαH-OH Glc2-1GGlclc GGlcl6-1c Glc mogroside11-oxo-mogrosideisomogroside VI 6-1 VI V 89590-98-7 1126032-65-2 1421942-59-7 Glc Glc6-1 Glc α -OOH Glc GGlclc6-1 6-1Glc isomogroside V 1126032-65-2 6-1G4-1lc 6-14-1Glc α G6-1lc 6-1 11-oxo-mogroside VI 1421942-59-7 isomogrosideisomogrosideGlc V V Glc 1126032-65-2 1126032-65-2O 2-1GGlclc Glc6-1GGlclc Glc α-OHα -OH Gl2-1cGlc2-1 Glc6-1Glc Glc mogroside11-oxo-mogroside VI VI 89590-98-7 1421942-59-7 Glc 6-1Glc2-1Glc Glc α-OH O Glc 6-1Glc2-1Glc Glc 11-oxo-mogroside2-1 VI 1421942-59-7 Glc 2-1 Glc O GlcGl2-1c Glc2-1 11-oxo-mogroside VI 1421942-59-7 Gl2-1c 2-1 O Gl2-1c 2-1 Gl2-1c6-1 6-1Gll2-1c6-1 2-1Glc6-1 6-1Glc6-1 2-1Glc 6-1Glc mogroside11-oxo-mogroside11-oxo-mogroside VI AG lc VI 2146088-13-1 1421942-59-7 GlcGlc α-OHO GlcGlc GGlclc 11-oxo-mogrosidemogroside VI 6-1 VI 1421942-59-7 1421942-59-789590-98-7 Glc GGlclc6-1Glc6-1 Glc Glc α -OOH Glc GGlclc6-1 6-1Glc Glc2-1 6-1Glc Glc2-16-1 6-1Glc mogroside VI 89590-98-7 11-oxo-mogroside11-oxo-mogrosideVI Glc VI Glc VI 1421942-59-7 1421942-59-7α-OH GlcGlc6-1Glc2-16-1Glc 6-1Glc O O GGlcl2-1c 2-1GlcGlc6-1 Glc mogrosidemogroside VI A VI 2146088-13-1 89590-98-7 Glc 6-1Glc2-1Glc Glc Glc α-OHα -OH 6-1Glc2-1 Glc mogroside VI2-1 89590-98-7 2-1 α 2-1 mogroside VI 89590-98-7 GlcGlc 2-1Glc α-OH GGlcl2-1cGlc Glc2-1 Gll2-1c6-1 Gll2-1c6-1 2-1Glc6-1 6-1Glc 2-1Glc6-1 6-1Glc mogrosidemogroside VI VIBG l c 2149606-17-5 89590-98-7 Gllc Glc6-1 6-1 α-OαH-OH GGlclcGlc GlcGlc mogroside VI A 89590-98-72146088-13-1 Glc GGlclc6-1Glc6-1 Glc Glc α -OH GGlclc6-1 6-1Glc 6-1 6-1 2-1 2-1 2-1 6-1 mogroside VI A 2146088-13-1 mogrosidemogroside VIGlc VI 89590-98-7 89590-98-7α-OH GlcGlc6-1Glc6-1Glc Glc6-1 α-OH -OH GlcGlc6-1 Glc6-1Glc Glc mogrosidemogrosidemogroside VI B VI A Glc Glc 2149606-17-5 2146088-13-1 89590-98-7 Glc Glc2-16-1 6-1 α-OHα -Oα H Glc 6-1Glc2-1Glc Glc mogroside VI A 2146088-13-1 6Glc-1 Glc 6-1Glc Glc α-OH Glc Glc Glc mogroside VI A 2146088-13-1 GlcGl2-1c 2-1 Glc α-OH Glc2-1Gl2-1c Glc2-1 Glc Glc Glc Gll2-1c6-1 6-1 6-1 2-1 2-1 mogroside VI A 2146088-13-1 GGlcllc 6-1-1 α-OH Glc2-1GGlclc 6-16-1Glc Glc Glc 6-1 Glc -O H Glc GGlcGlclcGlc GlcGlc mogroside VI AB 2146088-13-12149606-17-5 Glc6-16-1 Glc Glc α Glc6-1 6-1 6-1 -OH Glc Glc6-1 6-1 6-1 -OH Gl2-1cGlc2-1 6-1 Glc mogroside VI B 2149606-17-5 mogrosidemogroside VIGlc A VIGlc A 2146088-13-1 2146088-13-1α Glc 6-1Glc Glc α -OH Glc2-12-16Glc-1 Glc Glc mogrosidemogroside VI AB 2149606-17-5 2146088-13-1a GlcGlc6-1 Glc Glc α-OαH Glc 6-1Glc Glc mogroside VI B 2149606-17-5 Glc = β-D-glucopyranosyl.2-16Glc-1 Glc α-OH mogroside VI B 2149606-17-5 α-OH GlcGl2-1Glcc2-1Glc 2-1 Glc Glc G2-1lc a Glc 6-1 2-1 6-1 Glc = β-D-glucopyranosyl. -OH GGlclGlcc 2-16-1Glc mogroside VI B 2149606-17-5 Glc 6-1Glc α GlcGlcGlc Glc mogroside VI B 2149606-17-5 2-1Glc Glc α-OH Glc Glc6-1 6-1 -OH Glc2-1 2-1 Glc6-1 mogroside VI B 2149606-17-5 GlcGlc 6Glc-1 α -OH 2-12-1Glc Glc mogroside VI B 2149606-17-5 Glc Glc α 2-1Glc Glc a 2-1GlcGlc mogroside VI B 2149606-17-5 Glc = β-D-glucopyranosyl. Glc 2-1 a Glc = β-D-glucopyranosyl. Glc2-1 a Glc = β-D-glucopyranosyl. 2-1Glc a Glc = β-D-glucopyranosyl. Glc Glc = β-D-glucopyranosyl. 2-1Glc 2-1 a Glc = β-D-glucopyranosyl. Glc a Glc = β-D-glucopyranosyl. Glc a Glca = β-D-glucopyranosyl. a GlcGlc = = ββ-D-glucopyranosyl.-D-glucopyranosyl.

Molecules 2019, 24, 627 18 of 21

Table A2. Contents (mg/g) of 21 mogrosides in tested samples (n = 3, mean ± SD).

MVI MVIA MVIB 11-O-MVI 11-O-MV MV IMV 11-E-MV 11-D-O-MV SI MIVA MIVE 11-O-SI MIII MIIIE MIIIA2 MIIIA1 MIIE MIIA2 MIIA MIIA1 E1 0.66 ± 0.19 0.34 ± 0.10 0.89 ± 0.26 0.23 ± 0.07 0.33 ± 0.07 12.31 ± 2.83 1.75 ± 0.42 3.87 ± 1.01 0.26 ± 0.08 1.16 ± 0.26 0.74 ± 0.16 1.31 ± 0.30 0.30 ± 0.07 0.04 ± 0.01 0.68 ± 0.12 0.04 ± 0.01 0.29 ± 0.07 ND 0.14 ± 0.04 0.19 ± 0.05 ND E2 0.35 ± 0.02 0.15 ± 0.02 0.40 ± 0.02 0.11 ± 0.01 0.27 ± 0.00 10.43 ± 0.35 1.05 ± 0.07 2.91 ± 0.09 0.29 ± 0.02 0.83 ± 0.05 1.80 ± 0.05 1.02 ± 0.03 0.18 ± 0.01 0.61 ± 0.03 0.33 ± 0.03 0.19 ± 0.02 0.12 ± 0.00 0.09 ± 0.01 0.06 ± 0.01 0.04 ± 0.00 ND E3 0.71 ± 0.08 0.30 ± 0.02 0.84 ± 0.03 0.22 ± 0.01 0.34 ± 0.01 10.90 ± 0.48 0.93 ± 0.04 4.09 ± 0.13 0.45 ± 0.03 1.12 ± 0.05 0.88 ± 0.04 1.45 ± 0.10 0.33 ± 0.02 0.13 ± 0.01 0.89 ± 0.05 0.04 ± 0.01 0.23 ± 0.01 0.01 ± 0.00 0.12 ± 0.02 0.14 ± 0.01 ND E5 0.74 ± 0.06 0.36 ± 0.03 0.89 ± 0.07 0.19 ± 0.01 0.39 ± 0.03 12.68 ± 0.35 1.41 ± 0.04 3.80 ± 0.28 0.50 ± 0.05 0.50 ± 0.04 1.74 ± 0.12 0.08 ± 0.01 0.04 ± 0.00 ND 0.02 ± 0.00 0.09 ± 0.01 ND 0.08 ± 0.01 ND ND ND E12 0.56 ± 0.04 0.29 ± 0.01 0.67 ± 0.04 0.14 ± 0.00 0.28 ± 0.03 10.80 ± 0.43 0.89 ± 0.05 2.91 ± 0.20 0.34 ± 0.03 1.03 ± 0.09 1.72 ± 0.14 1.08 ± 0.06 0.21 ± 0.01 0.39 ± 0.03 0.49 ± 0.05 0.12 ± 0.01 0.25 ± 0.03 0.02 ± 0.00 0.08 ± 0.01 0.09 ± 0.00 ND E23 0.44 ± 0.02 0.21 ± 0.02 0.59 ± 0.02 0.12 ± 0.00 0.28 ± 0.01 12.15 ± 0.06 1.28 ± 0.05 2.82 ± 0.07 0.44 ± 0.01 1.07 ± 0.03 0.81 ± 0.03 0.91 ± 0.01 0.17 ± 0.01 0.07 ± 0.00 0.80 ± 0.02 0.03 ± 0.00 0.17 ± 0.01 ND 0.06 ± 0.01 0.05 ± 0.00 ND E29 0.46 ± 0.04 0.18 ± 0.00 0.45 ± 0.05 0.11 ± 0.01 0.23 ± 0.02 10.76 ± 0.36 0.96 ± 0.08 2.54 ± 0.14 0.29 ± 0.02 0.74 ± 0.05 0.37 ± 0.03 0.99 ± 0.03 0.14 ± 0.00 0.01 ± 0.00 0.26 ± 0.01 0.03 ± 0.00 0.12 ± 0.01 ND 0.06 ± 0.01 0.05 ± 0.01 ND S2 0.69 ± 0.07 0.30 ± 0.02 0.76 ± 0.05 0.18 ± 0.01 0.41 ± 0.02 13.49 ± 0.63 0.75 ± 0.01 3.91 ± 0.18 0.54 ± 0.05 1.21 ± 0.09 0.85 ± 0.06 1.33 ± 0.10 0.21 ± 0.01 0.05 ± 0.00 0.47 ± 0.04 0.05 ± 0.01 0.27 ± 0.02 ND 0.12 ± 0.01 0.09 ± 0.01 ND S3 0.24 ± 0.01 0.14 ± 0.01 0.34 ± 0.02 0.05 ± 0.00 0.22 ± 0.02 10.84 ± 0.45 0.61 ± 0.02 2.02 ± 0.13 0.25 ± 0.01 0.72 ± 0.04 0.46 ± 0.02 0.75 ± 0.04 0.09 ± 0.01 0.01 ± 0.00 0.08 ± 0.00 ND 0.15 ± 0.01 ND 0.07 ± 0.01 0.04 ± 0.00 ND S10 0.13 ± 0.00 0.04 ± 0.01 0.17 ± 0.01 0.03 ± 0.00 0.05 ± 0.00 4.86 ± 0.05 0.35 ± 0.02 1.01 ± 0.01 0.01 ± 0.00 0.26 ± 0.00 0.16 ± 0.00 0.44 ± 0.02 0.07 ± 0.00 ND 0.13 ± 0.00 ND 0.05 ± 0.00 ND 0.02 ± 0.01 0.04 ± 0.00 ND S13 0.27 ± 0.01 0.10 ± 0.01 0.17 ± 0.01 0.05 ± 0.00 0.08 ± 0.01 7.24 ± 0.20 0.28 ± 0.01 1.10 ± 0.02 0.28 ± 0.00 0.82 ± 0.02 0.18 ± 0.00 2.10 ± 0.01 0.13 ± 0.00 0.01 ± 0.00 0.88 ± 0.02 0.07 ± 0.00 0.08 ± 0.00 ND 0.02 ± 0.00 0.11 ± 0.00 ND C2 0.35 ± 0.03 0.16 ± 0.02 0.47 ± 0.04 0.10 ± 0.01 0.17 ± 0.02 9.91 ± 0.61 1.14 ± 0.10 1.83 ± 0.17 0.25 ± 0.03 0.76 ± 0.07 0.23 ± 0.03 1.10 ± 0.10 0.18 ± 0.01 0.11 ± 0.01 1.90 ± 0.15 0.08 ± 0.01 0.10 ± 0.01 ND 0.02 ± 0.00 0.99 ± 0.09 0.05 ± 0.01 C3 0.37 ± 0.03 0.16 ± 0.02 0.41 ± 0.05 0.14 ± 0.01 0.23 ± 0.05 10.70 ± 0.46 0.87 ± 0.26 2.87 ± 0.60 0.29 ± 0.01 1.01 ± 0.14 0.22 ± 0.04 1.35 ± 0.11 0.25 ± 0.03 0.04 ± 0.05 0.97 ± 0.81 0.02 ± 0.00 0.10 ± 0.02 ND ND 0.38 ± 0.48 0.02 ± 0.05 C6 0.37 ± 0.02 0.16 ± 0.01 0.46 ± 0.02 0.11 ± 0.01 0.23 ± 0.01 10.20 ± 0.34 0.89 ± 0.04 2.40 ± 0.09 0.26 ± 0.02 1.06 ± 0.04 0.44 ± 0.02 1.88 ± 0.06 0.27 ± 0.01 0.04 ± 0.00 0.79 ± 0.03 0.10 ± 0.01 0.14 ± 0.01 ND 0.07 ± 0.01 0.13 ± 0.01 ND W4 0.31 ± 0.03 0.22 ± 0.02 0.40 ± 0.03 0.07 ± 0.01 0.22 ± 0.02 10.10 ± 0.56 0.37 ± 0.04 2.05 ± 0.13 0.21 ± 0.02 0.73 ± 0.04 0.33 ± 0.03 0.98 ± 0.04 0.14 ± 0.01 0.02 ± 0.00 0.55 ± 0.02 0.03 ± 0.01 0.15 ± 0.01 ND 0.04 ± 0 0.25 ± 0.01 ND Average 0.44 0.21 0.53 0.12 0.25 10.49 0.90 2.67 0.31 0.87 0.73 1.12 0.18 0.10 0.62 0.06 0.15 0.01 0.06 0.17 0.00 Content Coefficient of 43.37 46.02 45.88 50.25 40.14 21.12 45.94 37.29 41.87 30.98 78.03 44.31 46.9 166.85 82.1 80.17 58.66 – 88.59 157.39 – Variation (%)

Table A3. Contents (mg/g) of squalene and cucurbitadienol in tested samples (n = 3, mean ± SD).

E1 E2 E3 E5 E12 E23 E29 S2 S3 S10 S13 C2 C3 C6 W4 Average Content CV (%) Squalene 0.10 ± 0.01 0.28 ± 0.13 0.03 ± 0.00 0.08 ± 0.01 0.07 ± 0.01 0.07 ± 0.01 0.24 ± 0.09 0.10 ± 0.02 0.06 ± 0.02 0.07 ± 0.00 0.17 ± 0.04 1.24 ± 0.15 0.39 ± 0.13 0.17 ± 0.01 0.21 ± 0.05 0.22 135.03 Cucurbitadienol 0.48 ± 0.05 0.62 ± 0.24 0.23 ± 0.05 0.25 ± 0.02 0.22 ± 0.00 0.20 ± 0.01 0.36 ± 0.07 0.27 ± 0.06 0.18 ± 0.04 0.17 ± 0.02 0.56 ± 0.07 1.80 ± 0.19 1.01 ± 0.22 0.57 ± 0.03 0.55 ± 0.11 0.50 86.03 Molecules 2019, 24, 627 19 of 21

Table A4. Primers used in this study.

SgCS-F ATGTGGAGGTTAAAGGTC SgCS-R TTATTCAGTCAAAACCCG pCEV- Seq-F CGATGACCTCCCATTGATA pCEV- Seq-R CGTTCTTAATACTAACATAACT 50A-F ATTGTTGCAAGTTCATAAAGCTGCTAAAGCTTTTCATGATGATAGA 50A-R TCTATCATCATGAAAAGCTTTAGCAGCTTTATGAACTTGCAACAAT 50C-F ATTGTTGCAAGTTCATAAAGCTTGTAAAGCTTTTCATGATGATAGA 50C-R TCTATCATCATGAAAAGCTTTACAAGCTTTATGAACTTGCAACAAT 50D-F ATTGTTGCAAGTTCATAAAGCTGATAAAGCTTTTCATGATGATAGA 50D-R TCTATCATCATGAAAAGCTTTATCAGCTTTATGAACTTGCAACAAT 50E-F ATTGTTGCAAGTTCATAAAGCTGAAAAAGCTTTTCATGATGATAGA 50E-R TCTATCATCATGAAAAGCTTTTTCAGCTTTATGAACTTGCAACAAT 50H-F ATTGTTGCAAGTTCATAAAGCTCATAAAGCTTTTCATGATGATAGA 50H-R TCTATCATCATGAAAAGCTTTATGAGCTTTATGAACTTGCAACAAT 50K-F ATTGTTGCAAGTTCATAAAGCTAAAAAAGCTTTTCATGATGATAGA 50K-R TCTATCATCATGAAAAGCTTTTTTAGCTTTATGAACTTGCAACAAT The nucleotides in bold and underlined were the corresponding mutant sites.

References

1. Chaturvedula, V.S.P.; Prakash, I. Cucurbitane Glycosides from Siraitia grosvenorii. J. Carbohydr. Chem. 2011, 30, 16–26. [CrossRef] 2. Zhang, J.; Dai, L.; Yang, J.; Liu, C.; Men, Y.; Zeng, Y.; Cai, Y.; Zhu, Y.; Sun, Y. Oxidation of Cucurbitadienol Catalyzed by CYP87D18 in the Biosynthesis of Mogrosides from Siraitia grosvenorii. Plant Cell Physiol. 2016, 57, 1000–1007. [CrossRef] 3. Li, C.; Lin, L.M.; Sui, F.; Wang, Z.M.; Huo, H.R.; Jiang, T.L. Chemistry and pharmacology of Siraitia grosvenorii: A review. Chin. J. Nat. Med. 2014, 12, 89–102. [CrossRef] 4. Pawar, R.S.; Krynitsky, A.J.; Rader, J.I. Sweeteners from plants–with emphasis on Stevia rebaudiana (Bertoni) and Siraitia grosvenorii (Swingle). Anal. Bioanal. Chem. 2013, 405, 4397–4407. [CrossRef] 5. Xia, Y.; Riverohuguet, M.E.; Hughes, B.H.; Marshall, W.D. Isolation of the sweet components from Siraitia grosvenorii. Food Chem. 2008, 107, 1022–1028. [CrossRef] 6. Qing, Z.X.; Zhao, H.; Tang, Q.; Mo, C.M.; Huang, P.; Cheng, P.; Yang, P.; Yang, X.Y.; Liu, X.B.; Zheng, Y.J. Systematic identification of flavonols, flavonol glycosides, triterpene and siraitic acid glycosides from Siraitia grosvenorii using high-performance liquid chromatography/quadrupole-time-of-flight mass spectrometry combined with a screening strategy. J. Pharm. Biomed. Anal. 2017, 138, 240–248. [CrossRef] 7. Fu, W.; Ma, X.; Tang, Q.; Mo, C. Karyotype analysis and genetic variation of a mutant in Siraitia grosvenorii. Mol. Biol. Rep. 2012, 39, 1247. [CrossRef] 8. Bai, L.H.; Ma, X.J.; Mo, C.M.; Shi, L.; Feng, S.X.; Jiang, X.J. Study on quantitative assessment of Siraitia grosvenorii germplasms by general index. China J. Chin. Mater. Med. 2007, 32, 2482–2484. 9. Tu, D.; Ma, X.; Zhao, H.; Mo, C.; Tang, Q.; Wang, L.; Huang, J.; Pan, L. Cloning and expression of SgCYP450-4 from Siraitia grosvenorii. Acta Pharm. Sin. B 2016, 6, 614–622. [CrossRef] 10. Ma, X.J.; Mo, C.M.; Bai, L.H.; Feng, S.X. A New Siraitia grosvenorii Cultivar ‘Yongqing 1’. Acta Hortic. Sin. 2008, 35, 1855. 11. Liu, W.J.; Mo, C.M.; Ma, X.J.; Zhang, H.Y.; Sun, B.X.; Jiang, X.J. Breeding of superior Siraitia grosvenorii cultivars producing glycosides-V. Guihaia 2010, 30, 881–883. 12. Mo, C.M.; Ma, X.J.; Liu, W.J.; Bai, L.H.; Tang, Q.; Feng, S.X. Genetic effects of the main characteristics of Siraitia grosvenorii. Guihaia 2009, 29, 899–904. 13. Pan, L.; Mo, C.; Ma, X.; Feng, S.; Tang, Q.; Bai, L.; Wei, R.; Xing, A.; Branch, L. The Construction of the Aseptic Strain of the Tissue Culture of Seedless Siraitia grosvenorii. Chin. Agric. Sci. Bull. 2013, 29, 150–155. 14. Zhao, H.; Guo, J.; Tang, Q.; Guo, L.; Huang, L.; Ma, X. Cloning and expression analysis of squalene epoxidase genes from Siraitia grosvenorii. China J. Chin. Mater. Med. 2018, 43, 3255–3262. Molecules 2019, 24, 627 20 of 21

15. Dai, L.; Liu, C.; Zhu, Y.; Zhang, J.; Men, Y.; Zeng, Y.; Sun, Y. Functional Characterization of Cucurbitadienol Synthase and Triterpene Glycosyltransferase Involved in Biosynthesis of Mogrosides from Siraitia grosvenorii. Plant Cell Physiol. 2015, 56, 1172–1182. [CrossRef] 16. Itkin, M.; Davidovich-Rikanati, R.; Cohen, S.; Portnoy, V.; Doron-Faigenboim, A.; Oren, E.; Freilich, S.; Tzuri, G.; Baranes, N.; Shen, S.; et al. The biosynthetic pathway of the nonsugar, high-intensity sweetener mogroside V from Siraitia grosvenorii. Proc. Natl. Acad. Sci. USA 2016, 113, E7619–E7628. [CrossRef] 17. Zhao, H.; Tang, Q.; Mo, C.; Bai, L.; Tu, D.; Ma, X. Cloning and characterization of squalene synthase and cycloartenol synthase from Siraitia grosvenorii. Acta Pharm. Sin. B 2017, 7, 215–222. [CrossRef] 18. Su, H.; Liu, Y.; Xiao, Y.; Tan, Y.; Gu, Y.; Liang, B.; Huang, H.; Wu, Y. Molecular and biochemical characterization of squalene synthase from Siraitia grosvenorii. Biotechnol. Lett. 2017, 39, 1–10. [CrossRef] 19. Nicole, S.; Barcaccia, G.; Erickson, D.L.; Kress, J.W.; Lucchin, M. The coding region of the UFGT gene is a source of diagnostic SNP markers that allow single-locus DNA genotyping for the assessment of cultivar identity and ancestry in grapevine (Vitis vinifera L.). BMC Res. Notes 2013, 6, 502. [CrossRef] 20. Primmer, C.R.; Borge, T.; Lindell, J.; Saetre, G.-P. Single-nucleotide polymorphism characterization in species with limited available sequence information: High nucleotide diversity revealed in the avian genome. Mol. Ecol. 2010, 11, 603–612. [CrossRef] 21. Song, W.; Qiao, X.; Chen, K.; Wang, Y.; Ji, S.; Feng, J.; Li, K.; Lin, Y.; Ye, M. Biosynthesis-Based Quantitative Analysis of 151 Secondary Metabolites of Licorice to Differentiate Medicinal Glycyrrhiza Species and Their Hybrids. Anal. Chem. 2017, 89, 3146–3153. [CrossRef] 22. Begovich, A.B.; Carlton, V.E.H.; Honigberg, L.A.; Schrodi, S.J.; Chokkalingam, A.P.; Alexander, H.C.; Ardlie, K.G.; Huang, Q.; Smith, A.M.; Spoerke, J.M. A Missense Single-Nucleotide Polymorphism in a Gene Encoding a Protein Tyrosine Phosphatase (PTPN22) Is Associated with Rheumatoid Arthritis. Am. J. Hum. Genet. 2004, 75, 330–337. [CrossRef] 23. Hakim, I.R.; Kammoun, N.G.; Makhloufi, E.; Rebaï, A. Discovery and potential of SNP markers in characterization of Tunisian olive germplasm. Diversity 2009, 2, 2679–2685. [CrossRef] 24. Hayashi, K.; Hashimoto, N.; Daigen, M.; Ashikawa, I. Development of PCR-based SNP markers for rice blast resistance genes at the Piz locus. Theor. Appl. Genet. 2004, 108, 1212–1220. [CrossRef] 25. Choi, I.Y.; Hyten, D.L.; Matukumalli, L.K.; Song, Q.; Chaky, J.M.; Quigley, C.V.; Chase, K.; Lark, K.G.; Reiter, R.S.; Yoon, M.S. A soybean transcript map: Gene distribution, haplotype and single-nucleotide polymorphism analysis. Genet 2007, 176, 685. [CrossRef] 26. Xia, M.; Han, X.; He, H.; Yu, R.; Zhen, G.; Jia, X.; Cheng, B.; Deng, X.W. Improved de novo genome assembly and analysis of the Chinese cucurbit Siraitia grosvenorii, also known as monk fruit or luo-han-guo. Gigascience 2018, 7, 1–9. [CrossRef] 27. Luo, Z.; Shi, H.; Zhang, K.; Qin, X.; Guo, Y.; Ma, X. Liquid chromatography with tandem mass spectrometry method for the simultaneous determination of multiple sweet mogrosides in the fruits of Siraitia grosvenorii and its marketed sweeteners. J. Sep. Sci. 2016, 39, 4124–4135. [CrossRef] 28. Zhang, H.; Yang, H.; Zhang, M.; Wang, Y.; Wang, J.; Yau, L.; Jiang, Z.; Hu, P. Identification of flavonol and triterpene glycosides in Luo-Han-Guo extract using ultra-high performance liquid chromatography/quadrupole time-of-flight mass spectrometry. J. Food Compos. Anal. 2012, 25, 142–148. [CrossRef] 29. Lu, F.; Li, D.; Fu, C.; Liu, J.; Huang, Y.; Chen, Y.; Wen, Y.; Nohara, T. Studies on chemical fingerprints of Siraitia grosvenorii fruits (Luo Han Guo) by HPLC. J. Nat. Med. 2012, 66, 70–76. [CrossRef] 30. Qiao, J.; Liu, J.; Liao, J.; Luo, Z.; Ma, X.; Ma, G. Identification of Key Amino Acid Residues Determining Product Specificity of 2,3-Oxidosqualene Cyclase in Siraitia grosvenorii. Catalysts 2018, 8, 577. [CrossRef] 31. Abe, I.; Rohmer, M.; Prestwich, G.D. Enzymatic Cyclization of Squalene and Oxidosqualene to Sterols and Triterpenes. Chem. Rev. 1993, 93, 2189–2206. [CrossRef] 32. Suzuki, M.; Xiang, T.; Ohyama, K.; Seki, H.; Saito, K.; Muranaka, T.; Hayashi, H.; Katsube, Y.; Kushiro, T.; Shibuya, M. in dicotyledonous plants. Plant Cell Physiol. 2006, 47, 565–571. [CrossRef] 33. Qiao, J.; Luo, Z.; Li, Y.; Ren, G.; Liu, C.; Ma, X. Effect of Abscisic Acid on Accumulation of Five Active Components in Root of Glycyrrhiza uralensis. Molecules 2017, 22, 1982. [CrossRef] 34. Lodeiro, S.; Schulz-Gasch, T.; Matsuda, S.P. Enzyme redesign: Two mutations cooperate to convert cycloartenol synthase into an accurate lanosterol synthase. J. Am. Chem. Soc. 2005, 127, 14132–14133. [CrossRef] Molecules 2019, 24, 627 21 of 21

35. Hoshino, T. β-Amyrin biosynthesis: Catalytic mechanism and substrate recognition. Org. Biomol. Chem. 2017, 15, 1–55. [CrossRef] 36. Qiao, J.; Luo, Z.; Cui, S.; Zhao, H.; Tang, Q.; Mo, C.; Ma, X.; Ding, Z. Modification of isoprene synthesis to enable production of curcurbitadienol synthesis in Saccharomyces cerevisiae. J. Ind. Microbiol. Biotechnol. 2018. [CrossRef] 37. Shibuya, M.; Adachi, S.; Ebizuka, Y. Cucurbitadienol synthase, the first committed enzyme for cucurbitacin biosynthesis, is a distinct enzyme from cycloartenol synthase for phytosterol biosynthesis. Tetrahedron 2004, 60, 6995–7003. [CrossRef] 38. Davidovich-Rikanati, R.; Shalev, L.; Baranes, N.; Meir, A.; Itkin, M.; Cohen, S.; Zimbler, K.; Portnoy, V.; Ebizuka, Y.; Shibuya, M. Recombinant yeast as a functional tool for understanding bitterness and cucurbitacin biosynthesis in watermelon (Citrullus spp.). Yeast 2015, 32, 103–114. 39. Shang, Y.; Ma, Y.; Zhou, Y.; Zhang, H.; Duan, L.; Chen, H.; Zeng, J.; Zhou, Q.; Wang, S.; Gu, W. Biosynthesis, regulation, and domestication of bitterness in cucumber. Science 2014, 346, 1084–1088. [CrossRef] 40. Kusano, M.; Abe, I.; Sankawa, U.; Ebizuka, Y. Purification and some properties of squalene-2,3-epoxide: Lanosterol cyclase from rat liver. Chem. Pharm. Bull. 1991, 39, 239. [CrossRef] 41. Abe, I.; Ebizuka, Y.; Seo, S.; Sankawa, U. Purification of squalene-2,3-epoxide cyclases from cell suspension cultures of Rabdosia japonica Hara. FEBS Lett. 1989, 249, 100–104. [CrossRef] 42. Zerikly, M.; Challis, G.L. Strategies for the discovery of new natural products by genome mining. ChemBioChem 2009, 10, 625–633. [CrossRef] 43. Boutanaev, A.M.; Moses, T.; Zi, J.; Nelson, D.R.; Mugford, S.T.; Peters, R.J.; Osbourn, A. Investigation of terpene diversification across multiple sequenced plant genomes. Proc. Natl. Acad. Sci. USA 2015, 112, E81–E88. [CrossRef][PubMed] 44. Hutmacher, R.B.; Ulloa, M.; Wright, S.D.; Campbell, B.T.; Percy, R.; Wallace, T.; Myers, G.; Bourland, F.; Weaver, D.; Peng, C. Elite Upland Cotton Germplasm-Pool Assessment of Fusarium Wilt Resistance in California. Agron. J. 2013, 105, 1635. [CrossRef] 45. Ma, X.J.; Mo, C.M. Prospects of molecular breeding in medical plants. China J. Chin. Mater. Med. 2017, 42, 2021–2031. 46. Qi, T.; Ma, X.; Mo, C.M.; Wilson, I.; Song, C.; Zhao, H.; Yang, Y.F.; Fu, W.; Qiu, D.Y. An efficient approach to finding Siraitia grosvenorii triterpene biosynthetic genes by RNA-seq and digital gene expression analysis. BMC Genom. 2011, 12, 343. 47. Vickers, C.E.; Bydder, S.F.; Zhou, Y.; Nielsen, L.K. Dual gene expression cassette vectors with antibiotic selection markers for engineering in Saccharomyces cerevisiae. Microb. Cell Fact. 2013, 12, 96. [CrossRef] [PubMed]

Sample Availability: Samples of the compounds morgrosides and squalene are available from the authors.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).