catalysts

Article Identification of Key Amino Acid Residues Determining Product Specificity of 2,3-Oxidosqualene Cyclase in Siraitia grosvenorii

Jing Qiao, Jiushi Liu, Jingjing Liao, Zuliang Luo, Xiaojun Ma * and Guoxu Ma *

Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100193, China; [email protected] (J.Q.); [email protected] (J.L.); [email protected] (J.L.); [email protected] (Z.L.) * Correspondence: [email protected] (X.M.); mgxfl[email protected] (G.M.); Tel.: +86-10-5783-3155 (X.M. & G.M.)  Received: 25 October 2018; Accepted: 19 November 2018; Published: 22 November 2018 

Abstract: Sterols and triterpenes are structurally diverse bioactive molecules generated through cyclization of linear 2,3-oxidosqualene. Based on carbocationic intermediates generated during the initial substrate preorganization step, oxidosqualene cyclases (OSCs) are roughly segregated into a dammarenyl cation group that predominantly catalyzes triterpenoid precursor products and a protosteryl cation group which mostly generates sterol precursor products. The mechanism of conversion between two scaffolds is not well understood. Previously, we have characterized a promiscuous OSC from Siraitia grosvenorii (SgCS) that synthesizes a novel cucurbitane-type triterpene cucurbitadienol as its main product. By integration of homology modeling, molecular docking and site-directed mutagenesis, we discover that five key amino acid residues (Asp486, Cys487, Cys565, Tyr535, and His260) may be responsible for interconversions between chair–boat–chair and chair–chair–chair conformations. The discovery of euphol, dihydrolanosterol, dihydroxyeuphol and tirucallenol unlocks a new path to triterpene diversity in nature. Our findings also reveal mechanistic insights into the cyclization of oxidosqualene into cucurbitane-type and lanostane-type skeletons, and provide a new strategy to identify key residues determining OSC specificity.

Keywords: chemical diversity; oxidosqualene cyclase; homology modeling; site-directed mutagenesis; cucurbitadienol

1. Introduction Sterols and triterpenes have been shown to exert a wide spectrum of important biological activities [1,2]. They are structurally diverse molecules, over 20,000 steroids and triterpenoids with 200 different skeletons have been found in eukaryotes [3]. These compounds display promising pharmaceutical activity and remarkable stability, as a result, they have attracted much attention from researchers [4,5]. However, naturally producing these ingredients is limited by the difficulties in plant cultivation and artificial extraction; therefore, further exploration of their synthesis and pharmaceutical applications were encouraged. Nevertheless, the challenges of the chemical synthesis of sterols and triterpenes in high yields are still enormous, mainly because of their dramatic structural micro-heterogeneity [6,7]. The important natural products catalyzed by metabolic from the biosynthetic pathway appears to be a promising alternative, which has attracted growing interest and made progress in the production of triterpenoids and sterols. Metabolic enzymes in the biosynthetic pathway appear to be a promising alternative candidate to catalyze the formation of important natural products,

Catalysts 2018, 8, 577; doi:10.3390/catal8120577 www.mdpi.com/journal/catalysts Catalysts 2018, 8, 577 2 of 28 for this reason, they attracted growing interest and made tremendous strides on the productions of the triterpenoids and sterols. In the process of biosynthesis of all triterpenoids and sterols, the first step is to cyclize a 30-carbon precursor, 2,3-oxidosqualene, that arises from the isoprenoid pathway and is catalyzed by oxidosqualene cyclases (OSCs) [8]. It triggers a complex chain reaction to produce tricyclic, tetracyclic, or pentacyclic molecules. At this point, the divergence of the sterol and triterpenoid biosynthetic pathways depends on different types of OSC. Cyclization of 2,3-oxidosqualene in the chair–boat–chair (CBC) conformation leads to a protosteryl cation intermediate, the precursor of the sterols via the formation of cycloartenol, lanosterol, and parkeol [9,10]. On the contrary, 2,3-oxidosqualene in the chair–chair–chair (CCC) conformation is cyclized into a dammarenyl carbocation intermediate, which subsequently gives rise to diverse triterpenoid skeletons after further rearrangements, see Figure A1. Since then, intensive studies have revealed the function of the OSC family but its functional diversity is still unclear and further research is needed. Studies on the cyclization mechanisms go back to the 1950s [11]. Much evidence has shown that the functional diversity of OSCs have been regulated by several key amino acid residues. Up to now, four sterol synthases and two triterpene synthases for the identification and function of the active sites in OSCs have been performed. They are from Homo sapiens (HsLAS) and Saccharomyces cerevisiae (SceERG7) [12–15], squalene-hopene cyclase from Alicyclobacillus acidocaldarius (AacSHC) [16], cycloartenol synthase from Arabidopsis thaliana (AthCAS) [17–20], β-amyrin synthase from Avena strigose (AsbAS) [21], and Euphorbia tirucalli (EtAS) [22]. Mutagenesis of key amino acid residues may lead to loss of functions and a change of product types or profiles. Although many mutagenesis studies have focused on the species of S. cerevisiae and A. thaliana, progress in the study of OSC in medicinal plants has lagged in comparison. Moreover, there has been very little research reported on the key residues which are essential in functional conversion between sterol and triterpenes synthases. Therefore, it is of great significance to search the key amino acids of OSCs, so as to expand their range of products to create more novel structurally diverse products. Siraitia grosvenorii, a member of Cucurbitaceae family, is a traditional herb mainly planted in the Guangxi Zhuang Autonomous Region in China. Its genome includes two members of the OSC gene family, cucurbitadienol synthase (CS) cyclizes oxidosqualene to form the cucurbitadienol triterpenoid skeleton, which is a distinct step in phytosterol biosynthesis [23,24]. So far, CS has been functionally characterized from Cucurbita pepo [25], Citrullus colocynthis [26], and Cucumis sativus [27]. A previous study suggests that four indirectly interacting residues of CS in C. sativus may act collaboratively in the choice of substrate folding type and five-member-ring formation reaction for plant OSCs. Here, using a combined approach that relied on homology modeling, residue coevolution analysis, as well as site-directed mutagenesis, the key amino acid residues and reaction mechanism underlying triterpene product specificity were identified, providing catalytic insights into the functional conversion between sterols and triterpenes.

2. Results

2.1. Catalytic Mechanism Prediction of Cucurbitadienol Biosynthesis in S. grosvenorii Figure A2 shows the homology model of SgCS using the human LaS crystal structure [28] as a template, where the presumed active sites responsible for the construction of the cucurbitadienol scaffold are highlighted. During optimization of the model, no major structural changes were observed for residues, and refinement consisted only of fine-tuning the coordinates of the active site sidechain atoms. The active sites of all residues are the same for the SgCS model and LAS, Asp486, Cys487, and Cys565 in SgCS coincide with their counterparts in LAS Asp455, Cys456, and Cys533, respectively. Molecular docking and energy calculation of quantum chemistry are useful for explaining the experimental results in theory. In order to verify the hypothesis that cucurbitadienol is catalyzed via the C4 intermediate. As shown in Figure A3, the geometry optimization and the single-point Catalysts 2018, 8, 577 3 of 28

Catalysts 2018, 8, x FOR PEER REVIEW 3 of 28 energy calculations were performed on 2,3-oxidosqualene and C-4 cation substrates, respectively. Thecalculations AutoDock were (V4.2) performed was conducted on 2,3 to-oxidosqualene dock both 2,3-oxidosqualene and C-4 cation substrates and C-4, cation respectively. into the The active siteAutoDock pocket of (V4.2) SgCS, was and conducted the data into Figuredock both1 indicated 2,3-oxidosqualene that both and 2,3-oxidosqualene C-4 cation into the and active C-4 cationsite pocket of SgCS, and the data in Figure 1 indicated that both 2,3-oxidosqualene and C-4 cation could could combine with the active site excellently. The computed binding free energy of the C-4 cation combine with the active site excellently. The computed binding free energy of the C-4 cation (−12.46 (−12.46 kcal/mol) is significantly larger in magnitude than the binding free energy of 2,3-oxidosqualene kcal/mol) is significantly larger in magnitude than the binding free energy of 2,3-oxidosqualene (−10.86 kcal/mol). A proper position and orientation of the conserved side-chains of His260, Trp413, (−10.86 kcal/mol). A proper position and orientation of the conserved side-chains of His260, Trp413, Phe475,Phe475, Trp613, Trp613 and, and Phe729 Phe729 was was required required withwith the purpose of of stabilization stabilization of of intermediate intermediate tertiary tertiary cationscations at at C-4. C-4. Active Active site site residues residues Tyr535Tyr535 possiblypossibly participates in in CH CH--ππ complexationcomplexation with with His260 His260 becausebecause of of the the closer close distancer distance (3.3(3.3 Å)Å ) ofof thethe C-4C-4 cation compared compared to to the the 2,3 2,3-oxidosqualene-oxidosqualene (4.1 (4.1 Å ). Å). TheseThese results results clearly clearly indicated indicated that the that Tyr535 the Tyr535 and His260 and together His260 could together trigger could the trigger 4β-demethylation the 4β- viad oxidativeemethylation decarboxylation. via oxidative decarboxylation. InIn this this study, study, the the homology homology modeling modeling andand molecular docking docking indicated indicated that that Asp486, Asp486, Cys487, Cys487, Cys565,Cys565, Tyr535, Tyr535 and, and His260 His260 in SgCS in maySgCS be may regarded be regard as aned active-site as an active residue-site for residue the 2,3-oxidosqualene for the 2,3- cyclization.oxidosqualene These cyclization. residues wereThese subjected residues were to site-directed subjected to mutagenesis site-directed experimentsmutagenesis experiments to identify the functionsto identify of the the residues.functions of the residues.

FigureFigure 1. 1.The The twotwo primary energetically energetically favored favored binding binding clusters clusters of the of 2, the3-oxidosqualene 2,3-oxidosqualene and C-4 and C-4cation cation in inthe the active active sites sites of SgCS ofSgCS from docking from docking studies. studies. (a) 2D docked (a) 2D view docked of ligand view SgCS of ligand with 2,3 SgCS- withoxidosqualene 2,3-oxidosqualene receptor; receptor; (b) 2D docked (b) 2D view docked of ligandview of SgCS ligand with SgCS C-4 cationwith C-4 receptor; cation (c receptor;) 2,3- (c)oxidosqualene 2,3-oxidosqualene and protein and protein interaction interaction hydrogen hydrogen bond interaction bond interaction view; (d) view;C-4 cation (d) C-4and cationprotein and proteininteraction interaction hydrogen hydrogen bond interaction bond interaction view. 3D view. representation 3D representation of SgCS is of shown SgCS isin showncartoonin (gray), cartoon (gray),the sticks the sticks with cyan with carbon cyan carbon atoms atomsrepresent represent amino acid amino residues acid residuesand the sticks and thewith sticks yellow with carbon yellow atoms represent 2,3-oxidosqualene and C-4 cation compounds. The red dotted lines denote hydrogen carbon atoms represent 2,3-oxidosqualene and C-4 cation compounds. The red dotted lines denote bonds. hydrogen bonds. 2.2.2.2. Bioinformatic Bioinformatic Analysis Analysis of of SgCS SgCS TheThe Table Table A1 A has1 has listed listed the the deduced deduced proteinprotein parameters that that are are predicted predicted by by the the online online Pxpasy Pxpasy’s’s ProtParamProtParam tool, tool, the the SgCS SgCS was was divided divided into into a stable protein protein and and no no signal signal peptides. peptides. Our Our results results suggestedsuggested that that SgCS SgCS was was mainly mainly found found inin thethe nucleusnucleus or or cytoplasm. cytoplasm. The The putative putative protein protein sequences sequences of SgCSof SgCS had had negative negative Grand Grand average average ofof hydropathicityhydropathicity ( (GRAVY)GRAVY) values values which which were were the the indication indication of hydrophilic proteins. Moreover, two transmembrane domains were found in SgCS, including 117– of hydrophilic proteins. Moreover, two transmembrane domains were found in SgCS, including 194 AAs and 601–641 AA. 117–194 AAs and 601–641 AA. The CS cloned from different species was analyzed using multiple sequence comparison The CS cloned from different species was analyzed using multiple sequence comparison analyses, analyses, and the results indicated the presence of the DCTAE motif, which included the catalytic andaspartic the results acid indicatedresidue at thethe presence486 site and of thewas DCTAEproposed motif, to be whichthe initiation included of the the cyclization catalytic aspartic reaction acid, residuesee Figure at the 2. 486 [8] site. SgCS and also was contains proposed three to berepeats the initiation of QW motifs of the (conserved cyclization motifs reaction, rich see in aromatic Figure2[ 8]. SgCSamino also acids, contains starting three with repeats Q-Gln of QWand motifsending (conserved with W-Trp) motifs which rich have in aromaticstructure amino-stabilizing acids, effects starting withand Q-Gln may andbe useful ending in withthe stabilization W-Trp) which of havethe carbocation structure-stabilizing intermediates effects during and cyclization may be useful [29]. inIn the stabilization of the carbocation intermediates during cyclization [29]. In addition, SgCS has a 260-site

Catalysts 2018, 8, 577 4 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 4 of 28 histidineaddition, residue SgCS has in thea 260 MWCHCR-site histidine motif, residue that in has thebeen MWCHCR proposed motif, to that play has an been important proposed role to in pla they processan important of stabilization role in the of process the protosteryl of stabilization cationintermediate of the protosteryl [30]. cation intermediate [30].

Figure 2. Multiple sequence alignment of the putative SgCS protein and three other cucurbitadienol synthases,Figure 2. GenBankMultiple sequence accession alignment numbers: of CcCSthe putative (KM821405), SgCS protein CpCS and (AB116238), three other CsCS cucurbitadienol (AIT72030), SgCSsynthases, (HQ128567). GenBank The accession highly conserved numbers: DCTAE CcCS (KM821405), motif marks CpCS in red. (AB116238), A terpene synthase CsCS (AIT72030), feature of QXXXXXWSgCS (HQ128567). motif is singleThe highly underlined. conserved MWCHCR DCTAE motifmotif ismarks shown in inred. the A double terpene underline. synthase feature Important of catalyticQXXXXXW residues motif are is single marked underlined. with stars MWCHCR (Asp486, motif Cys487, is shown Cys565, in the Tyr535, double and underline. His260) Important which are necessarycatalytic forresidu 2,3-oxidosqualenees are marked with cyclization stars (Asp486, into cucurbitadienol. Cys487, Cys565, Tyr535, and His260) which are necessary for 2,3-oxidosqualene cyclization into cucurbitadienol. 2.3. DCTAE Motif for the Initiation of the Polycyclization Cascade 2.3.Corey DCTAE [30 Motif] proposed for the Initiation that the D456of the Polycyclization residue of the Cascade DCTAE motif of S. cerevisiae LAS initiates the ring-formingCorey [30] reaction. proposed Asp486 that inthe SgCS D456 coincidesresidue of with the DC itsTAE counterparts motif of S. in cerevisiaeS. cerevisiae LAS, LASinitiates Asp456. the Atring first,-forming we constructed reaction. three Asp486 variants in SgCS (D486N, coincide D486E,s with D486A), its counterparts to verify the in functionS. cerevisiae of this, LAS amino Asp456. acid residueAt first of, we SgCS. constructed As shown three in Figure variants3a, (D486N, all of the D486E, mutants D486A), completely to verify lost the their function function of tothis produce amino cucurbitadienolacid residue of because SgCS. As of ashown single-point in Figure mutation 3a, all at of the the Asp486 mutants position. completely This lost demonstrates their function that theto acidicproduce residue cucurbitadienol of the Asp is regarded because asof a a proton single donor-point to mutation initiate the at cucurbitadienolthe Asp486 position. ring-forming This reaction.demonstrates Interestingly, that the the acidic D486A residue and D486N of the Aspmutants is regard showeded as multiple a proton profiles donor of to products initiate with the a molecularcucurbitadienol mass of ring (m/-formingz) 426, whereas reaction. D486E Interestingly, produced a the minor D486A amount and of D486Ncompound mutants1 (unidentified) showed asmultiple the sole product,profiles of see products Figure3 withc. Compound a molecular2 wasmass distinguished of (m/z) 426, whereas from an D486E authentic produced cucurbitadienol a minor standard.amount Theof compound retention times1 (unidentified) of compounds as the3, 4, sole5, and product7 were, see 19.86, Figure 20.71, 3c. 22.52, Compound and 25.52 2 was min relativedistinguished to cucurbitadienol, from an authentic and cucurbitadienol were identified standard. as euphol, The dihydrolanosterol, retention times of compounds dihydroxyeuphol, 3, 4, 5, andand tirucallenol, 7 were 19.86, respectively, 20.71, 22.52 the, dataand 25.52is shown min in relative Figures to A4 cucurbitadienol,–A7. These additional and were four identified compounds as caneuphol, be assigned dihydrolanosterol, to different C–C dihydroxyeuphol bond-forming, and conformations. tirucallenol, Compoundrespectively, 4 thewas data formed is shown through in Figures A4–A7. These additional four compounds can be assigned to different C–C bond-forming a chair–boat–chair–chair envelope (C–B–C–C) conformation featuring a trans (CH3-14 and H-13) conformations. Compound 4 was formed through a chair–boat–chair–chair envelope (C–B–C–C) orientation at the C–D ring fused carbons to give a 6/6/6/5-fused tetracyclic protosteryl cation. conformation featuring a trans (CH3-14 and H-13) orientation at the C–D ring fused carbons to give The subsequent reactions of hydride shifts from H-9 to C-8, H-13 to C-17, and H-17 to C-20 together a 6/6/6/5-fused tetracyclic protosteryl cation. The subsequent reactions of hydride shifts from H-9 to with the methyl shifts from CH3-8 to C-14 and CH3-14 to C-13 give the final skeleton of compound 4. C-8, H-13 to C-17, and H-17 to C-20 together with the methyl shifts from CH3-8 to C-14 and CH3-14 The other compounds, 3 and 7, initiate the chair–boat–chair (C–C–C) conformation of A/B/C rings to C-13 give the final skeleton of compound 4. The other compounds, 3 and 7, initiate the chair–boat– as well as the chair and boat of D ring, respectively, to give the damarrenyl cation path. Further chair (C–C–C) conformation of A/B/C rings as well as the chair and boat of D ring, respectively, to hydride and methyl shifts formed compounds 3 and 7, while compound 5 experienced an additional give the damarrenyl cation path. Further hydride and methyl shifts formed compounds 3 and 7, while hydrogenation reaction, see Figure4. compound 5 experienced an additional hydrogenation reaction, see Figure 4.

Catalysts 2018, 8, 577 5 of 28

Catalysts 2018, 8, x FOR PEER REVIEW 5 of 28 Based upon the human LAS structure analysis by X-ray crystallography, Thoma [28] proposed that the acidityBased upon of D455 the human of the DCTAELAS structure sequence analysis was increasedby X-ray crystallography by the hydrogen, Thoma bond [28] formations proposed with that the acidity of D455 of the DCTAE sequence was increased by the hydrogen bond formations with Cys456 and Cys533, versus their counterparts in SgCS with cysteine at position 487 (C487) and cysteine Cys456 and Cys533, versus their counterparts in SgCS with cysteine at position 487 (C487) and at position 565 (C565), respectively. In order to explore whether additional residues C487 and C485 are cysteine at position 565 (C565), respectively. In order to explore whether additional residues C487 included,and C485 we areconstructed included, six we variants constructed (C487M, six variants C487R, (C487M, C487A, C565M,C487R, C487A, C565R C565M, and C565A), C565R to and verify theC565A), functions to ofverify these the two functions amino of acid these residues two amino of SgCS.acid residues Results of showed SgCS. Result thats the showed hydrogen-bond that the withhydrogen carboxyl-bond residue with Asp486 carboxyl of residue cys487 Asp486 is stronger of cys487 than is that strong wither than cys565 that because with cys565 the because activity the of the C487Aactivity variant of the is aboutC487A 20% variant of the is wildabout type, 20% whereasof the wild the type, C565A whereas is about the 35%, C565A as shownis about in 35% Figure, as 3a. Theshown product in Figure profiles 3a. of The 487 product and 565 profiles site mutants of 487 and are 565 summarized site mutants inare Figure summarized3b. The in Met Figure and 3b. Ala mutantsThe Met produced and Ala compounds mutants produced3–7 as part compound of the products 3–7 as profile part of in the addition product to profile cucurbitadienol. in addition On to the contrary,cucurbitadienol. the Arg mutants On the onlycontrary produced, the Arg a mutants small number only produced of compound a small1 number. of compound 1.

FigureFigure 3. Product 3. Product promiscuity promiscuity of of sgCS sgCS mutants mutants forfor cyclization.cyclization. ( (aa) )Cucurbitadienol Cucurbitadienol content content in SgCS in SgCS mutants;mutants; (b) ( productb) product profile profile of of SgCS SgCS mutants; mutants; ((cc)) extractedextracted ion ion chromatogram chromatogram of ofSgCS SgCS and and mutants mutants activity.activity. The The control control panel panel (pCEV-G (pCEV-4G-K4-Kmmempty empty vector) shows shows the the yeast yeast BY4742 BY4742 accumulate accumulate lanosterol lanosterol (RT = 15.51 min) in the existence of lanosterol synthase activity. In the presence of SgCS, cucurbitadienol (compound 2) is accumulated. Catalysts 2018, 8, x FOR PEER REVIEW 6 of 28

Catalysts(RT 2018 = 1,5.518, 577 min) in the existence of lanosterol synthase activity. In the presence of SgCS6 of, 28 cucurbitadienol (compound 2) is accumulated.

O OH + H H H H H hydride and methyl shifts hydrogenation H + HO HO HO

C-C-C-C 3 5

O + H H H H

C-B-C-C hydride and methyl shifts hydrogenation H HO HO HO O 4

C-C-C-B

O OH + H H

hydride and methyl shifts H HO + HO H 7

FigureFigure 4. 4.OxidosqualeneOxidosqualene cyclases cyclases convert convert oxidosqualene toto diverse diverse cyclic cyclic structures. structures.

2.4. Functions of the Tyr535 and His260 Residue 2.4. Functions of the Tyr535 and His260 Residue A homology model suggested that the Tyr535 should synergize with the His260 to biosynthesize A homology model suggested that the Tyr535 should synergize with the His260 to cucurbitadienol more precisely. Residues His260 and Tyr535 in SgCS correspond to His234 and biosynthesize cucurbitadienol more precisely. Residues His260 and Tyr535 in SgCS correspond Tyr510 in Erg7, respectively. Wu [12,13,31] proposed that Tyr510 points to the monocyclic C-10 cation (lanosterolto His234 numbering) and Tyr510 and in thatErg7, the respectively. His234:Tyr510 Wu catalytic [12,13,31 base] proposed dyad forms that theTyr510 organized points local to the structuremonocyclic of Erg7. C-10 A cation mutant (lanosterol of Y535L numbering) of SgCS did and not that afford the any His234:Tyr510 aberrant cyclization catalytic products, base dyad andforms only cucurbitadienol the organized local was produced structure in of a Erg7. very smallA mutant amount. of By Y535L contrast, of SgCS the Y535W did not and afford Y535A any variantsaberrant of SgCScyclization yielded products,3 to 7 in and addition only cucurbitadienol to cucurbitadienol. was Furthermore, produced in thea very Trp small residue amount. has theBy largest contrast, steric the volume Y535W amongand Y535A all the variants proteinogenic of SgCS yielded amino acids, 3 to 7 andin addition thus, the to cucurbitadienol. production of sterolFurthermore compounds, the3 toTrp7 inresidue Y535W has is significantlythe largest steric higher volume than Y535A. among Similar all the to proteinogenic compound 3 aminoto 7, theacids, content and of lanosterolthus, the production (RT = 15.51 min) of sterol in Y535W compounds was obviously 3 to 7 in increased Y535W compared is significantly to that higher of Y535A, than seeY535A. Figure 5Similar. These to results compound demonstrated 3 to 7, the that content substitution of lanosterol of Tyr535 (R withT = 1 other5.51 min) amino in acids Y535W with was smallerobviously or larger increased steric bulks compared alters theto that organized of Y535A local, see structure, Figure thus5. These influences results the dem polycyclizationonstrated that pathway,substitution resulting of in Tyr535 the formation with other of the amino products acids3 to with7. smaller or larger steric bulks alters the organized local structure, thus influences the polycyclization pathway, resulting in the formation of the products 3 to 7.

Catalysts 2018, 8, 577 7 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 7 of 28

Figure 5. Extracted ion chromatogram of the 535 site and Y535L/C565R double mutant activity. Figure 5. Extracted ion chromatogram of the 535 site and Y535L/C565R double mutant activity. Next, position 260-directed mutagenesis was used to design more promiscuous OSCs properly, Next, position 260-directed mutagenesis was used to design more promiscuous OSCs properly, the product profiles of mutants are shown in Table1. All of the mutants produced compound 5 the product profiles of mutants are shown in Table 1. All of the mutants produced compound 5 and and 7 as part of the product profile. Interestingly, substitution with acidic amino acid (Asp, Glu), 7 as part of the product profile. Interestingly, substitution with acidic amino acid (Asp, Glu), aromatic aromatic amino acids (Phe, Trp, Tyr), and hydroxyl amino acid (Ser, Thr) at this position abolish amino acids (Phe, Trp, Tyr), and hydroxyl amino acid (Ser, Thr) at this position abolish cucurbitadienol biosynthesis, consistent with a specific role for His260 in C4-C8 bond formation. cucurbitadienol biosynthesis, consistent with a specific role for His260 in C4-C8 bond formation. The The H260X mutants substituted with hydroxyl amino acids (H260S and H260T) afforded the same H260X mutants substituted with hydroxyl amino acids (H260S and H260T) afforded the same product profile including compound 1, 5, and 7. The H260A, H260R, and H260Y mutants produced product profile including compound 1, 5, and 7. The H260A, H260R, and H260Y mutants produced a minor amount of compound 4, in addition to other products. However, none of the mutants showed a minor amount of compound 4, in addition to other products. However, none of the mutants showed a specific product profile. Our data reveal that an aromatic amino acid residue at position 260 is a specific product profile. Our data reveal that an aromatic amino acid residue at position 260 is important for promiscuity. important for promiscuity. Table 1. Product profile of S. cerevisiae BY4742 expressing the SgCSH260X site-saturated mutants. Table 1. Product profile of S. cerevisiae BY4742 expressing the SgCSH260X site-saturated mutants. Product Profile Product Profile AminoAmino Acids Acids Substitution Substitution 1 21 2 33 44 55 6 6 7 7 Ala (A)Ala (A) 18 18 19 19 55 3737 5 5 16 16 Cys (C)Cys (C) 43 43 2424 32 32 Asp (D) 20 59 21 Glu (E)Asp (D) 4320 3459 21 23 Phe (F)Glu (E) 43 5334 23 47 Gly (G)Phe (F) 10 18 5253 47 20 Ile (I)Gly (G) 24 10 36 18 2352 20 17 Lys (K) 58 9 33 Leu (L)Ile (I) 1124 32 36 3123 417 21

Met (M)Lys (K) 43 29 1758 9 33 11 Asn (N)Leu (L) 48 11 9 32 3231 4 2 218 Pro (P)Met (M) 15 43 21 29 2217 18 11 24 Gln (Q) 71 3 17 9 Asn (N) 48 9 32 2 8 Arg (R) 20 6 49 4 22 Ser (S)Pro (P) 815 21 5622 18 24 36 Thr (T)Gln (Q) 10 71 3 5517 935 Val (V)Arg (R) 25 20 6 4049 134 22 22 Trp (W)Ser (S) 8 4656 24 36 30 Tyr (Y) 20 52 28 Thr (T) 10 55 35 Val (V) 25 40 13 22 Trp (W) 46 24 30

Catalysts 2018, 8, x FOR PEER REVIEW 8 of 28

Tyr (Y) 20 52 28

2.5. Double Mutants Determining Product Specificity Various possible SgCS mutants were created via replacing amino acid residues at each of these five sitesCatalysts in combination2018, 8, 577 and expressing them in BY4742. The D486E/C487R double mutants8 of 28 obtained an ability to specifically produce compound 1, while the production was not increased in comparison with single2.5. Double mutant Mutants D486E Determining or C487R. Product The Specificity D486A/C487A double mutants also didn’t change the product profile compared to the D486A or C487A single mutant. Interestingly, compound 6 synthesis Various possible SgCS mutants were created via replacing amino acid residues at each of these was significantly increased in the double mutant Y535L/C565R, producing at least up to 15 times five sites in combination and expressing them in BY4742. The D486E/C487R double mutants obtained more productan ability than to specifically that of the produce single compound mutant,1 see, while Figure the production 5. These was results not increased showed in that comparison residues may develop withsynergism single mutant in the D486E choice or C487R. of an Theextraordinary D486A/C487A substrate double mutants folding also intermediate didn’t change the for product OSCs. profile compared to the D486A or C487A single mutant. Interestingly, compound 6 synthesis was 2.6. Reactionsignificantly Mechanism increased of SgCS in the double mutant Y535L/C565R, producing at least up to 15 times more product than that of the single mutant, see Figure5. These results showed that residues may develop A onesynergism-step reaction in the choice to form of an the extraordinary tetracyclic substrate cucurbitadienol folding intermediate molecule for liner OSCs. squalene is considered to be a very complex biochemical reaction in nature. Based on the results of our investigation, we 2.6. Reaction Mechanism of SgCS proposed a hypothetical mechanism of the SgCS-catalyzed 2,3-oxidosqualene cyclization reaction of cucurbitadienolA one-step, shown reaction in Figure to form 6. Firstly, the tetracyclic the cyclization cucurbitadienol reaction molecule is initiated liner squalene by Asp486 is considered protonating to be a very complex biochemical reaction in nature. Based on the results of our investigation, the epoxide group of prefolded 2,3-oxidosqualene. Cys487 and Cys565 serve as hydrogen-bonding we proposed a hypothetical mechanism of the SgCS-catalyzed 2,3-oxidosqualene cyclization reaction of partners cucurbitadienol, with Asp486; shown therefore in Figure, there6. Firstly, is an the increased cyclization reaction acidity is of initiated Asp486 by Asp486 in SgCS. protonating Then, the OSC cyclizationthe cascade epoxide group stops of with prefolded the tertiary 2,3-oxidosqualene. protosterol Cys487 cation and at Cys565 C-20 when serve as the hydrogen-bonding formation of the five- memberedpartners D-ring with is Asp486; accomplished. therefore, thereSecondly, is an increased the protosterol acidity of Asp486 C-20 cation in SgCS. is Then, transformed the OSC to the cucurbitadienolcyclization C- cascade4/C-8 cation stops withvia a the rearranging tertiary protosterol skeleton cation. Ultimately, at C-20 when the thehighly formation conserved of the His 260 five-membered D-ring is accomplished. Secondly, the protosterol C-20 cation is transformed to the is the basic residue that is close enough to accept the proton in the specific deprotonation of the C- cucurbitadienol C-4/C-8 cation via a rearranging skeleton. Ultimately, the highly conserved His 260 is 4/C-8 cucurbitadienolthe basic residue cation that is closethat enoughterminates to accept catalysis. the proton The in hydroxy the specific group deprotonation of Tyr535 of the that C-4/C-8 is hydrogen- bonded tocucurbitadienol His 260 would cation be that in terminatesan appropriate catalysis. position The hydroxy for groupthe final of Tyr535 deprotonation that is hydrogen-bonded step. to His 260 would be in an appropriate position for the final deprotonation step.

Figure 6.Figure A proposed 6. A proposed tentative tentative mechanism mechanism for for thethe SgCS-catalyzed SgCS-catalyzed cyclization cyclization reaction. reaction.

3. Discussion

Catalysts 2018, 8, 577 9 of 28

3. Discussion We identified five key amino acid positions which regulate the functional switch between cucurbitane-type and lanostane-type triterpenoids synthesis using structural modeling. The use of protein homology modeling combined with molecular docking is a useful method to discover new enzymes, and it reveals a chemical diversity of catalytic mechanisms of enzymes [32]. In the present study, this method enabled us to identify the novel phytosterols, euphol, dihydroxyeuphol, and tirucallenol, and uncover an unknown field regarding OSC cyclization. As far as we know, biosynthesis of these compounds has not been reported before, their structures are quite similar to cycloartenol’s and lanosterol’s, that are the precursors of sterol biosynthesis. The products of SgCS mutants may participant in synthesizing new steroid hormones, and, therefore, affect the growth and development of a plant. A recent research of bacterial OSCs (HsLAS) has found four sites at positions Trp230, His232, Tyr503, and Asn697 that may lead to a change in product from tetracyclic triterpenoid lansterol to pentacyclic products, but keep a protosteryl-type C-B-C conformation [33]. Two residues, at position Phe696 and Ser699, were identified in the studies of Avena strigose and Euphorbia tirucalli b-amyrin synthase (SAD1), which determined the functional interconversion of b-amyrin and lansterol synthases. These enzymes produce pentacyclic and tetracyclic triterpenes from the same dammarane-type C–C–C conformation [21,34]. Most of the tetracyclic products catalyzed via the C–B–C intermediate are precursors for primary metabolites, such as cholesterol and brassinosteroids (BRs) [35]. Although, one exception is cucurbitadienol synthase from cucurbit plants [25,27]. In this study, the five key residue positions in SgCS, Asp486, Cys487, Cys565, Tyr535, and His260, are responsible for the alternation between the C–B–C and C–C–C conformations. This is the first enzyme to report the conformational change from the C–C–C to the C–B–C folding conformation among the OSC mutagenesis studies reported hitherto. Over the course of evolution, in order to adapt to both natural and artificial selections, the plants produce diverse catalogs of compounds via improvements to their system. Research on engineering enzymes has progressed, such as a broadened substrate, reaction, and product specificity. In this study, because of a single-point mutation, the product scope of SgCS for oxisqualene-cyclization was significantly enlarged. It has proven to be dramatically more difficult to redesign enzymes to have a stringent specificity. There are various factors that can be used to improve the product specificity and stereoselectivity. During the cyclization process, specific cation–pi interactions mediated by aromatic residues and cation intermediates are important for OSC’s generation of the specific final product [20,36–38]. It is worthy to note that we found two residues, Cys565 and Tyr535, can act collaboratively in the choice of the specific substrate folding intermediate for OSCs. These results offered a possibility that may produce promiscuous OSC variants to create specific products. Since the 1950s, the response mechanism of OSC has been studied in depth. Many OSCs have been cloned and sequenced since the first DNA sequence of Candida albicans OSC was reported in 1990 [39]. However, isolating OSCs in an active form and characterizing their enzymatic properties in vitro has been very difficult because they are membrane-bound proteins. To date, only the X-ray crystallographic structures of the two triterpene cyclases of SHC and human lanosterol cyclase have been solved and the structure of cucurbitadienol synthase has not yet been elucidated. Recently, a report describing the functional analyses of the active sites in cucurbitadienol synthase have appeared, but at present, studies on the catalytic mechanism of cucurbitadienol synthase remain insufficient, and further research is needed to better understand cucurbitadienol biosynthesis [40]. To acquire detailed information on the polycyclization cascades promoted by triterpene cyclases, collaborations involving multiple research fields, such as molecular biology, structural biology, biochemistry, and organic chemistry, will be required. Catalysts 2018, 8, 577 10 of 28

4. Materials and Methods

4.1. Chemicals and General Methods The host strain S. cerevisiae BY4742 was purchased from Invitrogen (Carlsbad, CA, USA). Chemical reagents were purchased from Sigma Aldrich (St. Louis, MO, USA), J&K Scientific Ltd. (Beijing, China), and Taihe Biotechnology Co. Ltd. (Beijing, China). KOD-Plus DNA Polymerase was purchased from TOYOBO Biotech Co. Ltd. (Shanghai, China). Primers synthesis and DNA sequencing were obtained from Sangon Biotech Company (Beijing, China). Restriction enzymes and DNA were purchased from Takara Biotechnology Co. Ltd. (Dalian, China). The yeast expression plasmids pCEV-G4-Km (Addgene plasmid # 46819) were a gift from Lars Nielsen and Claudia Vickers [41]. Yeast products were detected using an Acquity UPLC™ system (Waters Corp. USA) with an Acquity BEH C18 column (100 mm × 2.1 mm i.d. 1.7 µm), and purified by a Lumiere K-1001 pump, a Lumiere K-2501 single λ absorbance detector and a YMC-Pack ODS-A column (5 µm, 10 × 250 mm, YMC, Kyoto, Japan). ESI-MS wasperformed on a LTQ-Obitrap XL spectrometer (Thermo Fisher Scientific, Boston, MA, USA). The conversion rates of the substrate were calculated from peak areas of products as analyzed by UPLC at their maximum absorption wavelength, respectively. Compounds were characterized by 1H NMR at 600 MHz and 13C APT at 150 MHz on Bruker AV III 600 NMR spectrometers. Chemical shifts (δ) were referenced to internal solvent resonances and were given in parts per million (ppm). Coupling constants (J) were given in hertz (Hz).

4.2. Molecular Modeling The comparison of different amino acid sequences using alignment was carried out by ClustalX (http://www.clustal.org). The sequence alignments were edited and analyzed using ESPript 3.0 (http://espript.ibcp.fr/ESPript/ESPript/). The three-dimensional structure models of SgCS were constructed using Modeller 9.17 (Accelrys, Inc., San Diego, CA, USA). The human lanosterol synthase (PDB accession code 1W6J) was used as the template for modeling. The generated three-dimensional structures were improved by using the loop refined program of Modeller 9.17. The final models were checked by Procheck (http://nihserver. mbi.ucla.edu/511SAVES/) and 90.4% of residues were located in the most favored regions in the Ramachandran plot.

4.3. Molecular Docking AutoDock (V4.2) was used with an empirical free-energy function to evaluate binding free energies and the Lamarckian genetic algorithm (LGA) to search for favorable binding positions [42]. The empirical scoring function which contains hydrogen bonding, electrostatics, conform, torsion, and solvent terms, was trained to calculate the affinity between ligand and protein [43]. It was used to dock compounds into the SgCS protein. The AutoGrid was used to calculate the grid maps which can define the search region and represent the protein during the docking process, the dimensions of grid maps were 50 Å × 50 Å × 40 Å centered on coordinates 26.271, 68.084, and 7.604, with a spacing of 0.375 Å between the grid points. The LGA parameters were accepted as number of GA runs 100, population size 150, maximum number of evals 2,500,000 generations and other parameters were left at the default values.

4.4. Mutagenesis Experiments of SgCS The CDS of SgCS biosynthetic gene (GenBank accession No. HQ128567) was codon-optimized (Box A1) for synthesis and cloned into the BamHI/EcoRI sites of the pCEV-G4-Km-USER [44] yeast expression vector under the control of the TEF1 promoter to construct pCEV-G4-Km-USER-SgCS. The Site-Directed Mutagenesis Kit obtained from Biomed (Beijing, China) was used to detect the mutagenesis, and the corresponding degenerate primers are given in Table A2 with the substitutions underlined. After separation by 1% agarose gel electrophoresis, the resulting PCR products were Catalysts 2018, 8, 577 11 of 28 directly transformed into the Trans1-T1 E. coli. Sanger sequencing was used to determine the sequence of the mutation in the resulting plasmid (pCEV-G4-Km) using the oligonucleotide primers pCEV-Seq, the data is shown in Table A2.

4.5. Yeast Transformation and Cell Cultivation The plasmids were transfected into S. cerevisiae strains BY4742 using the Frozen-EZ yeast transformation II kit obtained from Zymo Research (Orange, CA, USA) and selected for growth on YPD plates with 200 mg/L G418. The empty pCEV-G4-Km vector was also introduced into BY4742 as a control. The recombinant cells were transferred and cultured in 2 mL YPD medium with 200 mg/L G418 and in a humidified incubator at 30 ◦C, and 250 rpm to 600 nm (OD600) of approximately 1.0. Then flasks (250 mL) containing 100 mL medium were then inoculated to an OD600 of 0.05 with the seed cultures. After inoculation, incubate the strains at 30 ◦C, 250 rpm for 3 days. Moreover, Shimadzu UV-2550 spectrophotometer was used to detect the optical densities at OD600.

4.6. UPLC-Q-TOF Analysis of Yeast Extracts Yeast cells were harvest from culture using a centrifuge at 10,000× g for 5 min, then washed with 5 mL of 20% KOH/50% ethanol and extracted three times with the same volume of n-hexane. The recombinant extracts were distilled and dried under reduced pressure and dissolved in 1 mL of acetonitrile. An Acquity UPLCTM system was used to detect chromatography using a BEH rapid resolution RP-C18 column (1.7 µm, 2.1 × 100 mm, Waters). The mobile phase consisted of 0.05% formic acid aqueous solution (v/v, solvent A) and acetonitrile (solvent B) with a gradient elution: 0–50 min, 90–100.0% (B); 50–60 min, 100.0% (B). The flow was 0.3 mL/min and the injection volume was 3 µL.

4.7. Compound Purification from Yeast Yeast harboring 535W SgCS mutant was cultured (50 L) and then the mixture was centrifuged at 3000 rpm for 10 min, the pellet (600 g) was resuspended in 3 L 20% KOH/50% EtOH (1:1), and treated with an alkali lysis method for 1 h at 95 ◦C, and then double extraction of with the same volume of n-hexane. After retrieval of the n-hexane, the residue (2.4 g) was subjected to a column chromatograph (CC) on a silica gel (200–300 mesh, 3 × 10 cm) with a gradient of petroleum ether-CH2Cl2 (6:1–1:2) to yield seven fractions (Fr.1–Fr.7). Fr.3 (2:1, 0.26 g) was purified by using semi-preparative HPLC eluting with CH3CN-H2O (98:2) on a YMC RP C-18 column to give compounds 3 (Rt 65.0 min), 4 (Rt 73.3 min), 5 (Rt 77.7 min), 7 (Rt 102.6 min). The flow rate was kept constant at 6 mL/min and the detection wavelength was set at 203 nm.

4.8. Compound Structure Elucidation by NMR Data Analysis NMR analyses of compounds 3–5, and 7 were performed on a BrukerAvance III 600 analysis system with the solvent of CDCl3, and the structures were elucidated according to optical rotation, ESIMS, and NMR data (1H NMR, 13C APT, and NOESY spectra) and compared with the reported ones.

20 + 1 Product 3: [α]D = +32.4 (c 0.10, CHCl3); ESIMS: m/z 449 [M + Na] ; H NMR (600 MHz, CDCl3) δH: 5.10 (1H, t, J = 7.2 Hz, H-24), 3.22 (1H, dd, J = 12.0, 4.8 Hz, H-3), 0.67 (3H, s, H-18), 0.81 (3H, s, H-29), 0.86 (3H, s, H-30), 0.91 (3H, d, J = 7.2 Hz, H-21), 0.97 (3H, s, H-28), 0.99 (3H, s, H-19), 1.28 (3H, s, H-26), 1 1.66 (3H, s, H-27); H NMR (150 MHz, CDCl3) δC: 35.6 (C-1), 27.9 (C-2), 79.0 (C-3), 38.9 (C-4), 50.1 (C-5), 18.3 (C-6), 28.2 (C-7), 134.3 (C-8), 130.9 (C-9), 37.0 (C-10), 21.0 (C-11), 26.5 (C-12), 44.4 (C-13), 49.8 (C-14), 31.0 (C-15), 30.8 (C-16), 50.4 (C-17), 15.8 (C-18), 19.2 (C-19), 36.2 (C-20), 18.7 (C-21), 36.4 (C-22), 24.9 (C-23), 125.3 (C-24), 131.0 (C-25), 18.6 (C-26), 25.8 (C-27), 24.3 (C-28), 15.4 (C-29), 28.0 (C-30).

20 + 1 Product 4: [α]D = +62.2 (c 0.10, CHCl3); ESIMS: m/z 447 [M + Na] ; H NMR (600 MHz, CDCl3) δH: 3.25 (1H, dd, J = 12.0, 1.8 Hz, H-3), 0.81 (3H, s, H-18), 0.98 (3H, s, H-19), 0.88 (3H, d, J = 7.2 Hz, H-21), Catalysts 2018, 8, 577 12 of 28

0.86 (3H, s, H-28), 0.92 (3H, d, J = 6.0 Hz, H-26), 0.93 (3H, d, J = 6.0 Hz, H-27), 1.01 (3H, s, H-29), 0.70 1 (3H, s, H-30); H NMR (150 MHz, CDCl3) δC: 36.1 (C-1), 27.9 (C-2), 79.0 (C-3), 38.9 (C-4), 50.9 (C-5), 21.0 (C-6), 28.2 (C-7), 134.3 (C-8), 130.9 (C-9), 37.0 (C-10), 18.3 (C-11), 25.5 (C-12), 44.4 (C-13), 49.8 (C-14), 31.0 (C-15), 29.7 (C-16), 50.4 (C-17), 18.8 (C-18), 18.6 (C-19), 36.2 (C-20), 19.1 (C-21), 35.6 (C-22), 31.3 (C-23), 25.6 (C-24), 31.3 (C-25), 24.2 (C-26), 24.8 (C-27), 15.4 (C-28), 28.0 (C-29), 24.3 (C-30).

20 + 1 Product 5: [α]D = +33.7 (c 0.10, CHCl3); ESIMS: m/z 447 [M + Na] ; H NMR (600 MHz, CDCl3) δH: 3.23 (1H, dd, J = 12.0, 1.8 Hz, H-3), 0.68 (3H, s, H-18), 0.81 (3H, s, H-29), 0.86 (3H, s, H-30), 0.90 (3H, d, J = 7.2 Hz, H-21), 0.97 (3H, s, H-28), 1.00 (3H, s, H-19), 0.87 (3H, d, J = 6.0 Hz, H-26), 0.89 (3H, d, J = 6.0 1 Hz, H-27); H NMR (150 MHz, CDCl3) δC: 35.5 (C-1), 27.9 (C-2), 79.0 (C-3), 38.9 (C-4), 50.1 (C-5), 18.4 (C-6), 28.2 (C-7), 134.4 (C-8), 130.9 (C-9), 37.0 (C-10), 21.1 (C-11), 26.3 (C-12), 44.4 (C-13), 49.8 (C-14), 31.0 (C-15), 30.5 (C-16), 50.4 (C-17), 15.8 (C-18), 19.1 (C-19), 36.2 (C-20), 18.9 (C-21), 36.2 (C-22), 31.2 (C-23), 25.5 (C-24), 31.1 (C-25), 24.2 (C-26), 25.7 (C-27), 24.3 (C-28), 15.4 (C-29), 27.9 (C-30).

20 + 1 Product 7: [α]D = +11.7 (c 0.10, CHCl3); ESIMS: m/z 449 [M + Na] ; H NMR (600 MHz, CDCl3) δH: 5.08 (1H, t, J = 7.2 Hz, H-24), 3.25 (1H, dd, J = 12.0, 4.8 Hz, H-3), 0.68 (3H, s, H-18), 0.81 (3H, s, H-29), 0.86 (3H, s, H-30), 0.85 (3H, d, J = 7.2 Hz, H-21), 0.97 (3H, s, H-28), 0.99 (3H, s, H-19), 1.28 (3H, s, H-26), 1 1.66 (3H, s, H-27); H NMR (150 MHz, CDCl3) δC: 35.6 (C-1), 27.8 (C-2), 79.0 (C-3), 38.9 (C-4), 50.1 (C-5), 18.4 (C-6), 28.2 (C-7), 134.3 (C-8), 130.9 (C-9), 37.1 (C-10), 21.0 (C-11), 26.5 (C-12), 44.5 (C-13), 49.7 (C-14), 31.0 (C-15), 30.8 (C-16), 50.4 (C-17), 15.8 (C-18), 19.2 (C-19), 36.2 (C-20), 24.1 (C-21), 36.4 (C-22), 24.9 (C-23), 125.1 (C-24), 131.0 (C-25), 18.6 (C-26), 25.8 (C-27), 24.3 (C-28), 15.4 (C-29), 28.0 (C-30).

5. Conclusions To summarize, we developed a reliable homology model of SgCS to explore the potential function of the active site. Molecular docking and energy calculation of quantum chemistry were useful for explaining the catalytic mechanism in theory. The dynamic domain of SgCS was constituted by five covarying residues including Asp486, Cys487, Cys565, Tyr535, and His260. We described a site-directed mutagenesis strategy for the synthesis of cucurbitane-type and lanostane-type derivatives via an oxidosqualene cyclization reaction using SgCS as a safe, economical, and eco-friendly catalyst. The results suggested that the active site of SgCS accounted for the alternation between C–B–C and C–C–C conformations. A lot of useful information about enzyme evolution can be provided by exploring the untapped catalytic promiscuity of natural enzymes. OSCs can polycyclize compounds in a one-step reaction, as a result, it is currently being applied as an interesting biocatalyst in a variety of biotechnological processes. Its tight stereo control, when compared to chemically produced compounds, makes this enzyme greatly superior to catalyze the reaction. Further studies are needed to broaden the substrate range of the enzyme, which makes OSCs a promising tool to produce novel, unnatural cyclic products.

Author Contributions: Conceptualization, X.M. and J.Q.; Methodology, Z.L., J.L. (Jingjing Liao) and G.M.; Formal Analysis, G.M.; Data Curation, J.Q. and Z.L.; Writing-Original Draft Preparation, J.Q. and G.M.; Writing-Review & Editing, X.M.; Manuscript Revise, J.L. (Jiushi Liu); Project Administration, G.M. and X.M.; Funding Acquisition, X.M. Funding: This work was supported by the National Natural Science Foundation of China (NO. 81573521), Beijing Municipal Natural Science Foundation (NO. 5172028) and CAMS Innovation Fund for Medical Sciences (NO. 2017-I2 M-1-013). Conflicts of Interest: The authors declare no conflict of interest. Catalysts 2018, 8, 577 13 of 28

Appendix A

Box A1. Code-Optimized SgCS gene sequence.

GGATCC GCCACC ATGTGGAGATTGAAAGTTGGTGCTGAATCTGTTGGTGAAAATGATGAAAAATGGTTGAAATCTA TCTCTAACCACTTGGGTAGACAAGTTTGGGAATTTTGTCCAGATGCTGGTACTCAACAACAATTG TTGCAAGTTCATAAAGCTAGAAAAGCTTTTCATGATGATAGATTCCATAGAAAACAGTCTTCTGA TTTGTTTATCACTATCCAATACGGTAAAGAAGTTGAAAACGGTGGTAAAACTGCTGGTGTTAAAT TGAAAGAAGGTGAAGAAGTTAGAAAAGAAGCTGTTGAATCTTCTTTGGAAAGAGCTTTGTCTTT TTATTCTTCTATCCAAACTTCTGATGGTAACTGGGCTTCTGATTTGGGTGGTCCAATGTTTTTGTTG CCAGGTTTGGTTATTGCTTTGTATGTTACTGGTGTTTTGAATTCTGTTTTGTCTAAGCATCATAGAC AAGAAATGTGTAGATATGTTTATAATCACCAGAACGAAGATGGTGGTTGGGGTTTGCATATTGAA GGTCCATCTACTATGTTTGGTTCTGCTTTGAATTATGTTGCTTTGAGATTGTTGGGTGAAGATGCTA ATGCTGGTGCTATGCCAAAAGCTAGAGCTTGGATTTTGGATCATGGTGGTGCTACTGGTATTACTT CTTGGGGTAAATTGTGGTTGTCTGTTTTGGGTGTTTATGAATGGTCTGGTAATAATCCATTGCCAC CAGAATTTTGGTTGTTTCCATATTTTTTGCCATTTCATCCAGGTAGAATGTGGTGTCATTGTAGAAT GGTTTATTTGCCAATGTCTTATTTGTACGGTAAAAGATTTGTTGGTCCAATTACTCCAATTGTTTTG TCTTTGAGAAAAGAATTGTATGCTGTTCCATATCATGAAATCGATTGGAATAAATCTAGAAACAC TTGCGCTAAAGAAGATTTGTATTACCCACATCCAAAAATGCAAGATATTTTGTGGGGTTCTTTGC ATCATGTTTATGAACCATTGTTTACTAGATGGCCAGCTAAAAGATTGAGAGAAAAAGCTTTGCAA ACTGCTATGCAACATATTCATTATGAAGATGAAAATACTAGATATATTTGTTTGGGTCCAGTTAAC AAGGTTTTGAATTTGTTATGTTGTTGGGTTGAAGATCCATATTCTGATGCTTTTAAATTACATTTAC AAAGAGTTCATGATTATTTATGGGTTGCTGAAGATGGTATGAAAATGCAAGGTTATAATGGTTCT CAATTATGGGATACTGCTTTTTCTATCCAAGCTATTGTTTCTACTAAGTTAGTTGACAATTACGGTC CTACTTTAAGAAAGGCTCATGATTTTGTTAAGTCTTCTCAAATCCAACAAGATTGTCCTGGTGATC CTAATGTTTGGTATAGACACATTCACAAGGGTGCTTGGCCTTTTTCTACTAGAGATCACGGTTGGT TAATTTCTGACTGTACTGCTGAAGGTTTAAAGGCTGCTTTAATGTTATCTAAGTTACCTTCTGAAA CTGTTGGTGAATCTTTAGAAAGAAATAGATTATGCGACGCAGTTAATGTTTTATTATCTTTACAGA ACGACAATGGTGGTTTCGCATCTTATGAATTAACAAGATCTTACCCTTGGTTAGAATTAATTAATC CTGCAGAAACATTCGGTGACATTGTTATTGACTACCCTTACGTTGAGTGTACATCTGCAACAATG GAGGCATTAACATTATTCAAGAAGTTACACCCTGGTCACAGAACAAAGGAGATAGACACAGCA ATAGTTAGAGCAGCAAATTTCTTAGAGAATATGCAAAGAACAGACGGTTCTTGGTACGGTTGTTG GGGTGTTTGTTTCACATACGCAGGTTGGTTCGGTATAAAGGGTTTAGTTGCAGCAGGTAGAACAT ACAATAATTGCTTAGCAATAAGAAAGGCATGCGACTTCTTATTATCTAAGGAGTTACCTGGTGGT GGTTGGGGTGAGTCTTACTTATCTTGCCAGAATAAGGTTTACACAAACTTAGAGGGTAACAGAC CTCACTTAGTTAACACAGCATGGGTTTTAATGGCATTAATAGAGGCAGGTCAGGCAGAGAGAGA CCCTACACCTTTACACAGAGCAGCAAGATTATTAATAAACTCTCAGTTAGAGAACGGTGACTTCC CTCAGCAGGAGATAATGGGTGTTTTCAACAAGAACTGCATGATAACATACGCAGCATACAGAAA CATATTCCCTATCTGGGCATTAGGTGAGTACTGCCACAGAGTTTTAACAGAG GAATTC Nucleotides indicating restriction sites are underlined and bold.

Table A1. Parameters computed by the online ExPASy Proteomics Server.

Protein Characteristic SgCS Number of the deduced amino acids 759 Molecular weight (kDa) 86.4 Theoretical pI 6.64 Instability index 38.07 Grand average of hydropathicity (GRAVY) −0.313

Table A2. List of primer sequences.

Mutants Primers (50→30) USER cassette oligo for pCEV-G4-Km, F: CTAGTGCTGAGGCATGTTTAAACGACCCTCAGCTTAAT SpeI/PacI overhang. R: TAAGCTGAGGGTCGTTTAAACATGCCTCAGCA F: GGCATGTTUATGTGGAGATTGAAAGTTGGTGC SgCS-Wild R: GGGTCGTTUTCACTCTGTTAAAACTCTGTGGCAGTAC F: CGATGACCTCCCATTGATA pCEV-Seq R: CGTTCTTAATACTAACATAACT F: AGATCACGGTTGGTTAATTTCTGAAGTACTGCTGAAGGTTTAAAGGC D486E R: GCCTTTAAACCTTCAGCAGTACTTCAGAAATTAACCAACCGTGATCT Catalysts 2018, 8, x FOR PEER REVIEW 14 of 28 Catalysts 2018, 8, 577 14 of 28 F: GGCATGTTUATGTGGAGATTGAAAGTTGGTGC SgCS-Wild R: GGGTCGTTUTCACTCTGTTAAAACTCTGTGGCAGTAC F: CGATGACCTCCCATTGATA pCEV-Seq R: CGTTCTTAATACTAACATAACTTable A2. Cont. F: AGATCACGGTTGGTTAATTTCTGAAGTACTGCTGAAGGTTTAAAGGC MutantsD486E Primers (50→30) R: GCCTTTAAACCTTCAGCAGTACTTCAGAAATTAACCAACCGTGATCT F: AGATCACGGTTGGTTAATTTCTAACGTACTGCTGAAGGTTTAAAGGC D486N F: AGATCACGGTTGGTTAATTTCTAACGTACTGCTGAAGGTTTAAAGGC D486N GTT R: GCCTTTAAACCTTCAGCAGTACR: GCCTTTAAACCTTCAGCAGTACGTTAGAAATTAACCAACCGTGATCTAGAAATTAACCAACCGTGATCT F: AGATCACGGTTGGTTAATTTCTF: AGATCACGGTTGGTTAATTTCTGCTGCTGTACTGCTGAAGGGTACTGCTGAAGGTTTAAAGGCTTTAAAGGC D486AD486A R: GCCTTTAAACCTTCAGCAGTACR: GCCTTTAAACCTTCAGCAGTACAGCAGCAGAAATTAACCAACCGTGATCTAGAAATTAACCAACCGTGATCT F: GACTACCCTTACGTTGAGF: GACTACCCTTACGTTGAGATGACATCTGCAACAATGGAGGATGACATCTGCAACAATGGAGG C487MC487M R: CCTCCATTGTTGCAGATGTR: CCTCCATTGTTGCAGATGTCATCTCAACGTAAGGGTAGTCCATCTCAACGTAAGGGTAGTC F: GACTACCCTTACGTTGAGF: GACTACCCTTACGTTGAGCGTACATCTGCAACAATGGAGGCGTACATCTGCAACAATGGAGG C487RC487R R: CCTCCATTGTTGCAGATGTR: CCTCCATTGTTGCAGATGTACGACGCTCAACGTAAGGGTAGTCCTCAACGTAAGGGTAGTC F: GACTACCCTTACGTTGAGF: GACTACCCTTACGTTGAGGCTACATCTGCAACAATGGAGGGCTACATCTGCAACAATGGAGG C487AC487A R: CCTCCATTGTTGCAGATGTR: CCTCCATTGTTGCAGATGTAGCAGCCTCAACGTAAGGGTAGTCCTCAACGTAAGGGTAGTC F: TGACTACCCTTACGTTGAGF: TGACTACCCTTACGTTGAGATGATGACATCTGCAACAATGGAGACATCTGCAACAATGGAG C565MC565M R: CTCCATTGTTGCAGATGTR: CTCCATTGTTGCAGATGTCATCTCAACGTAAGGGTAGTCACATCTCAACGTAAGGGTAGTCA F: TGACTACCCTTACGTTGAGF: TGACTACCCTTACGTTGAGCGTACATCTGCAACAATGGAGCGTACATCTGCAACAATGGAG C565RC565R R: CTCCATTGTTGCAGATGTR: CTCCATTGTTGCAGATGTACGACGCTCAACGTAAGGGTAGTCACTCAACGTAAGGGTAGTCA F: TGACTACCCTTACGTTGAGF: TGACTACCCTTACGTTGAGGCTACATCTGCAACAATGGAGGCTACATCTGCAACAATGGAG C565AC565A R: CTCCATTGTTGCAGATGTR: CTCCATTGTTGCAGATGTAGCAGCCTCAACGTAAGGGTAGTCACTCAACGTAAGGGTAGTCA F: CAATGGTGGTTTCGCATCTF: CAATGGTGGTTTCGCATCTTGGTGGGAATTAACAAGATCTTACCCGAATTAACAAGATCTTACCC Y535WY535W R: GGGTAAGATCTTGTTAATTCR: GGGTAAGATCTTGTTAATTCCCACCAAGATGCGAAACCACCATTGAGATGCGAAACCACCATTG F: CAATGGTGGTTTCGCATCTCTAGAATTAACAAGATCTTACCC F: CAATGGTGGTTTCGCATCTCTAGAATTAACAAGATCTTACCC Y535LY535L R: GGGTAAGATCTTGTTAATTCR: GGGTAAGATCTTGTTAATTCTAGTAGAGATGCGAAACCACCATTGAGATGCGAAACCACCATTG F: CAATGGTGGTTTCGCATCTGCTGAATTAACAAGATCTTACCC Y535A F: CAATGGTGGTTTCGCATCTGCTGAATTAACAAGATCTTACCC Y535A R: GGGTAAGATCTTGTTAATTCR: GGGTAAGATCTTGTTAATTCAGCAGCAGATGCGAAACCACCATTGAGATGCGAAACCACCATTG F: TCCAGGTAGAATGTGGTGTCGTTGTAGAATGGTTTATTTGCC H260R F: TCCAGGTAGAATGTGGTGTCGTTGTAGAATGGTTTATTTGCC H260R R: GGCAAATAAACCATTCTACAACGACACCACATTCTACCTGGA R: GGCAAATAAACCATTCTACAACGACACCACATTCTACCTGGA F: TCCAGGTAGAATGTGGTGTTGGTGTAGAATGGTTTATTTGCC H260W F: TCCAGGTAGAATGTGGTGTTGGTGTAGAATGGTTTATTTGCC H260W R: GGCAAATAAACCATTCTACACCAACACCACATTCTACCTGGA R: GGCAAATAAACCATTCTACACCAACACCACATTCTACCTGGA F: TCCAGGTAGAATGTGGTGTGCTTGTAGAATGGTTTATTTGCC H260A F: TCCAGGTAGAATGTGGTGTGCTTGTAGAATGGTTTATTTGCC H260A R: GGCAAATAAACCATTCTACAAGCACACCACATTCTACCTGGA R: GGCAAATAAACCATTCTACAAGCACACCACATTCTACCTGGA F: AGATCACGGTTGGTTAATTTCTGCTGCTCTGCTGAAGGTTTAAAGGC DC486AA F: AGATCACGGTTGGTTAATTTCTGCTGCTCTGCTGAAGGTTTAAAGGC DC486AA R: GCCTTTAAACCTTCAGCAGAGCAGCAGAAATTAACCAACCGTGATCT R: GCCTTTAAACCTTCAGCAGAGCAGCAGAAATTAACCAACCGTGATCT F: CACGGTTGGTTAATTTCTGACTGTGCTGCTGCTGGTTTAAAGGCTGCTTTAATGTTA TAE4890AAA F: CACGGTTGGTTAATTTCTGACTGTGCTGCTGCTGGTTTAAAGGCTGCTTTAATGTTA TAE4890AAA R: TAACATTAAAGCAGCCTTTAAACCAGCAGCAGCACAGTCAGAAATTAACCAACCGTG R: TAACATTAAAGCAGCCTTTAAACCAGCAGCAGCACAGTCAGAAATTAACCAACCGTG F: GTCTTTTTATTCTTCTATCGCTACTTCTGATGGTAACGCTGCTTCTGATTTGGGTGGT QW113AA F: GTCTTTTTATTCTTCTATCGCTACTTCTGATGGTAACGCTGCTTCTGATTTGGGTGGT QW113AA R: ACCACCCAAATCAGAAGCAGCGTTACCATCAGAAGTAGCGATAGAAGAATAAAAAGAC F: GTAGATATGTTTATAATCACR: ACCACCCAAATCAGAAGCGCTAACGAAGATGGTGGTGCTAGCGTTACCATCAGAAGTAGCGGTTTGCATATTGAAGGTGATAGAAGAATAAAAAGAC QW163AA R: ACCTTCAATATGCAAACCF: GTAGATATGTTTATAATCACAGCACCACCATCTTCGTTAGCGCTAACGAAGATGGTGGTGCTGTGATTATAAACATATCTACGGTTTGCATATTGAAGGT QW163AA F: GCAAATTTCTTAGAGAATATGR: ACCTTCAATATGCAAACCGCTAGAACAGACGGTTCTGCTAGCACCACCATCTTCGTTAGCGTGATTATAAACATATCTACTACGGTTGTTGGGGTGTTT QW603AA R: AAACACCCCAACAACCGTAF: GCAAATTTCTTAGAGAATATGAGCAGAACCGTCTGTTCTAGCGCTAGAACAGACGGTTCTGCTCATATTCTCTAAGAAATTTGCTACGGTTGTTGGGGTGTTT QW603AA F: ATCACGGTTGGTTAATTTCTR: AAACACCCCAACAACCGTAGCTGCTGCTGCTGCTAGCAGAACCGTCTGTTCTAGCGGTTTAAAGGCTGCTTTAATGCATATTCTCTAAGAAATTTGC DCTAE4865A R: CATTAAAGCAGCCTTTAAACF: ATCACGGTTGGTTAATTTCTCAGCAGCAGCAGCAGGCTGCTGCTGCTGCTCAGAAATTAACCAACCGTGATGGTTTAAAGGCTGCTTTAATG DCTAE4865A Site-saturation F: TCCAGGTAGAATGTGGTGTR: CATTAAAGCAGCCTTTAAACNNKTGTAGAATGGTTTATTTCAGCAGCAGCAGCAG CAGAAATTAACCAACCGTGAT mutagenesis 260 R: GGCAAATAAACCATTCTACAF: TCCAGGTAGAATGTGGTGTMNNNNKACACCACATTCTACCTTGTAGAATGGTTTATTT Site-saturation mutagenesis 260 The underlinedR: GGCAAATAAACCATTCTACA oligonucleotides MNNrepresentACACCACATTCTACCT mutagenized sites. The underlined oligonucleotides represent mutagenized sites.

Figure A1. Two substrate prefolding types during the initial oxidosqualene cyclization reactions. Figure A1. Two substrate prefolding types during the initial oxidosqualene cyclization reactions.

Catalysts 2018, 8, 577 15 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 15 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 15 of 28

Figure A2. Homology model of cucurbitadienol synthase constructed using the human LAS crystal Figure A2.A2. Homology model of cucurbitadienol synthase constructed using the human LAS crystal structure as a template. During optimization of the model no major structural changes were observed structure as a template. During optimization of the model model no no major major structural structural changes were observed for active site residues, and refinement consisted only on fine-tuning the coordinates of active site for active site residues, and refinementrefinement consisted only on fine fine-tuning-tuning the coordinates of active site sidechain atoms. sidechain atoms.

Figure A A3.3. ConformationsConformations of 2,3 of-oxidosqualene 2,3-oxidosqualene and C and-4 cation C-4 after cation geometry after geometry optimization. optimization. (a,b) 2,3- Figure A3. Conformations of 2,3-oxidosqualene and C-4 cation after geometry optimization. (a,b) 2,3- (oxidosqualenea,b) 2,3-oxidosqualene (c,d) C-4 (cation.c,d) C-4 Yellow: cation. carbon Yellow: atom; carbon Red: atom; oxygen Red: atom. oxygen atom. oxidosqualene (c,d) C-4 cation. Yellow: carbon atom; Red: oxygen atom.

Catalysts 2018, 8, 577 16 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 16 of 28

(a)

(b)

Figure A4. Cont.

Catalysts 2018, 8, 577 17 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 17 of 28

(c)

(d)

Figure A4. Cont.

Catalysts 2018, 8, 577 18 of 28

Catalysts 2018, 8, x FOR PEER REVIEW 18 of 28 Relative Abundance

qj-B-11_180503161403 # 1 RT: 0.00 AV: 1 NL: 9.55E5 T: ITMS + p ESI Full ms [200.00-2000.00] 449.4167 100 95 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20 1043.6667 15 326.2500 10 481.3333 685.4167 727.4167 5 875.0834 1101.6667 1613.8334 1927.3334 0 200 400 600 800 1000 1200 1400 1600 1800 2000 m/z (e)

FigureFigure A4. A4.( a(a))1 HNMR1HNMR spectrumspectrum ofof compoundcompound3 3.(. b(b))13 13CAPTCAPT spectrumspectrum ofof compoundcompound3 3.(. c(c))HSQC HSQC spectrumspectrum of of compound compound3 3.(. d(d)) HMBC HMBC spectrum spectrum of of compound compound3 3.(. e(e)) ESIMS ESIMS spectrum spectrum of of compound compound3 3. .

Catalysts 2018, 8, 577 19 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 19 of 28

(a)

(b)

Figure A5. Cont.

Catalysts 2018, 8, 577 20 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 20 of 28

(c)

(d)

Figure A5. Cont.

Catalysts 2018, 8, x FOR PEER REVIEW 21 of 28 CatalystsCatalysts2018, 82018, 577, 8, x FOR PEER REVIEW 2121 of 28of 28

Relative Abundance qj-B-8_180625122450Relative #Abundance 1 RT: 0.00 AV: 1 NL: 1.48E5 T: ITMS + p ESI Full ms [200.00-2000.00] qj-B-8_180625122450451.2500 # 1 RT: 0.00 AV: 1 NL: 1.48E5 100 T: ITMS + p ESI Full ms [200.00-2000.00] 451.2500 95 100 90 95 90 85 85 80 80 75 75 70 70 65 65 60 60 55 55 50 50 45 45 40 40 35 35 30 413.3333 30 413.3333 671.3333 25 671.3333 25 20 20 1058.5834 15 1058.5834 15 10 10 1685.3334 309.2500 1140.0000 1685.3334 5 309.2500 917.0000 1140.0000 5 917.0000 0 200 0 400 600 800 1000 1200 1400 1600 1800 2000 200 400 600 800 1000m/z 1200 1400 1600 1800 2000 m/z

(e) (e) Figure A5. (a) 1HNMR spectrum of compound 4. (b) 13CAPT spectrum of compound 4. (c) HSQC FigureFigure A5. ( aA)51. HNMR(a) 1HNMR spectrum spectrum of compound of compound4.(b 4). 13(bCAPT) 13CAPT spectrum spectrum of compoundof compound4.( c4). HSQC(c) HSQC spectrum of compound 4. (d) HMBC spectrum of compound 4. (e) ESIMS spectrum of compound 4. spectrumspectrum of compound of compound4.(d )4 HMBC. (d) HMBC spectrum spectrum of compound of compound4.(e 4). ESIMS (e) ESIMS spectrum spectrum of compound of compound4. 4.

(a) (a)

Figure A6. Cont.

Catalysts 2018, 8, 577 22 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 22 of 28

(b)

(c)

Figure A6. Cont.

Catalysts 2018, 8, 577 23 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 23 of 28

(d)

qj-B-7_180625124610 # 1 RT: 0.00 AV: 1 NL: 1.35E5 T: ITMS + p ESI Full ms [200.00-2000.00] 451.2500 100

95 Relative Abundance 90

85 80 75

70 65

60 55 50

45 40

35 30 413.2500 25 671.3333 20 1058.5834 15

10 309.1667 1219.1667 1370.7500 5 940.3334 0 200 400 600 800 1000 1200 1400 1600 1800 2000 m/z

(e)

FigureFigure A6. A6.( a(a) )1 HNMR1HNMR spectrumspectrum ofof compound compound5 .(5. b(b) )13 13CAPTCAPT spectrumspectrum ofof compoundcompound5 5.(. c(c)) HSQC HSQC spectrumspectrum of of compound compound5 5.(. d(d)) HMBC HMBC spectrum spectrum of of compound compound5 .(5. e(e)) ESIMS ESIMS spectrum spectrum of of compound compound5 .5.

Catalysts 2018, 8, 577 24 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 24 of 28

(a)

(b)

Figure A7. Cont.

Catalysts 2018, 8, 577 25 of 28 Catalysts 2018, 8, x FOR PEER REVIEW 25 of 28

(c)

(d)

Figure A7. Cont.

CatalystsCatalysts2018 2018, ,8 8,, 577 x FOR PEER REVIEW 26 ofof 2828

qj-B-5_180625124610 # 1 RT: 0.00 AV: 1 NL: 1.63E5 T: ITMS + p ESI Full ms [200.00-2000.00] 449.4167 100 95 90 85 80 75 Relative Abundance 70 326.2500 65 60 55 50 45 1101.6667 40 35 30 25 481.3333 20 579.4167 685.4167 15 951.0834 10 785.5000 1186.4167 5 0 200 400 600 800 1000 1200 1400 1600 1800 2000 m/z

(e)

FigureFigure A7.A7.( a(a))1 1HNMRHNMR spectrumspectrum ofof compoundcompound7 7.(. b(b))13 13CAPTCAPT spectrumspectrum ofof compoundcompound 77.(. (cc)) HSQCHSQC spectrumspectrum ofof compoundcompound7 7.(. (dd)) HMBCHMBC spectrumspectrum of of compound compound7 7.(. e(e)) ESIMS ESIMS spectrumspectrum ofof compoundcompound7 7. .

ReferencesReferences

1.1. Hill,Hill, R.A.;R.A.; Connolly,Connolly, J.D. J.D. Triterpenoids. Triterpenoids.Nat. Nat. Prod. Prod. Rep. Rep.2013 2013,,30 30,, 1028–1065. 1028–1065. [ CrossRef][PubMed] 2.2. Nes,Nes, W.R.W.R. Role Role of of sterols sterols in in membranes. membranes.Lipids Lipids1974 1974, ,9 9,, 596–612. 596–612. [ CrossRef][PubMed] 3.3. Xu,Xu, R.;R.; Fazio, G.C.; Matsuda, Matsuda, S.P. S.P. On On the the origins origins of of triterpenoid triterpenoid skeletal skeletal diversity. diversity. PhytochemistryPhytochemistry 20042004, 65,, 65261, 261–291.–291. [CrossRef][PubMed] 4.4. Augustin,Augustin, J.M.; J.M.; Kuzina, Kuzina, V.; V.; Andersen, Andersen, S.B.; S.B.; Bak, Bak, S. S. Molecular Molecular activities, activities, biosynthesis biosynthesis and and evolution evolution of of triterpenoidtriterpenoid saponins.saponins.Phytochemistry Phytochemistry2011 2011,,72 72,, 435–457. 435–457. [CrossRef][PubMed] 5.5. Moses,Moses, T.;T.; Papadopoulou,Papadopoulou, K.K.;K.K.; Osbourn,Osbourn, A.A. MetabolicMetabolic andand functionalfunctional diversitydiversity ofof saponins,saponins, biosyntheticbiosynthetic intermediatesintermediates andand semi-syntheticsemi-synthetic derivatives. CritCrit.. Rev Rev.. Biochem Biochem.. Mol Mol.. Biol Biol.. 20142014, 49, 49, 439, 439–462.–462. [CrossRef] 6. [Kovganko,PubMed] N.V.; Ananich, S.K. The Chemical Synthesis of Sterols: Latest Advances. Chem. Nat. Compd. 1999, 6. Kovganko,35, 229–259. N.V.; Ananich, S.K. The Chemical Synthesis of Sterols: Latest Advances. Chem. Nat. Compd. 1999, 7. 35Yu,, 229–259. B.; Sun, [J.CrossRef Current] synthesis of triterpene saponins. Chem. Asian J. 2010, 4, 642–654. 7.8. Yu,Abe, B.; I.; Sun, Rohmer, J. Current M.; Prestwich, synthesis of G.D. triterpene Enzymatic saponins. CyclizationChem. Asianof Squalene J. 2010 ,and4, 642–654. Oxidosqualene [CrossRef to][ SterolsPubMed and] 8. Abe,Triterpenes. I.; Rohmer, Chem M.;. Rev Prestwich,. 1993, 93 G.D., 2189 Enzymatic–2206. Cyclization of Squalene and Oxidosqualene to Sterols and 9. Triterpenes.Kolesnikova,Chem. M.D.; Rev. Xiong,1993 Q.;, 93 Lodeiro,, 2189–2206. S.; Hua, [CrossRef L.; Matsuda,] S.P.T. Lanosterol biosynthesis in plants. Arch. 9. Kolesnikova,Biochem. Biophys M.D.;. 2006 Xiong,, 447, 87 Q.;–95. Lodeiro, S.; Hua, L.; Matsuda, S.P.T. Lanosterol biosynthesis in plants. 10. Arch.Suzuki, Biochem. M.; Xiang, Biophys. T.; Ohyama,2006, 447 ,K.; 87–95. Seki, [ CrossRefH.; Saito,][ K.;PubMed Muranaka,] T.; Hayashi, H.; Katsube, Y.; Kushiro, T.; 10. Suzuki,Shibuya, M.; M. Xiang, Lanosterol T.; Ohyama, Synthase K.; in Seki, Dicotyledonous H.; Saito, K.; Plants. Muranaka, Plant T.;Cell Hayashi, Physiol. 2006 H.; Katsube,, 47, 565– Y.;571. Kushiro, T.; 11. Shibuya,Eschenmoser, M. Lanosterol A.; Ruzicka, Synthase L.; Jeger, in Dicotyledonous O.; Arigoni, D. Plants. Zur KenntnisPlant Cell derPhysiol. Triterpene.2006, 47 , 190. 565–571. Mitteilung. [CrossRef Eine] [stereochemischePubMed] Interpretation der biogenetischen Isoprenregel bei den Triterpenen. Helv. Chim. Acta 1955, 38, 1890–1904. 12. Wu, T.K.; Griffin, J.H. Conversion of a plant oxidosqualene-cycloartenol synthase to an oxidosqualene- lanosterol cyclase by random mutagenesis. Biochemistry 2002, 41, 8238–8244.

Catalysts 2018, 8, 577 27 of 28

11. Eschenmoser, A.; Ruzicka, L.; Jeger, O.; Arigoni, D. Zur Kenntnis der Triterpene. 190. Mitteilung. Eine stereochemische Interpretation der biogenetischen Isoprenregel bei den Triterpenen. Helv. Chim. Acta 1955, 38, 1890–1904. [CrossRef] 12. Wu, T.K.; Griffin, J.H. Conversion of a plant oxidosqualene-cycloartenol synthase to an oxidosqualene- lanosterol cyclase by random mutagenesis. Biochemistry 2002, 41, 8238–8244. [CrossRef][PubMed] 13. Wu, T.K.; Liu, Y.T.; Chang, C.H.; Yu, M.T.; Wang, H.J. Site-saturated mutagenesis of histidine 234 of Saccharomyces cerevisiae oxidosqualene-lanosterol cyclase demonstrates dual functions in cyclization and rearrangement reactions. J. Am. Chem. Soc. 2006, 128, 6414–6419. [CrossRef][PubMed] 14. Chang, C.H.; Chen, Y.C.; Tseng, S.W.; Liu, Y.T.; Wen, H.Y.; Li, W.H.; Huang, C.Y.; Ko, C.Y.; Wang, T.T.; Wu, T.K. The cysteine 703 to isoleucine or histidine mutation of the oxidosqualene-lanosterol cyclase from Saccharomyces cerevisiae generates an iridal-type triterpenoid. Biochimie 2012, 94, 2376–2381. [CrossRef] [PubMed] 15. Chang, C.H.; Wen, H.Y.; Shie, W.S.; Lu, C.T.; Li, M.E.; Liu, Y.T.; Li, W.H.; Wu, T.K. Protein engineering of oxidosqualene-lanosterol cyclase into triterpene monocyclase. Org. Biomol. Chem. 2013, 11, 4214–4219. [CrossRef][PubMed] 16. Sato, T.; Hoshino, T. Catalytic function of the residues of phenylalanine and tyrosine conserved in squalene-hopene cyclases. Biosci. Biotechnol. Biochem. 2001, 65, 2233–2242. [CrossRef][PubMed] 17. Hart, E.A.; Ling, H.; Darr, L.B.; Wilson, W.K.; Jihai Pang, A.; Matsuda, S.P.T. Directed Evolution To Investigate Steric Control of Enzymatic Oxidosqualene Cyclization. An Isoleucine-to-Valine Mutation in Cycloartenol Synthase Allows Lanosterol and Parkeol Biosynthesis. J. Am. Chem. Soc. 1999, 121, 9887–9888. [CrossRef] 18. Herrera, J.B.R.; And, W.K.W.; Matsuda, S.P.T. A Tyrosine-to-Threonine Mutation Converts Cycloartenol Synthase to an Oxidosqualene Cyclase that Forms Lanosterol as Its Major Product. J. Am. Chem. Soc. 2000, 122, 6765–6766. [CrossRef] 19. Meyer, M.M.; Xu, R.; Matsuda, S.P. Directed evolution to generate cycloartenol synthase mutants that produce lanosterol. Org. Lett. 2002, 4, 1395–1398. [CrossRef][PubMed] 20. Lodeiro, S.; Schulz-Gasch, T.; Matsuda, S. Enzyme Redesign: Two Mutations Cooperate to Convert Cycloartenol Synthase into an Accurate Lanosterol Synthase. J. Am. Chem. Soc. 2005, 127, 14132–14133. [CrossRef] [PubMed] 21. Salmon, M.; Thimmappa, R.B.; Minto, R.E.; Melton, R.E.; Hughes, R.K.; O’Maille, P.E.; Hemmings, A.M.; Osbourn, A. A conserved amino acid residue critical for product and substrate specificity in plant triterpene synthases. Proc. Natl. Acad. Sci. USA 2016, 113, E4407–E4414. [CrossRef][PubMed] 22. Ito, R.; Masukawa, Y.; Nakada, C.; Amari, K.; Nakano, C.; Hoshino, T. β-Amyrin synthase from Euphorbia tirucalli. Steric bulk, not the π-electrons of Phe, at position 474 has a key role in affording the correct folding of the substrate to complete the normal polycyclization cascade. Org. Biomol. Chem. 2014, 12, 3836–3846. [CrossRef][PubMed] 23. Dai, L.; Liu, C.; Zhu, Y.; Zhang, J.; Men, Y.; Zeng, Y.; Sun, Y. Functional Characterization of Cucurbitadienol Synthase and Triterpene Glycosyltransferase Involved in Biosynthesis of Mogrosides from Siraitia grosvenorii. Plant Cell Physiol. 2015, 56, 1172–1182. [CrossRef][PubMed] 24. Zhao, H.; Tang, Q.; Mo, C.; Bai, L.; Tu, D.; Ma, X. Cloning and characterization of squalene synthase and cycloartenol synthase from Siraitia grosvenorii. Acta Pharm. Sin. B 2017, 7, 215–222. [CrossRef][PubMed] 25. Shibuya, M.; Adachi, S.; Ebizuka, Y. Cucurbitadienol synthase, the first committed enzyme for cucurbitacin biosynthesis, is a distinct enzyme from cycloartenol synthase for phytosterol biosynthesis. Tetrahedron 2004, 35, 6995–7003. [CrossRef] 26. Davidovich-Rikanati, R.; Shalev, L.; Baranes, N.; Meir, A.; Itkin, M.; Cohen, S.; Zimbler, K.; Portnoy, V.; Ebizuka, Y.; Shibuya, M. Recombinant yeast as a functional tool for understanding bitterness and cucurbitacin biosynthesis in watermelon (Citrullus spp.). Yeast 2015, 32, 103–114. [PubMed] 27. Shang, Y.; Ma, Y.; Zhou, Y.; Zhang, H.; Duan, L.; Chen, H.; Zeng, J.; Zhou, Q.; Wang, S.; Gu, W. Biosynthesis, regulation, and domestication of bitterness in cucumber. Science 2014, 346, 1084–1088. [CrossRef][PubMed] 28. Thoma, R.; Schulzgasch, T.; D’Arcy, B.; Benz, J.; Aebi, J.; Dehmlow, H.; Hennig, M.; Stihle, M.; Ruf, A. Insight into steroid scaffold formation from the structure of human oxidosqualene cyclase. Nature 2004, 432, 118–122. [CrossRef][PubMed] 29. Poralla, K.; Hewelt, A.; Prestwich, G.D.; Abe, I.; Reipen, I.; Sprenger, G.A. specific amino acid repeat in squalene and oxidosqualene cyclases. Trends Biochem. Sci. 1994, 19, 157–158. [CrossRef] Catalysts 2018, 8, 577 28 of 28

30. Corey, E.J.; Cheng, H.; Baker, C.H.; Matsuda, S.P.T.; Li, D.; Song, X. Studies on the Substrate Binding Segments and Catalytic Action of Lanosterol Synthase. Affinity Labeling with Carbocations Derived from Mechanism-Based Analogs of 2,3-Oxidosqualene and Site-Directed Mutagenesis Probes. J. Am. Chem. Soc. 1997, 119, 1289–1296. [CrossRef] 31. Wu, T.K.; Liu, Y.T.; Chang, C.H. Histidine residue at position 234 of oxidosqualene-lanosterol cyclase from saccharomyces cerevisiae simultaneously influences cyclization, rearrangement, and deprotonation reactions. Chem. Biochem. 2005, 6, 1177–1181. [CrossRef][PubMed] 32. Weng, J.K. The evolutionary paths towards complexity: A metabolic perspective. New Phytol. 2014, 201, 1141–1149. [CrossRef][PubMed] 33. Banta, A.B.; Wei, J.H.; Gill, C.C.; Giner, J.L.; Welander, P.V. Synthesis of arborane triterpenols by a bacterial oxidosqualene cyclase. Proc. Natl. Acad. Sci. USA 2016, 114, 245–250. [CrossRef][PubMed] 34. Ito, R.; Hashimoto, I.; Masukawa, Y.; Hoshino, T. Effect of cation-π interactions and steric bulk on the catalytic action of oxidosqualene cyclase: A case study of Phe728 of β-amyrin synthase from Euphorbia tirucalli L. Chem. Eur. J. 2013, 19, 17150–17158. [CrossRef][PubMed] 35. Thimmappa, R.; Geisler, K.; Louveau, T.; O’Maille, P.; Osbourn, A. Triterpene Biosynthesis in Plants. Annu. Rev. Plant Biol. 2014, 65, 225–257. [CrossRef][PubMed] 36. Wendt, K.U.; Lenhart, A.; Schulz, G.E. The structure of the membrane protein squalene-hopene cyclase at 2.0 å resolution 1. J. Mol. Biol. 1999, 286, 175–187. [CrossRef][PubMed] 37. Kushiro, T.; Shibuya, M.; Masuda, K.; Ebizuka, Y. Mutational Studies on Triterpene Synthases: Engineering into β-Amyrin Synthase. J. Am. Chem. Soc. 2000, 122, 6818–6824. [CrossRef] 38. Schulz–Gasch, T.; Stahl, M. Mechanistic insights into oxidosqualene cyclizations through homology modeling. J. Comput. Chem. 2003, 24, 741–753. [CrossRef][PubMed] 39. Kelly, R.; Miller, S.M.; Lai, M.H.; Kirsch, D.R. Cloning and characterization of the 2,3-oxidosqualene cyclase-coding gene of Candida albicans. Gene 1990, 87, 177–183. [CrossRef] 40. Ma, Y.; Zhou, Y.; Ovchinnikov, S.; Greisen, P., Jr.; Huang, S.; Yi, S. New insights into substrate folding preference of plant OSCs. Sci. Bull. 2016, 61, 1407–1412. [CrossRef] 41. Vickers, C.E.; Bydder, S.F.; Zhou, Y.; Nielsen, L.K. Dual gene expression cassette vectors with antibiotic selection markers for engineering in Saccharomyces cerevisiae. Microb. Cell Fact. 2013, 12, 96. [CrossRef] [PubMed] 42. Morris, G.M.; Goodsell, D.S.; Halliday, R.S.; Huey, R.; Hart, W.E.; Belew, R.K.; Olson, A.J. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J. Comput. Chem. 2015, 19, 1639–1662. [CrossRef] 43. Huey, R.; Morris, G.M.; Olson, A.J.; Goodsell, D.S. A semiempirical free energy force field with charge-based desolvation. J. Comput. Chem. 2010, 28, 1145–1152. [CrossRef][PubMed] 44. Geuflores, F.; Noureldin, H.H.; Nielsen, M.T.; Halkier, B.A. USER fusion: A rapid and efficient method for simultaneous fusion and cloning of multiple PCR products. Nucleic Acids Res. 2007, 35, e55. [CrossRef] [PubMed]

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).