<<

Materials and methods Metabolome analysis Root samples were collected from 2-month old and 2-year old S. baicalensis plants maintained in Shanghai Chenshan Botanical Garden, and ground into powder in liquid nitrogen then freeze dried. 20 mg of each sample was suspended in 2 ml 70% methanol and then extracted in an ultrasonic water bath for 2 h. After centrifugation at 12,000 g for 10 min, the supernatant was filtered through a 0.2 μm Millipore filter before metabolite analysis. Samples were analyzed using an UPLC-ESI-MS/MS system comprised of UPLC, Shim-pack UFLC SHIMADZU CBM30A, (www.shimadzu.com.cn/); MS, Applied Biosystems 6500 Q TRAP, (www.appliedbiosystems.com.cn/). The analytical conditions were as follows: UPLC: column, Waters ACQUITY UPLC HSS T3 C18 (1.8 µm, 2.1 mm*100 mm), the mobile phase consisted of solvent A, pure water with 0.04% acetic acid, and solvent B, acetonitrile with 0.04% acetic acid. Sample measurements were performed with a gradient program that employed the starting conditions of 95% A, 5 % B. Within 10min, a linear gradient to 5% A, 95% B was programmed, and a composition of 5% A, 95% B was kept for 1 min. Subsequently, a composition of 95% A, 5.0 % B was applied within 0.10 min and maintained for 2.9 min. The column oven was set to 40°C; the injection volume was 2 μl. Alternatively, the effluent was connected to an ESI-triple quadrupole-linear ion trap (Q TRAP)-MS. LIT and triple quadrupole (QQQ) scans were acquired on a triple quadrupole-linear ion trap mass spectrometer (Q TRAP), API 6500 Q TRAP UPLC/MS/MS System, equipped with an ESI Turbo Ion-Spray interface, operating in positive and negative ion mode and controlled by Analyst 1.6.3 software (AB Sciex). The ESI source operation parameters were as follows: ion source, turbo spray; source temperature 550°C; ion spray voltage (IS) 5500 V (positive ion mode)/-4500 V (negative ion mode); ion source gas I (GSI), gas II(GSII), curtain gas (CUR) were set at 50, 60, and 30.0 psi, respectively; the collision gas (CAD) was high. Instrument tuning and mass calibration were performed with 10 and 100 μmol/L polypropylene glycol solutions in QQQ and LIT modes, respectively. QQQ scans were acquired as MRM (Multiple Reaction Monitoring) experiments with collision gas (nitrogen) set to 5 psi. DP (Declustering Potential) and CE (Collision Energy) for individual MRM transitions was done with further DP and CE optimization. A specific set of MRM transitions were monitored for each period according to the metabolites eluted within this period.

Genome-wide identification of SbUGT and SbUGAT genes The hidden Markov model (HMM) profile of Pfam PF00201 (http://pfam.xfam.org/) was used to extract full-length glycosyltransferase candidates from the S. baicalensis genome by the HMM algorithm (HMMER) (Eddy 1998), filtering by a length between 200 and 600 amino acids.

Multiple sequence alignments and phylogenetic tree construction were performed using MEGA X (Kumar et al., 2018). For the Neighbor-Joining tree, candidates were constructed under the default parameters with UGT sequences from Arabidopsis thaliana (downloaded from http://www.p450.kvl.dk/UGT.shtml), and with known UGTs under the following accession numbers: BpUGAT (AB190262), SlUGT1 (AB362989), AmUGTcg10 (AB362988), PfUGT50 (AB362991), SiUGT23 (AB362990), VvGT5 (AB499074), and Sb3GT1 (MK577650). 7-O SbUGT and SbUGAT candidates could be screened according to the annotated function and classified subfamilies. Maximum-Likelihood tree was constructed under the default parameters with sequences of 7-O SbUGT and SbUGAT candidates.

Gene cloning The complete open reading frames (ORFs) of the SbUGT and SbUGAT genes were amplified by RT-PCR using the primers listed in Supplementary Table 3. cDNA templates were chosen according to the tissue-specific expression patterns of SbUGT and SbUGAT genes. The ORFs of SbUGT1 and SbUGT10 were obtained by de novo synthesis (GenScript, Nanjing, China). According to the manufacturer’s instructions, fragments were cloned into the entry vector pDONR207 and prokaryotic expression vector pYesdest17 using the Gateway BP Clonase II Enzyme Kit and LR Clonase II Enzyme Kit (Invitrogen, MA, USA), respectively.

Crude enzymes extraction and protein purification The successfully constructed vectors were transformed into E. coli Rosetta (DE3) competent cells (Weidi Biotech, Shanghai, China). After growing at 37 °C for 12 h, transformant colonies were initially grown in 10 ml of LB liquid medium with 100 mg/mL ampicillin at 37 °C and 180 rpm for approximately 12 h, and then transferred in 200 mL LB liquid medium with 100 mg/mL ampicillin at 37℃ in a shaking incubator until the OD600 reached 0.6–0.8. Isopropyl β-D-thiogalactopyranoside (IPTG) was added to a final concentration of 1 mM, and cultured at 16℃ and 120 rpm for 16 h. pET28a-transformed E. coli Rosetta (DE3) was set as a control.

For crude enzymes extraction, E.coli cells were harvested by centrifugation at 12,000 rpm, and then resuspended in 50 mM phosphate buffer (pH 8.0) that contained 0.5 mM phenylmethanesulfonylfluoride (PMSF), 300 mM NaCl, 2 mM β- mercaptoethanol. High pressure cell disruption equipment (Constant Systems, Northants, UK) was used to crush the E.coli cells. After centrifugation at 4℃, 12,000 rpm for 20 min, approximately 10 mL of supernatant (crude protein) was collected. An equal volume of 60% glycerin was added into the supernatant for the -80℃ storage.

For protein purification, E.coli cells were harvested by centrifugation at 12,000 rpm, and then resuspended in 10 mL buffer A [50 mM phosphate buffer (pH 8.0), 0.5 mM phenylmethanesulfonylfluoride (PMSF), 300 mM NaCl, 2 mM β- mercaptoethanol and 10 mM imidazole]. High pressure cell disruption equipment (Constant Systems, Northants, UK) was used to crush the E.coli cells. After centrifugation at 4℃, 12,000 rpm for 20 min, the supernatant was mixed with 1 mL Ni–nitrilotriacetic acid (NTA) agarose (Qiagen, Germany) and stirred at 4℃ for 1 h. The mixture was packed to a column and washed three times at 4℃ with 5 mL buffer B [50 mM phosphate buffer (pH 8.0), 0.5 mM PMSF, 300 mM NaCl, 2 mM β-mercaptoethanol and 20 mM imidazole]. The protein was eluted by 1 mL buffer C [50 mM phosphate buffer (pH 8.0), 0.5 mM PMSF, 300 mM NaCl, 2 mM β-mercaptoethanol and 250 mM imidazole], and the imidazole was removed by ultrafiltration. Protein concentrations were determined using the Bradford method (Bradford 1976) and analyzed by 10% SDS-polyacrylamide gel electrophoresis

In vitro enzyme assays and kinetic studies Crude enzyme assays were performed in a 100 μl reaction volume, which contained 100 mM Tris-HCl buffer (pH 7.0), 0.5 mM sugar donor (UDP-glucose or UDP- glucuronic acid), 5 μl of extracted protein and 100 μM substrate. The reaction was incubated for 2 h at 37°C. Methanol was then added at a final concentration of 70% to quench the reaction. The reaction mixture was filtered with a 0.2 μm Millipore filter and analyzed by LC–MS.

For kinetics measurements, was used at concentrations ranging from 0.5 to 200 μM. Reaction time was reduced to 10 min. Km and Vmax values were calculated from the Eadie-Hofstee plot.

Standard compounds Baicalein and were purchased from Sigma-Aldrich (St. Louis, MO, USA), and oroxin A was purchased from Yuanye-Biotech (Shanghai, China). Baicalein was dissolved in dimethyl sulfoxide (DMSO), while baicalin and oroxin A were dissolved in methanol.

Metabolite Analyses Metabolites were analyzed using an Agilent 1260 Infinity II HPLC (high-performance liquid chromatography) system. Chromatographic separation was carried out on a Phenomenex Luna C18 (2) column (100×2 mm 3 μ) with a guard column. The flow rate of the mobile phase consisting of 0.1% (v/v) formic acid in water (A) and 1:1 acetonitrile/MeOH + 0.1% formic acid (B) was set to 0.26 mL/min. The gradient program was as follows: 0-3 min, 20% B; 20 min, 50% B; 20-30 min, 50% B; 36 min, 30% B; 37 min, 20% B; and 37–43 min, 20% B. The detection wavelength was 280 nm. The injection volume was 20 μl and the column temperature was 35°C . The products of enzyme assays were measured by comparing the area of the individual peaks with standard curves obtained from standard compounds.

LC–MS/MS was carried out by Thermo Q Exactive Plus. Chromatographic separation was carried out on a Phenomenex Luna C18 (2) column (100×2 mm 3 μ) using the same gradient described above. Mass spectra were acquired in negative ion modes with a heated ESI source, and the parameters were as follows: aus. gas flow 10 L/min; aus. Gas heater 350 °C; sheath gas flow 40 L/min; spray voltage 3.5 kV; capillary temperature 320 °C. For full-scan MS/data-dependent (ddMS2) analysis, spectra were recorded in the m/z range of 50–750 at a resolution of 17,500 with automatic gain control (AGC) targets of 1×106 and 2×105, respectively.

Reference Bradford, M. M. (1976). A rapid and sensitive method for the quantitation of microgram quantities of protein using the principle of protein dye binding. Anal.

Biochem. 6, 3177-3188.

Eddy, S. R. (1998). Profile hidden Markov models. Bioinformatics. 14, 755. Kumar, S., Stecher, G., Li, M., Knyaz, C. and Tamura, K. (2018). MEGA X: molecular evolutionary genetics analysis across computing platforms. Molecular

Biology and Evolution. 35, 1547-1549.

Supplementary Figure 1. Representative 7-O flavonoid glycosides detected from roots of S. baicalensis Red boxes indicated the different groups between sugar moieties.

Supplementary Figure 2. Phylogenetic tree of SbUBGT and SbUBGAT proteins with UGTs from Arabidopsis thaliana Neighbor-Joining method was used to construct the tree with bootstrap (n=3000). Circles before the labels represent candidate genes from S. baicalensis, triangles before the labels represent functional UGTs that have been reported. A glucuronosyltransferase (UGT1A1) from Homo sapiens was used as an outgroup.

Supplementary Figure 3. Alignment of SbUGTs and SbUGATs protein sequences The consensus sequences were highlighted by red color. The arrows indicated the different amino acid residues between SbUGTs and SbUGATs, which were responsible for the functional divergent between these two types of glycosyltransferases.

Supplementary Figure 4. SDS PAGE analysis of purification of SbUGT and SbUGAT proteins. a. Tracks from left to right showed protein markers (M), SbUGT1 (1), SbUGT2 (2), SbUGT3 (3), SbUGT7 (4), SbUGT8 (5) and SbUGT9 (6). b. Tracks from left to right showed protein markers (M), SbUGTA3 (1), SbUGAT4 (2), SbUGAT5 (3) and SbUGAT6 (4).

Supplementary Figure 5. Nonlinear regressions of the Michaelis−Menten equation for SbUBGTs and SbUBGATs Supplementary Table 1. Flavonoid glycosides detected in the root metabolome Molecular Formu Ionizatio mix0 mix0 mix0 Compounds Class CAS R1 R2 R3 YR1 YR2 YR3 Weight (Da) la n model 1 2 3 C21H1 21967- 1.85 1.80 1.67 2.19 1.16 1.76 1.64 1.65 1.57 Baicalin 446.07 8O11 [M+H]+ Flavonoid 41-9 E+08 E+08 E+08 E+08 E+08 E+08 E+08 E+08 E+08 C21H2 28757- 9.00 9.00 8.86 4.48 1.17 9.00 3.97 4.49 3.87 5-O-glucoside 432.09 0O10 [M+H]+ Flavonoid 27-9 E+00 E+00 E+06 E+07 E+08 E+00 E+07 E+07 E+07 C25H2 5.53 1.91 1.76 3.33 2.67 5.26 4.44 3.65 3.28 O-malonylhexoside 548.09 4O14 [M+H]+ Flavonoid - E+07 E+07 E+07 E+07 E+07 E+07 E+07 E+07 E+07 C22H2 7.79 4.48 3.59 9.27 3.70 2.23 3.87 3.69 3.30 7-O-galactoside 462.10 2O11 [M+H]+ Flavonoid - E+07 E+07 E+07 E+06 E+07 E+07 E+07 E+07 E+07 C26H2 3.25 3.29 1.83 4.07 2.06 2.42 2.92 2.85 2.70 Wogonoside 460.10 0O8 [M+H]+ Flavonoid - E+07 E+07 E+07 E+07 E+07 E+07 E+07 E+07 E+07 C21H1 3.22 4.07 2.38 1.16 8.64 1.22 7.39 7.83 8.06 Tetahydroxyflavone 7-O-β-D-glucuronide 462.06 8O12 [M-H]- Flavonoid - E+06 E+06 E+06 E+07 E+06 E+07 E+06 E+06 E+06 C21H2 3.21 4.98 6.38 4.89 1.03 9.87 7.39 7.37 7.57 7-O-β-D-glucuronide 448.08 0O11 [M-H]- Flavonoid - E+06 E+06 E+06 E+06 E+07 E+06 E+06 E+06 E+06 C22H2 1.00 9.51 6.79 8.86 4.96 2.58 7.76 7.32 6.98 7-O-glucuronide 460.10 0O11 [M+H]+ Flavonoid - E+07 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 C21H2 5373-1 3.00 4.78 6.81 5.50 1.64 1.42 6.39 7.11 7.41 Luteolin 7-O-glucoside () 448.08 0O11 [M+H]+ Flavonoid 1-5 E+06 E+06 E+06 E+06 E+07 E+07 E+06 E+06 E+06 C21H1 5.65 7.95 5.97 1.12 4.59 6.10 6.35 6.09 5.17 7-O-glucuronide 430.09 8O10 [M+H]+ Flavonoid - E+06 E+06 E+06 E+07 E+06 E+06 E+06 E+06 E+06 C21H2 578-74 2.81 1.73 3.41 5.44 9.75 1.29 4.67 4.91 4.37 Apigenin 7-O-glucoside (Cosmosiin) 432.09 0O10 [M-H]- Flavonoid -5 E+05 E+05 E+05 E+06 E+06 E+05 E+06 E+06 E+06 C23H2 4.24 8.83 8.94 2.25 5.84 3.01 3.97 3.83 3.61 viscidulin Ⅲ 2-O-β-D-glucoplyranoside 492.13 4O12 [M+H]+ Flavonoid - E+06 E+05 E+06 E+06 E+06 E+06 E+06 E+06 E+06 C21H1 27740- 1.32 2.00 1.41 6.12 5.96 7.91 2.71 3.16 3.60 462.06 8O12 [M+H]+ Flavonoid 01-8 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 C21H1 29741- 1.54 2.00 1.18 6.06 5.72 7.66 2.73 2.97 3.14 Luteolin 7-O-glucuronide 462.06 8O12 [M+H]+ Flavonoid 10-4 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 5, 7-dihydroxy-2-methoxyflavone C21H1 1.51 1.85 1.23 6.01 5.66 7.28 2.75 3.10 2.98 7-O-glucuronide 462.08 8O12 [M+H]+ Flavonoid - E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 Molecular Formu Ionizatio mix0 mix0 mix0 Compounds Class CAS R1 R2 R3 YR1 YR2 YR3 Weight (Da) la n model 1 2 3

5, 7-dihydroxy-6, 8-dimethoxyflavone C23H2 5.56 2.63 1.98 3.45 1.89 2.44 2.98 2.75 2.56 7-O-glucoside 490.11 2O12 [M+H]+ Flavonoid - E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 Chrysoeriol 7-O-[β-D-glucuronopyranosyl-(1→ C28H2 3.95 2.84 1.36 2.78 1.29 3.22 2.65 2.39 2.32 2)-O-β-D-glucuronopyranoside] 652.10 8O18 [M+H]+ Flavonoid - E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 C27H3 17306- 4.38 2.30 2.94 1.27 2.30 1.37 2.92 2.31 1.94 578.14 0O14 [M+H]+ Flavonoid 46-6 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 C22H2 35110- 3.32 3.38 1.51 1.72 1.28 1.52 1.71 1.73 1.60 Diosmetin 7-O-glucuronide 476.08 0O12 [M+H]+ Flavonoid 20-4 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 C23H2 1.63 1.10 7.50 1.63 7.53 2.16 1.48 1.40 1.21 Apigenin 7-O-(6'-O-acetyl)-β-D-glucoside 474.10 2O11 [M+H]+ Flavonoid - E+06 E+06 E+05 E+06 E+05 E+06 E+06 E+06 E+06 Diosmetin C25H2 1.31 1.94 2.67 3.81 1.05 2.63 1.19 1.02 8.78 7-O-(6'-O-malonyl)-β-D-glucoside 548.09 4O14 [M+H]+ Flavonoid - E+06 E+06 E+06 E+04 E+05 E+05 E+06 E+06 E+05 C22H2 5.01 5.77 3.68 1.18 1.04 1.14 9.08 9.83 9.11 O-glucuronic acid 460.08 0O11 [M-H]- Flavonoid - E+05 E+05 E+05 E+06 E+06 E+06 E+05 E+05 E+05 C28H3 480-36 2.82 3.88 1.87 7.08 4.33 9.41 5.89 6.42 5.55 Acacetin 7-O-rutinoside 592.15 2O14 [M+H]+ Flavonoid -4 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 C22H2 12391 6.01 7.58 3.02 8.41 3.36 4.02 6.43 5.38 4.79 Scuteamoenoside 464.13 4O11 [M+H]+ Flavonoid 4-35-2 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 C22H2 6.70 4.59 6.26 4.67 3.80 4.94 4.05 4.78 3.64 Chrysoeriol 5-O-hexoside 462.10 2O11 [M-H]- Flavonoid - E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 C32H3 1.60 9.47 2.08 5.25 1.17 4.10 4.18 4.55 3.02 Luteolin O-sinapoylhexoside 654.13 0O15 [M+H]+ Flavonoid - E+05 E+04 E+05 E+04 E+06 E+05 E+05 E+05 E+05 C28H3 1.19 1.92 3.76 1.15 1.39 1.41 4.36 3.22 3.16 Chrysoeriol O-hexosyl-O-hexoside 624.14 2O16 [M+H]+ Flavonoid - E+06 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 C28H3 1.38 2.53 1.43 8.79 1.08 1.69 3.13 3.13 2.79 Chrysoeriol O-glucuronic acid-O-hexoside 638.12 0O17 [M+H]+ Flavonoid - E+06 E+05 E+05 E+04 E+05 E+05 E+05 E+05 E+05 C27H3 257-72 7.19 1.12 3.39 2.95 3.25 4.74 2.93 3.13 2.36 Luteolin 7, 3'-O-β-D-diglucoside 610.13 0O16 [M+H]+ Flavonoid 4-7 E+04 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 Chrysoeriol 7-O-hexoside 462.10 C22H2 [M-H]- Flavonoid - 9.97 2.88 2.46 9.00 5.78 1.49 2.44 2.66 3.03 Molecular Formu Ionizatio mix0 mix0 mix0 Compounds Class CAS R1 R2 R3 YR1 YR2 YR3 Weight (Da) la n model 1 2 3 2O11 E+05 E+05 E+05 E+00 E+05 E+05 E+05 E+05 E+05 C27H3 7.16 3.92 8.68 9.85 2.32 1.31 1.18 1.12 1.09 Luteolin 7-O-β-D-rutinoside 594.13 0O15 [M-H]- Flavonoid - E+04 E+04 E+04 E+04 E+05 E+05 E+05 E+05 E+05 C27H3 52187- 3.72 3.22 1.06 1.43 1.82 2.10 1.03 1.17 1.13 Luteolin 3', 7-O-diglucoside 610.13 0O16 [M+H]+ Flavonoid 80-1 E+04 E+04 E+05 E+05 E+05 E+05 E+05 E+05 E+05 C28H3 2.18 3.62 1.08 5.57 3.92 2.83 3.63 4.38 3.47 Chrysoeriol 7-O-rutinoside 608.15 2O15 [M-H]- Flavonoid - E+04 E+04 E+04 E+04 E+04 E+04 E+04 E+04 E+04 C21H2 16290- 1.65 1.21 3.94 3.21 2.86 3.10 3.08 3.58 2.99 7-O-glucosdie 448.08 0O11 [M-H]- Flavonols 07-6 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 E+06 C28H3 604-80 2.02 3.13 3.47 8.98 1.43 1.18 1.09 1.05 9.98 Isorhamnetin 3-O-rutinoside 624.14 2O16 [M-H]- Flavonols -8 E+05 E+05 E+05 E+05 E+06 E+06 E+06 E+06 E+05 C27H3 2.28 4.96 7.13 4.66 1.82 1.40 9.09 7.92 8.64 3, 7-O-β-D-diglucoside 626.12 0O17 [M+H]+ Flavonols - E+05 E+05 E+05 E+05 E+06 E+06 E+05 E+05 E+05 C23H2 3.43 2.32 6.13 4.76 3.13 5.82 6.36 7.73 7.82 Syringetin 3-O-hexoside 508.10 4O13 [M+H]+ Flavonols - E+05 E+05 E+05 E+05 E+06 E+05 E+05 E+05 E+05 C27H3 1.75 3.37 5.97 2.84 7.42 6.32 4.99 4.59 4.11 6-hydroxykaempferol 6, 7-O-diglucoside 626.12 0O17 [M+H]+ Flavonols - E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 C27H3 17650- 5.75 6.99 8.94 6.43 1.75 1.38 1.24 1.22 1.26 Kaempferol 3-O-rutinoside(Nicotiflorin) 594.13 0O15 [M-H]- Flavonols 84-9 E+04 E+04 E+04 E+04 E+05 E+05 E+05 E+05 E+05 C27H3 17297- 4.44 6.53 9.75 9.09 1.66 1.36 1.05 1.26 1.30 Kaempferol 3-O-robinobioside(Biorobin) 594.13 0O15 [M-H]- Flavonols 56-2 E+04 E+04 E+04 E+04 E+05 E+05 E+05 E+05 E+05 Quercetin C24H2 2.15 1.13 5.77 7.38 1.28 7.40 1.32 1.21 8.31 7-O-(6'-O-malonyl)-β-D-glucoside 550.07 2O15 [M+H]+ Flavonols - E+05 E+05 E+04 E+04 E+05 E+04 E+05 E+05 E+04 C27H3 3.97 1.01 5.02 4.53 1.67 7.78 8.09 7.71 7.19 Kaempferol 3-O-glucoside-7-O-rhamnoside 594.13 0O15 [M+H]+ Flavonols - E+04 E+05 E+04 E+04 E+05 E+04 E+04 E+04 E+04 C27H3 1.18 6.36 7.73 1.39 1.10 7.83 6.71 5.93 5.57 6-hydroxykaempferol 3,6-O-diglucoside 626.12 0O17 [M+H]+ Flavonols - E+05 E+04 E+04 E+04 E+05 E+04 E+04 E+04 E+04 Isorhamnetin C24H2 1.62 2.49 2.90 7.61 3.92 1.52 4.11 3.51 2.51 3-O-β-(2''-O-acetyl-β-D-glucuronide) 534.08 2O14 [M-H]- Flavonols - E+04 E+04 E+04 E+03 E+04 E+04 E+04 E+04 E+04 C21H2 482-36 2.40 3.29 3.11 1.69 5.53 2.23 2.78 3.48 2.21 Hyperin 464.08 0O12 [M-H]- Flavonols -0 E+04 E+04 E+04 E+04 E+04 E+04 E+04 E+04 E+04 Molecular Formu Ionizatio mix0 mix0 mix0 Compounds Class CAS R1 R2 R3 YR1 YR2 YR3 Weight (Da) la n model 1 2 3 C24H2 2.12 2.40 2.48 1.38 3.83 1.20 2.55 2.86 2.46 Isorhamnetin O-acetyl-hexoside 520.10 4O13 [M-H]- Flavonols - E+04 E+04 E+04 E+04 E+04 E+04 E+04 E+04 E+04 C22H2 3.08 2.93 2.67 4.44 2.57 2.01 1.68 1.44 1.64 Kaempferide 3-O-β-D-glucuronide 476.08 0O12 [M-H]- Flavonols - E+04 E+04 E+04 E+03 E+04 E+03 E+04 E+04 E+04 C21H2 Dihydroflav 38965- 4.68 7.24 1.04 8.71 2.33 1.91 1.16 1.13 1.16 Eriodictyol 7-O-glucoside 450.10 2O11 [M+H]+ one 51-4 E+06 E+06 E+07 E+06 E+07 E+07 E+07 E+07 E+07 C21H2 Dihydroflav 529-55 5.82 3.79 8.41 1.48 1.66 9.11 1.17 1.47 1.06 7-O-glucoside 434.10 2O10 [M-H]- one -5 E+04 E+04 E+04 E+05 E+05 E+04 E+05 E+05 E+05 C27H3 Dihydroflav 10236- 9.83 3.38 1.24 2.74 2.93 9.00 5.63 5.25 8.70 Naringin 580.15 2O14 [M-H]- one 47-2 E+04 E+04 E+05 E+04 E+04 E+00 E+04 E+04 E+04 C24H2 491-74 1.01 6.51 1.61 3.08 4.76 3.90 2.84 2.61 2.27 522.11 6O13 [M+H]+ -7 E+07 E+06 E+07 E+07 E+07 E+07 E+07 E+07 E+07 C22H2 20633- 5.94 1.62 1.12 1.26 2.08 2.35 1.46 1.41 1.36 Calycosin 7-O-glucoside 446.10 2O10 [M+H]+ Isoflavones 67-4 E+06 E+07 E+07 E+07 E+07 E+07 E+07 E+07 E+07 C23H2 94396- 1.15 2.55 2.57 5.93 1.51 7.57 1.30 1.06 9.57 Iristectorin B 492.11 4O12 [M+H]+ Isoflavones 09-5 E+07 E+06 E+07 E+06 E+07 E+06 E+07 E+07 E+06 C21H2 Dihydroflav 572-31 2.38 2.49 2.50 4.37 2.84 2.50 3.84 3.92 3.02 Engeletin 434.10 2O10 [M-H]- onol -6 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 C23H2 Anthocyani 30113- 1.96 6.76 1.13 9.00 4.98 5.72 1.28 1.44 1.38 Malvidin 3-O-galactoside 493.11 5O12 [M]+ ns 37-2 E+05 E+05 E+06 E+00 E+07 E+07 E+07 E+07 E+07 C23H2 Anthocyani 18470- 1.11 2.97 6.17 9.00 2.58 2.79 7.09 7.08 7.28 Malvidin 3-O-glucoside (Oenin) 493.11 5O12 [M]+ ns 06-9 E+05 E+05 E+05 E+00 E+07 E+07 E+06 E+06 E+06 C24H2 Anthocyani 1.34 1.86 9.00 4.15 3.38 5.35 9.76 8.68 1.02 Cyanidin 3-O-malonylhexoside 534.08 2O14 [M+H]+ ns - E+06 E+06 E+00 E+06 E+05 E+05 E+05 E+05 E+06 C21H2 Anthocyani 7084-2 5.02 4.76 9.00 3.62 9.00 9.00 7.47 8.12 8.16 Cyanidin 3-O-glucoside (Kuromanin) 449.09 1O11 [M]+ ns 4-4 E+05 E+05 E+00 E+06 E+00 E+00 E+05 E+05 E+05 C23H2 Anthocyani 1.37 3.86 5.99 9.00 2.50 2.98 7.10 7.59 7.07 Malvidin O-hexoside 492.11 4O12 [M+H]+ ns - E+04 E+04 E+04 E+00 E+06 E+06 E+05 E+05 E+05 C27H3 Anthocyani 2611-6 6.57 1.35 4.33 3.04 1.50 1.28 5.58 5.33 5.43 Cyanin chloride 611.13 1O16 [M]+ ns 7-8 E+04 E+05 E+05 E+05 E+06 E+06 E+05 E+05 E+05 Cyanidin 3-O-galactoside 448.08 C21H2 [M+H]+ Anthocyani 27661- 1.06 1.25 9.11 3.89 9.00 9.00 3.05 2.98 2.39 Molecular Formu Ionizatio mix0 mix0 mix0 Compounds Class CAS R1 R2 R3 YR1 YR2 YR3 Weight (Da) la n model 1 2 3 0O11 ns 36-5 E+05 E+05 E+05 E+05 E+00 E+00 E+05 E+05 E+05 C29H3 Anthocyani 16727- 9.00 9.00 9.00 9.00 2.04 2.44 6.77 6.77 7.32 Malvidin 3, 5-diglucoside (Malvin) 655.16 5O17 [M]+ ns 30-3 E+00 E+00 E+00 E+00 E+05 E+05 E+04 E+04 E+04 C26H2 Flavonoid 52012- 1.12 7.15 9.52 1.34 3.48 1.88 1.66 1.50 1.44 Isoschaftoside 564.12 8O14 [M+H]+ carbonoside 29-0 E+07 E+06 E+06 E+07 E+07 E+07 E+07 E+07 E+07 C27H3 Flavonoid 23666- 5.73 3.35 7.33 9.03 1.94 1.07 9.78 8.52 9.05 Apigenin 6,8-C-diglucoside 594.13 0O15 [M+H]+ carbonoside 13-9 E+05 E+05 E+05 E+05 E+06 E+06 E+05 E+05 E+05 C26H2 Flavonoid 4.15 3.05 4.45 5.67 1.26 7.16 6.06 6.60 5.41 Apigenin C-hexosyl- O-pentoside 564.12 8O14 [M+H]+ carbonoside - E+05 E+05 E+05 E+05 E+06 E+05 E+05 E+05 E+05 C28H3 Flavonoid 1.26 2.14 3.47 1.83 8.99 8.04 4.18 4.13 3.89 8-C-hexosyl-hesperetin-O-hexoside 626.16 4O16 [M+H]+ carbonoside - E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 E+05 C26H2 Flavonoid 3.95 2.47 1.21 1.05 2.35 1.50 3.58 3.44 3.27 8-C-hexosyl-luteolin-O-pentoside 580.12 8O15 [M+H]+ carbonoside - E+05 E+05 E+06 E+05 E+05 E+05 E+05 E+05 E+05 C37H3 Flavonoid 7.25 1.94 1.68 4.18 4.19 7.87 4.05 2.52 2.39 8-C-hexosyl-apigenin -O-feruloylhexoside 770.17 8O18 [M+H]+ carbonoside - E+03 E+04 E+04 E+04 E+04 E+04 E+04 E+04 E+04 C26H2 Flavonoid 51938- 2.34 1.50 2.24 1.93 8.14 3.21 1.94 2.39 1.91 Schaftoside 564.12 8O14 [M+H]+ carbonoside 32-0 E+04 E+04 E+04 E+04 E+04 E+04 E+04 E+04 E+04

Supplementary Table 2. The list of enzyme names, gene locus, and their subfamilies of predicted 7-O glycosyltransferases in S. baicalensis

Enzyme names Gene locus Subfamily SbUBGT SbUGT1 Sb06g31800 UGT73B SbUGT2 Sb06g31810 UGT73B SbUGT3 Sb06g31820 UGT73B SbUGT4 Sb01g11070 UGT73C SbUGT5 Sb01g11080 UGT73C SbUGT6 Sb03g36110 UGT73D SbUGT7 Sb03g36130 UGT73D SbUGT8 Sb03g36140 UGT73D SbUGT9 Sb03g36150 UGT73D SbUGT10 Sb03g36160 UGT73D SbUBGAT SbUGAT1.1 Sb01g50771.p1 UGT88D SbUGAT1.2 Sb01g50771.p2 UGT88D SbUGAT2 Sb01g50791 UGT88D SbUGAT3 Sb01g51711 UGT88D SbUGAT4 Sb01g56811 UGT88D SbUGAT5 Sb01g56821 UGT88D SbUGAT6 Sb09g13460 UGT88D

Supplementary Table 3. Primers used for the cloning of SbUGT and SbUGAT genes Gene Forward (5′-3′) Reverse (5′-3′) names GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGT2 AGCAGGCTTCATGGGACAACTC GGGTTTTAGTTTAAGCCCTGTTTCAT CACATAGTCC AGG GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGT3 AGCAGGCTTCATGGAAGAGCTA GGGTTTTATGCCCTGTTTCTTAGGAG CATATTGTCCTTC TG GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGT4 AGCAGGCTTCATGGCTTCCCAA GGGTTTCAAGAAATAGTAAGTTGTTG GTTGATGA CAC GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGT5 AGCAGGCTTCATGGATCAACTC GGGTTTTAATTTGCTTCCATGAGATG CACTTTCTTATG TTG GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGT6 AGCAGGCTTCATGGCGTCCCAT GGGTTCTATTTCAAGGTGTGTTTCAA TCAAACCT AA GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGT7 AGCAGGCTTCATGGCCACTCAA GGGTTTCAAGAAATGCAAGGTGATG CCTTGCC AT GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGT8 AGCAGGCTTCATGGCAATCCAT GGGTTTTAGACCCCATTATTGGCCT GGAGAAATAC GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGT9 AGCAGGCTTCATGGCAGTCCAT GGGTTTCAATTATTCTGAACCATAATT GGAGAAATAC TCA GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGAT1 AGCAGGCTTCATGGAAAAATCA GGGTTTCAATGGTGGGCAGTAAAGA ATGGAAGGCA GA GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGAT2 AGCAGGCTTCATGATTTTTATAG GGGTTTTAATCAAGGACGGTGGAGT TGCAGGAATGGG GAA GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGAT3 AGCAGGCTTCATGGAAGAAGA GGGTTTCAATCAAGGGTGGCGGAGA CACCATTGTAATT GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGAT4 AGCAGGCTTCATGGAAGACAC GGGTTTTAATCCCGAGTGGCGAGAA ACTTGTGATCTACA G GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGAT5 AGCAGGCTTCATGGCGGACACC GGGTTTTAAACCCGAGTCACCGCC ATGGTTC Gene Forward (5′-3′) Reverse (5′-3′) names GGGGACAAGTTTGTACAAAAA GGGGACCACTTTGTACAAGAAAGCT SbUGAT6 AGCAGGCTTCATGGAAGACAC GGGTTTTAATCCCGAGTGGCGAGAA CATTGTTCTGTATG G