S-1

FlavonQ: An Automated Data Processing Tool for Profiling Flavone/flavonol Glycosides

Using Ultra High-performance Liquid Chromatography Diode Array Detection and High-

Resolution Accurate-Mass Mass Spectrometry (UHPLC HRAM-MS)

Supporting Information

Mengliang Zhang †, Jianghao Sun †, and Pei Chen ∗

Food Composition and Methods Development Lab, Beltsville Human Nutrition Research Center,

Agricultural Research Services, United States Department of Agriculture, Beltsville, Maryland

20705-2350, United States

Table of Contents Instruments ...... 2 Sample Preparation ...... 3 Table S1. Comparison of peak area measurement methods ...... 4 Table S2. Typical flavone and flavonol aglycone groups and common neutral losses observed in HRMS ...... 5 Table S3. Typical glycosyl groups and common neutral losses observed in HRMS ...... 6 Table S4 Typical Acyl groups and common neutral losses observed in HRMS ...... 7 Table S5. Molar relative response factor based on for flavone and flavonol glycosides ...... 8 Table S6. Tentative identification result by FlavonQ for precursor ion m/z 755.1825 ...... 10 Table S7. Tentative identification result by FlavonQ for precursor ion m/z 755.1825 with characteristic ion m/z 285 in MS 2 ...... 13 Table S8. Example of result output...... 14 Figure S1. Different peak area integration methods ...... 15 Figure S2. Parameter optimization processing flowchart of ‘FlavonQ’ ...... 16 Figure S3. Screen shot for METLIN search result example of m/z 609.1467...... 17

†Contribute equally to this manuscript.

∗Corresponding author: Tel.: +1 301 504 8144; fax: +1 301 504 8314. E-mail address: [email protected] S-2

Instruments. The UHPLC-DAD HRAM MS system consisted of an LTQ Orbitrap XL mass spectrometer with an Accela 1250 binary pump, a PAL HTC Accela TMO autosampler, a PDA detector (Thermo Fisher Scientific, San Jose, CA), and a G1316A column compartment (Agilent,

Palo Alto, CA) was used. The chromatographic separation was achieved on a UHPLC column

(200 mm × 2.1 mm i.d., 1.9 m, Hypersil Gold AQ RP-C18) (Thermo Fisher Scientific, Inc.,

Waltham, MA) with an HPLC/UHPLC pre-column filter (UltraShield Analytical Scientific

Instruments, Richmond, CA) at a flow rate of 0.3 mL/min. The mobile phase consisted of a combination of A (0.1% formic acid in water, v/v ) and B (0.1% formic acid in acetonitrile, v/v ).

The linear gradient was from 4 to 20% B ( v/v ) at 40 min, to 35% B at 60 min, and to 100% B at

61 min and held at 100% B to 65 min. The PDA was set at 350 and 280 nm to record the peaks, and UV-vis spectra were recorded from 190 to 600 nm. For mass spectrometer, negative ionization mode was used, and the conditions were set as follows: sheath gas at 70 (arbitrary units), auxiliary and sweep gas at 15 (arbitrary units), spray voltage at 4.8 kV, capillary temperature at 300 ˚C, capillary voltage at 15 V, and tube lens at 70 V. The mass range was from m/z 100 to 1, 500 with a resolution of 30,000, FTMS AGC target at 2 × 10 5, FT-MS/MS AGC target at 1 × 10 5, isolation width of 1.5 amu, and maximum ion injection time of 500 ms. The most intense ion was selected for the data-dependent scan to offer their MS 2, MS 3, and MS 4 product ions with a normalization collision energy at 35%.

The MATLAB R2012b (MathWorks Inc., Natick, MA) was used to process data. All the calculations were performed on an Intel Core i7 3.4 GHz personal computer with 16 GB RAM running a Microsoft Windows 7 Professional x64 operation system (Microsoft Corp., Redmond,

WA). S-3

Sample Preparation. Each powdered sample (250 mg) was extracted with 5.00 mL of methanol/water (60:40, v/v ) using sonication for 60 min at room temperature and the slurry mixture was centrifuged at 2500 rpm for 15 min (IEC Clinical Centrifuge, Damon/IEC Division,

Needham, MA). The supernatant was filtered through a 17 mm (0.45 m) PVDF syringe filter

(VWR Scientific, Seattle, WA), and 2 L of the extract was used for each injection.

The rutin standard solutions at concentration of 5, 25, 50, 100, 250 ng/mL were prepared by dilution of 100 µg/mL rutin stock solution by methanol/water (60:40, v/v ), and 10 L of the standard solution was injected into the system.

S-4

Table S1. Comparison of peak area measurement methods

Peak area ratios by different methods* Retention Second-derivative Peak ID time Drop Simple Manual Gaussian Valley (min) perpendicular Gaussian fit transformation

14 11.12 0.2 0.2 0.6 0.2 0.7

15 11.33 0.4 0.4 0.9 0.4 1.1

16 11.71 10.5 10.8 8.9 13.5 8.6

17 12.32 2.1 1.9 2.1 0.2 2.6

18 12.59 13.2 13.4 13.0 12.1 10.1

19 13.38 11.4 11.3 10.9 12.1 7.1

20 13.73 2.6 2.5 5.8 3.3 3.3

21 14.19 9.3 9.2 9.5 10.8 6.9

22 14.59 6.4 6.3 6.7 7.3 4.4

23 15.19 7.5 7.4 7.1 9.4 4.8

24 15.91 4.6 4.6 3.0 0.1 3.5

25 16.13 31.9 31.9 31.6 30.6 47.1

* Each peak area is normalized to the sum of 11 peak areas.

S-5

Table S2. Typical flavone and flavonol aglycone groups and common neutral losses observed in HRMS

Name MW Aglycone groups Formula (typical ) (neutral loss, Da)

Mono-hydroxyflavone 3-Hydroxyflavone 238.0630 C15 H10 O3 Di-hydroxyflavone 7,8-Dihydroxyflavone 254.0579 C15 H10 O4 Hydroxymethoxyflavone 7-Methoxyflavonol 268.0735 C16 H12 O4 Tri-hydroxyflavone 270.0528 C15 H10 O5 Di-hydromethoxyflavone Acacetin 284.0685 C16 H12 O5 Tetra-hydroxyflavone ; 286.0477 C15 H10 O6 Tri-hydroxymethoxyflavone ; Diosmetin 300.0634 C16 H12 O6 Penta-hydroxyflavone ; ; 302.0426 C15 H10 O7 Tetra-hydroxymethoxyflavone ; 316.0583 C16 H12 O7 Hexa-hydroxyflavone ; 318.0376 C15 H10 O8 Penta-hydroxymethoxyflavone 332.0532 C16 H12 O8 Hepta-hydroxyflavone 8-Hydroxyquercetagetin 334.0325 C15 H10 O9

S-6

Table S3. Typical glycosyl groups and common neutral losses observed in HRMS

MW Glycosyl groups Name Formula (neutral loss, Da)

Xylose 132.0422 C5H8O4

Arabinose 132.0422 C5H8O4

Apiose 132.0422 C5H8O4

Monosaccharides Rhamnose 146.0579 C6H10 O4

Glucose 162.0528 C6H10 O5

Galactose 162.0528 C6H10 O5

Glucuronic acid 176.0315 C6H8O6

S-7

Table S4 Typical Acyl groups and common neutral losses observed in HRMS

MW Acyl groups Name Formula (neutral loss, Da)

Hydroxycinnamoyls p-Coumaroyl 146.0347 C9H6O2

Caffeoyl 162.0317 C9H6O3

Feruloyl 176.0473 C10 H8O3

Sinapoyl 206.0579 C11 H10 O4

Hydroxybenzoyls p-Hydroxybenzoyl 120.0211 C7H4O2

Galloyl 152.0109 C7H4O4

Dicarboxylic acid acyls Oxalyl 71.9847 C2O3

Malonyl 86.0004 C3H2O3

Succinoyl 100.0160 C4H4O3

Maloyl 116.0109 C4H4O4

Other aliphatic and other acyls: Acetyl 42.0105 C2H2O

Propionyl 56.0262 C3H4O

Butyryl 70.0418 C4H6O

Sulfate 79.9568 SO 3

S-8

Table S5. Molar relative response factor based on rutin for flavone and flavonol glycosides. Molecular λ Compound Name max MRRF 354 * Weight (nm) R rutin 610 354 1 quercetin 3-O-glucoside 464 354 1 quercetin 3-O-galactoside 464 354 1 quercetin 3-O-rhamnoside 448 354 1 quercetin 3-O-arabinosylglucoside 596 354 1 isorhamnetin-3-O-glucoside 478 354 1 isorhamnetin-3-O-rutinoside 624 354 1 myricetin-3-O-rhamnoside 464 352 1 kaempferol 3-O-glucoside 448 348 1 kaempferol 3-O-rutinoside 594 348 1 kaempferol 3-O-robinoside-7-O- 740 348 1 rhamnoside syringetin-3-O-glucoside 508 358 1 syringetin-3-O-galactoside 508 358 1 quercetin 302 372 1.05 myricetin 318 372 1.05 316 370 1.05 rhamnetin 316 370 1.05 isorhamnetin 316 370 1.05 syringetin 346 374 1.05 kaempferol 286 368 1.05 quercetin-4′-O-glucoside 464 366 1.05 myricetin 3′,4 ′,5 ′-trimethoxy 360 366 1.05 kaempferol 7-O-neohesperidoside 594 366 1.05 robinetin 302 362 1.05 luteolin 286 348 1.2 luteolin 7-O-glucoside 448 348 1.2 diosmin 608 348 1.2 luteolin 7,3 ′-O-diglucoside 610 342 1.2 diosmetin 300 348 1.2 neodiosmin 608 348 1.2 orientin 448 348 1.2 homoorientin 448 348 1.2 chrysoeriol 300 348 1.2 luteolin 6-methoxy 316 348 1.2 luteolin 6,7-dimethoxy 330 346 1.2 apigenin 270 336 0.97 apigenin 7-O-glucoside 432 336 0.97 S-9

Table S5. Continued Molecular λ Compound Name max MRRF 354 * Weight (nm) R rhoifolin 578 336 0.97 genkwanin 284 336 0.97 acacetin 284 334 0.97 isorhoifolin 578 336 0.97 vitexin 578 336 0.97 isovitexin 432 336 0.97 isovitexin-7-O-glucoside 594 336 0.97 scutellarein 286 336 0.97 scutellarin 462 336 0.97 linarin 592 334 0.97 fortunellin 592 334 0.97 hinokiflavone (3 ′,6-biapigenin) 538 338 1.94 cupressuflavone (8,8 ′-biapigenin) 538 330 1.94 * MRRF R 354, Molar relative response factor based on rutin at 354 nm.

S-10

Table S6. Tentative identification result by FlavonQ for precursor ion m/z 755.1825 [M-H] - error weight (ppm) Tentative identification Formula 1 755.1823 -0.21 Tetrahydroxyflavone, 2 Pentosyl, 1 Sinapoyl C36H36O18 2 755.1823 -0.21 Hexahydroxyflavone, 2 Rhamnose, 1 p-Coumaroyl C36H36O18 3 755.1823 -0.21 Pentahydroxydimethoxyflavone, 2 Pentosyl, 1 p-Coumaroyl C36H36O18 4 755.1823 -0.21 Pentahydroxymethoxyflavone, 1 Pentosyl, 1 Rhamnose, 1 p-Coumaroyl C36H36O18 5 755.1823 -0.21 Trihydroxyflavone, 2 Rhamnose, 1 Acetyl, 1 Galloyl C36H36O18 6 755.1823 -0.21 Trihydroxyflavone, 1 Pentosyl, 1 Rhamnose, 1 Propionyl, 1 Galloyl C36H36O18 7 755.1823 -0.21 Trihydroxyflavone, 2 Pentosyl, 1 Butyryl, 1 Galloyl C36H36O18 8 755.1823 -0.21 Pentahydroxyflavone, 1 Rhamnose, 1 Hexosyl, 1 p-Coumaroyl C36H36O18 9 755.1823 -0.21 Pentahydroxyflavone, 2 Rhamnose, 1 Caffeoyl C36H36O18 10 755.1823 -0.21 Pentahydroxyflavone, 2 Rhamnose, 1 Acetyl, 1 p-Hydroxybenzoyl C36H36O18 11 755.1823 -0.21 Pentahydroxyflavone, 1 Pentosyl, 1 Rhamnose, 1 Feruloyl C36H36O18 12 755.1823 -0.21 Pentahydroxyflavone, 1 Pentosyl, 1 Rhamnose, 1 Propionyl, 1 p-Hydroxybenzoyl C36H36O18 13 755.1823 -0.21 Pentahydroxyflavone, 2 Pentosyl, 1 Butyryl, 1 p-Hydroxybenzoyl C36H36O18 14 755.1823 -0.21 Dihydromethoxyflavone, 1 Rhamnose, 1 Propionyl, 1 Maloyl, 1 Galloyl C36H36O18 15 755.1823 -0.21 Dihydromethoxyflavone, 1 Pentosyl, 1 Butyryl, 1 Maloyl, 1 Galloyl C36H36O18 16 755.1823 -0.21 Dihydromethoxyflavone, 1 Pentosyl, 1 Glucuronic acid, 1 Caffeoyl C36H36O18 17 755.1823 -0.21 Dihydromethoxyflavone, 1 Pentosyl, 1 Glucuronic acid, 1 Acetyl, 1 p-Hydroxybenzoyl C36H36O18 18 755.1823 -0.21 Dihydromethoxyflavone, 1 Pentosyl, 1 Rhamnose, 1 Oxalyl, 1 p-Hydroxybenzoyl C36H36O18 19 755.1823 -0.21 Dihydromethoxyflavone, 2 Pentosyl, 1 Malonyl, 1 p-Hydroxybenzoyl C36H36O18 20 755.1823 -0.21 Dihydromethoxyflavone, 1 Pentosyl, 1 Rhamnose, 1 Acetyl, 1 Galloyl C36H36O18 21 755.1823 -0.21 Dihydromethoxyflavone, 2 Pentosyl, 1 Propionyl, 1 Galloyl C36H36O18 22 755.1823 -0.21 Tetrahydroxymethoxyflavone, 1 Pentosyl, 1 Hexosyl, 1 p-Coumaroyl C36H36O18 23 755.1823 -0.21 Tetrahydroxymethoxyflavone, 1 Pentosyl, 1 Rhamnose, 1 Caffeoyl C36H36O18 24 755.1823 -0.21 Tetrahydroxymethoxyflavone, 1 Pentosyl, 1 Rhamnose, 1 Acetyl, 1 p-Hydroxybenzoyl C36H36O18 25 755.1823 -0.21 Tetrahydroxymethoxyflavone, 2 Pentosyl, 1 Feruloyl C36H36O18 26 755.1823 -0.21 Tetrahydroxymethoxyflavone, 2 Pentosyl, 1 Propionyl, 1 p-Hydroxybenzoyl C36H36O18 S-11

Table S6. Continued [M-H] - error weight (ppm) Tentative identification Formula 27 755.1823 -0.21 Heptahydroxyflavone, 1 Rhamnose, 1 Butyryl, 1 Sinapoyl C36H36O18 28 755.1823 -0.21 Dihydroxyflavone, 1 Rhamnose, 1 Hexosyl, 1 Acetyl, 1 Galloyl C36H36O18 29 755.1823 -0.21 Dihydroxyflavone, 1 Pentosyl, 1 Hexosyl, 1 Propionyl, 1 Galloyl C36H36O18 30 755.1823 -0.21 Hydroxymethoxyflavone, 1 Pentosyl, 1 Hexosyl, 1 Acetyl, 1 Galloyl C36H36O18 31 755.1823 -0.21 Tetrahydroxyflavone, 2 Hexosyl, 1 p-Coumaroyl C36H36O18 32 755.1823 -0.21 Tetrahydroxyflavone, 1 Rhamnose, 1 Hexosyl, 1 Caffeoyl C36H36O18 33 755.1823 -0.21 Tetrahydroxyflavone, 1 Rhamnose, 1 Hexosyl, 1 Acetyl, 1 p-Hydroxybenzoyl C36H36O18 34 755.1823 -0.21 Tetrahydroxyflavone, 1 Pentosyl, 1 Hexosyl, 1 Feruloyl C36H36O18 35 755.1823 -0.21 Tetrahydroxyflavone, 1 Pentosyl, 1 Hexosyl, 1 Propionyl, 1 p-Hydroxybenzoyl C36H36O18 36 755.1823 -0.21 Hexahydroxyflavone, 1 Hexosyl, 1 Butyryl, 1 Sinapoyl C36H36O18 37 755.1823 -0.21 Hexahydroxyflavone, 1 Rhamnose, 2 Butyryl, 1 Galloyl C36H36O18 38 755.1823 -0.21 Trihydroxymethoxyflavone, 1 Pentosyl, 1 Hexosyl, 1 Caffeoyl C36H36O18 39 755.1823 -0.21 Trihydroxymethoxyflavone, 1 Pentosyl, 1 Hexosyl, 1 Acetyl, 1 p-Hydroxybenzoyl C36H36O18 40 755.1823 -0.21 Pentahydroxymethoxyflavone, 1 Pentosyl, 2 Butyryl, 1 Galloyl C36H36O18 41 755.1823 -0.21 Pentahydroxymethoxyflavone, 1 Hexosyl, 1 Propionyl, 1 Sinapoyl C36H36O18 42 755.1823 -0.21 Pentahydroxymethoxyflavone, 1 Rhamnose, 1 Propionyl, 1 Butyryl, 1 Galloyl C36H36O18 43 755.1823 -0.21 Pentahydroxydimethoxyflavone, 1 Hexosyl, 1 Acetyl, 1 Sinapoyl C36H36O18 44 755.1823 -0.21 Pentahydroxydimethoxyflavone, 1 Rhamnose, 2 Propionyl, 1 Galloyl C36H36O18 45 755.1823 -0.21 Pentahydroxydimethoxyflavone, 1 Rhamnose, 1 Acetyl, 1 Butyryl, 1 Galloyl C36H36O18 46 755.1823 -0.21 Pentahydroxydimethoxyflavone, 1 Pentosyl, 1 Propionyl, 1 Butyryl, 1 Galloyl C36H36O18 47 755.1823 -0.21 Trihydroxyflavone, 2 Hexosyl, 1 Caffeoyl C36H36O18 48 755.1823 -0.21 Trihydroxyflavone, 2 Hexosyl, 1 Acetyl, 1 p-Hydroxybenzoyl C36H36O18 49 755.1823 -0.21 Monohydroxyflavone, 2 Hexosyl, 1 Acetyl, 1 Galloyl C36H36O18 50 755.1823 -0.21 Pentahydroxyflavone, 1 Hexosyl, 2 Butyryl, 1 Galloyl C36H36O18 51 755.1823 -0.21 Dihydromethoxyflavone, 1 Hexosyl, 1 Butyryl, 1 Malonyl, 1 Galloyl C36H36O18 52 755.1823 -0.21 Dihydromethoxyflavone, 1 Hexosyl, 1 Propionyl, 1 Succinoyl, 1 Galloyl C36H36O18 S-12

Table S6. Continued [M-H] - error weight (ppm) Tentative identification Formula 53 755.1823 -0.21 Tetrahydroxymethoxyflavone, 1 Hexosyl, 1 Propionyl, 1 Butyryl, 1 Galloyl' C36H36O18 54 755.1823 -0.21 Heptahydroxyflavone, 1 Hexosyl, 2 Butyryl, 1 p-Hydroxybenzoyl' C36H36O18 55 755.1857 4.24 Dihydromethoxyflavone, 2 Rhamnose, 1 Acetyl, 1 Propionyl, 1 Sulfate' C33H40O18S1 56 755.1857 4.24 Dihydromethoxyflavone, 1 Pentosyl, 1 Rhamnose, 2 Propionyl, 1 Sulfate' C33H40O18S1 57 755.1857 4.24 Dihydromethoxyflavone, 1 Pentosyl, 1 Rhamnose, 1 Acetyl, 1 Butyryl, 1 Sulfate' C33H40O18S1 58 755.1857 4.24 Dihydromethoxyflavone, 2 Pentosyl, 1 Propionyl, 1 Butyryl, 1 Sulfate' C33H40O18S1 59 755.1857 4.24 Monohydroxyflavone, 3 Rhamnose, 1 Sulfate' C33H40O18S1 S-13

Table S8. Example of result output.

Table S7. Tentative identification result by FlavonQ for precursor ion m/z 755.1825 with characteristic ion m/z 285 in MS 2 [M-H] - error weight (ppm) Tentative identification Formula 1 755.1823 -0.21 Tetrahydroxyflavone, 2 Pentosyl, 1 Sinapoyl C36H36O18 2 755.1823 -0.21 Tetrahydroxyflavone, 2 Hexosyl, 1 p-Coumaroyl C36H36O18 3 755.1823 -0.21 Tetrahydroxyflavone, 1 Rhamnose, 1 Hexosyl, 1 Caffeoyl C36H36O18 4 755.1823 -0.21 Tetrahydroxyflavone, 1 Rhamnose, 1 Hexosyl, 1 Acetyl, 1 p-Hydroxybenzoyl C36H36O18 5 755.1823 -0.21 Tetrahydroxyflavone, 1 Pentosyl, 1 Hexosyl, 1 Feruloyl C36H36O18 6 755.1823 -0.21 Tetrahydroxyflavone, 1 Pentosyl, 1 Hexosyl, 1 Propionyl, 1 p-Hydroxybenzoyl C36H36O18

S-14

Peak RT RT RT Peak Peak Peak R Target Concentration Identification ID (min) start end height width area (%) ions ( m/z ) (mg/100g) results Fragment: m/z 285; Multiple results; One example: 15 12.32 12.09 12.38 1.8e5 0.29 2.7e7 94.1 933.2365 18.93 Name: Tetrahydroxyflavone, 4 Pentosyl, 1 p-Hydroxybenzoyl; Formula: C 42 H46 O24 Fragment: m/z 301; Multiple results; One example: 16 12.59 12.38 13.07 7.6e5 0.69 1.9e8 97.2 1125.2909 159.11 Name: Pentahydroxyflavone,4 Hexosyl, 1 Feruloyl; Formula: C 49 H58 O30 Fragment: m/z 301; Multiple results; One example: 17 13.38 13.07 13.61 7.4e5 0.54 1.6e8 95.3 993.2495 118.97 Name: Pentahydroxyflavone, 3 Hexosyl, 1 Sinapoyl; Formula: C 44 H50 O26 Fragment: m/z 301; Multiple results; One example: Name: Pentahydroxyflavone, 3 18 13.73 13.61 13.99 3.3e5 0.39 3.6e7 96.7 963.2418 25.85 Pentosyl, 1 Rhamnose, 1 p- Hydroxybenzoyl; Formula: C 43 H48 O25 Fragment: m/z 315; Name: Tetrahydroxymethoxyflavone, 23 16.13 15.97 16.82 1.8e6 0.85 4.6e8 100 639.1551 215.32 2 Hexosyl; Formula: C 28 H32 O17

S-15

5 5 A x 10 B x 10 9 9 8 8 7 7 6 6 5 5 4 4 Response Response 3 3 2 2 1 1 0 0 13 13.5 14 14.5 13 13.5 14 14.5 Retention time (min) Retention time (min) C 5 D 5 x 10 x 10 9 12 Second-derivative Gaussian filter 8 10 7 8 6 6 4 5 4 Response 2 Response 0 3 -2 2 -4 1 -6 0 13 13.5 14 14.5 13 13.5 14 14.5 Retention time (min) Retention time (min)

Figure S1. Different peak area integration methods: (A) drop perpendicular method, (B) valley method, (C) second-derivative Gaussian transformation method, and (D) simple Gaussian fit method.

S-16

Setting thresholds for: Slope, amplitude, and peak width. Reset parameters

Good peak No detection?

Yes

Choose reference peak ID

Check UV and MS spectra

Decide threshold for R

Qualitative/quantitative analysis

Figure S2. Parameter optimization processing flowchart of ‘FlavonQ’. S-17

Figure S3. Screen shot for METLIN search result example of m/z 609.1467.