<<

Untargeted Workfl ow Using UHPLC/Quadrupole Orbitrap and SIEVE 2.1 Software

Junhua Wang, David A. Peake, Mark Sanders, Michael Athanas, and Yingying Huang Thermo Fisher Scientifi c, San Jose, CA Untargeted Metabolomics Workflow Using UHPLC/Quadrupole Orbitrap Mass Spectrometer and SIEVE 2.1 Software Junhua Wang, David A. Peake, Mark Sanders, Michael Athanas, Yingying Huang Thermo Fisher Scientific, San Jose, CA

FIGURE 1. Untargeted metabolomics workflow FIGURE 3. Metabolomics data analysis. SIEVE 2.1 software, Thermo Scientific™ Table 1. Statics of component m/z 170.081 from SIEVE 2.1 software FIGURE 8. SIEVE 2.1 software with KEGG pathway visualization Overview TraceFinder™ 3.1 software, and mzCloud support untargeted and targeted and XCMS Online Purpose: Demonstrate a generic, integrated workflow for untargeted metabolomics metabolomics workflows. study using a UHPLC/benchtop Thermo Scientific™ Orbitrap™ mass spectrometer SIEVE 2.1 Software XCMS Online Untargeted Targeted and informatics software. Ratio 5.0 Fold change 4.8

Differential Advanced p-value 3.44e-4 p-value 3.4e-4 Component Component Pathway (Statistical) Quantitation Statistical Detection Identification Annotation Introduction Analysis Analysis RT 2.03 min RT 2.03 min Metabolomics is a rapidly growing field of post-genomic biology, aiming to • SIEVE • SIEVE • SIEVE • SIEVE • SIEVE • SimcaP IDs: IDs: comprehensively characterize the small molecules in biological systems. Nonbiological • XCMS • mzCloud • TraceFinder • KEGG • Partek Norepinephrine Norepinephrine systematic biases from instrument calibration or the order of sample injection account • mzMine • METLIN • Cytoscape • GeneData Pyridoxine Pyridoxine (VB6) for the most significant errors in LC/TOF-MS data [1]. Here we present a workflow • R Oxidopiamine, etc. Hydroxydopamin, etc. using a UHPLC/benchtop quardrupole Orbitrap platform and automated data analysis software for untargeted metabolomic profiling of plasma samples for biomarker UHPLC provides fast for high throughput analysis, the typical peak discovery from the Zucker diabetes fatty (ZDF) rat model. The optimal conditions for width is 4–6 seconds. Our method can baseline resolve Isoleucine and Leucine, sample preparation, liquid chromatography (LC), column, (MS), generating peak width 1.2 s at FWHM. Refer to Figure 4. For such narrow peaks, FIGURE 5. PCA by SIEVE 2.1 software (A), Numbers in SIEVE 2.1 software (B), and data processing parameters are explored. Q Exactive mass spectrometer operating at 70,000 resolution acquires >15 point across Numbers in XCMS Online (C) XIC the peak without losing sensitivity (A). C p-value: 3.76e-3 ChemSpider ID: 389841 Ratio: 2.86 (2S)-4-Acetamido-2- Methods The QC of each run was performed by monitoring the intensity of IS, d5-hippuric acid B Composition: C6H12N2O3 aminobutanoic acid (B), and the overall base peak (C). When all samples were finished, the selected ion A 38,268 features Sample Preparation Branched amino acids are known hallmarks of Type II Diabetes [3] chromatograms can be quickly viewed with RawMeat 2.1 (a free app with SIEVE 2.1 2,075 components 3,848 Plasma samples were deproteinized with organic solvent. Four extraction solvent software). By inspecting the RT and intensity of the IS, runs with large retention time p<0.05 systems including methanol (MeOH), acetonitrile (ACN), acetone, and 1:1:1 of the drift and bad injections can be excluded from the following data analysis. 403 Results TABLE 2. Representative metabolites that are significantly changed above were tested in this work. Endogenous metabolites were reconstituted in p<0.05 440 „real peaks‟ methanol/water (1:9) containing isotopically labeled internal standard (IS), d5-hippuric Challenges in Untargeted Metabolomics Study [M+H]+ Time Fatty/Lean P-value Δppm Formula Name 206 854.5657 7.25 2.81 3.65E-02 4.0 C50H80NO8P PC44:10 acid for LC-MS analysis. Solvent blank, pooled QC, and biological samples were . Complexity of biological samples FIGURE 4. Method validation, data quality control in metabolomics application 838.6292 8.84 2.48 6.45E-03 3.0 C48H88NO8P PC 40:4 836.6133 8.55 4.54 2.22E-02 0.0 C46H88NO8P PC38:2 analyzed in a randomized injection order. RT: 1.6471 - 1.8824 SM: 7B 212 . Leucine 812.6095 8.58 2.79 3.08E-02 4.0 C46H86NO8P PC 38:3 Diversity of small molecule metabolites: polar and non-polar analytes FWHM 1.5 s; 1.8147 NL: 132.1020 4.76E8 100 Ratio > 2 788.6127 9.04 2.13 3.56E-02 3.0 C44H86NO8P PC36:1 Take an aliquot of 200 µL plasma 7 Hz, Res m/z= 778.5351 7.20 4.15 3.38E-03 0.0 C42H78NO8P PC34:3 . Ionization requires both positive and negative ion 90 132.1012- A 70K 132.1026 768.5498 8.01 2.14 3.17E-04 4.0 C43H78NO8P PC35:4 80 MS . Δ Urine_Junh 764.5196 7.44 3.71 6.37E-04 3.0 C43H74NO8P PC35:6 Wide range of concentrations 70 ppm 0.8 ua_1_10_4 754.5347 7.29 2.67 1.35E-02 3.0 C42H76NO8P PC34:4   60 Isoleucine 834.5936 8.10 2.83 7.13E-04 0.0 C44H83NO13 LacCer(d18:1/14:0) Add 600 µL (3 ) of cold 4 C organic solvent and vortex (3 min) . 1.7255 FIGURE 6. SIEVE 2.1 software output of component m/z 170.081, showing No universal method for chromatographic separation 50 132.1020 836.6041 8.01 2.42 1.10E-03 4.0 C44H85NO13 LacCer(d18:0/14:0) 813.6790 9.72 2.40 3.84E-02 4.0 C47H93N2O6P SM(d18:2/24:0) 40 alignment across different samples (A), adducts grouping (B), peak . 675.5412 7.20 2.29 3.34E-02 3.0 C37H75N2O6P SM(d16:1/16:0)

Multiple sources of variability Abundance Relative 30 integration (C), and trend intensity view (D) 546.3525 5.84 2.33 9.23E-03 4.0 C28H52NO7P LysoPC(20:3) Spin down at 4 C, 4000  g (5 min) 20 . Structure elucidation of unknowns is expensive: lack of synthetic standards 526.2910 5.69 2.42 1.67E-04 3.0 C27H44NO7P LysoPE(22:6) 10 A 153.0521 1.79 2.46 1.56E-03 3.0 C4H12N2S2 Cystamine 0 C 230.0954 0.80 2.53 2.80E-02 1.0 C9H15N3O2S Ergothioneine Preparing for the UHPLC-MS Data Acquisition 1.65 1.70 1.75 1.80 1.85 245.0917 3.56 3.93 2.16E-03 1.0 C13H12N2O3 Haematopodin Take supernatant and divide into 3 portions (each 200 µL), dry down (30–60 min) Time (min) 172.1693 5.10 3.18 4.19E-02 1.0 C10H21NO decanamide RT: 0.00 - 12.01 SM: 7B 302.3043 5.48 2.28 2.50E-02 3.0 C18H39NO2 Sphinganine Prior to the real samples analysis, a solvent blank with internal standard (IS) is injected 2.83 NL: 185.0966 1.94E8 204.1228 3.75 3.10 1.83E-02 1.0 C9H17NO4 Acetylcarnitine at the beginning to check the solvent and the LC-MS status. The injections of the real 100 m/z= 232.1534 2.26 4.49 1.99E-03 2.0 C11H21NO4 Butyryl-L-carnitine 185.0957- 344.2786 5.31 2.13 1.57E-02 2.0 C19H37NO4 Lauroylcarnitine Reconstitute in 400 µL (9:1) H2O:CH3OH containing IS, vortex, and sonicate (2 min) samples should be randomized in order to eliminate systematic bias. Triplicate 80 B 185.0975 B 400.3409 5.53 2.10 3.49E-04 2.0 C23H45NO4 Palmitoyl-L-carnitine injections of the pooled plasma are intermittently repeated throughout the whole batch 60 IS=d5-hippuric acid MS ZDF_F1_2 D 466.3160 4.97 0.07 3.52E-03 0.0 C26H43NO6 Glycocholic Acid to validate consistent performance of the overall system. The experimental design and 40 357.2780 5.39 0.15 3.03E-03 2.0 C24H36O2 THA 15 features grouped 355.2626 5.13 0.17 8.89E-03 1.0 C24H34O2 delta2-THA Liquid Chromatography 20 run sequence are shown in Figure 2. to one component 170.0810 2.03 0.20 3.44E-04 0.0 C8H11NO3 Norepinephrine, 5-Hydroxydopamine Relative Abundance Relative 0 184.0968 2.69 0.23 2.70E-02 1.0 C9H13NO3 Normetanephrine; Methylnoradrenaline UHPLC separation was implemented on a Thermo Scientific™ Dionex™ UltiMate™ 2.48 NL: pressure gradient) pump using Thermo Scientific™ GOLD™ 205.0967 3.17E9 212.1279 3.74 0.34 4.04E-03 1.0 C11H17NO3 Methoxamine 3000 HPG (high- Hypersil 100 m/z= 226.1433 4.20 0.29 8.10E-03 1.0 C12H19NO3 N-Methylmescaline; Terbutaline  FIGURE 2. UHPLC/MS Experimental Design and Run Sequence. Left, schematic Base Peak 80.0000- RP C18 column at 450 µL/min, column temperature at 55 C. LC solvents were 0.1% 80 11.31 161.0918 2.45 0.35 3.76E-03 2.0 C6H12N2O3 (2S)-4-Acetamido-2-aminobutanoic acid showing the vials and sample names. Right, detailed content and overall time C 6.38 149.0117 1200.0000 224.0915 2.59 0.34 3.56E-02 1.0 C11H13NO4 Acetyl-L-tyrosine – 60 391.2823 MS FA (A) and 0.1% FA in MeOH (B). Apply linear gradient from 0.5 50% B for 5.5 min, 5.88 ZDF_F1_2 146.1173 1.44 0.42 3.54E-02 1.0 C7H15NO2 DL-Aminoheptanoic acid for each step. 496.3373 8.35 followed by increasing to 98% at 6 min, hold 98% B for 6 min, then decrease to 0.5% 40 732.5496 FIGURE 7. Searching MS/MS spectrum of component m/z 184.0607 in mzCloud. at 13 min, then equilibrate for another 2 min. 20 The entry of 4-pyridoxic acid matches the spectrum with the highest score. a b Abundance Relative 6 Step | Preconditioning LC under one polarity (+ or -) 0 Conclusion 1 0 2 4 6 8 10 12 4-Pyridoxic acid_pos_HCD_QE #141 RT: 1.26 AV: 1 NL: 2.70E8 Mass Spectrometry 1. Solvent blank spiked with IS, 5-10X | (check signal) T: FTMS + p ESI Full ms2 [email protected] [50.00-205.00] 166.0500 Control Time (min) 100 3 Solvent 2. QC mixture with IS, 3-5X | (check LC system status) 3h 95 An efficient and robust workflow for untargeted metabolomics is presented here. The The Thermo Scientific™ Q Exactive™ mass spectrometer was operated under sample 90 I blank with IS 3. Solvent blank with IS, 1X | (check carry over) 85 reliable high-resolution, accurate-mass (HR/AM) performance of the Q Exactive LC- 80 with IS #1 • If carry over, wash followed by repeating step 2 & 3 | (clean) (ESI) positive, negative, and polarity switching modes. Full Comparing Differential Analysis Results from SIEVE 2.1 software and XCMS 75 MS system eliminates the need for technical replicates on biological samples. The • If no carry over, proceed to next step 70 scan (m/z 67–1000) used resolution 70,000 with automatic gain control (AGC) target Online 65 6 60 MS/MS of m/z 184.0606 from sample superior S/N in Orbitrap data allows efficient data reduction in SIEVE 2.1 software, of 110 ions and a maximum ion injection time (IT) of 35 ms. Data-dependent MS/MS 2 7 55 The results from SIEVE 2.1 software are compared to XCMS Online, an academia- 50 mzCloud search and score resulting in much reduced data analysis effort. KEGG pathway visualization allows were acquired on a “Top5” data Treated 45

-dependent mode using the following parameters: QC Step | Experiment I: Positive, R=70K Abundance Relative n sample 40 quick access to biological pathway mapping. The MS spectral library mzCloud 5 Mixture developed open-source software for metabolomics data analysis (Table 1) (2). As 35  II with IS 4. Solvent blank no IS, 1X | (for background subtraction) resolution 17,500; AGC 1 10 ions; maximum IT 80 ms; 2.0 amu isolation window; 30 148.0393 with IS facilitates accurate compound identification. ± #1 5. Pooled plasma with IS, 3X | (check method and column) 14h shown in Figure 5, the significant components (p-value <0.05, ratio > 2.0) identified by 25 normalized collision energy 35% 50%; underfill ratio 1.0%; Apex trigger 2–4 s, and 20 138.0551 184.0607 6. Control sample with IS #1, 1X | SIEVE 2.1 software and XCMS Online are similar, while the SIEVE software started 15 dynamic exclusion 6 s. Source ionization parameters were: spray voltage, 3.8 kV; 4 7. Treated sample with IS #1, 1X | 10  with a much smaller (~20 less) number of components because of its capability to 5 0 capillary temperature, 300 C; and S-Lens level, 45. 8. Run all biological replicates with a randomized order | (all) 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 4-pyridoxic acid Control m/z Solvent Control automatically remove solvent background, thus saving users time and labor in data References sampleControl III blank sample Data Analysis withsample IS review (Figure 6). 1. Wang S.Y., Kuo, C.H., Tseng, Y. J. Anal. Chem., 2013, 85 (2), 1037–1046. NO IS with IS #2with IS Step | Experiment II: MS/MS Positive 15m #2 2. Scripps Center For Metabolomics Home Page. https://xcmsonline.scripps.edu/index.php Differential analyses of the obese versus lean plasma were performed using Thermo 5 # n 9. Pooled plasma, top5 ddMS2 HCD | (dd-MS/MS) Metabolite Identification Guided by mzCloud Scientific™ SIEVE™ 2.1 software, which executes background subtraction, 3. m/zCloud Home Page. https://www.mzcloud.org/home.aspx n An example of SIEVE 2.1 software with KEGG pathway visualization demonstrated Treated 8 The mzCloud is the first cloud-based MS spectral trees library built for small molecule component detection, peak alignment, differential analysis (Figure 1). Statistical results, 9 Pooled Treated with component m/z 161.0918 is shown in Figure 8. All matching metabolites are – sampleTreated Step | Experiment III: Negative, R=70K 14h 4. Muoio, D.M.; Newgard, C.B. Nature Rev. Mol. Cell. Biol. 2008, 9, 193 205. IV plasma sample structural elucidation (3). It contains a standard reference database, an elucidated putative IDs, and pathways were generated through searching ChemSpider and withsample IS 4-8. Run all biological replicates as Experiment I | (all samples) labeled with black dots. with IS with IS KEGG™. Metabolites of interest were identified via MS/MS mass spectral database #2with IS compound database with putative structures, and a database of virtual spectra. KEGG is a trademark of Kyoto University, Kanehisa Lab. All other trademarks are the property of Thermo Fisher #2# n Q Exactive HCD MS/MS spectra of interested metabolites can be searched against Table 2 shows some of the representative metabolites that are significantly changed. matching. The *.raw files were converted to mzXML format using ProteoWizard and Scientific and its subsidiaries. Step | Experiment IV: MS/MS Negative 15m m/zCloud. The matching compounds are scored and the fragment ions are annotated Up-regulated metabolites in fatty vs. lean rat plasma are shown in green while down- also analyzed by XCMS Online [2] to compare the results. 10. Repeat as in Experiment II, step 9 | (dd-MS/MS) This information is not intended to encourage use of these products in any manners that might infringe the accordingly (Figure 7). regulated metabolites are shown in blue (p-value <0.05, ratio >2). intellectual property rights of others.

2 Untargeted Metabolomics Work ow Using UHPLC/Quadrupole Orbitrap Mass Spectrometer and SIEVE 2.1 Software Untargeted Metabolomics Workflow Using UHPLC/Quadrupole Orbitrap Mass Spectrometer and SIEVE 2.1 Software Junhua Wang, David A. Peake, Mark Sanders, Michael Athanas, Yingying Huang Thermo Fisher Scientific, San Jose, CA

FIGURE 1. Untargeted metabolomics workflow FIGURE 3. Metabolomics data analysis. SIEVE 2.1 software, Thermo Scientific™ Table 1. Statics of component m/z 170.081 from SIEVE 2.1 software FIGURE 8. SIEVE 2.1 software with KEGG pathway visualization Overview TraceFinder™ 3.1 software, and mzCloud support untargeted and targeted and XCMS Online Purpose: Demonstrate a generic, integrated workflow for untargeted metabolomics metabolomics workflows. study using a UHPLC/benchtop Thermo Scientific™ Orbitrap™ mass spectrometer SIEVE 2.1 Software XCMS Online Untargeted Targeted and informatics software. Ratio 5.0 Fold change 4.8

Differential Advanced p-value 3.44e-4 p-value 3.4e-4 Component Component Pathway (Statistical) Quantitation Statistical Detection Identification Annotation Introduction Analysis Analysis RT 2.03 min RT 2.03 min Metabolomics is a rapidly growing field of post-genomic biology, aiming to • SIEVE • SIEVE • SIEVE • SIEVE • SIEVE • SimcaP IDs: IDs: comprehensively characterize the small molecules in biological systems. Nonbiological • XCMS • mzCloud • TraceFinder • KEGG • Partek Norepinephrine Norepinephrine systematic biases from instrument calibration or the order of sample injection account • mzMine • METLIN • Cytoscape • GeneData Pyridoxine Pyridoxine (VB6) for the most significant errors in LC/TOF-MS data [1]. Here we present a workflow • R Oxidopiamine, etc. Hydroxydopamin, etc. using a UHPLC/benchtop quardrupole Orbitrap platform and automated data analysis software for untargeted metabolomic profiling of plasma samples for biomarker UHPLC provides fast chromatography for high throughput analysis, the typical peak discovery from the Zucker diabetes fatty (ZDF) rat model. The optimal conditions for width is 4–6 seconds. Our method can baseline resolve Isoleucine and Leucine, sample preparation, liquid chromatography (LC), column, mass spectrometry (MS), generating peak width 1.2 s at FWHM. Refer to Figure 4. For such narrow peaks, FIGURE 5. PCA by SIEVE 2.1 software (A), Numbers in SIEVE 2.1 software (B), and data processing parameters are explored. Q Exactive mass spectrometer operating at 70,000 resolution acquires >15 point across Numbers in XCMS Online (C) XIC the peak without losing sensitivity (A). C p-value: 3.76e-3 ChemSpider ID: 389841 Ratio: 2.86 (2S)-4-Acetamido-2- Methods The QC of each run was performed by monitoring the intensity of IS, d5-hippuric acid B Composition: C6H12N2O3 aminobutanoic acid (B), and the overall base peak (C). When all samples were finished, the selected ion A 38,268 features Sample Preparation Branched amino acids are known hallmarks of Type II Diabetes [3] chromatograms can be quickly viewed with RawMeat 2.1 (a free app with SIEVE 2.1 2,075 components 3,848 Plasma samples were deproteinized with organic solvent. Four extraction solvent software). By inspecting the RT and intensity of the IS, runs with large retention time p<0.05 systems including methanol (MeOH), acetonitrile (ACN), acetone, and 1:1:1 of the drift and bad injections can be excluded from the following data analysis. 403 Results TABLE 2. Representative metabolites that are significantly changed above were tested in this work. Endogenous metabolites were reconstituted in p<0.05 440 „real peaks‟ methanol/water (1:9) containing isotopically labeled internal standard (IS), d5-hippuric Challenges in Untargeted Metabolomics Study [M+H]+ Time Fatty/Lean P-value Δppm Formula Name 206 854.5657 7.25 2.81 3.65E-02 4.0 C50H80NO8P PC44:10 acid for LC-MS analysis. Solvent blank, pooled QC, and biological samples were . Complexity of biological samples FIGURE 4. Method validation, data quality control in metabolomics application 838.6292 8.84 2.48 6.45E-03 3.0 C48H88NO8P PC 40:4 836.6133 8.55 4.54 2.22E-02 0.0 C46H88NO8P PC38:2 analyzed in a randomized injection order. RT: 1.6471 - 1.8824 SM: 7B 212 . Leucine 812.6095 8.58 2.79 3.08E-02 4.0 C46H86NO8P PC 38:3 Diversity of small molecule metabolites: polar and non-polar analytes FWHM 1.5 s; 1.8147 NL: 132.1020 4.76E8 100 Ratio > 2 788.6127 9.04 2.13 3.56E-02 3.0 C44H86NO8P PC36:1 Take an aliquot of 200 µL plasma 7 Hz, Res m/z= 778.5351 7.20 4.15 3.38E-03 0.0 C42H78NO8P PC34:3 . Ionization requires both positive and negative ion 90 132.1012- A 70K 132.1026 768.5498 8.01 2.14 3.17E-04 4.0 C43H78NO8P PC35:4 80 MS . Δ Urine_Junh 764.5196 7.44 3.71 6.37E-04 3.0 C43H74NO8P PC35:6 Wide range of concentrations 70 ppm 0.8 ua_1_10_4 754.5347 7.29 2.67 1.35E-02 3.0 C42H76NO8P PC34:4   60 Isoleucine 834.5936 8.10 2.83 7.13E-04 0.0 C44H83NO13 LacCer(d18:1/14:0) Add 600 µL (3 ) of cold 4 C organic solvent and vortex (3 min) . 1.7255 FIGURE 6. SIEVE 2.1 software output of component m/z 170.081, showing No universal method for chromatographic separation 50 132.1020 836.6041 8.01 2.42 1.10E-03 4.0 C44H85NO13 LacCer(d18:0/14:0) 813.6790 9.72 2.40 3.84E-02 4.0 C47H93N2O6P SM(d18:2/24:0) 40 alignment across different samples (A), adducts grouping (B), peak . 675.5412 7.20 2.29 3.34E-02 3.0 C37H75N2O6P SM(d16:1/16:0)

Multiple sources of variability Abundance Relative 30 integration (C), and trend intensity view (D) 546.3525 5.84 2.33 9.23E-03 4.0 C28H52NO7P LysoPC(20:3) Spin down at 4 C, 4000  g (5 min) 20 . Structure elucidation of unknowns is expensive: lack of synthetic standards 526.2910 5.69 2.42 1.67E-04 3.0 C27H44NO7P LysoPE(22:6) 10 A 153.0521 1.79 2.46 1.56E-03 3.0 C4H12N2S2 Cystamine 0 C 230.0954 0.80 2.53 2.80E-02 1.0 C9H15N3O2S Ergothioneine Preparing for the UHPLC-MS Data Acquisition 1.65 1.70 1.75 1.80 1.85 245.0917 3.56 3.93 2.16E-03 1.0 C13H12N2O3 Haematopodin Take supernatant and divide into 3 portions (each 200 µL), dry down (30–60 min) Time (min) 172.1693 5.10 3.18 4.19E-02 1.0 C10H21NO decanamide RT: 0.00 - 12.01 SM: 7B 302.3043 5.48 2.28 2.50E-02 3.0 C18H39NO2 Sphinganine Prior to the real samples analysis, a solvent blank with internal standard (IS) is injected 2.83 NL: 185.0966 1.94E8 204.1228 3.75 3.10 1.83E-02 1.0 C9H17NO4 Acetylcarnitine at the beginning to check the solvent and the LC-MS status. The injections of the real 100 m/z= 232.1534 2.26 4.49 1.99E-03 2.0 C11H21NO4 Butyryl-L-carnitine 185.0957- 344.2786 5.31 2.13 1.57E-02 2.0 C19H37NO4 Lauroylcarnitine Reconstitute in 400 µL (9:1) H2O:CH3OH containing IS, vortex, and sonicate (2 min) samples should be randomized in order to eliminate systematic bias. Triplicate 80 B 185.0975 B 400.3409 5.53 2.10 3.49E-04 2.0 C23H45NO4 Palmitoyl-L-carnitine injections of the pooled plasma are intermittently repeated throughout the whole batch 60 IS=d5-hippuric acid MS ZDF_F1_2 D 466.3160 4.97 0.07 3.52E-03 0.0 C26H43NO6 Glycocholic Acid to validate consistent performance of the overall system. The experimental design and 40 357.2780 5.39 0.15 3.03E-03 2.0 C24H36O2 THA 15 features grouped 355.2626 5.13 0.17 8.89E-03 1.0 C24H34O2 delta2-THA Liquid Chromatography 20 run sequence are shown in Figure 2. to one component 170.0810 2.03 0.20 3.44E-04 0.0 C8H11NO3 Norepinephrine, 5-Hydroxydopamine Relative Abundance Relative 0 184.0968 2.69 0.23 2.70E-02 1.0 C9H13NO3 Normetanephrine; Methylnoradrenaline UHPLC separation was implemented on a Thermo Scientific™ Dionex™ UltiMate™ 2.48 NL: pressure gradient) pump using Thermo Scientific™ GOLD™ 205.0967 3.17E9 212.1279 3.74 0.34 4.04E-03 1.0 C11H17NO3 Methoxamine 3000 HPG (high- Hypersil 100 m/z= 226.1433 4.20 0.29 8.10E-03 1.0 C12H19NO3 N-Methylmescaline; Terbutaline  FIGURE 2. UHPLC/MS Experimental Design and Run Sequence. Left, schematic Base Peak 80.0000- RP C18 column at 450 µL/min, column temperature at 55 C. LC solvents were 0.1% 80 11.31 161.0918 2.45 0.35 3.76E-03 2.0 C6H12N2O3 (2S)-4-Acetamido-2-aminobutanoic acid showing the vials and sample names. Right, detailed content and overall time C 6.38 149.0117 1200.0000 224.0915 2.59 0.34 3.56E-02 1.0 C11H13NO4 Acetyl-L-tyrosine – 60 391.2823 MS FA (A) and 0.1% FA in MeOH (B). Apply linear gradient from 0.5 50% B for 5.5 min, 5.88 ZDF_F1_2 146.1173 1.44 0.42 3.54E-02 1.0 C7H15NO2 DL-Aminoheptanoic acid for each step. 496.3373 8.35 followed by increasing to 98% at 6 min, hold 98% B for 6 min, then decrease to 0.5% 40 732.5496 FIGURE 7. Searching MS/MS spectrum of component m/z 184.0607 in mzCloud. at 13 min, then equilibrate for another 2 min. 20 The entry of 4-pyridoxic acid matches the spectrum with the highest score. a b Abundance Relative 6 Step | Preconditioning LC under one polarity (+ or -) 0 Conclusion 1 0 2 4 6 8 10 12 4-Pyridoxic acid_pos_HCD_QE #141 RT: 1.26 AV: 1 NL: 2.70E8 Mass Spectrometry 1. Solvent blank spiked with IS, 5-10X | (check signal) T: FTMS + p ESI Full ms2 [email protected] [50.00-205.00] 166.0500 Control Time (min) 100 3 Solvent 2. QC mixture with IS, 3-5X | (check LC system status) 3h 95 An efficient and robust workflow for untargeted metabolomics is presented here. The The Thermo Scientific™ Q Exactive™ mass spectrometer was operated under sample 90 I blank with IS 3. Solvent blank with IS, 1X | (check carry over) 85 reliable high-resolution, accurate-mass (HR/AM) performance of the Q Exactive LC- 80 with IS #1 • If carry over, wash followed by repeating step 2 & 3 | (clean) electrospray ionization (ESI) positive, negative, and polarity switching modes. Full Comparing Differential Analysis Results from SIEVE 2.1 software and XCMS 75 MS system eliminates the need for technical replicates on biological samples. The • If no carry over, proceed to next step 70 scan (m/z 67–1000) used resolution 70,000 with automatic gain control (AGC) target Online 65 6 60 MS/MS of m/z 184.0606 from sample superior S/N in Orbitrap data allows efficient data reduction in SIEVE 2.1 software, of 110 ions and a maximum ion injection time (IT) of 35 ms. Data-dependent MS/MS 2 7 55 The results from SIEVE 2.1 software are compared to XCMS Online, an academia- 50 mzCloud search and score resulting in much reduced data analysis effort. KEGG pathway visualization allows were acquired on a “Top5” data Treated 45

-dependent mode using the following parameters: QC Step | Experiment I: Positive, R=70K Abundance Relative n sample 40 quick access to biological pathway mapping. The MS spectral library mzCloud 5 Mixture developed open-source software for metabolomics data analysis (Table 1) (2). As 35  II with IS 4. Solvent blank no IS, 1X | (for background subtraction) resolution 17,500; AGC 1 10 ions; maximum IT 80 ms; 2.0 amu isolation window; 30 148.0393 with IS facilitates accurate compound identification. ± #1 5. Pooled plasma with IS, 3X | (check method and column) 14h shown in Figure 5, the significant components (p-value <0.05, ratio > 2.0) identified by 25 normalized collision energy 35% 50%; underfill ratio 1.0%; Apex trigger 2–4 s, and 20 138.0551 184.0607 6. Control sample with IS #1, 1X | SIEVE 2.1 software and XCMS Online are similar, while the SIEVE software started 15 dynamic exclusion 6 s. Source ionization parameters were: spray voltage, 3.8 kV; 4 7. Treated sample with IS #1, 1X | 10  with a much smaller (~20 less) number of components because of its capability to 5 0 capillary temperature, 300 C; and S-Lens level, 45. 8. Run all biological replicates with a randomized order | (all) 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 4-pyridoxic acid Control m/z Solvent Control automatically remove solvent background, thus saving users time and labor in data References sampleControl III blank sample Data Analysis withsample IS review (Figure 6). 1. Wang S.Y., Kuo, C.H., Tseng, Y. J. Anal. Chem., 2013, 85 (2), 1037–1046. NO IS with IS #2with IS Step | Experiment II: MS/MS Positive 15m #2 2. Scripps Center For Metabolomics Home Page. https://xcmsonline.scripps.edu/index.php Differential analyses of the obese versus lean plasma were performed using Thermo 5 # n 9. Pooled plasma, top5 ddMS2 HCD | (dd-MS/MS) Metabolite Identification Guided by mzCloud Scientific™ SIEVE™ 2.1 software, which executes background subtraction, 3. m/zCloud Home Page. https://www.mzcloud.org/home.aspx n An example of SIEVE 2.1 software with KEGG pathway visualization demonstrated Treated 8 The mzCloud is the first cloud-based MS spectral trees library built for small molecule component detection, peak alignment, differential analysis (Figure 1). Statistical results, 9 Pooled Treated with component m/z 161.0918 is shown in Figure 8. All matching metabolites are – sampleTreated Step | Experiment III: Negative, R=70K 14h 4. Muoio, D.M.; Newgard, C.B. Nature Rev. Mol. Cell. Biol. 2008, 9, 193 205. IV plasma sample structural elucidation (3). It contains a standard reference database, an elucidated putative IDs, and pathways were generated through searching ChemSpider and withsample IS 4-8. Run all biological replicates as Experiment I | (all samples) labeled with black dots. with IS with IS KEGG™. Metabolites of interest were identified via MS/MS mass spectral database #2with IS compound database with putative structures, and a database of virtual spectra. KEGG is a trademark of Kyoto University, Kanehisa Lab. All other trademarks are the property of Thermo Fisher #2# n Q Exactive HCD MS/MS spectra of interested metabolites can be searched against Table 2 shows some of the representative metabolites that are significantly changed. matching. The *.raw files were converted to mzXML format using ProteoWizard and Scientific and its subsidiaries. Step | Experiment IV: MS/MS Negative 15m m/zCloud. The matching compounds are scored and the fragment ions are annotated Up-regulated metabolites in fatty vs. lean rat plasma are shown in green while down- also analyzed by XCMS Online [2] to compare the results. 10. Repeat as in Experiment II, step 9 | (dd-MS/MS) This information is not intended to encourage use of these products in any manners that might infringe the accordingly (Figure 7). regulated metabolites are shown in blue (p-value <0.05, ratio >2). intellectual property rights of others.

Thermo Scienti c Poster Note • PN ASMS13_T382_JWang_E 07/13S 3 Untargeted Metabolomics Workflow Using UHPLC/Quadrupole Orbitrap Mass Spectrometer and SIEVE 2.1 Software Junhua Wang, David A. Peake, Mark Sanders, Michael Athanas, Yingying Huang Thermo Fisher Scientific, San Jose, CA

FIGURE 1. Untargeted metabolomics workflow FIGURE 3. Metabolomics data analysis. SIEVE 2.1 software, Thermo Scientific™ Table 1. Statics of component m/z 170.081 from SIEVE 2.1 software FIGURE 8. SIEVE 2.1 software with KEGG pathway visualization Overview TraceFinder™ 3.1 software, and mzCloud support untargeted and targeted and XCMS Online Purpose: Demonstrate a generic, integrated workflow for untargeted metabolomics metabolomics workflows. study using a UHPLC/benchtop Thermo Scientific™ Orbitrap™ mass spectrometer SIEVE 2.1 Software XCMS Online Untargeted Targeted and informatics software. Ratio 5.0 Fold change 4.8

Differential Advanced p-value 3.44e-4 p-value 3.4e-4 Component Component Pathway (Statistical) Quantitation Statistical Detection Identification Annotation Introduction Analysis Analysis RT 2.03 min RT 2.03 min Metabolomics is a rapidly growing field of post-genomic biology, aiming to • SIEVE • SIEVE • SIEVE • SIEVE • SIEVE • SimcaP IDs: IDs: comprehensively characterize the small molecules in biological systems. Nonbiological • XCMS • mzCloud • TraceFinder • KEGG • Partek Norepinephrine Norepinephrine systematic biases from instrument calibration or the order of sample injection account • mzMine • METLIN • Cytoscape • GeneData Pyridoxine Pyridoxine (VB6) for the most significant errors in LC/TOF-MS data [1]. Here we present a workflow • R Oxidopiamine, etc. Hydroxydopamin, etc. using a UHPLC/benchtop quardrupole Orbitrap platform and automated data analysis software for untargeted metabolomic profiling of plasma samples for biomarker UHPLC provides fast chromatography for high throughput analysis, the typical peak discovery from the Zucker diabetes fatty (ZDF) rat model. The optimal conditions for width is 4–6 seconds. Our method can baseline resolve Isoleucine and Leucine, sample preparation, liquid chromatography (LC), column, mass spectrometry (MS), generating peak width 1.2 s at FWHM. Refer to Figure 4. For such narrow peaks, FIGURE 5. PCA by SIEVE 2.1 software (A), Numbers in SIEVE 2.1 software (B), and data processing parameters are explored. Q Exactive mass spectrometer operating at 70,000 resolution acquires >15 point across Numbers in XCMS Online (C) XIC the peak without losing sensitivity (A). C p-value: 3.76e-3 ChemSpider ID: 389841 Ratio: 2.86 (2S)-4-Acetamido-2- Methods The QC of each run was performed by monitoring the intensity of IS, d5-hippuric acid B Composition: C6H12N2O3 aminobutanoic acid (B), and the overall base peak (C). When all samples were finished, the selected ion A 38,268 features Sample Preparation Branched amino acids are known hallmarks of Type II Diabetes [3] chromatograms can be quickly viewed with RawMeat 2.1 (a free app with SIEVE 2.1 2,075 components 3,848 Plasma samples were deproteinized with organic solvent. Four extraction solvent software). By inspecting the RT and intensity of the IS, runs with large retention time p<0.05 systems including methanol (MeOH), acetonitrile (ACN), acetone, and 1:1:1 of the drift and bad injections can be excluded from the following data analysis. 403 Results TABLE 2. Representative metabolites that are significantly changed above were tested in this work. Endogenous metabolites were reconstituted in p<0.05 440 „real peaks‟ methanol/water (1:9) containing isotopically labeled internal standard (IS), d5-hippuric Challenges in Untargeted Metabolomics Study [M+H]+ Time Fatty/Lean P-value Δppm Formula Name 206 854.5657 7.25 2.81 3.65E-02 4.0 C50H80NO8P PC44:10 acid for LC-MS analysis. Solvent blank, pooled QC, and biological samples were . Complexity of biological samples FIGURE 4. Method validation, data quality control in metabolomics application 838.6292 8.84 2.48 6.45E-03 3.0 C48H88NO8P PC 40:4 836.6133 8.55 4.54 2.22E-02 0.0 C46H88NO8P PC38:2 analyzed in a randomized injection order. RT: 1.6471 - 1.8824 SM: 7B 212 . Leucine 812.6095 8.58 2.79 3.08E-02 4.0 C46H86NO8P PC 38:3 Diversity of small molecule metabolites: polar and non-polar analytes FWHM 1.5 s; 1.8147 NL: 132.1020 4.76E8 100 Ratio > 2 788.6127 9.04 2.13 3.56E-02 3.0 C44H86NO8P PC36:1 Take an aliquot of 200 µL plasma 7 Hz, Res m/z= 778.5351 7.20 4.15 3.38E-03 0.0 C42H78NO8P PC34:3 . Ionization requires both positive and negative ion 90 132.1012- A 70K 132.1026 768.5498 8.01 2.14 3.17E-04 4.0 C43H78NO8P PC35:4 80 MS . Δ Urine_Junh 764.5196 7.44 3.71 6.37E-04 3.0 C43H74NO8P PC35:6 Wide range of concentrations 70 ppm 0.8 ua_1_10_4 754.5347 7.29 2.67 1.35E-02 3.0 C42H76NO8P PC34:4   60 Isoleucine 834.5936 8.10 2.83 7.13E-04 0.0 C44H83NO13 LacCer(d18:1/14:0) Add 600 µL (3 ) of cold 4 C organic solvent and vortex (3 min) . 1.7255 FIGURE 6. SIEVE 2.1 software output of component m/z 170.081, showing No universal method for chromatographic separation 50 132.1020 836.6041 8.01 2.42 1.10E-03 4.0 C44H85NO13 LacCer(d18:0/14:0) 813.6790 9.72 2.40 3.84E-02 4.0 C47H93N2O6P SM(d18:2/24:0) 40 alignment across different samples (A), adducts grouping (B), peak . 675.5412 7.20 2.29 3.34E-02 3.0 C37H75N2O6P SM(d16:1/16:0)

Multiple sources of variability Abundance Relative 30 integration (C), and trend intensity view (D) 546.3525 5.84 2.33 9.23E-03 4.0 C28H52NO7P LysoPC(20:3) Spin down at 4 C, 4000  g (5 min) 20 . Structure elucidation of unknowns is expensive: lack of synthetic standards 526.2910 5.69 2.42 1.67E-04 3.0 C27H44NO7P LysoPE(22:6) 10 A 153.0521 1.79 2.46 1.56E-03 3.0 C4H12N2S2 Cystamine 0 C 230.0954 0.80 2.53 2.80E-02 1.0 C9H15N3O2S Ergothioneine Preparing for the UHPLC-MS Data Acquisition 1.65 1.70 1.75 1.80 1.85 245.0917 3.56 3.93 2.16E-03 1.0 C13H12N2O3 Haematopodin Take supernatant and divide into 3 portions (each 200 µL), dry down (30–60 min) Time (min) 172.1693 5.10 3.18 4.19E-02 1.0 C10H21NO decanamide RT: 0.00 - 12.01 SM: 7B 302.3043 5.48 2.28 2.50E-02 3.0 C18H39NO2 Sphinganine Prior to the real samples analysis, a solvent blank with internal standard (IS) is injected 2.83 NL: 185.0966 1.94E8 204.1228 3.75 3.10 1.83E-02 1.0 C9H17NO4 Acetylcarnitine at the beginning to check the solvent and the LC-MS status. The injections of the real 100 m/z= 232.1534 2.26 4.49 1.99E-03 2.0 C11H21NO4 Butyryl-L-carnitine 185.0957- 344.2786 5.31 2.13 1.57E-02 2.0 C19H37NO4 Lauroylcarnitine Reconstitute in 400 µL (9:1) H2O:CH3OH containing IS, vortex, and sonicate (2 min) samples should be randomized in order to eliminate systematic bias. Triplicate 80 B 185.0975 B 400.3409 5.53 2.10 3.49E-04 2.0 C23H45NO4 Palmitoyl-L-carnitine injections of the pooled plasma are intermittently repeated throughout the whole batch 60 IS=d5-hippuric acid MS ZDF_F1_2 D 466.3160 4.97 0.07 3.52E-03 0.0 C26H43NO6 Glycocholic Acid to validate consistent performance of the overall system. The experimental design and 40 357.2780 5.39 0.15 3.03E-03 2.0 C24H36O2 THA 15 features grouped 355.2626 5.13 0.17 8.89E-03 1.0 C24H34O2 delta2-THA Liquid Chromatography 20 run sequence are shown in Figure 2. to one component 170.0810 2.03 0.20 3.44E-04 0.0 C8H11NO3 Norepinephrine, 5-Hydroxydopamine Relative Abundance Relative 0 184.0968 2.69 0.23 2.70E-02 1.0 C9H13NO3 Normetanephrine; Methylnoradrenaline UHPLC separation was implemented on a Thermo Scientific™ Dionex™ UltiMate™ 2.48 NL: pressure gradient) pump using Thermo Scientific™ GOLD™ 205.0967 3.17E9 212.1279 3.74 0.34 4.04E-03 1.0 C11H17NO3 Methoxamine 3000 HPG (high- Hypersil 100 m/z= 226.1433 4.20 0.29 8.10E-03 1.0 C12H19NO3 N-Methylmescaline; Terbutaline  FIGURE 2. UHPLC/MS Experimental Design and Run Sequence. Left, schematic Base Peak 80.0000- RP C18 column at 450 µL/min, column temperature at 55 C. LC solvents were 0.1% 80 11.31 161.0918 2.45 0.35 3.76E-03 2.0 C6H12N2O3 (2S)-4-Acetamido-2-aminobutanoic acid showing the vials and sample names. Right, detailed content and overall time C 6.38 149.0117 1200.0000 224.0915 2.59 0.34 3.56E-02 1.0 C11H13NO4 Acetyl-L-tyrosine – 60 391.2823 MS FA (A) and 0.1% FA in MeOH (B). Apply linear gradient from 0.5 50% B for 5.5 min, 5.88 ZDF_F1_2 146.1173 1.44 0.42 3.54E-02 1.0 C7H15NO2 DL-Aminoheptanoic acid for each step. 496.3373 8.35 followed by increasing to 98% at 6 min, hold 98% B for 6 min, then decrease to 0.5% 40 732.5496 FIGURE 7. Searching MS/MS spectrum of component m/z 184.0607 in mzCloud. at 13 min, then equilibrate for another 2 min. 20 The entry of 4-pyridoxic acid matches the spectrum with the highest score. a b Abundance Relative 6 Step | Preconditioning LC under one polarity (+ or -) 0 Conclusion 1 0 2 4 6 8 10 12 4-Pyridoxic acid_pos_HCD_QE #141 RT: 1.26 AV: 1 NL: 2.70E8 Mass Spectrometry 1. Solvent blank spiked with IS, 5-10X | (check signal) T: FTMS + p ESI Full ms2 [email protected] [50.00-205.00] 166.0500 Control Time (min) 100 3 Solvent 2. QC mixture with IS, 3-5X | (check LC system status) 3h 95 An efficient and robust workflow for untargeted metabolomics is presented here. The The Thermo Scientific™ Q Exactive™ mass spectrometer was operated under sample 90 I blank with IS 3. Solvent blank with IS, 1X | (check carry over) 85 reliable high-resolution, accurate-mass (HR/AM) performance of the Q Exactive LC- 80 with IS #1 • If carry over, wash followed by repeating step 2 & 3 | (clean) electrospray ionization (ESI) positive, negative, and polarity switching modes. Full Comparing Differential Analysis Results from SIEVE 2.1 software and XCMS 75 MS system eliminates the need for technical replicates on biological samples. The • If no carry over, proceed to next step 70 scan (m/z 67–1000) used resolution 70,000 with automatic gain control (AGC) target Online 65 6 60 MS/MS of m/z 184.0606 from sample superior S/N in Orbitrap data allows efficient data reduction in SIEVE 2.1 software, of 110 ions and a maximum ion injection time (IT) of 35 ms. Data-dependent MS/MS 2 7 55 The results from SIEVE 2.1 software are compared to XCMS Online, an academia- 50 mzCloud search and score resulting in much reduced data analysis effort. KEGG pathway visualization allows were acquired on a “Top5” data Treated 45

-dependent mode using the following parameters: QC Step | Experiment I: Positive, R=70K Abundance Relative n sample 40 quick access to biological pathway mapping. The MS spectral library mzCloud 5 Mixture developed open-source software for metabolomics data analysis (Table 1) (2). As 35  II with IS 4. Solvent blank no IS, 1X | (for background subtraction) resolution 17,500; AGC 1 10 ions; maximum IT 80 ms; 2.0 amu isolation window; 30 148.0393 with IS facilitates accurate compound identification. ± #1 5. Pooled plasma with IS, 3X | (check method and column) 14h shown in Figure 5, the significant components (p-value <0.05, ratio > 2.0) identified by 25 normalized collision energy 35% 50%; underfill ratio 1.0%; Apex trigger 2–4 s, and 20 138.0551 184.0607 6. Control sample with IS #1, 1X | SIEVE 2.1 software and XCMS Online are similar, while the SIEVE software started 15 dynamic exclusion 6 s. Source ionization parameters were: spray voltage, 3.8 kV; 4 7. Treated sample with IS #1, 1X | 10  with a much smaller (~20 less) number of components because of its capability to 5 0 capillary temperature, 300 C; and S-Lens level, 45. 8. Run all biological replicates with a randomized order | (all) 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 4-pyridoxic acid Control m/z Solvent Control automatically remove solvent background, thus saving users time and labor in data References sampleControl III blank sample Data Analysis withsample IS review (Figure 6). 1. Wang S.Y., Kuo, C.H., Tseng, Y. J. Anal. Chem., 2013, 85 (2), 1037–1046. NO IS with IS #2with IS Step | Experiment II: MS/MS Positive 15m #2 2. Scripps Center For Metabolomics Home Page. https://xcmsonline.scripps.edu/index.php Differential analyses of the obese versus lean plasma were performed using Thermo 5 # n 9. Pooled plasma, top5 ddMS2 HCD | (dd-MS/MS) Metabolite Identification Guided by mzCloud Scientific™ SIEVE™ 2.1 software, which executes background subtraction, 3. m/zCloud Home Page. https://www.mzcloud.org/home.aspx n An example of SIEVE 2.1 software with KEGG pathway visualization demonstrated Treated 8 The mzCloud is the first cloud-based MS spectral trees library built for small molecule component detection, peak alignment, differential analysis (Figure 1). Statistical results, 9 Pooled Treated with component m/z 161.0918 is shown in Figure 8. All matching metabolites are – sampleTreated Step | Experiment III: Negative, R=70K 14h 4. Muoio, D.M.; Newgard, C.B. Nature Rev. Mol. Cell. Biol. 2008, 9, 193 205. IV plasma sample structural elucidation (3). It contains a standard reference database, an elucidated putative IDs, and pathways were generated through searching ChemSpider and withsample IS 4-8. Run all biological replicates as Experiment I | (all samples) labeled with black dots. with IS with IS KEGG™. Metabolites of interest were identified via MS/MS mass spectral database #2with IS compound database with putative structures, and a database of virtual spectra. KEGG is a trademark of Kyoto University, Kanehisa Lab. All other trademarks are the property of Thermo Fisher #2# n Q Exactive HCD MS/MS spectra of interested metabolites can be searched against Table 2 shows some of the representative metabolites that are significantly changed. matching. The *.raw files were converted to mzXML format using ProteoWizard and Scientific and its subsidiaries. Step | Experiment IV: MS/MS Negative 15m m/zCloud. The matching compounds are scored and the fragment ions are annotated Up-regulated metabolites in fatty vs. lean rat plasma are shown in green while down- also analyzed by XCMS Online [2] to compare the results. 10. Repeat as in Experiment II, step 9 | (dd-MS/MS) This information is not intended to encourage use of these products in any manners that might infringe the accordingly (Figure 7). regulated metabolites are shown in blue (p-value <0.05, ratio >2). intellectual property rights of others.

4 Untargeted Metabolomics Work ow Using UHPLC/Quadrupole Orbitrap Mass Spectrometer and SIEVE 2.1 Software Untargeted Metabolomics Workflow Using UHPLC/Quadrupole Orbitrap Mass Spectrometer and SIEVE 2.1 Software Junhua Wang, David A. Peake, Mark Sanders, Michael Athanas, Yingying Huang Thermo Fisher Scientific, San Jose, CA

FIGURE 1. Untargeted metabolomics workflow FIGURE 3. Metabolomics data analysis. SIEVE 2.1 software, Thermo Scientific™ Table 1. Statics of component m/z 170.081 from SIEVE 2.1 software FIGURE 8. SIEVE 2.1 software with KEGG pathway visualization Overview TraceFinder™ 3.1 software, and mzCloud support untargeted and targeted and XCMS Online Purpose: Demonstrate a generic, integrated workflow for untargeted metabolomics metabolomics workflows. study using a UHPLC/benchtop Thermo Scientific™ Orbitrap™ mass spectrometer SIEVE 2.1 Software XCMS Online Untargeted Targeted and informatics software. Ratio 5.0 Fold change 4.8

Differential Advanced p-value 3.44e-4 p-value 3.4e-4 Component Component Pathway (Statistical) Quantitation Statistical Detection Identification Annotation Introduction Analysis Analysis RT 2.03 min RT 2.03 min Metabolomics is a rapidly growing field of post-genomic biology, aiming to • SIEVE • SIEVE • SIEVE • SIEVE • SIEVE • SimcaP IDs: IDs: comprehensively characterize the small molecules in biological systems. Nonbiological • XCMS • mzCloud • TraceFinder • KEGG • Partek Norepinephrine Norepinephrine systematic biases from instrument calibration or the order of sample injection account • mzMine • METLIN • Cytoscape • GeneData Pyridoxine Pyridoxine (VB6) for the most significant errors in LC/TOF-MS data [1]. Here we present a workflow • R Oxidopiamine, etc. Hydroxydopamin, etc. using a UHPLC/benchtop quardrupole Orbitrap platform and automated data analysis software for untargeted metabolomic profiling of plasma samples for biomarker UHPLC provides fast chromatography for high throughput analysis, the typical peak discovery from the Zucker diabetes fatty (ZDF) rat model. The optimal conditions for width is 4–6 seconds. Our method can baseline resolve Isoleucine and Leucine, sample preparation, liquid chromatography (LC), column, mass spectrometry (MS), generating peak width 1.2 s at FWHM. Refer to Figure 4. For such narrow peaks, FIGURE 5. PCA by SIEVE 2.1 software (A), Numbers in SIEVE 2.1 software (B), and data processing parameters are explored. Q Exactive mass spectrometer operating at 70,000 resolution acquires >15 point across Numbers in XCMS Online (C) XIC the peak without losing sensitivity (A). C p-value: 3.76e-3 ChemSpider ID: 389841 Ratio: 2.86 (2S)-4-Acetamido-2- Methods The QC of each run was performed by monitoring the intensity of IS, d5-hippuric acid B Composition: C6H12N2O3 aminobutanoic acid (B), and the overall base peak (C). When all samples were finished, the selected ion A 38,268 features Sample Preparation Branched amino acids are known hallmarks of Type II Diabetes [3] chromatograms can be quickly viewed with RawMeat 2.1 (a free app with SIEVE 2.1 2,075 components 3,848 Plasma samples were deproteinized with organic solvent. Four extraction solvent software). By inspecting the RT and intensity of the IS, runs with large retention time p<0.05 systems including methanol (MeOH), acetonitrile (ACN), acetone, and 1:1:1 of the drift and bad injections can be excluded from the following data analysis. 403 Results TABLE 2. Representative metabolites that are significantly changed above were tested in this work. Endogenous metabolites were reconstituted in p<0.05 440 „real peaks‟ methanol/water (1:9) containing isotopically labeled internal standard (IS), d5-hippuric Challenges in Untargeted Metabolomics Study [M+H]+ Time Fatty/Lean P-value Δppm Formula Name 206 854.5657 7.25 2.81 3.65E-02 4.0 C50H80NO8P PC44:10 acid for LC-MS analysis. Solvent blank, pooled QC, and biological samples were . Complexity of biological samples FIGURE 4. Method validation, data quality control in metabolomics application 838.6292 8.84 2.48 6.45E-03 3.0 C48H88NO8P PC 40:4 836.6133 8.55 4.54 2.22E-02 0.0 C46H88NO8P PC38:2 analyzed in a randomized injection order. RT: 1.6471 - 1.8824 SM: 7B 212 . Leucine 812.6095 8.58 2.79 3.08E-02 4.0 C46H86NO8P PC 38:3 Diversity of small molecule metabolites: polar and non-polar analytes FWHM 1.5 s; 1.8147 NL: 132.1020 4.76E8 100 Ratio > 2 788.6127 9.04 2.13 3.56E-02 3.0 C44H86NO8P PC36:1 Take an aliquot of 200 µL plasma 7 Hz, Res m/z= 778.5351 7.20 4.15 3.38E-03 0.0 C42H78NO8P PC34:3 . Ionization requires both positive and negative ion 90 132.1012- A 70K 132.1026 768.5498 8.01 2.14 3.17E-04 4.0 C43H78NO8P PC35:4 80 MS . Δ Urine_Junh 764.5196 7.44 3.71 6.37E-04 3.0 C43H74NO8P PC35:6 Wide range of concentrations 70 ppm 0.8 ua_1_10_4 754.5347 7.29 2.67 1.35E-02 3.0 C42H76NO8P PC34:4   60 Isoleucine 834.5936 8.10 2.83 7.13E-04 0.0 C44H83NO13 LacCer(d18:1/14:0) Add 600 µL (3 ) of cold 4 C organic solvent and vortex (3 min) . 1.7255 FIGURE 6. SIEVE 2.1 software output of component m/z 170.081, showing No universal method for chromatographic separation 50 132.1020 836.6041 8.01 2.42 1.10E-03 4.0 C44H85NO13 LacCer(d18:0/14:0) 813.6790 9.72 2.40 3.84E-02 4.0 C47H93N2O6P SM(d18:2/24:0) 40 alignment across different samples (A), adducts grouping (B), peak . 675.5412 7.20 2.29 3.34E-02 3.0 C37H75N2O6P SM(d16:1/16:0)

Multiple sources of variability Abundance Relative 30 integration (C), and trend intensity view (D) 546.3525 5.84 2.33 9.23E-03 4.0 C28H52NO7P LysoPC(20:3) Spin down at 4 C, 4000  g (5 min) 20 . Structure elucidation of unknowns is expensive: lack of synthetic standards 526.2910 5.69 2.42 1.67E-04 3.0 C27H44NO7P LysoPE(22:6) 10 A 153.0521 1.79 2.46 1.56E-03 3.0 C4H12N2S2 Cystamine 0 C 230.0954 0.80 2.53 2.80E-02 1.0 C9H15N3O2S Ergothioneine Preparing for the UHPLC-MS Data Acquisition 1.65 1.70 1.75 1.80 1.85 245.0917 3.56 3.93 2.16E-03 1.0 C13H12N2O3 Haematopodin Take supernatant and divide into 3 portions (each 200 µL), dry down (30–60 min) Time (min) 172.1693 5.10 3.18 4.19E-02 1.0 C10H21NO decanamide RT: 0.00 - 12.01 SM: 7B 302.3043 5.48 2.28 2.50E-02 3.0 C18H39NO2 Sphinganine Prior to the real samples analysis, a solvent blank with internal standard (IS) is injected 2.83 NL: 185.0966 1.94E8 204.1228 3.75 3.10 1.83E-02 1.0 C9H17NO4 Acetylcarnitine at the beginning to check the solvent and the LC-MS status. The injections of the real 100 m/z= 232.1534 2.26 4.49 1.99E-03 2.0 C11H21NO4 Butyryl-L-carnitine 185.0957- 344.2786 5.31 2.13 1.57E-02 2.0 C19H37NO4 Lauroylcarnitine Reconstitute in 400 µL (9:1) H2O:CH3OH containing IS, vortex, and sonicate (2 min) samples should be randomized in order to eliminate systematic bias. Triplicate 80 B 185.0975 B 400.3409 5.53 2.10 3.49E-04 2.0 C23H45NO4 Palmitoyl-L-carnitine injections of the pooled plasma are intermittently repeated throughout the whole batch 60 IS=d5-hippuric acid MS ZDF_F1_2 D 466.3160 4.97 0.07 3.52E-03 0.0 C26H43NO6 Glycocholic Acid to validate consistent performance of the overall system. The experimental design and 40 357.2780 5.39 0.15 3.03E-03 2.0 C24H36O2 THA 15 features grouped 355.2626 5.13 0.17 8.89E-03 1.0 C24H34O2 delta2-THA Liquid Chromatography 20 run sequence are shown in Figure 2. to one component 170.0810 2.03 0.20 3.44E-04 0.0 C8H11NO3 Norepinephrine, 5-Hydroxydopamine Relative Abundance Relative 0 184.0968 2.69 0.23 2.70E-02 1.0 C9H13NO3 Normetanephrine; Methylnoradrenaline UHPLC separation was implemented on a Thermo Scientific™ Dionex™ UltiMate™ 2.48 NL: pressure gradient) pump using Thermo Scientific™ GOLD™ 205.0967 3.17E9 212.1279 3.74 0.34 4.04E-03 1.0 C11H17NO3 Methoxamine 3000 HPG (high- Hypersil 100 m/z= 226.1433 4.20 0.29 8.10E-03 1.0 C12H19NO3 N-Methylmescaline; Terbutaline  FIGURE 2. UHPLC/MS Experimental Design and Run Sequence. Left, schematic Base Peak 80.0000- RP C18 column at 450 µL/min, column temperature at 55 C. LC solvents were 0.1% 80 11.31 161.0918 2.45 0.35 3.76E-03 2.0 C6H12N2O3 (2S)-4-Acetamido-2-aminobutanoic acid showing the vials and sample names. Right, detailed content and overall time C 6.38 149.0117 1200.0000 224.0915 2.59 0.34 3.56E-02 1.0 C11H13NO4 Acetyl-L-tyrosine – 60 391.2823 MS FA (A) and 0.1% FA in MeOH (B). Apply linear gradient from 0.5 50% B for 5.5 min, 5.88 ZDF_F1_2 146.1173 1.44 0.42 3.54E-02 1.0 C7H15NO2 DL-Aminoheptanoic acid for each step. 496.3373 8.35 followed by increasing to 98% at 6 min, hold 98% B for 6 min, then decrease to 0.5% 40 732.5496 FIGURE 7. Searching MS/MS spectrum of component m/z 184.0607 in mzCloud. at 13 min, then equilibrate for another 2 min. 20 The entry of 4-pyridoxic acid matches the spectrum with the highest score. a b Abundance Relative 6 Step | Preconditioning LC under one polarity (+ or -) 0 Conclusion 1 0 2 4 6 8 10 12 4-Pyridoxic acid_pos_HCD_QE #141 RT: 1.26 AV: 1 NL: 2.70E8 Mass Spectrometry 1. Solvent blank spiked with IS, 5-10X | (check signal) T: FTMS + p ESI Full ms2 [email protected] [50.00-205.00] 166.0500 Control Time (min) 100 3 Solvent 2. QC mixture with IS, 3-5X | (check LC system status) 3h 95 An efficient and robust workflow for untargeted metabolomics is presented here. The The Thermo Scientific™ Q Exactive™ mass spectrometer was operated under sample 90 I blank with IS 3. Solvent blank with IS, 1X | (check carry over) 85 reliable high-resolution, accurate-mass (HR/AM) performance of the Q Exactive LC- 80 with IS #1 • If carry over, wash followed by repeating step 2 & 3 | (clean) electrospray ionization (ESI) positive, negative, and polarity switching modes. Full Comparing Differential Analysis Results from SIEVE 2.1 software and XCMS 75 MS system eliminates the need for technical replicates on biological samples. The • If no carry over, proceed to next step 70 scan (m/z 67–1000) used resolution 70,000 with automatic gain control (AGC) target Online 65 6 60 MS/MS of m/z 184.0606 from sample superior S/N in Orbitrap data allows efficient data reduction in SIEVE 2.1 software, of 110 ions and a maximum ion injection time (IT) of 35 ms. Data-dependent MS/MS 2 7 55 The results from SIEVE 2.1 software are compared to XCMS Online, an academia- 50 mzCloud search and score resulting in much reduced data analysis effort. KEGG pathway visualization allows were acquired on a “Top5” data Treated 45

-dependent mode using the following parameters: QC Step | Experiment I: Positive, R=70K Abundance Relative n sample 40 quick access to biological pathway mapping. The MS spectral library mzCloud 5 Mixture developed open-source software for metabolomics data analysis (Table 1) (2). As 35  II with IS 4. Solvent blank no IS, 1X | (for background subtraction) resolution 17,500; AGC 1 10 ions; maximum IT 80 ms; 2.0 amu isolation window; 30 148.0393 with IS facilitates accurate compound identification. ± #1 5. Pooled plasma with IS, 3X | (check method and column) 14h shown in Figure 5, the significant components (p-value <0.05, ratio > 2.0) identified by 25 normalized collision energy 35% 50%; underfill ratio 1.0%; Apex trigger 2–4 s, and 20 138.0551 184.0607 6. Control sample with IS #1, 1X | SIEVE 2.1 software and XCMS Online are similar, while the SIEVE software started 15 dynamic exclusion 6 s. Source ionization parameters were: spray voltage, 3.8 kV; 4 7. Treated sample with IS #1, 1X | 10  with a much smaller (~20 less) number of components because of its capability to 5 0 capillary temperature, 300 C; and S-Lens level, 45. 8. Run all biological replicates with a randomized order | (all) 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 4-pyridoxic acid Control m/z Solvent Control automatically remove solvent background, thus saving users time and labor in data References sampleControl III blank sample Data Analysis withsample IS review (Figure 6). 1. Wang S.Y., Kuo, C.H., Tseng, Y. J. Anal. Chem., 2013, 85 (2), 1037–1046. NO IS with IS #2with IS Step | Experiment II: MS/MS Positive 15m #2 2. Scripps Center For Metabolomics Home Page. https://xcmsonline.scripps.edu/index.php Differential analyses of the obese versus lean plasma were performed using Thermo 5 # n 9. Pooled plasma, top5 ddMS2 HCD | (dd-MS/MS) Metabolite Identification Guided by mzCloud Scientific™ SIEVE™ 2.1 software, which executes background subtraction, 3. m/zCloud Home Page. https://www.mzcloud.org/home.aspx n An example of SIEVE 2.1 software with KEGG pathway visualization demonstrated Treated 8 The mzCloud is the first cloud-based MS spectral trees library built for small molecule component detection, peak alignment, differential analysis (Figure 1). Statistical results, 9 Pooled Treated with component m/z 161.0918 is shown in Figure 8. All matching metabolites are – sampleTreated Step | Experiment III: Negative, R=70K 14h 4. Muoio, D.M.; Newgard, C.B. Nature Rev. Mol. Cell. Biol. 2008, 9, 193 205. IV plasma sample structural elucidation (3). It contains a standard reference database, an elucidated putative IDs, and pathways were generated through searching ChemSpider and withsample IS 4-8. Run all biological replicates as Experiment I | (all samples) labeled with black dots. with IS with IS KEGG™. Metabolites of interest were identified via MS/MS mass spectral database #2with IS compound database with putative structures, and a database of virtual spectra. KEGG is a trademark of Kyoto University, Kanehisa Lab. All other trademarks are the property of Thermo Fisher #2# n Q Exactive HCD MS/MS spectra of interested metabolites can be searched against Table 2 shows some of the representative metabolites that are significantly changed. matching. The *.raw files were converted to mzXML format using ProteoWizard and Scientific and its subsidiaries. Step | Experiment IV: MS/MS Negative 15m m/zCloud. The matching compounds are scored and the fragment ions are annotated Up-regulated metabolites in fatty vs. lean rat plasma are shown in green while down- also analyzed by XCMS Online [2] to compare the results. 10. Repeat as in Experiment II, step 9 | (dd-MS/MS) This information is not intended to encourage use of these products in any manners that might infringe the accordingly (Figure 7). regulated metabolites are shown in blue (p-value <0.05, ratio >2). intellectual property rights of others.

Thermo Scienti c Poster Note • PN ASMS13_T382_JWang_E 07/13S 5 Untargeted Metabolomics Workflow Using UHPLC/Quadrupole Orbitrap Mass Spectrometer and SIEVE 2.1 Software Junhua Wang, David A. Peake, Mark Sanders, Michael Athanas, Yingying Huang Thermo Fisher Scientific, San Jose, CA

FIGURE 1. Untargeted metabolomics workflow FIGURE 3. Metabolomics data analysis. SIEVE 2.1 software, Thermo Scientific™ Table 1. Statics of component m/z 170.081 from SIEVE 2.1 software FIGURE 8. SIEVE 2.1 software with KEGG pathway visualization Overview TraceFinder™ 3.1 software, and mzCloud support untargeted and targeted and XCMS Online Purpose: Demonstrate a generic, integrated workflow for untargeted metabolomics metabolomics workflows. study using a UHPLC/benchtop Thermo Scientific™ Orbitrap™ mass spectrometer SIEVE 2.1 Software XCMS Online Untargeted Targeted and informatics software. Ratio 5.0 Fold change 4.8

Differential Advanced p-value 3.44e-4 p-value 3.4e-4 Component Component Pathway (Statistical) Quantitation Statistical Detection Identification Annotation Introduction Analysis Analysis RT 2.03 min RT 2.03 min Metabolomics is a rapidly growing field of post-genomic biology, aiming to • SIEVE • SIEVE • SIEVE • SIEVE • SIEVE • SimcaP IDs: IDs: comprehensively characterize the small molecules in biological systems. Nonbiological • XCMS • mzCloud • TraceFinder • KEGG • Partek Norepinephrine Norepinephrine systematic biases from instrument calibration or the order of sample injection account • mzMine • METLIN • Cytoscape • GeneData Pyridoxine Pyridoxine (VB6) for the most significant errors in LC/TOF-MS data [1]. Here we present a workflow • R Oxidopiamine, etc. Hydroxydopamin, etc. using a UHPLC/benchtop quardrupole Orbitrap platform and automated data analysis software for untargeted metabolomic profiling of plasma samples for biomarker UHPLC provides fast chromatography for high throughput analysis, the typical peak discovery from the Zucker diabetes fatty (ZDF) rat model. The optimal conditions for width is 4–6 seconds. Our method can baseline resolve Isoleucine and Leucine, sample preparation, liquid chromatography (LC), column, mass spectrometry (MS), generating peak width 1.2 s at FWHM. Refer to Figure 4. For such narrow peaks, FIGURE 5. PCA by SIEVE 2.1 software (A), Numbers in SIEVE 2.1 software (B), and data processing parameters are explored. Q Exactive mass spectrometer operating at 70,000 resolution acquires >15 point across Numbers in XCMS Online (C) XIC the peak without losing sensitivity (A). C p-value: 3.76e-3 ChemSpider ID: 389841 Ratio: 2.86 (2S)-4-Acetamido-2- Methods The QC of each run was performed by monitoring the intensity of IS, d5-hippuric acid B Composition: C6H12N2O3 aminobutanoic acid (B), and the overall base peak (C). When all samples were finished, the selected ion A 38,268 features Sample Preparation Branched amino acids are known hallmarks of Type II Diabetes [3] chromatograms can be quickly viewed with RawMeat 2.1 (a free app with SIEVE 2.1 2,075 components 3,848 Plasma samples were deproteinized with organic solvent. Four extraction solvent software). By inspecting the RT and intensity of the IS, runs with large retention time p<0.05 systems including methanol (MeOH), acetonitrile (ACN), acetone, and 1:1:1 of the drift and bad injections can be excluded from the following data analysis. 403 Results TABLE 2. Representative metabolites that are significantly changed above were tested in this work. Endogenous metabolites were reconstituted in p<0.05 440 „real peaks‟ methanol/water (1:9) containing isotopically labeled internal standard (IS), d5-hippuric Challenges in Untargeted Metabolomics Study [M+H]+ Time Fatty/Lean P-value Δppm Formula Name 206 854.5657 7.25 2.81 3.65E-02 4.0 C50H80NO8P PC44:10 acid for LC-MS analysis. Solvent blank, pooled QC, and biological samples were . Complexity of biological samples FIGURE 4. Method validation, data quality control in metabolomics application 838.6292 8.84 2.48 6.45E-03 3.0 C48H88NO8P PC 40:4 836.6133 8.55 4.54 2.22E-02 0.0 C46H88NO8P PC38:2 analyzed in a randomized injection order. RT: 1.6471 - 1.8824 SM: 7B 212 . Leucine 812.6095 8.58 2.79 3.08E-02 4.0 C46H86NO8P PC 38:3 Diversity of small molecule metabolites: polar and non-polar analytes FWHM 1.5 s; 1.8147 NL: 132.1020 4.76E8 100 Ratio > 2 788.6127 9.04 2.13 3.56E-02 3.0 C44H86NO8P PC36:1 Take an aliquot of 200 µL plasma 7 Hz, Res m/z= 778.5351 7.20 4.15 3.38E-03 0.0 C42H78NO8P PC34:3 . Ionization requires both positive and negative ion 90 132.1012- A 70K 132.1026 768.5498 8.01 2.14 3.17E-04 4.0 C43H78NO8P PC35:4 80 MS . Δ Urine_Junh 764.5196 7.44 3.71 6.37E-04 3.0 C43H74NO8P PC35:6 Wide range of concentrations 70 ppm 0.8 ua_1_10_4 754.5347 7.29 2.67 1.35E-02 3.0 C42H76NO8P PC34:4   60 Isoleucine 834.5936 8.10 2.83 7.13E-04 0.0 C44H83NO13 LacCer(d18:1/14:0) Add 600 µL (3 ) of cold 4 C organic solvent and vortex (3 min) . 1.7255 FIGURE 6. SIEVE 2.1 software output of component m/z 170.081, showing No universal method for chromatographic separation 50 132.1020 836.6041 8.01 2.42 1.10E-03 4.0 C44H85NO13 LacCer(d18:0/14:0) 813.6790 9.72 2.40 3.84E-02 4.0 C47H93N2O6P SM(d18:2/24:0) 40 alignment across different samples (A), adducts grouping (B), peak . 675.5412 7.20 2.29 3.34E-02 3.0 C37H75N2O6P SM(d16:1/16:0)

Multiple sources of variability Abundance Relative 30 integration (C), and trend intensity view (D) 546.3525 5.84 2.33 9.23E-03 4.0 C28H52NO7P LysoPC(20:3) Spin down at 4 C, 4000  g (5 min) 20 . Structure elucidation of unknowns is expensive: lack of synthetic standards 526.2910 5.69 2.42 1.67E-04 3.0 C27H44NO7P LysoPE(22:6) 10 A 153.0521 1.79 2.46 1.56E-03 3.0 C4H12N2S2 Cystamine 0 C 230.0954 0.80 2.53 2.80E-02 1.0 C9H15N3O2S Ergothioneine Preparing for the UHPLC-MS Data Acquisition 1.65 1.70 1.75 1.80 1.85 245.0917 3.56 3.93 2.16E-03 1.0 C13H12N2O3 Haematopodin Take supernatant and divide into 3 portions (each 200 µL), dry down (30–60 min) Time (min) 172.1693 5.10 3.18 4.19E-02 1.0 C10H21NO decanamide RT: 0.00 - 12.01 SM: 7B 302.3043 5.48 2.28 2.50E-02 3.0 C18H39NO2 Sphinganine Prior to the real samples analysis, a solvent blank with internal standard (IS) is injected 2.83 NL: 185.0966 1.94E8 204.1228 3.75 3.10 1.83E-02 1.0 C9H17NO4 Acetylcarnitine at the beginning to check the solvent and the LC-MS status. The injections of the real 100 m/z= 232.1534 2.26 4.49 1.99E-03 2.0 C11H21NO4 Butyryl-L-carnitine 185.0957- 344.2786 5.31 2.13 1.57E-02 2.0 C19H37NO4 Lauroylcarnitine Reconstitute in 400 µL (9:1) H2O:CH3OH containing IS, vortex, and sonicate (2 min) samples should be randomized in order to eliminate systematic bias. Triplicate 80 B 185.0975 B 400.3409 5.53 2.10 3.49E-04 2.0 C23H45NO4 Palmitoyl-L-carnitine injections of the pooled plasma are intermittently repeated throughout the whole batch 60 IS=d5-hippuric acid MS ZDF_F1_2 D 466.3160 4.97 0.07 3.52E-03 0.0 C26H43NO6 Glycocholic Acid to validate consistent performance of the overall system. The experimental design and 40 357.2780 5.39 0.15 3.03E-03 2.0 C24H36O2 THA 15 features grouped 355.2626 5.13 0.17 8.89E-03 1.0 C24H34O2 delta2-THA Liquid Chromatography 20 run sequence are shown in Figure 2. to one component 170.0810 2.03 0.20 3.44E-04 0.0 C8H11NO3 Norepinephrine, 5-Hydroxydopamine Relative Abundance Relative 0 184.0968 2.69 0.23 2.70E-02 1.0 C9H13NO3 Normetanephrine; Methylnoradrenaline UHPLC separation was implemented on a Thermo Scientific™ Dionex™ UltiMate™ 2.48 NL: pressure gradient) pump using Thermo Scientific™ GOLD™ 205.0967 3.17E9 212.1279 3.74 0.34 4.04E-03 1.0 C11H17NO3 Methoxamine 3000 HPG (high- Hypersil 100 m/z= 226.1433 4.20 0.29 8.10E-03 1.0 C12H19NO3 N-Methylmescaline; Terbutaline  FIGURE 2. UHPLC/MS Experimental Design and Run Sequence. Left, schematic Base Peak 80.0000- RP C18 column at 450 µL/min, column temperature at 55 C. LC solvents were 0.1% 80 11.31 161.0918 2.45 0.35 3.76E-03 2.0 C6H12N2O3 (2S)-4-Acetamido-2-aminobutanoic acid showing the vials and sample names. Right, detailed content and overall time C 6.38 149.0117 1200.0000 224.0915 2.59 0.34 3.56E-02 1.0 C11H13NO4 Acetyl-L-tyrosine – 60 391.2823 MS FA (A) and 0.1% FA in MeOH (B). Apply linear gradient from 0.5 50% B for 5.5 min, 5.88 ZDF_F1_2 146.1173 1.44 0.42 3.54E-02 1.0 C7H15NO2 DL-Aminoheptanoic acid for each step. 496.3373 8.35 followed by increasing to 98% at 6 min, hold 98% B for 6 min, then decrease to 0.5% 40 732.5496 FIGURE 7. Searching MS/MS spectrum of component m/z 184.0607 in mzCloud. at 13 min, then equilibrate for another 2 min. 20 The entry of 4-pyridoxic acid matches the spectrum with the highest score. a b Abundance Relative 6 Step | Preconditioning LC under one polarity (+ or -) 0 Conclusion 1 0 2 4 6 8 10 12 4-Pyridoxic acid_pos_HCD_QE #141 RT: 1.26 AV: 1 NL: 2.70E8 Mass Spectrometry 1. Solvent blank spiked with IS, 5-10X | (check signal) T: FTMS + p ESI Full ms2 [email protected] [50.00-205.00] 166.0500 Control Time (min) 100 3 Solvent 2. QC mixture with IS, 3-5X | (check LC system status) 3h 95 An efficient and robust workflow for untargeted metabolomics is presented here. The The Thermo Scientific™ Q Exactive™ mass spectrometer was operated under sample 90 I blank with IS 3. Solvent blank with IS, 1X | (check carry over) 85 reliable high-resolution, accurate-mass (HR/AM) performance of the Q Exactive LC- 80 with IS #1 • If carry over, wash followed by repeating step 2 & 3 | (clean) electrospray ionization (ESI) positive, negative, and polarity switching modes. Full Comparing Differential Analysis Results from SIEVE 2.1 software and XCMS 75 MS system eliminates the need for technical replicates on biological samples. The • If no carry over, proceed to next step 70 scan (m/z 67–1000) used resolution 70,000 with automatic gain control (AGC) target Online 65 6 60 MS/MS of m/z 184.0606 from sample superior S/N in Orbitrap data allows efficient data reduction in SIEVE 2.1 software, of 110 ions and a maximum ion injection time (IT) of 35 ms. Data-dependent MS/MS 2 7 55 The results from SIEVE 2.1 software are compared to XCMS Online, an academia- 50 mzCloud search and score resulting in much reduced data analysis effort. KEGG pathway visualization allows were acquired on a “Top5” data Treated 45

-dependent mode using the following parameters: QC Step | Experiment I: Positive, R=70K Abundance Relative n sample 40 quick access to biological pathway mapping. The MS spectral library mzCloud 5 Mixture developed open-source software for metabolomics data analysis (Table 1) (2). As 35  II with IS 4. Solvent blank no IS, 1X | (for background subtraction) resolution 17,500; AGC 1 10 ions; maximum IT 80 ms; 2.0 amu isolation window; 30 148.0393 with IS facilitates accurate compound identification. ± #1 5. Pooled plasma with IS, 3X | (check method and column) 14h shown in Figure 5, the significant components (p-value <0.05, ratio > 2.0) identified by 25 normalized collision energy 35% 50%; underfill ratio 1.0%; Apex trigger 2–4 s, and 20 138.0551 184.0607 6. Control sample with IS #1, 1X | SIEVE 2.1 software and XCMS Online are similar, while the SIEVE software started 15 dynamic exclusion 6 s. Source ionization parameters were: spray voltage, 3.8 kV; 4 7. Treated sample with IS #1, 1X | 10  with a much smaller (~20 less) number of components because of its capability to 5 0 capillary temperature, 300 C; and S-Lens level, 45. 8. Run all biological replicates with a randomized order | (all) 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 4-pyridoxic acid Control m/z Solvent Control automatically remove solvent background, thus saving users time and labor in data References sampleControl III blank sample Data Analysis withsample IS review (Figure 6). 1. Wang S.Y., Kuo, C.H., Tseng, Y. J. Anal. Chem., 2013, 85 (2), 1037–1046. NO IS with IS #2with IS Step | Experiment II: MS/MS Positive 15m #2 2. Scripps Center For Metabolomics Home Page. https://xcmsonline.scripps.edu/index.php Differential analyses of the obese versus lean plasma were performed using Thermo 5 # n 9. Pooled plasma, top5 ddMS2 HCD | (dd-MS/MS) Metabolite Identification Guided by mzCloud Scientific™ SIEVE™ 2.1 software, which executes background subtraction, 3. m/zCloud Home Page. https://www.mzcloud.org/home.aspx n An example of SIEVE 2.1 software with KEGG pathway visualization demonstrated Treated 8 The mzCloud is the first cloud-based MS spectral trees library built for small molecule component detection, peak alignment, differential analysis (Figure 1). Statistical results, 9 Pooled Treated with component m/z 161.0918 is shown in Figure 8. All matching metabolites are – sampleTreated Step | Experiment III: Negative, R=70K 14h 4. Muoio, D.M.; Newgard, C.B. Nature Rev. Mol. Cell. Biol. 2008, 9, 193 205. IV plasma sample structural elucidation (3). It contains a standard reference database, an elucidated putative IDs, and pathways were generated through searching ChemSpider and withsample IS 4-8. Run all biological replicates as Experiment I | (all samples) labeled with black dots. with IS with IS KEGG™. Metabolites of interest were identified via MS/MS mass spectral database #2with IS compound database with putative structures, and a database of virtual spectra. KEGG is a trademark of Kyoto University, Kanehisa Lab. All other trademarks are the property of Thermo Fisher #2# n Q Exactive HCD MS/MS spectra of interested metabolites can be searched against Table 2 shows some of the representative metabolites that are significantly changed. matching. The *.raw files were converted to mzXML format using ProteoWizard and Scientific and its subsidiaries. Step | Experiment IV: MS/MS Negative 15m m/zCloud. The matching compounds are scored and the fragment ions are annotated Up-regulated metabolites in fatty vs. lean rat plasma are shown in green while down- also analyzed by XCMS Online [2] to compare the results. 10. Repeat as in Experiment II, step 9 | (dd-MS/MS) This information is not intended to encourage use of these products in any manners that might infringe the accordingly (Figure 7). regulated metabolites are shown in blue (p-value <0.05, ratio >2). intellectual property rights of others.

6 Untargeted Metabolomics Work ow Using UHPLC/Quadrupole Orbitrap Mass Spectrometer and SIEVE 2.1 Software www.thermoscientific.com ©2013 Thermo Fisher Scientific Inc. All rights reserved. ISO is a trademark of the International Standards Organization. KEGG Thermo Fisher Scientific, is a trademark of Kyoto University, Kanehisa Lab. All other trademarks are the property of Thermo Fisher Scientific, Inc. and its San Jose, CA USA subsidiaries. Specifications, terms and pricing are subject to change. Not all products are available in all countries. Please consult is ISO 9001:2008 Certified. your local sales representative for details.

Africa-Other +27 11 570 1840 Europe-Other +43 1 333 50 34 0 Japan +81 45 453 9100 Spain +34 914 845 965 Australia +61 3 9757 4300 Finland/Norway/Sweden Latin America +1 561 688 8700 Switzerland +41 61 716 77 00 Austria +43 1 333 50 34 0 +46 8 556 468 00 Middle East +43 1 333 50 34 0 UK +44 1442 233555 Belgium +32 53 73 42 41 France +33 1 60 92 48 00 Netherlands +31 76 579 55 55 USA +1 800 532 4752 Canada +1 800 530 8447 Germany +49 6103 408 1014 New Zealand +64 9 980 6700 China +86 10 8419 3588 India +91 22 6742 9434 Russia/CIS +43 1 333 50 34 0 Denmark +45 70 23 62 60 Italy +39 02 950 591 South Africa +27 11 570 1840 ASMS13_T382_JWang_E 07/13S