RmassBank Decelopments: XCMS+CAMERA=MetShot

Steffen Neumann, Erik Müller

Leibniz Institute of Plant Biochemistry

ThOB 13.6.2013

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Metabolomics – The Pipeline

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013

Outline

1 Acquisition and Processing of LC-MS/MS

2 Summary & Outlook

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Outline

1 Acquisition and Processing of LC-MS/MS

2 Summary & Outlook

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Instrument Data-dependent MS/MS

Instrument Auto-MS/MS Full scan first

Then: MS/MS of most intense 800 peak(s) Online 600 m/z Problems: 400 Boring peaks: plasticiser, solvent, . . . 200 Molecular ion / Biologically relevant peaks may not be 0 500 1000 1500 2000 the most intense RT DDIT Noise spectra / Too few / too precursor selection in 8 LC-MS/MS runs many peaks

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Instrument Data-dependent MS/MS

Instrument Auto-MS/MS Full scan first Then: MS/MS of most intense peak(s) Online Problems: Boring peaks: plasticiser, solvent, . . . Molecular ion / Biologically relevant peaks may not be the most intense Noise spectra / Too few / too many peaks

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Instrument Data-dependent MS/MS

Instrument Auto-MS/MS Full scan first Then: MS/MS of most intense peak(s) Online Problems: Boring peaks: plasticiser, solvent, . . . Molecular ion / Biologically relevant peaks may not be the most intense Noise spectra / Too few / too many peaks

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Instrument Data-dependent MS/MS

Instrument Auto-MS/MS Full scan first Then: MS/MS of most intense peak(s) Online Problems: Boring peaks: plasticiser, solvent, . . . Molecular ion / Biologically relevant peaks may not be the most intense Noise spectra / Too few / too many peaks

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Single spectrum extraction

m/z accuracy: Low intensity signals

→ low m/z accuracy ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●

285.080 20 ppm ● ● ● ●●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● m/z ● ● ● ● Spectra merging ● ● → improves m/z accuracy 285.065 530 535 540 545 550 ●● ● ● Co-elution / co-isolation: ● ● ● ● ● ● 60000 MS/MS spectrum contains ● ● Intensity ● ● ● ● ●

20000 ● fragment peaks from more ● ● ● ●● ● ●●● ●●●●●●●●● ●●●●●●●●● ● 0

than one metabolite 530 535 540 545 550 Seconds Poor identification

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Single spectrum extraction

m/z accuracy:

Low intensity signals 50 → low m/z accuracy Counts 95 40 89 Spectra merging 83 77 → improves m/z accuracy 30 72 66 60 54

Co-elution / co-isolation: dppm 20 48 42 MS/MS spectrum contains 36 30 10 24 fragment peaks from more 19 13 than one metabolite 7 0 1 Poor identification 4 6 8 10 12 14 log(intensity)

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Acquisition of LC-MS/MS

Manual LC-MS/MS acquisition: Data-dependent MS/MS: Choose m/z – RT window(s) Selects most intense peak(s) → build instrument method(s) Low quality spectra: 2nd run/injection for MS/MS “wrong” precursor(s) Manual extraction m/z accuracy effective MS1 scan rate of MS/MS spectrum reduced Controlled quality data Online, little control Offline, low throughput

Choice of two approaches with different drawbacks

MetShot is a hybrid approach, where the instrument methods are created automatically after the profiling and statistics

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 MetShot: nearline data-dependent tandem MS

   

Peaks of Interest Library Search LC/MS data MS/MS data de-Novo Identification XCMS MS/MS Method XCMS centWave Generation Peak Table  Peak Tables

Statistics CAMERA

CAMERA

MS/MS Spectra

Steffen Neumann, Andrea Thum and Christoph Böttcher: Nearline acquisition and processing of liquid chromatography-tandem mass spectrometry data. Metabolomics (2012)

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Scheduling of MS/MS spectra

MSMSpos_Col0_1_10eV_1−A,1_01_9844

“Features of Interest” list from LC-MS profiling 800 Optimal packing of LC-MS/MS: 600 m/z most important first

without RT overlap 400 method files for Bruker,

Waters, Agilent, TraML XML 200 Ordered by p-Value 0 200 400 600 800

(biological relevance) Seconds

⇒ Only few LC-MS/MS runs to cover many peaks R package: http://msbi.ipb-halle.de/msbi/MetShot

Steffen Neumann, Andrea Thum and Christoph Böttcher: Nearline acquisition and processing of liquid chromatography-tandem mass spectrometry data. Metabolomics (2012)

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Scheduling of MS/MS spectra

“Features of Interest” list

from LC-MS profiling 800 Optimal packing of

LC-MS/MS: 600 m/z most important first

without RT overlap 400 method files for Bruker, Waters, Agilent, TraML XML 200 Ordered by p-Value 0 200 400 600 800

(biological relevance) RT

⇒ Only few LC-MS/MS runs to cover many peaks R package: http://msbi.ipb-halle.de/msbi/MetShot

Steffen Neumann, Andrea Thum and Christoph Böttcher: Nearline acquisition and processing of liquid chromatography-tandem mass spectrometry data. Metabolomics (2012)

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 MetShot: nearline data-dependent tandem MS

   

Peaks of Interest Library Search LC/MS data MS/MS data de-Novo Identification XCMS MS/MS Method XCMS centWave Generation Peak Table  Peak Tables

Statistics CAMERA

CAMERA

MS/MS Spectra

Steffen Neumann, Andrea Thum and Christoph Böttcher: Nearline acquisition and processing of liquid chromatography-tandem mass spectrometry data. Metabolomics (2012)

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Collect and Assemble LC-MS and LC-MS/MS

Feature detection with XCMS in LC-MS/MS raw data Collect compound spectra with CAMERA Assign spectra to nearest precursor peak of interest

Steffen Neumann, Andrea Thum and Christoph Böttcher: Nearline acquisition and processing of liquid chromatography-tandem mass spectrometry data. Metabolomics (2012)

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Evaluation of extracted spectra

Xanthohumol 15 eV Xanthotoxin 15 eV

Sample Sample PB005728 PB006163

Mixture of 27 standards Manual spectra extraction

Intensity with vendor software, Intensity available in MassBank MassBank spectral score automatic vs. manual −1000 −500 0 500 1000 −1000 −500 0 500 1000 200 250Score >0.95 300 for 350 25 of 27 120 140 160 180 200 220 mz mz

Naringin 15 eV Bergapten 15 eV

Sample Sample PB005723 PB006144 Steffen Neumann, Andrea Thum and Christoph Böttcher: Nearline acquisition and processing of liquid chromatography-tandem mass spectrometry data. Metabolomics (2012)

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Intensity Intensity −1000 −500 0 500 1000 −1000 −500 0 500 1000

150 200 250 300 350 400 120 140 160 180 200 220

mz mz

Rotenone 15 eV Cinchonine 15 eV

Sample Sample PB005742 PB005842 Intensity Intensity −1000 −500 0 500 1000 −1000 −500 0 500 1000

150 200 250 300 350 400 150 200 250 300

mz mz

Anisic acid 15 eV Erythromycin 15 eV

Sample Sample PB006062 PB005802 Intensity Intensity −1000 −500 0 500 1000 −1000 −500 0 500 1000

120 125 130 135 200 300 400 500 600 700

mz mz Xanthohumol 15 eV Xanthotoxin 15 eV

Sample Sample PB005728 PB006163 Intensity Intensity

−1000 −500Evaluation 0 500 1000 of extracted spectra−1000 −500 0 500 1000 200 250 300 350 120 140 160 180 200 220

mz mz

Naringin 15 eV Bergapten 15 eV

Sample Sample PB005723 PB006144

Mixture of 27 standards Manual spectra extraction

Intensity with vendor software, Intensity available in MassBank MassBank spectral score automatic vs. manual −1000 −500 0 500 1000 −1000 −500 0 500 1000 150 200 250Score 300 >0.95 350 400 for 25 of 27 120 140 160 180 200 220 mz mz

Rotenone 15 eV Cinchonine 15 eV

Sample Sample PB005742 PB005842 Steffen Neumann, Andrea Thum and Christoph Böttcher: Nearline acquisition and processing of liquid chromatography-tandem mass spectrometry data. Metabolomics (2012)

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Intensity Intensity −1000 −500 0 500 1000 −1000 −500 0 500 1000

150 200 250 300 350 400 150 200 250 300

mz mz

Anisic acid 15 eV Erythromycin 15 eV

Sample Sample PB006062 PB005802 Intensity Intensity −1000 −500 0 500 1000 −1000 −500 0 500 1000

120 125 130 135 200 300 400 500 600 700

mz mz S4: Tandem Mass Spectra of Natural Product Standards

Table 2: Similarity scores between the automatically extracted tandem mass spectra Evaluation of extracted spectraand the reference spectra deposited in MassBank. The similarity score is based on the modified cosine distance used in the MassBank system. Sheet2 Compound Name MassBank Score Anisic acid PB006062 0.9981 Xanthotoxin PB006163 0.9966 Genistein PB005713 0.9953 Amentoflavone PB006304 0.9934 Xanthohumol PB005728 0.9919 Biochanin A PB005708 0.9906 Cinchonine PB005842 0.9902 Mixture of 27 standards PB005703 0.9889 4-Methylumbelliferyl glucoronide PB006265 0.9888 Manual spectra extraction PB005942 0.9882 Chlorogenic Acid PB006181 0.9872 with vendor software, PB005881 0.9864 (+/-) Salsolinol PB006041 0.9861 available in MassBank Rotenone PB005742 0.9849 Naringin PB005723 0.9832 MassBank spectral score Indole-3-acetyl-L-valine PB005789 0.9831 Rutin PB006202 0.9828 automatic vs. manual Indole-3-carboxylic acid PB005786 0.9803 Reserpine PB005761 0.9789 Hesperidin methyl chalcone PB006242 0.9771 Score >0.95 for 25 of 27 Chelidonine PB005902 0.9740 Erythromycin PB005802 0.9692 3,4,5-Trimethoxycinnamic acid PB005822 0.9679 PB006205 0.9654 Bergapten PB006144 0.9527 Indole-3-acetonitrile PB006045 0.9440 Emetine PB005910 0.8247 Steffen Neumann, Andrea Thum and Christoph Böttcher: Nearline acquisition and processing of liquid chromatography-tandem mass spectrometry data. Metabolomics (2012)

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013

5

Page 5 MetShot: nearline data-dependent tandem MS

   

Peaks of Interest Library Search LC/MS data MS/MS data de-Novo Identification XCMS MS/MS Method XCMS centWave Generation Peak Table  Peak Tables

Statistics CAMERA

CAMERA

MS/MS Spectra

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Extracting MS2 spectra

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Other additions: Visualisation

RMassBank focussed on record generation Added summaries for different processing stages Added Visualisations

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Parser and Record validation

Parser for MassBank v1 and v2 records Useful for validation, conversion etc. Rules for validation: Consistent instrument names Mass of SMILES consistent with isolation window No ions above isolation window Empty peak lists Create a HTML report table Can be used on MassBank servers in general ⇒ curation of MassBank data

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Developments under the hood

RMassBank GUI is great for casual use RMassBank workflows are great for repeated analysis But flexible MS/MS acquisition strategies require flexible data analysis Simplification and Modularisation planned

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Outline

1 Acquisition and Processing of LC-MS/MS

2 Summary & Outlook

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Summary

    MetShot aproach Peaks of Interest Library Search LC/MS data MS/MS data de-Novo High-quality MS/MS spectra Identification XCMS MS/MS Method XCMS centWave Biologically relevant features Generation Peak Table  Peak Tables

Signal processing with Statistics CAMERA CAMERA XCMS and CAMERA MS/MS Spectra

QueryQuery Spectrum Spectrum Metabolite identification

SpectralSpectral DB DB FragmenterFragmenter With reference spectra (MassBank) (MassBank)(MassBank) (MetFrag)(MetFrag) With in silico tools (MetFrag) SpectraSpectra CompoundCompound With a combination thereof (MetFusion) MatchesMatches CandidatesCandidates

MetFusionMetFusion

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 IntegratedIntegrated Result Result List List ofof reranked reranked CandidatesCandidates Thanks to . . .

MSBI @ IPB Halle: Christoph Ruttkies Michael Gerlich D. Trutschel, D. Schober

Acknowledgements: Profs. Nishioka, Arita and Horai (MassBank Japan) Ralf Tautenhahn (SCRIPPS, Metlin Fragments) Emma Schymanski (eawag, Zürich) Funding: DFG grant “ChemFrag”, EU FP7 “COSMOS”

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013 Only supplemental slides beyond this point

S. Neumann (IPB-Halle.DE) RMassBank ThOB 13.6.2013