<<

Biomass and Metabolomics at the MPI for Molecular Plant

Bioen Workshop on Metabolomics of Sugarcane FAPESP, Sao Paulo, 07 December 2009

Lothar Willmitzer Max-Planck-Institut für Molekulare Pflanzenphysiologie 14476 Potsdam-Golm, The

Foundedis a research 1948 organization supposed to work complementary to Universities  276 Departments (led by Max Planck Directors) 12,200 employees (include. 4,200 scientists) + 9,600 junior scientists

 Budget: 1.3 BEuro = ≤2% of total R&D in Germany government + 16 states (1:1)

 78 Institutes our research activities are

- focused on basic research - curiosity-based and hypothesis-driven - open to but not focused on applications

“Knowledge must precede application”

(Max Planck) 17 Nobel prizes since 1948

2007 Chemistry: Gerd Ertl 2005 Physics: Theodor Hänsch 1995 Chemistry: Paul Crutzen 1995 Medicine: Christiane Nüsslein-Volhard 1991 Medicine: 1988 Chemistry: Johann Deisenhofer Robert Huber Hartmut Michel 1986 Physics: Ernst Ruska 1985 Physics: Klaus von Klitzing 1984 Medicine: Georges Köhler 1973 Medicine: 1967 Chemistry: Manfred Eigen 1964 Medicine: Feodor Lynen 1963 Chemistry: Karl Ziegler 1954 Physics: Walter Bothe Commercial success = Technology transfer

„Max Planck Innovation GmbH“ since 1979: 2,025 patents filed 1,186 licence contracts 154 M€ patent licences 60 Start Up companies Potsdam-Golm Campus harbors University Potsdam, Fraunhofer Research Institutes and Max-Planck Institutes MPI für Molekulare Pflanzenphysiologie Structure of the MPI-MP

• 3 departments hosting 12 research groups mostly led by young scientists on limited contracts • 2 independent Max-Planck research groups • 3 infrastructure groups • 2 university guest groups • 2 Systems Biology guest groups Structure of the MPI-MP

• Approx 350 employees

• 90+ post-doc´s and 90+ PhD students

• More than 50% coming from abroad covering 25+ countries

• Working language is english Biological Processes studied at the MPI-MP

Plant Energetics (Photosynthesis, Respiration)

Biomass

Abiotic stress Metabolism and Growth (temperature, nutrients) Cell wall biosynthesis Create genetic diversity, grow it in defined environmental conditions and subject it to broad molecular phenotyping

RNA analysis (arrays, RT-PCR)

Proteomics facility ( LC-MS, 2D)

EnzymeData HTPMining platform (Bioinformatics)Metabolite analysis

Single cell analysis Genotyp Environment AnalysisAnaly Genotypeses Environment sis Subcellular analysis Transgenics Natural Variation Arabidopsis, tomato,rice Metabolomics • The metabolome comprises all small molecules present in a given biological system • Metabolomics aims at the quantitative determination of all small molecules • The metabolome contains molecules hugely varying in three parameters : concentration, structure and chemical behaviour • In contrast to RNA and protein metabolites are non-linear to the DNA Metabolomics - Challenges

• Increase the fraction of molecules we see..

• Annotate more of them …

• Quantitate them ( relative and absolute) ..

• Discriminate a true metabolite from a contaminant …

• Make sure that what we measure reflects the situation in the cell …. Sampling

• subcellular distribution, labelling ( fluxomics) …. Metabolomics - Extraction

• Metabolomics aims at a complete coverage of all small molecules • Thus as little prefractionation as possible is applied • We start from a simple methanol/chloroform/water extract which is exposed to three different platforms GC-MS : the workhorse for 100 – 150 metabolites mostly from primary metabolism

robotic derivatisation & full extract injection: „What Is This Peak?“

Acquired on 09-Dec-1998 at 12:59:56 8343AO01 Scan EI+ TIC 100 1.20e8

%

0 rt 7.500 10.000 12.500 15.000 17.500 20.000 22.500 25.000 27.500 30.000 32.500 35.000 37.500 40.000 42.500 45.000 LTQ FT Ultra

• Resolution – > 1 000 000 • Mass Range – m/z 50-2000 • Dynamic Range – 1 000 Mass Accuracy

102 ppm 0.26 ppm 5.93 ppm + 0.00004 - 0.017 - 0.00103

Phenylalanine [M+H+]+ Formula calculation is depending on mass accuracy and resolution

[M+H+]+ Mass 181.07066 allowed chemical elements: C = 30 H = 50 N = 5 O = 10 P = 5 S = 5

500 ppm = 0.09085 Da Error  268 predicted Formulas 100 ppm = 0.01817 Da Error  54 predicted Formulas 10 ppm = 0.00181 Da Error  6 predicted Formulas

1 ppm = 0.00018 Da Error  1 predicted Formula  C6H12O6 Metabolite Extraction

MeOH:CHCl3:H2O

12 13 Arabidopsis CO2 Arabidopsis CO2 •UPLC FT-ICR MS

•Peak Extraction 12C chemical formula 13C chemical formula and retention time and retention time •DB Search

Matched 12C/13C formulas with RT tolerance 0.05 min unambiguous ambiguous

Annotated formula •Isotope pattern •MS/MS analysis unambiguous

13 Whole metabolome CO2 labelling strategy Parallel analysis of 12/13C labelled compounds allows discrimination of biological from non-biological material and annotation of number of carbon atoms

x:\ag willmitzer\...\at_13c_pos1 10/7/2008 10:23:16 PM

RT: 0.00 - 13.01 100 NL: 1.62E7 90 8.23min Base Peak MS 80 7.27min At_12C_po s1 70

60

50

40

30

20

10

100 NL: 1.57E7 90 Base Peak MS 80 at_13c_pos 1 70

60

50

40

30

20

10

0 1 2 3 4 5 6 7 8 9 10 11 12 13

X:\AG Willmitzer\...\At_12C_pos1 10/7/2008 8:41:49 PM X:\AG Willmitzer\...\At_12C_pos1 10/7/2008 8:41:49 PM

100 NL: 9.74E6 100 NL: 7.52E6 At_12C_pos1#515 At_12C_pos1#582 90 741.22238 RT: 7.33 AV: 1 T: 90 RT: 8.21 AV: 1 T: 12 FTMS + p NSI Full 382.14236 FTMS + p NSI Full 80 ms 80 12 ms C [100.00-1300.00] [100.00-1300.00] 70 70 C

60 60

50 50

40 40

30 30

20 C 33 20 383.14574 384.13859 10 10

0 0 100 NL: 8.58E6 100 NL: 1.56E6 at_13c_pos1#514 at_13c_pos1#591 90 RT: 7.30 AV: 1 T: 90 RT: 8.27 AV: 1 T: 774.33350 FTMS + p NSI Full 382.14225 FTMS + p NSI Full 80 13 ms 80 13 ms C [100.00-1300.00] [100.00-1300.00] 70 70 C

60 60

50 50

40 40

30 30

20 20 383.14599 10 10 384.13865 0 0 730 735 740 745 750 755 760 765 770 775 780 785 382.0 382.5 383.0 383.5 384.0 m/z m/z Relative quantitation via spiking with a 13C labeled metabolome

A 13 Sample 1 Sample 1 Sample 2 Sample 3 CO2 Sample

•Metabolite Extraction

•UPLC FT-ICR MS •Spike fixed amount (1:1) •Peak Extraction •Spectra Alignment •Differential Peak detection •DB Search

1:1 1:2 1:4 1:20

12C/13C dilution

B

35

30

25

20

15

Ratio13C/12C

10

5 0

1:1 1:1 1:1 1:2 1:2 1:2 1:4 1:4 1:4 1:20 1:20 1:20

Mixing Ratio 12C/13C MS/MS for elucidation of structures

13 C H O C15H9O7 13 15 9 7 C27H28O16 301.03397 316.08536 636.23571 100 90 80 70 60 C27H28O16 50 609.14234 m/z 303

Rel. Abundance [%] Abundance Rel. 40 m/z 318 30 20 10 0 100 200 300 400 500 600 700 800 900 1000 m/z IRMPD ESI neg [M-H]- Strategic Overview of the Platform

GC-TOF FT-ICR-MS

pos. mode aqueous UPLC separation phase Arabidopsis neg. 12 mode CO2 grown

Metabolite Extraction

MeOHl:CHCl3:H2O

pos. mode Arabidopsis chloroform 13 Lipids UPLC fractionation CO2 grown phase neg. mode

Compound annotation UPLC peak detection, and result interpretation peak lists extraction and differential database searches Summary Metabolomics

• Starting from a MeOH-H2O/CHCl3 extract three platforms allow :

• analysis of 150 primary metabolites (GC-TOF)

• analysis of 500 – 1000 secondary metabolites (UPLC-FT-ICR-MS)

• analysis of about 140 complex lipids Metabolomics : Applications

• Metabolomics and Genetics : Annotation of gene function

• Metabolomics and Diagnostics : Discrimination/Identification/Prediction of states

• Metabolomics and Systems Biology : Integration of different data sets Metabolomics and Genetics :

Annotation of gene function using KO populations and overexpressors Metabolic Genomics: From Gene to Function - Directly Metabolic Genomics at metanomics: Process Overview

Knockout “Gene...... Donors”:

Transfer of additional Genes

Transformants A Wild Type Transformants B

Arabidopsis thaliana The “domestic pet” Plant

 about 27.000 Genes Collections

 Genetic blueprint known

 Small  Grows rapidly Functional  Large progeny Performance Analysis Metabolic Analyses Plant Lines

Metabolites

0.2 1 2

X-Fold ratio in comparison to Wild Type control

Fig.4 Metabolomics and Diagnostics : Discrimination/Identification/Prediction of states Metabolic profiling applied to two ecotypes and one mutant each

Col-2 wild type dgd1 mutant C24 wild type sdd1 mutant Arabidopsis Profiling: PCA Cluster Analysis

Factor1

Factor4 Factor2

Col2 WT Col2 dgd1 C24 WT C24 sdd Medical applications of Metabolomics • Differentiation of diabetics/nondiabetics

• Diagnostics of cancer and non-cancer tissues (kidney cancer)

• Prediction of health risk ( myocardial infarct)

• Differentiation of responder/non-responder (antidepressiva) Separation of diabetic and nondiabetic male and female patients Metabolic profiling allows assigment of normal and clear cell kidney carcinoma Prediction of myocardial infarction by metabolic signature.

100

80

60

40

SensitivitySensitivity

20 Standard Model and five metabolites Standard Model Age, Sex and five metabolites 0 0 20 40 60 80 100 100-Specificity

Prediction was estimated by calculation of receiver-operating-curves (ROC). Prediction of a model including age, sex and the here identified five metabolites (red ROC; AUC 0.844) was not inferior to a standard model including all established risk factors (age, sex, smoking, education, alcohol consumption, physical activity, hormone replacement therapy, BMI, hypertension, diabetes, HDL/total--ratio and C-reactive protein) (blue ROC; AUC 0.86; p=0.385). A model with all established risk factors (standard model) and the five metabolites (AUC 0.89) outperformed the standard model alone (p=0.033). Medical applications of Metabolomics • Differentiation of diabetics/nondiabetics

• Diagnostics of cancer and non-cancer tissues (kidney cancer)

• Prediction of health risk ( myocardial infarct)

• Differentiation of responder/non-responder (antidepressiva) Metabolomics and systems biology: Network analysis and integration with transcript data E. coli response towards different stresses on the transcriptome and metabolome

Heat Cold Oxidative stress Glucose-lactose shift control M T stress

M T

Adaptation phase

MMTMTT

M T Optical Density 600nm Density Optical

time Network construction by correlation analysis (metabolite A vs B)

A A A

B B B control growth

overlap Cold stress

Heat stress

Oxidative stress

Lactose shift Superimposition of stable networks and metabolic pathways Differential network properties identify candidate molecules Canonical correlation analysis allows identification of transcripts correlating with metabolite changes Coordinated expression of transcripts and metabolites Conclusion

• Metabolomics is rapidly developing into mature technology already now covering several thousand small molecules • Combined with genetic diversity it allows the high-throughput annotation of gene function • Probably due to its vicinity to the ultimate phenotype it has a surprisingly high predictive power • It is an integral part of systems biology Patrick Giavalisco Bettina Seiwert AenneThank Eckardt you for your attention! Szymon Josefcuk Zoran Nikoloski Goforsys JedrzeyThanks Szymanski to FAPESP for support! Alvaro Inostroza Joachim Selbig UP Sebastian Klie