Quick viewing(Text Mode)

2. Introduction to Shot Gun Proteomics

2. Introduction to Shot Gun Proteomics

MDC m/z

Introduction to mass spectrometry Gunnar Dittmar What is mass spectrometry?

What is the accuracy of a mass spectrometer? MDC m/z mass spectrometry

Accuracy: 0.1u = 1,6 * 10-28 kg

Examples: Hydrogen atom: 1u Oxygen atom: 16 u Neutron: 1 u Electron: 0.005 u MDC Uses for mass spectrometry m/z mass spectrometry

Identification of chemicals quality control biological degradation

Water quality control Control of food and beverages check for pesticides, toxins etc

Detection of explosives toxic chemicals

Doping MDC Mass spectrometry general setup m/z mass spectrometry

Samples

Mass Ionization Detector analyzer

Data analysis MDC General principle of mass spectrometry m/z mass spectrometry

+ + + + + + + + + + + + electrostatic field

+ + + + + + + +

------ion source mass analyzer detector MDC General principle of mass spectrometry m/z mass spectrometry

+ + + + + + + + + + + +

+ + + + + + + + + + + + ------+ ion source mass analyzer detector MDC Mass spectrometry m/z mass spectrometry

Sir Joseph John Thomson, Nobel prize in physics for the discovery of the electron, 1906 MDC Mass spectrometry m/z mass spectrometry

Sir Francis Aston, first fully functional mass spectrometer in 1919 MDC m/z mass spectrometry

Ionization MDC Ionization m/z mass spectrometry

ESI soft ionization Electro-spray ionization

MALDI Matrix assisted laser Proteomics desorption ionization

FACS-MS hard ionization Plasma CyTOF MDC MALDI m/z mass spectrometry

Image from Ekman et al. Mass spectrometry MDC MALDI m/z mass spectrometry

• laser is used • sample is crystallized with a matrix • mostly singly charged ions • pulsed method • rather resistant to salt MDC Ionization m/z mass spectrometry

Image from Ekman et al. Mass spectrometry ESI

2000 - 3000 V

150°C

MASS nanoflow needle SPECTROMETER ESI

MASS SPECTROMETER + + PFASGHFK

+ + + + + PFASGHFK + PFASGHFK + PFASGHFK + + PFASGHFK + PFASGHFK PFASGHFK PFASGHFK + + + + + PFASGHFK + MDC m/z mass spectrometry

Mass spectrometers MDC Mass spectrometers in proteomics m/z mass spectrometry

• Ion traps • LTQ • OrbiTrap • FT-ICR • TOF (time of flight) • QQQ (triple quadrupols) MDC Mass spectrometers in proteomics m/z mass spectrometry

• Ion traps • LTQ • OrbiTrap • FT-ICR • TOF (time of flight) • QQQ (triple quadrupols) MDC m/z mass spectrometry

Quadrupols MDC m/z mass spectrometry

+

- -

+ MDC m/z mass spectrometry

-

+ +

- MDC m/z mass spectrometry

- RF RF + +

- MDC m/z mass spectrometry

-

+ +

- MDC Quadrupol m/z mass spectrometry Quadrupoles of the AB Sciex Q-TRAP 5500 MDC m/z mass spectrometry

Orbitrap MDC Orbitrap m/z mass spectrometry

r!

z!

φ

Korsunskii M.I., Basakutsa V.A. Sov. Physics-Tech. Phys. 1958; 3: 1396. Knight R.D. Appl.Phys.Lett. 1981, 38: 221. Gall L.N.,Golikov Y.K.,Aleksandrov M.L.,Pechalina Y.E.,Holin N.A. SU Pat. 1247973, 1986. MDC m/z mass spectrometry MDC Orbitraps m/z mass spectrometry

• high mass accuracy • relatively fast MDC m/z mass spectrometry

Mass spectrometers for proteomics Hybrid mass spectrometers MDC m/z mass spectrometry MDC m/z mass spectrometry

proteomics shot-gun on an OrbiTrap MDC Bottom-up/Top-down m/z mass spectrometry

Bottom-up

peptides

Top-down MDC Top-down ms m/z mass spectrometry Liquid-chromatography coupled mass spectrometry MDC m/z mass spectrometry

I

t MDC MS Spectrum m/z mass spectrometry MDC Top5 identification cycle m/z mass spectrometry MDC Mass spectrometry of peptides m/z mass spectrometry

Ionization + PFASGHFK PFASGHFK Mass Analyzer Mass Detector + TSSSGHR TSSSGHR + HLFWTK HLFWTK

m/z MDC Mass measurement of a peptide m/z mass spectrometry

• exact mass of a peptide • no sequence information MDC Tandem MS or MS/MS m/z mass spectrometry

PFASGHFK TSSSGHR Mass Analyzer Mass Detector HLFWTK m/z Selection of a peptide Ion

Fragmentation Cell

PF PFA PFAS P PFASG SGHFK ASGHFK GHFK HFK PFASGHFK FASGHFK

Mass Detector m/z How to get the sequence information from Mass MDC Spectra? m/z mass spectrometry

PFASGHFK PFASGHF PFASG PFAS PF P

m/z PFASGHFK FASGHFK ASGHFK GHFK HFK FK K MDC Top5 cycling between MS and MS/MS m/z mass spectrometry

MS MS/MS of the 1. peak MS/MS of the 2. peak MS/MS of the 3. peak MS/MS of the 4. peak MS/MS of the 5. peak MS MS/MS of the 1. peak MS/MS of the 2. peak MDC MS - MS/MS m/z mass spectrometry MDC m/z mass spectrometry

Proteomes What is actually a proteome? MDC Proteome m/z mass spectrometry

• “genome“ refers to all genes in a given organism • the term “proteome“ was coined by Marc Wilkins 1994 to describe the complement of the genome • refers to all present is a sample (organism, cell, body fluid etc.) • but how can we identify all proteins? • classical approach: two-dimensional gel electrophoresis MDC 2D-gel based proteomics m/z mass spectrometry

Is this a complete proteome? MDC Gel based proteomics m/z mass spectrometry

Coomassie and silver staining can only detect the most abundant proteins: not sensitive enough for complete proteome analysis! MDC Protein abundance in the proteome m/z mass spectrometry

Protein Copy number

Serum albumine 1E+10

Transcription factors 10 - 1E5 MDC Dynamic range m/z mass spectrometry

Mount Everest: 8850 m

difference 1e10

bacterium: 1 µm MDC chormatography coupled proteomics m/z mass spectrometry ! Intensity

!me!

ESI$

MS Intensity

m/z proteins peptides Database search

MS/MS Intensity

m/z MDC Protease digest m/z mass spectrometry MDC m/z mass spectrometry MDC chormatography coupled proteomics m/z mass spectrometry ! Intensity

!me!

ESI$

MS Intensity

m/z proteins peptides Database search

MS/MS Intensity

m/z Reversed phase high performance liquid MDC chromatography (rpHPLC) m/z mass spectrometry

C18

pumpA

mobile phases mixer analyzer stationary phase

pumpB

Buffer A: water + HAc Buffer B: organic solvent + HAc Reversed phase high performance liquid MDC chromatography (rpHPLC) m/z mass spectrometry

mixed analytes

mobile phase (low amount of organic solvent)

stationary phase (C18) Reversed phase high performance liquid MDC chromatography (rpHPLC) m/z mass spectrometry

adsorption

(hydrophobic interaction with C18 chains) Reversed phase high performance liquid MDC chromatography (rpHPLC) m/z mass spectrometry

higher amount of organic solvent desorption Reversed phase high performance liquid MDC chromatography (rpHPLC) m/z mass spectrometry

high amount of organic solvent

separated analytes MDC Nano-flow HPLC m/z mass spectrometry

smaller column -> lower flow rate -> higher concentration of peptides -> higher sensitivity

Ideal: 20 nl/min in practise: 200 nl/min (0.2 µl/min) Nano HPLC coupled to an LTQ-OrbiTrap mass spectrometer MDC m/z mass spectrometry MDC m/z mass spectrometry

Sensitivity Mass resolution MDC chormatography coupled proteomics m/z mass spectrometry ! Intensity

!me!

ESI$

MS Intensity

m/z proteins peptides Database search

MS/MS Intensity

m/z Critical parameters for mass spectrometers MDC m/z mass spectrometry

• Sensitivit

•ability to detect small amounts of peptides • Dynamic range

•ability to detect peptides with big differences in abundance • Speed

•ability to fragment many peptides per second • Resolution

•ability to differentiate peptides with similar m/z MDC Resolution in mass spectrometry m/z mass spectrometry

m/z MDC Average mass and monoisotopic mass m/z mass spectrometry

Resolution 1,000 Resolution 10,000

centroid = average mass

Average m/z = 923.93 monoisotopic m/z = 923.40 z = 2

Mass of singly charged peptide ([M+H]+) = 923.40 x 2 – 1.008 = 1845.80

peptide: AEGWNFQDEHGEDRR MDC The importance of high resolution m/z mass spectrometry

peptide mixture: AEGWNFQDEHGEDRR ([M+H]+ = 1845.79) VSAYVKPMITHALPYR ([M+H]+ = 1846.05)

R = 1,000 MDC The importance of high resolution m/z mass spectrometry

peptide mixture: AEGWNFQDEHGEDRR ([M+H]+ = 1845.79) VSAYVKPMITHALPYR ([M+H]+ = 1846.05)

R = 5,000 MDC The importance of high resolution m/z mass spectrometry

peptide mixture: AEGWNFQDEHGEDRR ([M+H]+ = 1845.79) VSAYVKPMITHALPYR ([M+H]+ = 1846.05)

R = 10,000 MDC The importance of high resolution m/z mass spectrometry

peptide mixture: AEGWNFQDEHGEDRR ([M+H]+ = 1845.79) VSAYVKPMITHALPYR ([M+H]+ = 1846.05)

R = 60,000 -18 MDC Sensitivity: atto molar (10 ) m/z mass spectrometry

atto molar MDC Dynamic range of an LTQ-Orbitrap m/z mass spectrometry

Mount Everest: 8850 m

5000 100 000

human: petri dish: 1.8 m 10 cm

LTQ-Orbitrap Q-Exactive MDC m/z mass spectrometry

Data interpretation MDC Interpretation of MS/MS spectra m/z mass spectrometry

PFASGHFK PFASGHF PFASG PFAS PF P

m/z PFASGHFK FASGHFK ASGHFK GHFK HFK FK K MDC Automatic data analysis m/z mass spectrometry

We know:

• mass of precursor peptide (from MS scan) • masses of fragments (from MS/MS scan) • enzyme we have used • organism we are analyzing MDC Automatic data analysis m/z mass spectrometry

1. organism Database with all human proteins

2. enzyme in silico digest

All theoretical human peptides

Select peptides matching 3. mass of precursor to precursor mass (high mass accuracy important!)

candidate peptides

compare theoretical fragment masses 4. masses of fragments with observed fragment masses

score 96 identified peptide score 5

score 2 From peptides to proteins: the protein inference MDC problem m/z mass spectrometry

William of Ockham, 1288-1348:

Ockhams razor principle:

entia non sunt multiplicanda praeter necessitatem

(entities should not be multiplied beyond necessity)

Ø Report the smallest list of proteins that is sufficient to explain all identified peptides

Nesvizhskii and Aebersold, 2005 Controlling the false-positive rate in shotgun MDC proteomic data m/z mass spectrometry • every search can contain false positive and false negative identifications • manual verification of individual identifications is not feasible • target-decoy database searching can be used to estimate the false positive rate • this strategy is independent of the search engine used MDC Target-decoy database searching m/z mass spectrometry

MS-Data Protein Database Spectrum 1 Spectrum 2 Spectrum 3 Spectrum 4 … Protein A Protein B … Search engine Protein C … (e.g. ) … Protein D … …

Results Spectrum 1 -> Protein C Spectrum 2 -> no match Spectrum 3 -> Protein A This list can contain Spectrum 4 -> Protein F false posive IDs, but we … do not know how many … MDC Target-decoy database searching m/z mass spectrometry

“Control” database (should not contain the protein MS-Data sequences you had in your sample) Spectrum 1 Spectrum 2 Spectrum 3 Spectrum 4 … Control Protein A Control Protein B … Search engine Control Protein C … (e.g. MASCOT) Control Protein D … … …

Results Spectrum 7 -> Control Protein T

This is a false posive hit (by definion) MDC Target-decoy database searching m/z mass spectrometry

The false posive rate (in %) is similar to the rao of

hits to control DB hits to correct DB X 100

For example:

5 hits to control DB X 100 = 0.1 % 5000 hits to correct DB MDC Which control database? m/z mass spectrometry

should be as similar to the target database as possible • same number of proteins • same length of proteins • same amino acid composion

Ø either reversed or randomized database

Advantage of reversed database: • a defined control database is generated for every target database MDC Target-decoy database searching m/z mass spectrometry

MS-Data Spectrum 1 reversed database Spectrum 2 …

Results Spectrum 5 -> Reversed Protein C, score 5

MS-Data Spectrum 1 target database Spectrum 2 …

Results Spectrum 5 -> Protein G, score 67 MDC Target-decoy database searching m/z mass spectrometry

Target-Decoy Database

MS-Data Protein A Spectrum 1 Protein B Spectrum 2 Protein C Spectrum 3 Protein D Spectrum 4 Search engine … … (e.g. MASCOT) … … … Reversed Protein A … Reversed Protein B Reversed protein C Results Reversed Protein D Spectrum 1 -> Protein C … Spectrum 2 -> no match Spectrum 3 -> Protein A Spectrum 4 -> Protein F Spectrum 5 -> Protein G Spectrum 6 -> Reversed Protein D … … MDC Summary: shotgun proteomics m/z mass spectrometry

• separation at the peptide level (RP-HPLC) • high-throughput online LC-MS/MS • higher sensitivity • higher dynamic range • higher speed • peptide identification by database search • protein inference problem • target-decoy database to estimate false-positive rates MDC m/z mass spectrometry

Thank you Current techniques in proteomics is next...