Fundamentals of Biological Mass Spectrometry and Proteomics
Total Page:16
File Type:pdf, Size:1020Kb
Fundamentals of Biological Mass Spectrometry and Proteomics Steve Carr Broad Institute of MIT and Harvard Modern Mass Spectrometer (MS) Systems Orbitrap Q-Exactive Triple Quadrupole Discovery/Global Experiments Targeted MS MS systems used for proteomics have 4 tasks: • Create ions from analyte molecules • Separate the ions based on charge and mass • Detect ions and determine their mass-to-charge • Select and fragment ions of interest to provide structural information (MS/MS) Electrospray MS: ease of coupling to liquid-based separation methods has made it the key technology in proteomics Possible Sample Inlets Syringe Pump Sample Injection Loop Liquid Autosampler, HPLC Capillary Electrophoresis Expansion of the Ion Formation and Sampling Regions Nitrogen Drying Gas Electrospray Atmosphere Vacuum Needle 3- 5 kV Liquid Nebulizing Gas Droplets Ions Containing Solvated Ions Isotopes Most elements have more than one stable isotope. For example, most carbon atoms have a mass of 12 Da, but in nature, 1.1% of C atoms have an extra neutron, making their mass 13 Da. Why do we care? Mass spectrometers “see” the isotope peaks provided the resolution is high enough. If an MS instrument has resolution high enough to resolve these isotopes, better mass accuracy is achieved. Stable isotopes of most abundant elements of peptides Element Mass Abundance H 1.0078 99.985% 2.0141 0.015 C 12.0000 98.89 13.0034 1.11 N 14.0031 99.64 15.0001 0.36 O 15.9949 99.76 16.9991 0.04 17.9992 0.20 Monoisotopic mass and isotopes We use instruments that resolve the isotopes enabling us to accurately measure the monoisotopic mass MonoisotopicMonoisotopic mass; all 12C, mass no 13C atoms corresponds to 13 lowestOne massC atom peak Two 13C atoms Angiotensin I (MW = 1295.6) (M+H)+ = C62 H90 N17 O14 TheWhen monoisotopic the isotopes mass of aare molecule clearly is the resolved sum of the the accurate monoisotopic masses for the massmost abundant isotope of each element present. As the number of atoms of any given element increases,is used the as percentage it is the of most the population accurate of molecules measurement. having one or more atoms of a heavier isotope of this element also increases. The most significant contributors to the isotopic peak pattern for peptides is the 13C isotope of carbon (1.1%) and 15N peak of nitrogen (0.36%). Example of electrospray mass spectrum of mixture of 3 peptides Isotope spacing = 1.0: Ion is singly charged: (M+H)1+ MW = 584.2 Isotope spacing = 0.5: 2+ Isotope spacing = 0.3: Doubly charged: (M+2H) Triply charged: (M+3H)3+ MW = 1096.2 MW = 1806.6 586.2 603.5 Example of electrospray mass spectrum of mixture of 3 peptides Peptide 3: MW = 1806.6 F-G-G-F-T-G-A-R-K-S-A-R-K-L-A-N-Q Peptide 2: MW = 1096.2 F-G-G-F-T-G-A-R-K-S-A Peptide 1: MW = 584.2 F-G-G-F-T-G Peptide 1 (M+H)1+ Peptide 2 1-6 Peptide 3 (M +2H)2+ (M+3H)3+ Nociceptin 1-11 1-11 1-17 Nociceptin 586.2 Peptide 2 Nociceptin (-H, +Na) 603.5 How we sequence peptides: MS/MS intact peptide parent ions Scan m/z Pass Pass 350-1200 All ions All ions MS Collision MS 1 Cell (off) MS 2 MS or mass spectrum HPLC Column MS/MS Collision Cell (on) Pass m/z Fragment Pass MS/MS spectrum of doubly 834-838 all ions All ions charge ion at m/z 836.5 MS/MS means using two mass analyzers (combined in one instrument) to select an analyte (ion) from a mixture, then generate fragments from it to give structural information. Dominant fragment ions observed by collision- induced dissociation (CID) of peptides b ions y ions Direct cleavage Rearrangement of of peptide bond mobile proton Example electrospray MS/MS spectrum of a peptide F-I-I-V-G-Y-V-D-D-T-Q-F-V-R 1199.53 100 90 80 880.46 70 Δ=115 Δ=115 979.49 Asp Asp 60 Δ=128 Δ=99 Δ=57 Δ=99 Gln Val 50 792.35 Gly Val Δ=147 693.26 Phe 1142.53 1298.60 40 706.62 Δ=101 1497.46 Thr Δ=163 30 Tyr 650.44 1251.46 765.40 20 421.33 473.15 549.46 907.26 1124.44 1398.48 10 261.30 818.64 Relative Abundance, % Relative 0 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 m/z Example electrospray MS/MS spectrum of a peptide 11 10 9 8 7 6 5 4 3 2 y ions F-I-I-V-G-Y-V-D-D-T-Q-F-V-R b ions 2 3 4 6 7 8 9 10 11 12 13 1199.53 100 Δ=115 y 90 10 Asp y7 80 Δ=115 y8 Asp 880.46 Δ=99 70 Val 979.49 Δ=57 b7 Gly 60 Δ=128 b6 Δ=99 y11 Gln Val Δ=163 y9 50 b3 792.35 b Δ=101 693.26 Tyr 13 b4 1142.53 1298.60 40 706.62 Thr 1497.46 y5 y b10 y 6 b11 30 y 3 2 b8 y4 650.44 1251.46 b b 765.40 12 20 2 421.33 b 473.15 549.46 907.26 9 1124.44 1398.48 10 261.30 818.64 Relative Abundance, % Relative 0 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 m/z Discovery Proteomics: differential expression profiling by MS Biological Samples LC-MS/MS Data Analysis (case vs. control) Protein Mixtures • Biofluids Separate and Analyze Search DB using peptide • Tissue lysates Peptides by LC-MS/MS m/z and sequence • digest to peptides • m/z and intensity of peptides • Peptide identity • fractionate peptides • rich pattern • Protein identity • Fragment ions for sequence • Relative abundance Most analyses of proteins are done by digestion of proteins to peptides (“bottom-up” proteomics) Advantages: • Data acquisition easily automated • Fragmentation of tryptic peptides well understood • Reliable software available for analysis • Separation of peptides to create less complex subsets of the proteome for MS analysis is far easier than for proteins (relates to breadth and depth of coverage) Disadvantages: • Simple relationship between peptide and protein lost • Took highly complex mixture and made it 20-100x more complex • Puts high analytical demands on instrumentation Obtaining sequence information on intact proteins: “top-down” proteomics Most useful for single proteins or relatively simple mixtures (1) Can distinguish sequence variants Enables deciphering of combinatorial modification “codes” on proteins like histones (2). While useful, its not suitable for most biomedical applications yet: – requires highly specialized instrumentation – cannot be easily applied to complex biological samples – Data interpretation is far more difficult and less automated – Breadth and depth of coverage of the proteome is orders of magnitude less than for bottom-up proteomics 1. Tran et al.Nature, 2011, 480(7376) p. 254-8 2. Tian et al. Genome Biology 2012, 13:R86 A selective look into the proteomic “tool chest” Reduction and Alkylation Routinely done prior to enzymatic digestion to break disulfide bonds, unfolding proteins to make them more susceptible to enzymatic cleavage A selective look into the proteomic “tool chest” Highly Specific Proteases Trypsin C-terminal to Arg and Lys Lys-C C-terminal to Lys Staph. V8 C-terminal to Glu and Asp Asp-N N-terminal to Asp Non-Specific Proteases Chymotrypsin C-terminal to aromatic, aliphatic (e.g., Tyr, Trp, Phe, Leu) Proteinase K, Thermolysis C-terminal to aromatic, aliphatic Discovery Proteomics: differential expression profiling by MS Biological Samples LC-MS/MS Data Analysis (case vs. control) Protein Mixtures • Biofluids Separate and Analyze Search DB using peptide • Tissue lysates Peptides by LC-MS/MS m/z and sequence • digest to peptides • m/z and intensity of peptides • Peptide identity • fractionate peptides • rich pattern • Protein identity • Fragment ions for sequence • Relative abundance Peptide Sequencing by LC/MS/MS MS-1 Collision MS-2 peptide from Q I K L COOHCell protein of interest E NH F 2 NH2 Y NH2 Y G NH 1D or 2D LC 2 Y G G G Y G G F separation Y NH2 G COOH Y G G F L NH2 L NH2 F COOH G G F L COOH G F L COOH L F COOH M A P L S A COOH A R N Trypsin digest S P N Reduce, alkylate Select mass from Sequence peptide of interest peptide of interest proteins Automated Peptide Sequencing by LC/MS/MS RT: 0.01 - 90.15 340 Total Ion Current Trace produced during LC-MS/MSNL: 10 0 505 12 54 B 9.54E 7 A 510 80 385 B ase Peak F : + 465 c Full ms [ 60 1150 12 9 9 300.00 - 335 415 645 10 70 114 5 2000.00] 535 695 10 15 10 6 0 40 295 555 630 705 175 955 116 5 1175 13 0 4 18 0 250 650 795 860 875 925 13 59 20 585 760 970 13 0 9 Relative Abundance Relative 170 41 31 14 0 13 6 9 14 6 9 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 T ime (min) 10 0 535.8 10 0 500.6 684.6 B 403.9 A 750.0 50 493.6 50 616.6 713.0 Relative Abundance Relative Relative Abundance Relative 0 0 500 10 0 0 150 0 2000 500 10 0 0 150 0 2000 m/z m/z 644.0 10 0 658.9 10 0 MS/MS 535.8 MS/MS 500.6 653.0 50 598.5 526.9 10 0 50 10 0 596.7 MS/MS 403.9 700.5 MS/MS 684.6 511.0 213.1 469.5 565.1 10 14 .