Post-Translational Modification Identification by Mass Spectrometry

Post-translational Modification Identification by Mass Spectrometry Liwen Zhang Mass Spectrometry and Proteomics Facility The Ohio State University Summer Workshop 2015 PTM: Chemical Modifications on Proteins Increases the Functional Diversity of the Proteome Regulate Activity, Localization and Interaction Image from https://www.lifetechnologies.com/overview‐post‐translational‐modification.html Histone Modifications Arginine Methylation Lysine Methylation Acetylation Sumoylation K‐GGXXX Phosphorylation Ubiquinylation Citrullination K‐GG Post-translational Modifications Modified Modification Mass Shift (Da) Function Residue Disulphide bond C ‐2.0157 formation protein stability Methylation K/R 14.0157 regulation of gene expression Hydroxylation P 15.9949 Protein stability and protein-ligand interaction Oxidation of Met M 15.9949 usually introduced during digestion Acetylation K 42.0106 Protein stability, protection of N terminus, regualtion of protein-DNA interactions oxidative stress; preventing irreversible oxidation of protein thiol; control of cell- 305.0682 S‐Glutathionylation C signaling pathways by modulating protein function Signaling, activation/inactivation of enzyme activities, modulation of molecular Phosphorylation S/T/Y 79.9663 interactions Nitration Y 44.9851 oxidative damage during inflammation Ubiquitination K 114.0429 destruction signal Oxidation of Cys C 31.9721/47.9847 oxidative/nitrosative stress Nitrosylation C 28.9902 oxidative/nitrosative stress Excreted proteins, cell-cell recognition/signaling O-GlcNAc, reversible, regulatory Glycosylation N/S/T ‐‐‐‐‐‐‐‐ functions Method for PTM Detections • Specific Fluorescence Dye: Pro-Q Diamond---Phosphorylation Pro-Q Emerald---Glycosylation • Specific Antibodies: Nitration; Phosphorylation; Methylation; Ubiquitination; Acetylation • Mass Shift/PI Change: Gel Approaches Pros: High sensitivity, Favor the modification Cons: Lack of accuracy Interference from other residues Lack the ability to locate the actual modification sites Mass Spectrometry!!!!! Peptide Modifications Identification by MS/MS • Virtually anything that shifts the mass can be determined by MS • Using MS/MS allows for identification of the modification AND location • You must have an idea of what modification you are looking for • First use protein stain to examine various modifications • Western analysis VS MS analysis Cons: Detection based on ionization efficiency and abundance of the peptides, thus may not favor modified peptides. • LC-MS/MS allows for the identification of low abundant modifications MSMS Fragments and Different Fragmentation Technique xn-1 yn-1 zn-1 xn-2 yn-2 zn-2 x1 y1 z1 R O R O R O R O 1 2 n-1 n H2N CCN CC … N CC N CCOH H H H H H H a1 b1 c1 a2 b2 c2 an-1 bn-1 cn-1 CID: a, b and y ion, loss of H2O, NH3, side-chain PSD: a, b and y ion IRMPD: b and y ion, loss of side-chain ECD: c, y and z ions ETD: c, y and z ions Sequence Nomenclature for Mass Ladder X b2‐b1=56Da + R2 b3‐b2=56Da + R2 X b1 b2 b3 Mass=56Da+R y3 y2 y1 y3‐y2=56Da + R2 y2‐y1=56Da + R2 1598 723 965 1166 529 1424 401 852 586 1052 1295 QGH E L S N EER G L L E D L G Y D V V V K 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z b12 G L L E D L G Y D V V V K y2 1273.72 b12 246.00 y2 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z b11 b12 G L L E D L G Y D V V V K y3 y2 1174.68 1273.72 b11 b12 99.04 Val 246.00 345.32 y2 y3 99.32 Val 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z b10 b11 b12 G L L E D L G Y D V V V K y4y3 y2 1075.51 1174.68 1273.72 b10 b11 b12 99.17 99.04 Val Val 246.00 345.32 444.29 y2 y3 y4 99.32 98.97 Val Val 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z b9 b10 b11 b12 G L L E D L G Y D V V V K y5 y4y3 y2 976.45 1075.51 1174.68 1273.72 b9 b10 b11 b12 99.06 99.17 99.04 Val Val Val 246.00 345.32 444.29 559.27 y2 y3 y4 y5 99.32 98.97 114.98 Val Val Asp 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z b3 b4 b5 b6b7 b8 b9 b10 b11 b12 G L L E D L G Y D V V V K y11 y10 y9 y8 y7 y6 y5 y4y3 y2 283.82 413.06 528.07 641.08 698.27 861.35 976.45 1075.51 1174.68 1273.72 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 129.24 115.01 113.01 57.19 163.08 115.10 99.06 99.17 99.04 Glu Asp Leu Gly Tyr Asp Val Val Val 246.00 345.32 444.29 559.27 722.22 779.16 892.36 1007.39 1136.48 1249.63 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 99.32 98.97 114.98 162.95 56.94 113.20 115.03 129.09 113.19 Val Val Asp Tyr Gly Leu Asp Glu Leu 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z Database Searching Mascot •16 node cluster for high-speed data processing Protein sequences are digested and fragmented In Silico which produces an enormous peak list Raw MS/MS data is converted to a peak list and compared against the In Silico peak list. Critical point of database searching •The enzyme must work properly (no non-specific cleavage and missed cuts) •The mass accuracy limits must be set appropriately •Any modifications must be accounted for (modified cysteine) •The database must contain the protein Mascot Database Search Engine Enzyme Used 20ppm for orbitrap 1.5 Da for LTQ MASCOT Results Protein Mol. Weight Protein Mascot # of spectra matched Mowse Score # of peptides matched * Protein Name The Exponentially Modified Protein Abundance Index (emPAI) * Example 1 — Phosphorylation Identification by MS/MS b9 b11 b12 b13 b14 b4 b5 b6 b7 b9* b10 b11* b12* b13* b14* b15* 48 41T S S D N N V S(PO4) V T N V S V A K56 y13 y12 y11 y10y9 y8 y7 y6 y5 y4 y3 y12* y11* y9* bn*=bn-H3PO4 2+ yn*=yn-H3PO4 m/z=850.54 802.432+ 391.12 505.27 619.29 718.37 M-H3PO4 984.39 1085.46 1199.51 1298.46 1385.54 1484.46 b4 b5 b6 b7 266.02 b9 b10 b11 b12 b13 b14 (87+80+99) Asn Asn Val ThrAsn Val Ser Val Ser(PO4)-Val 886.36 1200.51 1386.63 b9* b12* b14* 1287.76 1101.51 b13* 1457.78 b11* b15* 317.28 404.24 503.32 617.36 718.37 817.37 984.39 1083.45 1197.51 1311.59 1426.56 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12 y13 167.02 (87+80) Ser Val Asn Thr Val Val Asn Asn Asp Ser(PO4) 886.36 y9* 1099.56 1213.70 y11* y12* 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 m/z Example 2 — Nitration Identification by MS/MS b3 b4 b7 b8b9 b10 b11 b12 246 241E A F P A Y (N2O) R Q P P E S L253 y11 y10 y9 y8 y7 y6 y5 y4 y3 y2 Measured m/z = 775.612+ Theoretical m/z = 775.372+ 348.10 445.22 880.42 1008.45 1105.51 1202.57 1331.59 b3 b4 b7 b8 b9 b10 b11 601.992+ b10 Pro Gln Pro Pro Glu 219.15 348.10 445.22 542.31 670.43 826.20 1034.50 1105.51 1202.57 1349.69 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 675.552+ y11 208.3 (163+45) Glu Pro Pro Gln Arg Ala Pro Phe Tyr(N2O) 666.502+ b11 709.872+ b12 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z Example 3 — Identification of Cysteine Oxidation by MS/MS y17 y16 y15 y14 y13 y12 y11 y10 y9 y8 y7 y6 y5 y4 529 548 V G S V L Q E G C(SO3) E K I S S L Y G D L R b6 b7 b8 b9 b10 b11 b12 b13 b14 b15 b16 b17 b18 b19 584.26 713.38 1050.21 1178.34 1291.39 1465.42 1578.48 1741.58 1913.59 b6 b7 b10 b11 b12 b14 b15 b16 b18 770.40 921.28 1378.34 1798.75 b8 b9 b13 b17 150.88 (103+48) EGC(SO3) EKISSL Y GD 1014.182+ b19 460.24 623.30 736.40 910.39 1023.55 1151.56 1280.54 1431.51 1617.48 1745.66 1858.62 y4 y5 y6 y8 y9 y10 y11 y12 y14 y15 y16 823.41 1488.48 y7 y13 150.97 (103+48) Y L S S I K E C(SO3) G E QL 873.542+ 930.162+ y15 y16 979.482+ y17 400 600 800 1000 1200 1400 1600 1800 2000 m/z Example 4 — Identification of DMPO Adduct by MS/MS b4 b5 b6 b7 b8 b10 b11 907 900E E W K W F R C (DMPO) P T L L911 y11 y9 y7 Theoretical m/z=859.942+ 2+ 794.642+ Observed m/z=859.89 2+ 738.12 b11 b10 DMPO=C6H9NO (111.0684 Da) 573.11 759.32 906.56 1062.49 1276.67 1474.67 b4 b5 b6 b7 b8 b10 186.21 97.24 155.93 214.18(103+111) 198.00 (97+101) Trp Phe Arg C-DMPO Pro-Thr 638.892+ b8 638.312+ y9 795.572+ 1274.57 y11 y9 960.47 y7 500 600 700 800 900 1000 1100 1200 1300 1400 1500 m/z PTM Identification: when modification is on the terminus of the peptide b2 b3 b4 b5 b6b7 b8 b9 b10 b11 b12 b13 b14 44X K E G V V H G V A T V A E K60 y14 y13 y12 y11 y10 y9 y8 y7 y6 y5 y4y3 y2 753.582+ 543.30 b14 935.48 1105.51 b4 b8 b10 357.19 486.30 642.31 741.36 878.49 1034.45 1206.44 1305.59 1376.51 1505.59 b2 b3 b5 b6 b7 b9 b11 b12 b13 b14 129.11 57.00 99.01 99.05 137.13 56.99 97.97 71.06 100.93 99.15 70.92 129.98 E G V V H G V A T V A E 276.20 347.18 446.27 547.27 618.41 717.41 911.36 1010.47 1166.53 1295.49 1423.55 y2 y3 y4 y5 y6 y7 774.38 y9 y10 1109.47y12 y13 y14 y8 y11 70.98 99.09 101.00 71.13 99.00 56.97 136.98 99.11 99.00 57.06 128.96 128.06 A V T A V G H V V G E K 712.632+ y14 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 m/z PTM Identification: when modification is on the terminus of the peptide b2 b3 b4 b5 b6b7 b8 b9 b10 b11 b12 b13 b14 44X K E G V V H G V A T V A E K60 y14 y13 y12 y11 y10 y9 y8 y7 y6 y5 y4y3 y2 Mass of X=(M+H)-y14=826.45*2-1-1423.55=228.35 Mass of X=b2-Mass of Lysine-H=357.19-128.02-1.01=228.16 753.582+ 543.30 b14 935.48 1105.51 b4 b8 b10 357.19 486.30 642.31 741.36 878.49 1034.45 1206.44 1305.59 1376.51 1505.59 b2 b3 b5 b6 b7 b9 b11 b12 b13 b14 129.11 57.00 99.01 99.05 137.13 56.99 97.97 71.06 100.93 99.15 70.92 129.98 E G V V H G V A T V A E 276.20 347.18 446.27 547.27 618.41 717.41 911.36 1010.47 1166.53 1295.49 1423.55 y2 y3 y4 y5 y6 y7 774.38 y9 y10 1109.47y12 y13 y14 y8 y11 70.98 99.09 101.00 71.13 99.00 56.97 136.98 99.11 99.00 57.06 128.96 128.06 A V T A V G H V V G E K 712.632+ y14 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 m/z PTM Identification: when modification is on the terminus of the peptide b2 b3 b4 b5 b6b7 b8 b9 b10 b11 b12 b13 b14 44X K E G V V H G V A T V A E K60

Post-Translational Modification Identification by Mass Spectrometry

MALDI-TOF/TOF MS Protocols Objectives

Optimized Search Strategy to Maximize PTM Characterization and Protein Coverage in Proteome Discoverer Software

Protein Identification by Mass Spectrometry

Large-Scale Database Searching Using Tandem Mass Spectra: Looking up the Answer in the Back of the Book

Ptmap—A Sequence Alignment Software for Unrestricted, Accurate, and Full-Spectrum Identification of Post-Translational Modification Sites

Bioinformatics Methods for Protein Identification

The Power of Three in Cannabis Shotgun Proteomics: Proteases, Databases and Search Engines

Using Mascot to Characterise Protein Modifications

John Cottrell, David Creasy & Darryl Pappin

Discovering the N-Terminal Methylome by Repurposing of Proteomic

Modifications

Evaluating Peptide Mass Fingerprinting-Based Protein Identification