Post-translational Modification Identification by Mass Spectrometry
Liwen Zhang Mass Spectrometry and Proteomics Facility The Ohio State University
Summer Workshop 2015 PTM: Chemical Modifications on Proteins Increases the Functional Diversity of the Proteome Regulate Activity, Localization and Interaction
Image from https://www.lifetechnologies.com/overview‐post‐translational‐modification.html Histone Modifications
Arginine Methylation Lysine Methylation
Acetylation Sumoylation
K‐GGXXX
Phosphorylation Ubiquinylation
Citrullination
K‐GG Post-translational Modifications
Modified Modification Mass Shift (Da) Function Residue
Disulphide bond C ‐2.0157 formation protein stability
Methylation K/R 14.0157 regulation of gene expression
Hydroxylation P 15.9949 Protein stability and protein-ligand interaction
Oxidation of Met M 15.9949 usually introduced during digestion
Acetylation K 42.0106 Protein stability, protection of N terminus, regualtion of protein-DNA interactions oxidative stress; preventing irreversible oxidation of protein thiol; control of cell- 305.0682 S‐Glutathionylation C signaling pathways by modulating protein function Signaling, activation/inactivation of enzyme activities, modulation of molecular Phosphorylation S/T/Y 79.9663 interactions
Nitration Y 44.9851 oxidative damage during inflammation
Ubiquitination K 114.0429 destruction signal
Oxidation of Cys C 31.9721/47.9847 oxidative/nitrosative stress Nitrosylation C 28.9902 oxidative/nitrosative stress Excreted proteins, cell-cell recognition/signaling O-GlcNAc, reversible, regulatory Glycosylation N/S/T ‐‐‐‐‐‐‐‐ functions Method for PTM Detections
• Specific Fluorescence Dye: Pro-Q Diamond---Phosphorylation Pro-Q Emerald---Glycosylation • Specific Antibodies: Nitration; Phosphorylation; Methylation; Ubiquitination; Acetylation • Mass Shift/PI Change: Gel Approaches
Pros: High sensitivity, Favor the modification Cons: Lack of accuracy Interference from other residues Lack the ability to locate the actual modification sites
Mass Spectrometry!!!!! Peptide Modifications Identification by MS/MS
• Virtually anything that shifts the mass can be determined by MS
• Using MS/MS allows for identification of the modification AND location
• You must have an idea of what modification you are looking for
• First use protein stain to examine various modifications
• Western analysis VS MS analysis
Cons: Detection based on ionization efficiency and abundance of the peptides, thus may not favor modified peptides.
• LC-MS/MS allows for the identification of low abundant modifications MSMS Fragments and Different Fragmentation Technique
xn-1 yn-1 zn-1 xn-2 yn-2 zn-2 x1 y1 z1 R O R O R O R O 1 2 n-1 n H2N CCN CC … N CC N CCOH H H H H H H
a1 b1 c1 a2 b2 c2 an-1 bn-1 cn-1
CID: a, b and y ion, loss of H2O, NH3, side-chain PSD: a, b and y ion IRMPD: b and y ion, loss of side-chain ECD: c, y and z ions ETD: c, y and z ions Sequence Nomenclature for Mass Ladder
X b2‐b1=56Da + R2 b3‐b2=56Da + R2 X b1 b2 b3
Mass=56Da+R
y3 y2 y1
y3‐y2=56Da + R2 y2‐y1=56Da + R2 1598
723 965 1166 529 1424 401 852 586 1052 1295 QGH E L S N EER G L L E D L G Y D V V V K
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z b12 G L L E D L G Y D V V V K y2
1273.72 b12
246.00 y2
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z b11 b12 G L L E D L G Y D V V V K y3 y2
1174.68 1273.72 b11 b12
99.04 Val
246.00 345.32 y2 y3
99.32 Val
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z b10 b11 b12 G L L E D L G Y D V V V K y4y3 y2
1075.51 1174.68 1273.72 b10 b11 b12
99.17 99.04 Val Val
246.00 345.32 444.29 y2 y3 y4
99.32 98.97 Val Val
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z b9 b10 b11 b12 G L L E D L G Y D V V V K y5 y4y3 y2
976.45 1075.51 1174.68 1273.72 b9 b10 b11 b12
99.06 99.17 99.04 Val Val Val
246.00 345.32 444.29 559.27 y2 y3 y4 y5
99.32 98.97 114.98 Val Val Asp
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z b3 b4 b5 b6b7 b8 b9 b10 b11 b12 G L L E D L G Y D V V V K y11 y10 y9 y8 y7 y6 y5 y4y3 y2
283.82 413.06 528.07 641.08 698.27 861.35 976.45 1075.51 1174.68 1273.72 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12
129.24 115.01 113.01 57.19 163.08 115.10 99.06 99.17 99.04 Glu Asp Leu Gly Tyr Asp Val Val Val
246.00 345.32 444.29 559.27 722.22 779.16 892.36 1007.39 1136.48 1249.63 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11
99.32 98.97 114.98 162.95 56.94 113.20 115.03 129.09 113.19 Val Val Asp Tyr Gly Leu Asp Glu Leu
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z Database Searching
Mascot •16 node cluster for high-speed data processing
Protein sequences are digested and fragmented In Silico which produces an enormous peak list
Raw MS/MS data is converted to a peak list and compared against the In Silico peak list.
Critical point of database searching •The enzyme must work properly (no non-specific cleavage and missed cuts) •The mass accuracy limits must be set appropriately •Any modifications must be accounted for (modified cysteine) •The database must contain the protein Mascot Database Search Engine
Enzyme Used
20ppm for orbitrap 1.5 Da for LTQ MASCOT Results
Protein Mol. Weight
Protein Mascot # of spectra matched Mowse Score # of peptides matched * Protein Name
The Exponentially Modified Protein Abundance Index (emPAI)
* Example 1 — Phosphorylation Identification by MS/MS
b9 b11 b12 b13 b14 b4 b5 b6 b7 b9* b10 b11* b12* b13* b14* b15* 48 41T S S D N N V S(PO4) V T N V S V A K56 y13 y12 y11 y10y9 y8 y7 y6 y5 y4 y3 y12* y11* y9* bn*=bn-H3PO4 2+ yn*=yn-H3PO4 m/z=850.54 802.432+ 391.12 505.27 619.29 718.37 M-H3PO4 984.39 1085.46 1199.51 1298.46 1385.54 1484.46 b4 b5 b6 b7 266.02 b9 b10 b11 b12 b13 b14 (87+80+99) Asn Asn Val ThrAsn Val Ser Val Ser(PO4)-Val
886.36 1200.51 1386.63 b9* b12* b14* 1287.76 1101.51 b13* 1457.78 b11* b15*
317.28 404.24 503.32 617.36 718.37 817.37 984.39 1083.45 1197.51 1311.59 1426.56 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12 y13 167.02 (87+80) Ser Val Asn Thr Val Val Asn Asn Asp Ser(PO4) 886.36 y9* 1099.56 1213.70 y11* y12*
300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 m/z Example 2 — Nitration Identification by MS/MS
b3 b4 b7 b8b9 b10 b11 b12 246 241E A F P A Y (N2O) R Q P P E S L253 y11 y10 y9 y8 y7 y6 y5 y4 y3 y2
Measured m/z = 775.612+ Theoretical m/z = 775.372+
348.10 445.22 880.42 1008.45 1105.51 1202.57 1331.59 b3 b4 b7 b8 b9 b10 b11 601.992+ b10 Pro Gln Pro Pro Glu
219.15 348.10 445.22 542.31 670.43 826.20 1034.50 1105.51 1202.57 1349.69 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 675.552+ y11 208.3 (163+45) Glu Pro Pro Gln Arg Ala Pro Phe Tyr(N2O)
666.502+ b11 709.872+ b12
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z Example 3 — Identification of Cysteine Oxidation by MS/MS y17 y16 y15 y14 y13 y12 y11 y10 y9 y8 y7 y6 y5 y4 529 548 V G S V L Q E G C(SO3) E K I S S L Y G D L R b6 b7 b8 b9 b10 b11 b12 b13 b14 b15 b16 b17 b18 b19
584.26 713.38 1050.21 1178.34 1291.39 1465.42 1578.48 1741.58 1913.59 b6 b7 b10 b11 b12 b14 b15 b16 b18 770.40 921.28 1378.34 1798.75 b8 b9 b13 b17 150.88 (103+48) EGC(SO3) EKISSL Y GD
1014.182+ b19
460.24 623.30 736.40 910.39 1023.55 1151.56 1280.54 1431.51 1617.48 1745.66 1858.62 y4 y5 y6 y8 y9 y10 y11 y12 y14 y15 y16 823.41 1488.48 y7 y13 150.97 (103+48) Y L S S I K E C(SO3) G E QL
873.542+ 930.162+ y15 y16
979.482+ y17
400 600 800 1000 1200 1400 1600 1800 2000 m/z Example 4 — Identification of DMPO Adduct by MS/MS
b4 b5 b6 b7 b8 b10 b11 907 900E E W K W F R C (DMPO) P T L L911 y11 y9 y7
Theoretical m/z=859.942+ 2+ 794.642+ Observed m/z=859.89 2+ 738.12 b11 b10
DMPO=C6H9NO (111.0684 Da)
573.11 759.32 906.56 1062.49 1276.67 1474.67 b4 b5 b6 b7 b8 b10
186.21 97.24 155.93 214.18(103+111) 198.00 (97+101) Trp Phe Arg C-DMPO Pro-Thr
638.892+ b8 638.312+ y9 795.572+ 1274.57 y11 y9
960.47 y7
500 600 700 800 900 1000 1100 1200 1300 1400 1500 m/z PTM Identification: when modification is on the terminus of the peptide
b2 b3 b4 b5 b6b7 b8 b9 b10 b11 b12 b13 b14 44X K E G V V H G V A T V A E K60 y14 y13 y12 y11 y10 y9 y8 y7 y6 y5 y4y3 y2
753.582+ 543.30 b14 935.48 1105.51 b4 b8 b10 357.19 486.30 642.31 741.36 878.49 1034.45 1206.44 1305.59 1376.51 1505.59 b2 b3 b5 b6 b7 b9 b11 b12 b13 b14
129.11 57.00 99.01 99.05 137.13 56.99 97.97 71.06 100.93 99.15 70.92 129.98 E G V V H G V A T V A E
276.20 347.18 446.27 547.27 618.41 717.41 911.36 1010.47 1166.53 1295.49 1423.55 y2 y3 y4 y5 y6 y7 774.38 y9 y10 1109.47y12 y13 y14 y8 y11
70.98 99.09 101.00 71.13 99.00 56.97 136.98 99.11 99.00 57.06 128.96 128.06 A V T A V G H V V G E K 712.632+ y14
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 m/z PTM Identification: when modification is on the terminus of the peptide
b2 b3 b4 b5 b6b7 b8 b9 b10 b11 b12 b13 b14 44X K E G V V H G V A T V A E K60 y14 y13 y12 y11 y10 y9 y8 y7 y6 y5 y4y3 y2
Mass of X=(M+H)-y14=826.45*2-1-1423.55=228.35 Mass of X=b2-Mass of Lysine-H=357.19-128.02-1.01=228.16
753.582+ 543.30 b14 935.48 1105.51 b4 b8 b10 357.19 486.30 642.31 741.36 878.49 1034.45 1206.44 1305.59 1376.51 1505.59 b2 b3 b5 b6 b7 b9 b11 b12 b13 b14
129.11 57.00 99.01 99.05 137.13 56.99 97.97 71.06 100.93 99.15 70.92 129.98 E G V V H G V A T V A E
276.20 347.18 446.27 547.27 618.41 717.41 911.36 1010.47 1166.53 1295.49 1423.55 y2 y3 y4 y5 y6 y7 774.38 y9 y10 1109.47y12 y13 y14 y8 y11
70.98 99.09 101.00 71.13 99.00 56.97 136.98 99.11 99.00 57.06 128.96 128.06 A V T A V G H V V G E K 712.632+ y14
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 m/z PTM Identification: when modification is on the terminus of the peptide
b2 b3 b4 b5 b6b7 b8 b9 b10 b11 b12 b13 b14 44X K E G V V H G V A T V A E K60 y14 y13 y12 y11 y10 y9 y8 y7 y6 y5 y4y3 y2
Modification occurs on lysine residue Mass of modification =Mass of X-Mass of Lysine=228.16-128.09=100.05Da Succinylation: Addition of –COCH2CH2CO- to lysine residue (+100.0115Da)
Theoretical m/z 826.45432+ Observed m/z 826.45732+ Mass Error 3.63 ppm
High mass accuracy instrument needed Use high mass accuracy for MS/MS too MSn can be performed to resolve the structure of modification if necessary Identification of PTMs:
Know the Modification (Stability, Mass Shift, possible modification site)
High Samples Purity
High Concentration
High Mass Accuracy/Resolution Difficulties for PTMs Identification
Generating Peptides containing modified residue --Different Enzymes/Digestion Condition
Low population of modifications (usually less than 1%) --Enrichment --Targeted Method
Modification not stable --Experiment Condition (reducing reagents, enzymes) --Fresh sample
Modification labile in MS analysis --Neutral Loss Method --Different Fragmentation Method Enzymes for Proteome Research
Trypsin K-X and R-X
Endoprotease Lys-C K-X except when X = P
Endoprotease Arg-C R-X except when X = P
Endoprotease Asp-N X-D
Endoprotease Glu-C E-X except when X = P
Chymotrypsin L-X, F-X, Y-X and W-X
Cyanogen Bromide X-M Using the CORRECT Enzyme
1 MAALKLLSSG LRLGASARSS RGALHKGCVC YFSVSTRHHT KFYTDPVEAV 51 KDIPNGATLL VGGFGLCGIP ENLIGALLKT GVKDLTAVSN NAGVDNFGLG 101 LLLRSKQIKR MISSYVGENA EFERQFLSGE LEVELTPQGT LAERIRAGGA 151 GVPAFYTSTG YGTLVQEGGS PIKYNKDGSV AIASKPREVR EFNGQHFILE 201 EAITGDFALV KAWKADRAGN VIFRKSARNF NLPMCKAAGT TVVEVEEIVD 251 IGSFAPEDIH IPKIYVHRLI KGEKYEKRIE RLSLRKEGDG KGKSGKPGGD 301 VRERIIKRAA LEFEDGMYAN LGIGIPLLAS NFISPNMTVH LQSENGVLGL Phosphorylation 351 GPYPLKDEAD ADLINAGKET VTVLPGASFF SSDESFAMIR GGHVNLTMLG 401 AMQVSKYGDL ANWMIPGKMV KGMGGAMDLV SSSKTKVVVT MEHSAKGNAH 451 KIMEKCTLPL TGKQCVNRII TEKGVFDVDK KNGLTLIELW EGLTVDDIKK 501 STGCDFAVSP NLMPMQQIST
Tryspin: 309AALEFEDGMYANLGIGIPLLASNFISPNMTVHLQSENGVLGLGPYPLK356 Chymotrypsin: 329ASNF332, GluC: 315DGMYANLGIGIPLLASNFISPNMTVHLQSE344 Optimize Digestion Conditions 1 MADESSDAAG EPQPAPAPVRRRSSANYRAY ATEPHAKKKS KISASRKLQL 51 KTLMLQIAKQ EMEREAEER R GEKGRVLRTR CQPLELDGLG FEELQDLCRQ 101 LHARVDKVDE ERYDVEAKVT KNITEIADLT QKIYDLRGKF KRPTLRRVRI 151 SADAMMQALL GTRAKESLDL RAHLKQVKKE DIEKENREVG DWRKNIDALS 201 GMEGRKKKFE G Sequence coverage: 97.2%
Mr(calc Start Observed Mr(expt) ppm M Score Peptide ) AA2-20 932.9369 1863.8592 1863.9 -0.34 0 104 M.ADESSDAAGEPQPAPAPVR.R AA2-21 674.3265 2019.9576 2020 -1.67 1 95 M.ADESSDAAGEPQPAPAPVRR.R AA2-22 726.3605 2176.0598 2176.1 -1.06 2 66 M.ADESSDAAGEPQPAPAPVRRR.S AA23-37 555.9385 1664.7936 1664.8 1.78 1 55 R.SSANYRAYATEPHAK.K AA23-39 641.3364 1920.9874 1921 3.59 3 44 R.SSANYRAYATEPHAKKK.S AA29-37 494.2502 986.4858 986.48 3.69 0 26 R.AYATEPHAK.K AA29-39 622.3464 1242.6782 1242.7 4.97 2 44 R.AYATEPHAKKK.S AA29-41 486.9404 1457.7994 1457.8 0.25 3 35 R.AYATEPHAKKKSK.I Trypsin; 1:200; RT; 20 min AA29-46 658.37 1972.0882 1972.1 1.43 4 34 R.AYATEPHAKKKSKISASR.K AA47-59 382.7495 1526.9691 1527 7.14 2 35 R.KLQLKTLMLQIAK.Q Digestion AA52-64 530.952 1589.8341 1589.8 4.51 1 85 K.TLMLQIAKQEMER.E AA52-69 735.7083 2204.1029 2204.1 4.57 2 49 K.TLMLQIAKQEMEREAEER.R AA60-69 436.1964 1305.5675 1305.6 4.31 1 25 K.QEMEREAEER.R AA76-99 730.1263 2916.476 2916.5 5.98 2 31 R.VLRTRCQPLELDGLGFEELQDLCR.Q AA79-99 850.4134 2548.2183 2548.2 5.24 1 95 R.TRCQPLELDGLGFEELQDLCR.Q AA81-99 1146.5381 2291.0616 2291.1 2.36 0 115 R.CQPLELDGLGFEELQDLCR.Q AA81-104 725.1068 2896.3981 2896.4 0.75 1 69 R.CQPLELDGLGFEELQDLCRQLHAR.V AA105-118 565.6145 1693.8217 1693.8 3.43 2 41 R.VDKVDEERYDVEAK.V AA105-121 1012.028 2022.0415 2022 7.19 3 39 R.VDKVDEERYDVEAKVTK.N AA119-132 787.4489 1572.8833 1572.9 7 1 63 K.VTKNITEIADLTQK.I AA119-137 745.4208 2233.2405 2233.2 3.91 2 65 K.VTKNITEIADLTQKIYDLR.G AA119-139 605.5953 2418.3522 2418.3 1.68 3 54 K.VTKNITEIADLTQKIYDLRGK.F AA122-132 623.3372 1244.6599 1244.7 -1.06 0 79 K.NITEIADLTQK.I AA122-137 953.5199 1905.0252 1905 2.38 1 66 K.NITEIADLTQKIYDLR.G AA122-139 697.7211 2090.1414 2090.1 2.03 2 48 K.NITEIADLTQKIYDLRGK.FQ ALLGTR.A AA164-171 466.2661 930.5176 930.51 4.52 1 58 R.AKESLDLR.A AA138-146 368.2345 1101.6816 1101.7 4.08 2 26 R.GKFKRPTLR.R AA138-147AA176-187 420.2665 505.9445 1257.7776 1514.8118 1257.8 1514.8 -0.46 4.36 3 3 26 R.GKFKRPTLRR.V 59 K.QVKKEDIEKENR.E AA148-163AA188-205 866.964 678.3385 1731.9134 2031.9937 1731.9 2032 0.61 6.93 1 2 79 R.VRISADAMMQALLGTR.A 43 R.EVGDWRKNIDALSGMEGR.K AA150-163AA194-205 739.3765 645.8306 1476.7385 1289.6466 1476.7 1289.6 -2.94 5.29 0 107 1 R.ISADAMM 78 R.KNIDALSGMEGR.K AA195-205 581.7822 1161.5499 1161.5 4.39 0 95 K.NIDALSGMEGR.K AA195-206 645.8309 1289.6472 1289.6 5.77 1 54 K.NIDALSGMEGRK.K AA195-207 709.878 1417.7414 1417.7 4.74 2 46 K.NIDALSGMEGRKK.K AA195-211 627.329 1878.9651 1879 1.58 4 29 K.NIDALSGMEGRKKKFEG.- Difficulties for PTMs Identification
Generating Peptides containing modified residue --Different Enzymes/Digestion Condition
Low population of modifications --Enrichment --Targeted Method (For better MSMS) PTM Enrichment Techniques (Start with mg of samples)
Acetylation: Anti-acetyl lysine polyclonal antibody
Phosphopeptides: Immobilized Metal Affinity Chromatography (Fe3+) Metal Oxide Affinity Chromatography (TiO2, ZrO2) Reversible Covalent Binding (Techniques for phosphopeptide enrichment prior to analysis by mass spectrometry. Mass Spectrom Rev. 2010 Jan- Feb;29(1):29-54.) Phospho-Threonine Antibody (P-Thr-Polyclonal)
Nitration: Anti‐Nitrotyrosine polyclonal antibody
Ubiquitination: K-ε-GG–specific antibody (enrichment has to be done prior to other treatment on lysine)
Glycopeptides: Lectin affinity enrichment Covalent Interactions Chromatographic separation (Glycopeptide enrichment and separation for protein glycosylation analysis., J Sep Sci. 2012 Sep;35(18):2341-72).
Methylation: Anti-methyl lysine/araginine antibody Enrichment for Phosphorylation
extract cells Cell Culture
cell lysis
TiO2 beads peptides
In-solution digestion protein
2D LC-MS/MS Phosphopeptides/proteins Identification
Slide courtesy of Nilini S. Ranbaduge NL: 4.05E7 Base Peak m/z= 516.7690- 516.7896 MS 17625_1_rer un
NL: 5.98E5 Example---Targeted MS Scan Base(for Peak Known Modifications) m/z= 479.2634- RT: 19.61 - 39.93 27.80 100 479.2826 90 27.74 MS 2+ 80 Full Scan 27.69 27.93 27.96 SIC of 516.7793 (SLVLC(CAM)TPSR) 70 17625_1_rer 60 28.01 un 50 27.61 28.06 e c
n 100:1 a d
n 27.55 u b A
e 40 tiv la
e 27.50 R 30 28.13 27.44 20 27.38 28.21 10 27.30 28.36 20.9021.45 22.89 23.82 25.54 30.13 25.95 27.19 28.56 29.07 31.48 33.6734.17 36.04 38.15 38.65 100 0 27.80 90 80 27.74 70 27.66 27.82 27.96 60 2+ 50 27.61 SIC of 479.2730 (SLVLfGlyTPSR ) 27.55 28.01 40 28.06 30 27.50 35.91 39.60 27.44 28.13 36.09 20 27.36 39.55 10 27.25 28.21 31.90 32.09 39.49 20.02 21.30 21.75 23.65 24.59 27.0635.59 33.86 28.64 25.85 31.76 33.41 36.25 38.68 0 20 22 24 26 28 30 32 34 36 38 Time (min)
2+ 2+ MS/MS of 516.7793 (SLVLC(CAM)TPSR) MS/MS of 479.2730 (SLVLfGlyTPSR )
17625_1_rerun # 2715-9589 RT: 18.73-51.92 AV: 99 NL: 5.71E3 T: Average spectrum MS2 516.79 (2715-9589) 733.45 100 95 90 85 620.39 80 507.76 NOT Observed!!!! 75 70 65 60
55 832.54 50 45 Relative Abundance 40 Cys(CAM)462.29 Leu Val 35 30 416.76 25 201.05 487.72
20 359.31 544.28 575.37 173.20 Thr 15 Val Leu Cys(CAM)674.26 Thr 10 255.07 423.26 347.30 367.27300.06 704.44 407.78 413.05 602.00 5 288.20 319.31 656.23 886.49 213.06 804.48 773.42 0 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 m/z NL: 2.47E8 Base Peak m/z= 85.0000- 2000.0000 MS 17625_1_rer un
Example---Targeted MS Scan (for Known Modifications) NL: 8.17E8 RT: 19.94 - 50.13 100 Base Peak 35.88 90 Full Scan MS 17625_1_Ta 42.95 49.23 80 41.90 43.66 49.07 70 rget_2 44.67 40.05 35.99 60 36.06 40.1540.20 36.27 45.51
e 50 c n a
d 28.36 n u b A e 40 tiv la e R 30 28.13 28.54 28.21 34.20 36.46 38.26 46.39 20 28.67 32.04 33.80 46.58 28.08 31.93 32.09 36.57 47.16 10 21.16 23.85 24.09 27.55 28.90 47.47 0 29.16 41.95 100 2+ 2+ 90 Targeted @ 479.2730 (SLVLfGlyTPSR )/516.7793 (SLVLC(CAM)TPSR) 80 70 60 50 40 44.90 30 44.81 20 35.82 40.13 35.9541.86 43.30 49.38 10 36.3928.69 40.32 46.00 45.16 48.12 24.70 25.82 28.26 32.14 32.44 36.59 38.21 39.95 34.25 0 20.00 22.08 28.92 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 Time (min)
42.09 MS/MS of 479.27302+ (SLVLfGlyTPSR ) 2+ MS/MS of 516.7793 (SLVLC(CAM)TPSR) 17625_1_Target_2 #667-2166 RT: 19.74-39.25 AV: 264 NL: 3.92E4 17625_1_Target_2 # 799-1295 RT: 22.03-28.48 AV: 40 NL: 1.03E3 T: ITMS + c ESI d Full ms2 [email protected] [130.00-1045.00] T: ITMS + c ESI d Full ms2 [email protected] [120.00-970.00] 733.44 581.28 100 100
95 95 359.31 90 620.39 90 85 85
80 80
75 75
70 70
65 65 832.51 60 60 55 55 507.74 50 50 45 45 40 Relative Abundance 416.77
40 Relative Abundance 470.22 35 35 Cys(CAM)Leu Val 30 460.28 30 460.36 201.05 25 757.45 25 Thr fGly 545.38 Leu 658.34 Val 20 359.31 413.05 20 173.20 300.06 Thr 573.18 15 402.85 15 413.10 299.98 328.66 10 201.02 631.30 674.27 446.40 255.09 173.19 Val Leu 594.32 10 Val Leu Cys(CAM) Thr 388.76 517.26 5 255.10 671.51 fGly Thr 730.39 555.09 229.03 781.32 5 656.23 715.43Pro 702.58 213.08 407.74 498.76 545.24 602.35 814.46Ser 858.31 0 262.26 310.81 430.13 689.49 771.31 886.44 150 200 250 300 350 400 450 500 550 600 650 700 750 800 0 m/z 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 m/z Difficulties for PTMs Identification
Generating Peptides containing modified residue --Different Enzymes/Digestion Condition
Low population of modifications --Enrichment --Targeted Method
Modification not stable --Experiment Condition (reducing reagents, enzymes) --Fresh sample Adjust Experiment Condition to Preserve Modifications
y12Ay11A y10A y9A y8A y6A y5A y4A y3A y2A 185 199 Chain A N A C187 G S G Y D F D V F V V R S S y5B y3B 138L V E G C L V G G R146 Chain B b3B 142 A 274.25 373.31 520.42 619.48 734.51 996.57 1159.62 1216.57 1303.641360.66 y2A y3A y4A y5A y6A y8A y9A y10A y11A y12A
VFVD FDYGSG
B 408.17 465.20 552.31 609.30 772.30 1034.35 1149.39 1248.41 1395.53 1494.50 1593.59
G S G Y FD D V F V V
289.23342.33 501.10 y3B b3B y5B
250 400 600 800 1000 1200 1400 1600 m/z Using Different Enzyme to Protect Modifications
y12 y10 y9 y8 y7 y6 y5 y4 y3 202 194A G R P G M G V O G P E T S L208 b3 b5 b6 b7 b8 b9 b10 b11 b12 b13 b14 Theoretical m/z =783.40892+, Observed m/z =783.40802+, Mass Error =1.15ppm O=Pyrrolysine
285.21 439.23 570.30 627.43 726.35 963.39 1020.49 1117.55 1246.54 1347.71 b3 b5 b6 b7 b8 b9 b10 b11 b12 b13 154.13 (97+57) 130.97 57.13 98.92 237.04 57.10 97.06 128.99 101.17 P‐G M G V O G P E T 674.472+ b13 718.062+ b14
623.822+ b12
320.13 449.05 546.24 603.21 840.38 939.24 996.30 1127.37 1281.45 y3 y4 y5 y6 y7 y8 y9 y10 y12
128.92 97.19 56.97 237.17 98.86 57.06 131.07 154.08 (57+97) E P G O V G M G‐P
200 300 400 500 600 700 800m/z 900 1000 1100 1200 1300 1400 LTQ DATA—Fresh Sample
1 MSDKSELKAE LERKKQRLAQ IREEKKRKEE ERKKKETDQK KEAVAPVQEE 51 SDLEKKRREA EALLQSMGLT PESPIVFSEY WVPPPMSPSS KSVSTPSEAG 101 SQDSGDGAVG SRTLHWDTDP SVLQLHSDSD LGRGPIKLGM AKITQVDFPP 151 REIVTYTKET QTPVMAQPKE DEEEDDDVVA PKPPIEPEEE KTLKKDEEND 201 SKAPPHELTE EEKQQILHSE EFLSFFDHST RIVERALSEQ INIFFDYSGR 251 DLEDKEGEIQ AGAKLSLNRQ FFDERWSKHR VVSCLDWSSQ YPELLVASYN 301 NNEDAPHEPD GVALVWNMKY KKTTPEYVFH CQSAVMSATF AKFHPNLVVG 351 GTYSGQIVLW DNRSNKRTPV QRTPLSAAAH THPVYCVNVV GTQNAHNLIS 401 ISTDGKICSW SLDMLSHPQD SMELVHKQSK AVAVTSMSFP VGDVNNFVVG 451 SEEGSVYTAC RHGSKAGISE MFEGHQGPIT GIHCHAAVGA VDFSHLFVTS 501 SFDWTVKLWT TKNNKPLYSF EDNADYVYDV MWSPTHPALF ACVDGMGRLD 551 LWNLNNDTEV PTASISVEGN PALNRVRWTH SGREIAVGDS EGQIVIYDVG 601 EQIAVPRNDE WARFGRTLAE INANRADAEE EAATRIPA Sequence Coverage 39%; T95 was phosphorylated Orbitrap XL Data‐‐‐Two Days old, Stored at 4C
1 MSDKSELKAE LERKKQRLAQ IREEKKRKEE ERKKKETDQK KEAVAPVQEE 51 SDLEKKRREA EALLQSMGLT PESPIVFSEY WVPPPMSPSS KSVSTPSEAG 101 SQDSGDGAVG SRTLHWDTDP SVLQLHSDSD LGRGPIKLGM AKITQVDFPP 151 REIVTYTKET QTPVMAQPKE DEEEDDDVVA PKPPIEPEEE KTLKKDEEND 201 SKAPPHELTE EEKQQILHSE EFLSFFDHST RIVERALSEQ INIFFDYSGR 251 DLEDKEGEIQ AGAKLSLNRQ FFDERWSKHR VVSCLDWSSQ YPELLVASYN 301 NNEDAPHEPD GVALVWNMKY KKTTPEYVFH CQSAVMSATF AKFHPNLVVG 351 GTYSGQIVLW DNRSNKRTPV QRTPLSAAAH THPVYCVNVV GTQNAHNLIS 401 ISTDGKICSW SLDMLSHPQD SMELVHKQSK AVAVTSMSFP VGDVNNFVVG 451 SEEGSVYTAC RHGSKAGISE MFEGHQGPIT GIHCHAAVGA VDFSHLFVTS 501 SFDWTVKLWT TKNNKPLYSF EDNADYVYDV MWSPTHPALF ACVDGMGRLD 551 LWNLNNDTEV PTASISVEGN PALNRVRWTH SGREIAVGDS EGQIVIYDVG 601 EQIAVPRNDE WARFGRTLAE INANRADAEE EAATRIPA
Sequence Coverage 53%; No Phosphorylation Detected
Detected 30 times!!! Difficulties for PTMs Identification
Generating Peptides containing modified residue --Different Enzymes/Digestion Condition
Low population of modifications --Enrichment --Targeted Method
Modification not stable --Experiment Condition (reducing reagents, enzymes) --Fresh sample
Modification labile in MS analysis --Neutral Loss Method --Different Fragmentation Method Neutral Loss Scan ScanDissociate Scan
http://www.biotechniques.com/BiotechniquesJournal/2006/June/Analysis-of-posttranslational-modifications-of- proteins-by-tandem-mass-spectrometry YKVPQLEIVPNsAEER Alpha Casein
NL of Phosphorylation: 98 (+1), 49 (+2)…
1070.62 1561.73 b9 y13 1777.84 391.23 504.24 b15 b3 y4
1463.75 y13* 784.36 y7* • Help Identify the Sequence but don’t Give Information about the Location of the
1070.62 Modification if Multiple Sites 883.43 b9 y8* Coexisting. • Sensitive not Ideal 391.23 996.51 b3 y9* 1125.55 1238.64 y10* y11* 292.16 1562.82 b2 y14*
Hydrothermal synthesis of α-Fe2O3@SnO2 core–shell nanotubes for highly selective enrichment of phosphopeptides for mass spectrometry analysis Nanoscale, 2010,2, 1892-1900 ECD/ETD: Keep the Phosphorylation Intact!!!
H3PO4 Group may Leave upon CID Dissociation; Actual Amino Acid Location of H3PO4 Group May not be Clear if >2 Sites in the Sequence
xn-1 yn-1 zn-1 xn-2 yn-2 zn-2 x1 y1 z1
R O R O R O R O 1 2 n-1 n H2N CCN CC … N CC N CCOH H H H H H H
a1 b1 c1 a2 b2 c2 an-1 bn-1 cn-1
CID: a, b and y ion, loss of H2O, NH3, side-chain ECD/ETD: c, y and z ions
Electron Capture Dissociation (ECD, ICR)/ Electron Transfer Dissociation (ETD, Traps) will Generate Peptide Fragments without Losing the Sidechain Phosphorylation
Sweet, Anal. Chem. 2006, 78, 7563-7569 Leann M. Mikesh et al. Biochimica et Biophysica Acta 1764 (2006) 1811–1822 http://openi.nlm.nih.gov/detailedresult.php?img=2849712_zjw0031035580003&req=4 Choose a Good Enzyme to Generate The GG Tag —the Identification of Ubiquitination by MS/MS
MQIFVKTLTG KTITLEVEPS DTIENVKAKI QDKEGIPPDQ QRLIFAGKQL EDGRTLSDYN IQKESTLHLV LRLRGG Trypsin!!!
+3Ub +2Ub +1Ub Unmodified Choose a Good Enzyme to Generate Signature Modification Group —the Identification of Ubiquitination by MS/MS
A proteomics approach to understanding protein ubiquitination Nature Biotechnology 21, 921 - 926 (2003) Unknown Truncation Sites Identification
Separate Suspected Truncation Product(s) from Intact Protein
Separation on 1D SDS PAGE Intact MS Measurement Cut the band (truncated protein) Digestion Top down Sequencing LC/MSMS
Peptide Sequencing MASCOT Protein sequence
Suggested truncation sites
Search against new sequences
Validation Example---Unknown Truncation Sites Identification
MASCOT RESULT: 1 MVLWILWRPF GFSGRFLKLE SHSITESKSL IPVAWTSLTQ MLLEAPGIFL 51 LGQRKRFSTM PETETHERET ELFSPPSDVR GMTKLDRTAF KKTVNIPVLK 101 VRKEIVSKLM RSLKRAALQR PGIRRVIEDP EDKESRLIML DPYKIFTHDS 151 FEKAELSVLE QLNVSPQISK YNLELTYEHF KSEEILRAVL PEGQDVTSGF 201 SRIGHIAHLN LRDHQLPFKH LIGQVMIDKN PGITSAVNKI NNIDNMYRNF 251 QMEVLSGEQN MMTKVRENNY TYEFDFSKVY WNPRLSTEHS RITELLKPGD 301 VLFDVFAGVG PFAIPVAKKN CTVFANDLNP ESHKWLLYNC KLNKVDQKVK 351 VFNLDGKDFL QGPVKEELMQ LLGLSKERKP SVHVVMNLPA KAIEFLSAFK 401 WLLDGQPCSS EFLPIVHCYS FSKDANPAED VRQRAGAVLG ISLEACSSVH 451 LVRNVAPNKE MLCITFQIPA SVLYKNQTRN PENHEDPPLK RQRTAEAFSD 501 EKTQIVSNT ~260AA =~27‐28kDa Conservation of structure and mechanism by Trm5 enzymes RNA 2013 Sep; 19(9): 1192–1199 Example---Unknown Truncation Sites Identification
Constructed new sequences (a total of 24) and searched the data against these sequences: >16502 Sequence 1 (241-509) NNIDNMYRNFQMEVLSGEQNMMTKVRENNYTYEFDFSKVYWNPRLSTEHSRITELLKPGDVLFDVFAGVGPFAIP VAKKNCTVFANDLNPESHKWLLYNCKLNKVDQKVKVFNLDGKDFLQGPVKEELMQLLGLSKERKPSVHVVMNLPA KAIEFLSAFKWLLDGQPCSSEFLPIVHCYSFSKDANPAEDVRQRAGAVLGISLEACSSVHLVRNVAPNKEMLCITFQI PASVLYKNQTRNPENHEDPPLKRQRTAEAFSDEKTQIVSNT < >16502 Sequence 2 (242-509) NIDNMYRNFQMEVLSGEQNMMTKVRENNYTYEFDFSKVYWNPRLSTEHSRITELLKPGDVLFDVFAGVGPFAIPVA KKNCTVFANDLNPESHKWLLYNCKLNKVDQKVKVFNLDGKDFLQGPVKEELMQLLGLSKERKPSVHVVMNLPAKAI EFLSAFKWLLDGQPCSSEFLPIVHCYSFSKDANPAEDVRQRAGAVLGISLEACSSVHLVRNVAPNKEMLCITFQIPAS VLYKNQTRNPENHEDPPLKRQRTAEAFSDEKTQIVSNT < …..
>16502 Sequence 23 (263-509) TKVRENNYTYEFDFSKVYWNPRLSTEHSRITELLKPGDVLFDVFAGVGPFAIPVAKKNCTVFANDLNPESHKWLLYN CKLNKVDQKVKVFNLDGKDFLQGPVKEELMQLLGLSKERKPSVHVVMNLPAKAIEFLSAFKWLLDGQPCSSEFLPI VHCYSFSKDANPAEDVRQRAGAVLGISLEACSSVHLVRNVAPNKEMLCITFQIPASVLYKNQTRNPENHEDPPLKRQ RTAEAFSDEKTQIVSNT < >16502 Sequence 24 (264-509) KVRENNYTYEFDFSKVYWNPRLSTEHSRITELLKPGDVLFDVFAGVGPFAIPVAKKNCTVFANDLNPESHKWLLYNC KLNKVDQKVKVFNLDGKDFLQGPVKEELMQLLGLSKERKPSVHVVMNLPAKAIEFLSAFKWLLDGQPCSSEFLPIV HCYSFSKDANPAEDVRQRAGAVLGISLEACSSVHLVRNVAPNKEMLCITFQIPASVLYKNQTRNPENHEDPPLKRQR TAEAFSDEKTQIVSNT <
MASSMATRIX Results: hit# score decoy% protein description hit1 3533 0.00% 16502 Sequence 12 (252-509) hit2 3522 0.00% 16502 Sequence 23 (263-509) hit3 3519 0.00% 16502 Sequence 14 (254-509) hit4 3512 0.00% 16502 Sequence 21 (261-509) b2 b3 b4 b5b6 b8 b9 b10 b11 b12 M E V L S G E Q N M M T K 252 y12 y11 y10 y9 y8 y5 y4 y3 y2 264 Observed m/z =749.34412+ 2+ Theoretical m/z =749.3409 1025.27 Mass Error=4.27ppm y9
1400 248.21 379.20 510.34 624.17 938.20 1138.37 1237.34 1366.52 y2 y3 y4 y5 y8 y10 y11 y12 1200 130.99 131.14 113.83 314.03 (57+129+128) 87.07 113.10 98.97 129.18 Met Met Asn Gly-Glu-Gln Ser Leu Val Leu
800
Intensity 260.92 360.13 473.13 560.08 617.17 874.34 988.20 1119.23 1250.21 1351.26 b2 b3 b4 b5 b6 b8 b9 b10 b11 b12
400 99.21 113.00 86.95 57.09 257.17(128+129) 113.86 131.03 130.98 101.05 Val Leu Ser Gly Glu-Gln Asn Met Met Thr
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z Conservation of structure and mechanism by Trm5 enzymes RNA. 2013 Sep; 19(9): 1192–1199 b2 b3 b5b7 b8 b9 b10 V L S G E Q N M M T K 254 y10 y9 y8 y7 y6 y5 y4 y3 y2 264 Observed m/z =619.30162+ Theoretical m/z =619.29942+ Mass Error=3.55ppm 1025.23 y9 610.232+ M‐H2O 5000
248.11 379.17 510.11 624.08 752.17 881.22 938.30 1138.25 y2 y3 y4 y5 y6 y7 y8 y10
4000 131.06 130.94 113.97 128.09 129.05 57.08 86.93 113.02 Met Met Asn Gln Glu Gly Ser Leu
3000 213.01 300.13 486.24 728.26 859.15 990.43 1091.04 Intensity b2 b3 b5 b7 b8 b9 b10
2000
87.12 186.11 (57+129) 242.02 (128+114) 130.89 131.28 100.61 1000 Ser Gly-Glu Gln-Asn Met Met Thr
200 300 400 500 600 700 800 900 1000 1100 1200 m/z
Conservation of structure and mechanism by Trm5 enzymes RNA. 2013 Sep; 19(9): 1192–1199 b2 b3 b4 b6 b7 b9 b10 b11 b13 b14 b15 b17 261M M T K V R E N N Y T Y E F D F S K278 y17 y16 y15 y14 y13 y12 y11 y10 y9 y8 y7 y6 y5 y4 y3 y2
906.412+ 1020.812+ Observed m/z =768.35733+ y14 y16 856.742+ 906.412+ 1086.462+ Theoretical m/z =768.35393+ 1350 y13 y15 y17 Mass Error=4.42ppm 1250 99.34 127.78 101.02 131.30 Val Lys Thr Met
234.08 381.14 496.40 643.26 772.16 935.14 1036.28 1199.31 1313.47 1427.34 1556.88 1000 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12 961.622+ b15
147.06 115.26 146.86 128.90 162.98 101.14 163.03 114.16 113.87 129.54 750 Phe Asp Phe Glu Tyr Thr Tyr Asn Asn Glu
904.472+ 2+ 2+ 2+ 2+ Intensity b14 438.92 552.89 634.302+ 684.622+ 830.87 1078.65 b7 b9 b10 b11 b13 b17 500 263.06 364.12 492.11 747.66 b2 b3 b4 b6
250 101.06 127.99 255.55 (99+156) Thr Lys Val-Arg
200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 m/z Conservation of structure and mechanism by Trm5 enzymes RNA. 2013 Sep; 19(9): 1192–1199 b4 b3 b6b7 b8 b9 b10 b11 b13 b14 b15 263T K V R E N N Y T Y E F D F S K278 y14 y13 y12 y10 y9 y8 y7 y6 y5 y4 y3
935.40 Observed m/z =680.99663+ y7 Theoretical m/z =680.99363+ 2+ 900 906.36 1036.27 y14 y8 Mass Error=4.40ppm 778.642+ 856.942+ 800 y12 y13 156.60 98.84 Arg Val
830.892+ b13 381.32 496.36 643.34 772.20 1199.22 1313.39 600 y3 y4 y5 y6 y9 y10
115.04 146.98 128.86 163.20 100.87 162.95 114.17 Asp Phe Glu Tyr Thr Tyr Asn 904.692+ 307.442+ 699.792+ 2+ Intensity b14 948.15 400 b5 b11 b15
485.56 728.46 842.88 1005.38 1106.55 1269.44 b4 b6 b7 b8 b9 b10
200 242.90 (129+114) 114.42 162.50 101.17 162.89 Glu-Asn Asn Tyr Thr Tyr
300 400 500 600 700 800 900 1000 1100 1200 1300 1400 m/z
Conservation of structure and mechanism by Trm5 enzymes RNA. 2013 Sep; 19(9): 1192–1199 Glycosylation—A Different Story
“The cell surface landscape is richly decorated with oligosaccharides anchored to proteins or lipids within the plasma membrane. Cell surface oligosaccharides mediate the interactions of cells with each other and with extracellular matrix components.” Science 291:2337
Why Glycosylation Investigation is SO HARD???
• Only Protein Modification Requiring Detailed Structural Characterization (sugar chain/branched) • Poor Ionization efficiency (large mass increase, sometimes has negative charge, hydrophobicity..) • Heterogeneity • Large Size
• Enrichment (Lectin Affinity Binding) • Derivatization (Permethylation) • LC Separation (PGC Column for Glycans, HILIC Column for Glycopeptides) • Both MALDI and ESI will Work • High Mass Accuracy Desired • Negative Mode Works Better • ETD Improve the MSMS Identification
Determination of site-specific glycan heterogeneity on glycoproteins Nature Protocols 7, 1285–1298 (2012)