Using Mascot to characterize modifications

ASMS 2003

1 Post-translational Modifications (PTMs)

• Help understand complex biological systems • Phosphorylation is one of the most important protein PTMs • Mascot allows up to 9 variable modifications to be specified at a time • Use variable modifications sparingly, follow it by error tolerant search

ASMS 2003

2 ASMS 2003

This slide shows that two have scores above the threshold. One of them is beta-Casein variant CnH.

3 ASMS 2003

The other protein is beta-casein precursor.

4 ASMS 2003

When we look at the detail, we see that Serine or Threonine is phosphorylated.

5 ASMS 2003

With Mass Fingerprint, it is not possible to map the modification sites. One needs tandem to do that.

6 ASMS 2003

This slide shows at least 5 proteins with scores above the threshold.

7 ASMS 2003

Four identified in this slide have scores higher than 34, the significance threshold. Deamidation seems to be a common modification for all of them.

8 ASMS 2003

Here we see four other peptides that have scores higher than the significance threshold.

9 ASMS 2003

Let’s focus on Query 34, where the ion score was 101. We see a rich MS/MS spectrum.

10 ASMS 2003

Most of the Y ions have been identified.

11 ASMS 2003

From the error graph, one can see that the calibration is quite good. I would believe that this peptide is deamidated.

12 ASMS 2003

This example shows a phosphorylated peptide. There are two possible phosphorylation sites, but the score for phospho-serine (102) is much, much greater than that for phospho-threonine. I such cases, there is no doubt about the location of the modification

13 ASMS 2003

The spectrum shows a strong run of y ions that have lost phosphate as a neutral loss

14 Background

• Eukaryotic chromosomes: DNA is associated with histone proteins to form nucleosomes, building blocks of chromatin • PTMs to histones regulate the access to DNA in chromatin • These mods can change the net charge and structure of histones • Histone H3 is modified on the N-terminus (rich in Arg and Lys) • Protease cleavage results in hydrophilic peptides (poorly retained on a RP-HPLC column)

ASMS 2003

In eukaryotic chromosomes, large amounts of DNA are compacted by association with histone proteins to form nucleosomes, the building blocks of chromatin. Access to DNA in chromatin is regulated by post-translational modifications (PTMs) (methylation, phosphorylation, etc.) to histones. Such modifications can change the histone’s net charge and structure thus playing a pivotal role in the control of chromatin structure and function. The majority of histone H3 modifications occur on the 1-50 residue N-terminal tail, which is rich in Arginine and Lysine residues. This tail produces peptides upon protease cleavage that are very hydrophilic and poorly retained on a RP-HPLC column.

15 Background continued…

• CAD mass spectra of highly multiply charged peptides-interpretation is difficult. • Peptides + Propionic anhydride converts N-termini and Lysines to propyl amides. This results in a decrease in net charge of the peptides and increased hydrophobicity. • Now the peptides can be retained on RP-HPLC columns and their CAD spectra are simpler.

ASMS 2003

The collisionally-activated dissociation (CAD) mass spectra of highly multiply charged peptides are very difficult to interpret. Their research efforts are placed on developing methods to better analyze histones by HPLC and mass spectrometry. Treatment of peptides with propionic anhydride converts free amine groups on unmodified Lysines or Lysine residues containing 1 methyl group modification and N-termini to propyl amides. The consequence of the above strategy is a decrease in the net charge of the peptides, as well as increased hydrophobicity thus facilitating their analysis by increasing retention times on an HPLC column and simplifying their CAD mass spectra.

16 Methods

• Histone protein + Glu-C…isolate 1-50 residue N- terminus by off-line HPLC. • Second digestion with Chymotrypsin of the 1-50 fragment generates appropriate length peptides for MS/MS. • Derivatization with propionic anhydride. • On-line RP-HPLC with direct elution into an electrospray ionization quadrupole ion trap mass spectrometer. • MS/MS followed by peptide and PTM identification using Mascot.

ASMS 2003

Their methods begin with an enzymatic digestion of the histone protein with Glu-C and isolation of the 1-50 residue amino terminus by off-line HPLC. A second digestion with Chymotrypsin is performed on the 1-50 fragment to produce peptides of suitable lengths for experiments. Peptides are then derivatized by the addition of the propionic anhydride reagent. The mixture of peptides are separated by on-line RP-HPLC before direct elution into an electrospray ionization quadrupole ion trap mass spectrometer. Tandem mass spectrometry experiments are then performed to fragment the ions and determine post-translational modification sites after searching with Mascot.

17 Results • Glu-C digestion gave the 1-50 piece • Chymotrypsin digestion produced 1-5, 1-19, 1-20, 6- 19, 6-20, 20-39, 21-39, 23-39, 24-39, 21-41, 40-50, 42- 50 and other random pieces • Primary modifications…found on Lys & Ser • Mods could also be on Arg & Thr • Ser & Thr can be phosphorylated • Lys can have mono, di or tri-methyl groups or an acetyl group • Arg can have mono or di methyl groups • Propyl amide can be on the N-terminus or on unmodified Lys or Lys with mono-methyl group • Mascot is one of the few software programs available for searching data with multiple modifications! ASMS 2003

The digestion of the H3 protein with Glu-C cleaved the protein to produce the 1-50 piece further isolated by off-line HPLC. A Chymotrypsin digestion of the 1-50 piece primarily produced peptide residues of 1-5, 1-19, 1-20, 6-19, 6-20, 20-39, 21-39, 23- 39, 24-39, 21-41, 40-50, 42-50 and other random pieces. The primary modifications found were on Lys and Ser residues. However, modifications could also be present on Arg and Thr residues. Ser and Thr can be modified by the addition of a phosphate group. Lys can be modified by the addition of mono, di or tri-methyl groups or by the addition of an acetyl group. Arg residues can be modified by the addition of one or two methyl groups. In addition, adding propionic anhydride creates propyl amide groups on the amino terminus of peptides and peptides containing unmodified Lys residues or Lys residues modified by one methyl group. Lys residues containing di, tri-methyl or acetyl groups are not modified by the propionic anhydride reagent. Thus, as can be easily seen, a vast amount of modifications and combination of modifications can be expected when analyzing histone peptides. Currently, Mascot is one of the few software programs available that is able to search data possibly containing multiple modifications. Modifications of N-terminal (Pr) and Lys modifications of propyl amide (Pr), propyl amide and methyl (Pr-Me), di-methyl (di-Me) and tri-methyl (tri-Me), Ser (phosphorylation) and Arg modification of (Me) were created and Mascot was used to search the database.

18 170 241 342 526 597 668 824 1008 1095 1166 1263 1334 1435 1492 1549 1648 1804 1988 2085 2240 Leu Ala Thr Lys Ala Ala Arg Lys Ser Ala Pro Ala Thr Gly Gly Val Lys (diMe) Lys Pro His 2240 2071 2000 1899 1715 1644 1573 1417 1233 1146 1075 978 907 806 749 692 593 437 253 156

100 b10 95 90 85 80 +2 y10 75 y +2 N-term- Pr 70 10 b18 65 60 y7 55 K23-Pr 50 y 45 8 40 35 30 +2 b K27-Pr 25 y8 5 b7 Relative Abundance 20 b4 b6 y11 15 y4 10 y5 y9 b12 5 y2 b3 b9 K36-diMe 0 200 400 600 800 1000 1200 1400 1600 1800 2000 m/z K37-Pr 170 241 342 526 597 668 824 980 1067 1138 1235 1306 1407 1464 1521 1620 1804 1988 2085 2240 Leu Ala Thr Lys Ala Ala Arg Lys (diMe) Ser Ala Pro Ala Thr Gly Gly Val Lys Lys Pro His 2240 2071 2000 1899 1715 1644 1573 1417 1261 1174 1103 1006 935 834 777 720 621 437 253 156 x5 x10 100 95 +2 90 b 85 18 80 75 N-term- Pr 70 65 +2 60 b17 55 +2 50 b y K23-Pr 45 10 +2 10 40 b16 35 y2 +2 30 b5 b14 b +2 Relative Abundance 25 10 K27-diMe 20 b y11 [b17 + OH] 15 6 y7 b 10 y b y4 b8 17 5 3 4 b9 y 0 12 K36-Pr 200 400 600 800 1000 1200 1400 1600 1800 2000 m/z 170 241 342 512 583 654 810 1008 1095 1166 1263 1334 1435 1492 1549 1648 1804 1988K37-Pr2085 2240 Leu Ala Thr Lys (Ac) Ala Ala Arg Lys (Me) Ser Ala Pro Ala Thr Gly Gly Val Lys (diMe) Lys Pro His 2240 2071 2000 1899 1729 1658 1587 1431 1233 1146 1075 978 907 806 749 692 593 437 253 156

+2 100 y10 95 y10 90 85 80 N-term- Pr 75 70 b 65 10 60 55 K23-Ac 50 45 y 40 7 +2 35 b18 30 y8 K27-Me 25 b6 b9 Relative Abundance b 20 y +2 7 15 8 b y 10 5 5 y K36-diMe 5 y2 y4 9 y 0 y12b11 200 400 600 800 1000 1 1200 1400 1600 1800 2000 m/z1 K37-Pr The complexity of analyzing histone peptide samples is shown by comparison of three mass spectra of 20-39 residue peptides all having the same parent m/z of 748.2 Da (charge state = +3), but having different PTM sites. Modifications were identified using the Mascot software program. Pr = chemical modification of propionyl group, Ac = endogenous acetylation, Me = endogenous methylation, diMe = endogenous dimethylation.

ASMS 2003

This slide shows three peptides with the same parent mass, but they differ in the PTM site(s). There are 4 Lysines that can be modified, there is the amino terminus and a Ser and an Arg residue. The possible combination of PTM sites is large. The bottom spectrum demonstrates how four different PTMs can be identified by Mascot. Mascot was used to identify combination of mods on histone peptides.

19 241 312 383 539 737 824 895 992 1063 1164 1221 1278 1377 1575 1759 1856 2011 Lys Ala Ala Arg Lys (Me) Ser Ala Pro Ala Thr Gly Gly Val Lys (Me) Lys Pro His 2011 1771 1700 1629 1473 1275 1188 1117 1020 949 848 791 734 635 437 253 156

SB07390203 # 3219 RT: 96.92 AV: 1 NL: 8.67E5 T: + c d Full ms2 [email protected] [ 265.00-2000.00] 1760.00 100

95

90

85

80

75 895.56 70

65

60

55 1760.98 1117.62 50

45

Relative Abundance 40

35

30

25

20 1118.71 1377.76 1592.98 15 867.60 1063.64 635.48 1575.85 1164.64 10 967.00 1593.90 539.41 737.53 1349.67 1731.89 5 320.17 383.22 520.16 551.94 1275.60 1548.87 1772.01 1475.85 1622.94 0 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 m/z

ASMS 2003

Mascot correctly identified the endogenous Lys methylations on K27 and K36, as well as our chemical modifications of propionyl on K37 and K23 (amino terminus).

20 ASMS 2003

21 Poster

Tuesday (#941) entitled “Analysis of Human Histone H3 Post-Translational Modification Site Patterns from Cells Arrested During Mitosis by Tandem Mass Spectrometry” Benjamin A. Garcia1, Scott A. Busby1, A. Celeste Dunsmoor1, Jeffrey Shabanowitz1, Cynthia M. Barber2, C. David Allis2, Donald F. Hunt1,3 Departments of Chemistry1 and Pathology3, University of Virginia, Charlottesville, VA; Department of Biochemistry and Molecular Genetics2, University of Virginia Health Science Center, Charlottesville, VA

ASMS 2003

22