& PEPTIDE SYNTHESIS CORE

Phosphorylation Site Mapping- maximizing the outcome

Presented by Henriette A. Remmer, Ph.D.

Director, BRCF Proteomics & Peptide Synthesis Core Research Assistant Scientist, Dept. of Biological Chemistry Phosphoproteomics

on Ser, Thr and Tyr is the prevalent PTM and occurs on over 30% of the human proteome. • The human genome encodes for 518 and approx 150 protein phosphatases • Phosphorylation occurs mainly on Ser residues (86%), followed by Thr residues (12%) and Tyr residues (2%). • Phosphorylated peptides usually have an abundance of less than 5%, therefore selective enrichment is recommended prior to analysis. • Localize phosphorylation sites on individual • Localize phosphorylation sites on proteomes to analyze information flux through signaling pathways

ü Phosphoproteomics workflow ü Localization of phosphorylation sites using Scaffold software and manual verification ü Parameters to vary in order to maximize number of phosphorylation sites Phosphoproteomics Workflow

In-gel Phospho- DATA SAMPLES digestion peptide LC-MS/MS ANALYSIS enrichment

Coomassie Reduction and •TiO2 tip • Separation of • Database search stained gel alkylation •Elution with peptides by (protein ID and bands followed by 5% NH4OH LC phosphorylation tryptic digest: (in water) • MS of as variable Carbamidomethyl and 5% peptide modification Cys pyrrolidone (parent ion) on S,T,Y ) (in water) • MS/MS of • Software used: peptide to Mascot determine Scaffold fragment Scaffold PTM ions/peptide • Manual spectrum sequence review to verify modification Proteins The Process of Proteomic Analysis Proteolytic Cleavage Peptides

Mass spectrometry analysis:

Mascot: ion score >45 protein score >78 X!Tandem: -log E value < 0.01 Scaffold: (peptide prophet) probability >95% Data analysis and interpretation begins with a database search • The experimentally obtained MS/MS spectra of the peptides are matched against theoretical MS/MS spectra available in the public databases (NCBI and/or Uniprot) • Complete or partial peptide sequences are assigned and proteins are identified via their unique peptides • Search engines are used in the process, using a threshold based scoring to distinguish positive (correct) peptide identifications. • Scaffold uses probability based scoring.

NCBI= National Center for Biotechnology Information; http://www.ncbi.nlm.nih.gov/ Uniprot Knowledge base: www.expasy.org What is Scaffold?

• Scaffold is a software that displays the data and results from protein identification and qualitative and quantitative LC- MS/MS analysis • Scaffold has a “free viewer” version for data that users can download from www.proteomesoftware.com. • Scaffold also allows the user to identify, compare and quantitate proteins, to review and validate MS/MS spectra and sequence assignments including those related to posttranslational modifications (localize PTMs). • Scaffold’s scoring is based on the Peptide Prophet* Algorithm. • Probability based scoring allows for easy data interpretation and comparability between experiments. It has been verified in many, many samples. * A. Keller, A. Nesvizhskii, A. Kolker, R. Aebersold (2002), Anal. Chem. 74, 5383-5392 Data Output from a protein ID experiment in Scaffold

Scaffold displays the list of identified proteins (non- redundant protein list) and provides a lot of additional information to support the ID. Data Output from a protein ID/GeLCMS experiment in Scaffold

The Protein View shows the sequence coverage map for each protein and for each sample Data Output from a protein ID/GeLCMS experiment in Scaffold

The Protein View also shows the MS/MS data for each identified peptide identified as well as posttranslational modifications present Data Output from a protein ID/GeLCMS experiment in Scaffold

The fragmentation table displays the fragment –ions observed. Peptide Fragmentation by MS/MS-(CID)

(N-terminus) Asp------Gly------Gly------Lys (C-terminus) 1 2 3 4

y3 y2 y1

Asp------Gly------Gly------Lys b1 b2 b3

Peptides fragment at the peptide bond and yield two ion –series: The b-ions contain the N-terminus of the peptide –they are numbered from left to right The y-ions contain the C-terminus of the peptide –they are numbered from right to left

In Scaffold, b-ions are displayed in red, y-ions are displayed in blue. Localization of PTMs

–The mass of the PTM is included in the Mascot database search. The mascot files are uploaded into Scaffold. The sites of modification are reported in Scaffold.

–Manual spectrum review is necessary to verify modified peptides.

–Scaffold PTM is a software package that assists in verification of localization of PTMs. It is especially recommended for phosphorylation site mapping Localization of phosphorylation sites in Scaffold PTM Localization of phosphorylation sites in Scaffold PTM Localizing phosphorylation sites in peptides: the A-Score

• The Ambiguity Score (A-Score) measures the probability of correct phosphorylation site location within a peptide based on presence and intensity of site determining peaks in the MS/MS spectrum

• The A score is calculated as the difference between the probabilistic scores of the top two possible sites

• The algorithm uses a peak depth of up to 10 peaks per 100 m/z

• A-Scores of >13 are significant Localization of phosphorylation sites in Scaffold PTM Localization of phosphorylation sites in Scaffold PTM

The peptide score tab provides visual confirmation of a modification’s likelihood on one amino acid over the other possible options in the peptide. Analysis of phosphorylation sites on the Q-Exactive instrument, with enrichment and CID

Sequence Coverage Map (33% sequence coverage):

7 confirmed Phosphorylation sites: T156, S159, S360, S377, S431, S664, S666.

1 phospho moiety present, but exact site NOT confirmed: S441 and S545 Analysis of phosphorylation sites on the Q-Exactive instrument, with enrichment and CID

Sequence Coverage Map (33% sequence coverage):

7 confirmed phosphorylation sites: Confirmed: T156, S159, S360, S377, S431, S664, S666.

1 phospho moiety present, but exact site NOT confirmed: S441 and S545 T156 : Spectrum and fragmentation table: phosphorylation site directly confirmed by b-ion series S360 : Spectrum and fragmentation table: phosphorylation indirectly confirmed by peptide mass. Fragment ions show that other Ser/Thr sites are not phosphorylated. Scaffold PTM results for phosphorylation site S360

A-Score S545 : phospho moiety present (by parent ion mass), exact localization of that site not confirmed due to poor fragmentation of this long peptide. Scaffold PTM results for phosphorylation site S545

S545 Analysis of phosphorylation sites on the FUSION instrument, with enrichment and CID

Sequence Coverage Map (67% sequence coverage):

17 confirmed Phosphorylation sites: S28, S30, T63, S142, T156, S159, S309, S354, S360, S377, S431, S441, S545, S559, S664, S666 Not confirmed: S418, S448,T449, S450, S451, T535, Y544, T547, S551, S553 S545: confirmed phosphorylation site

Obtained significantly more MS/MS spectra, Among them a few with much better peptide fragmentation that allow to confidently assign the phosphorylation site Peptide Fragmentation by MS/MS-(ETD)

(N-terminus) Asp------Gly------Gly------Lys (C-terminus) 1 2 3 4

z3 z2 z1

Asp------Gly------Gly------Lys

c1 c2 c3

Peptides fragment at the peptide bond and yield two ion –series: The c-ions contain the N-terminus of the peptide –they are numbered from left to right The z-ions contain the C-terminus of the peptide –they are numbered from right to left

ETD=Electron Transfer Dissociation, (peptide backbone dissociates between C-alpha and N-alpha) producing c and z ions, amino acid side chain stays intact, therefore useful for phosphorylation site mapping The power of ETD

Analysis of phosphorylation sites on the Q-Exactive instrument, with enrichment, CID and tryptic digest Sequence coverage: 19%

• 1 confirmed phosphorylation site: S75. • Seven possible phosphorylation sites present in the C-terminal peptide, • 2 phospho moieties present as confirmed by parent ion mass • Exact location of these two sites not confirmed by fragment spectra The power of ETD Analysis of phosphorylation sites on the Q-Exactive instrument, with enrichment, ETD and Glu-C digest

Sequence coverage: 67%

11 Phosphorylation sites unambiguously confirmed : S75, S80, T84, S87, S97, S100, S101, S102, S150, S196 and S198. Phosphorylation likely but not unambigously confirmed: S129. PROTEOMICS & PEPTIDE SYNTHESIS CORE Summary and Conclusion

• The Orbitrap Fusion Lumos instrument with its increased sensitivity allows for more phosphorylation sites to be detected and assigned with confidence.

• Mascot/Scaffold and Scaffold PTM are proven and efficient tools to help assign phosphorylation sites with confidence.

• Phosphorylated peptides usually have an abundance of less than 5%, therefore selective enrichment is recommended prior to analysis.

• Using different proteases and/or collision modes (CID/ETD) can enable the confident assignment of more phosphorylation sites

• Start with sufficient protein

• The combination of all these factors yields the best possible project outcome