A Thesis

entitled

Applications of Ion Mobility Mass Spectrometry - Screening for SUMOylation and

Other Post-Translational Modifications

by

Quentin Dumont

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the

Master of Science Degree in Chemistry

______Dr. Wendell P. Griffith, Committee Chair

______Dr. Timothy C. Mueser, Committee Member

______Dr. Max O. Funk, Committee Member

______Dr. Patricia R. Komuniecki, Dean College of Graduate Studies

The University of Toledo December 2012

Copyright 2012 Quentin Dumont ©

This document is copyrighted material. Under copyright law, no parts of this document may be reproduced without the expressed permission of the author. An Abstract of

Screening for SUMOylation and Other Post-Translational Modifications – Applications of Ion Mobility Mass Spectrometry

by

Quentin Dumont

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the Master of Science Degree in Chemistry

The University of Toledo

December 2012

Post-translational modifications (PTMs) of are fundamental processes that trigger, regulate and terminate most of the cellular mechanisms by covalent attachment of chemical moieties to substrate proteins. Conjugation to the Small

Ubiquitin-like MOdifier (SUMO) is a highly conserved and regulated posttranslational modification that is often restricted to highly specific cellular events and a number of obligatory nuclear regulatory processes. Although much is known about the processes of

SUMOylation and deSUMOylation, exactly how it mediates its various functions and 4 different human isoforms remains unclear. This is due in part to the extremely low abundance of modified proteins (about 2% of a given substrate will become modified), and the large expense of time and effort required to identify and characterize these

SUMOylated proteins by current methods.

This thesis presents the development and application of an ion mobility mass spectrometry (IMMS)-based method of screening for isopeptides from SUMOylated substrates. The model conjugates poly-SUMO2 and poly-SUMO3 were digested by trypsin/chymotrypsin to produce small linear and isopeptides with a QQQTGG

iii tag on the substrate . Using solution conditions to promote higher charge states and

IMMS, mass spectra for the larger isopeptides with +3 charge state were extracted from the mass mobility plot. These isopeptides were confirmed using tandem mass spectrometry. Interestingly, a neutral loss of 17 Da was observed for all isopeptides, which resulted from the loss of ammonia due the acid-catalyzed rearrangement of the N- terminal residue of the isopeptide tag. The method was also applied to in vitro

SUMOylated RanGAP1 and Sp100 fragments, two known SUMO substrates.

IMMS was also applied to the identification of - in peptides using an approach that was developed for phosphorylated peptides. Analysis of a trypsin digest of phosphorylated a-casein showed the expected decrease in drift time. A similar reduction in drift time was observed for tyrosine sulfated synthetic

DY*MGWMDF-NH2 (Spp) relative to its unmodified analog (DSpp). This decrease in drift time was a result of the modified peptides becoming more compact due to electrostatic interaction between the negatively charged modification and the positively charged N-terminus. The tyrosine-sulfated peptide and its unmodified analog DSpp were spiked in equimolar amount into a digest of bovine serum albumin. The band corresponding to the modified peptide was clearly identified in the much more complex mass mobility plot with a significant shift in the drift time for the tyrosine-sulfated peptide.

IMMS is a highly promising method for analysis of PTMs due to its sensitivity and selectivity. In this work we demonstrated its utility in screening digests for isopeptides from SUMO-conjugated substrates and tyrosine-sulfated peptides using model analytes. Future work will involve testing the limits of these methods including peptide sequence, peptide size and detection limits in more complex cellular matrices.

iv

Acknowledgements

I would like to express my deepest appreciation to my advisor Dr. Wendell P.

Griffith for his constant help, his guidance and limitless patience. Thank you for teaching me so much about mass spectrometry, for the excellent balance between your encouragements when the morale was low and your criticism when I was losing focus. I could not have wished for a better advisor! I would also like to thank my committee members, Dr. Funk and Dr. Mueser for their helpful suggestions, and staying onboard with me despite my complete change of project. I am grateful to the Chemistry

Department staff, for making my time here so smooth; and Leif Hansen who always helped me with the MALDI when I needed it. I want to acknowledge Dr. Anderson, Dr.

Bryant-Friedrich, Dr. Lind, Dr. Funk, and Dr. Ronning for sharing their knowledge during their amazing classes. I have learned so much! I also want to give a special thanks to Edith Kippenhan, who is probably the best TA advisor on Earth!

I would like to thank my past and present labmates: Jingshu, David, Steven,

Camille, Anthony, Mallory and the grad students from other labs for the wonderful time spent here; it was fantastic to meet all of these amazing people. A special thanks goes to

Lucile, who was always there for me, and with whom I have been on the craziest trips of my life! I am thankful to my family and friends in France, for their unconditional love and support. Thank you all for this unforgettable experience!

v

Table of Contents

Abstract ………………………………………………………………………………… iii

Acknowledgements ...... v

Table of Contents ...... vi

List of Figures ...... xii

List of Abbreviations ...... xvi

Introduction ...... 1

1.1. From Genes to Proteins: a Multi-Step Process With Increased Complexity ...... 1

1.1.1. Protein Synthesis: General Process ...... 1

1.1.2. The Diversification of the Proteome ...... 2

1.2. Post-Translational Modifications: Diversity and Functions ...... 4

1.2.1. The Different Types of PTMs ...... 4

1.2.2. The Effect of PTMs on the Protein Substrates ...... 5

1.3. Project Goal ...... 7

1.4. Organization of the Thesis ...... 9

Mass Spectrometry and its Role in Proteomics ...... 11

2.1. Principles of Mass Spectrometry ...... 11

2.1.1. Fundamentals ...... 11

2.1.2. The Development of MS for Proteomics ...... 12

vi 2.2. Instrument Parts ...... 13

2.2.1. Ionization Sources ...... 14

2.2.2. Mass Analyzers ...... 18

2.3. Synapt HDMS Instrument: Components and Features ...... 22

2.3.1. Intrument Parts ...... 22

2.3.2. Experimental Approaches ...... 24

2.4. Applications of MS to Proteomics ...... 28

2.4.1. Definition of Proteomics ...... 28

2.4.2. Protein Identity, Quantity and Structural Features ...... 30

2.4.3. PTMs and Proteomics ...... 32

Characterization of SUMOylated proteins by MS ...... 34

3.1. and SUMO ...... 34

3.1.1. Structure and Functions of Ub ...... 34

3.1.2. An Ubiquitin-Like Modifier: SUMO ...... 36

3.2. Existing Methods for the Detection of Ubls by MS ...... 40

3.2.1. Methods for Ubiquitin ...... 40

3.2.2. Yeast Analog ...... 43

3.2.3. Mutagenesis ...... 44

3.2.4. Use of High-end MS Instrumentation ...... 45

3.2.5. Software Tools ...... 46

3.2.6. Griffith/Cotter: Tailored- Approach ...... 48

3.3. Requirements for a New Method ...... 51

vii Material and methods ...... 53

4.1. Materials ...... 53

4.2. In vitro SUMOylation of Proteins ...... 54

4.2. SDS-PAGE Gels ...... 54

4.3. Enzymatic Digestion ...... 55

4.4. Zip Tip Procedures ...... 56

4.4.1. Manufacturer Protocol ...... 56

4.4.2. Preparation of Oligonucleotides Procedure ...... 57

4.5. Mass Spectrometry ...... 57

4.5.1. Nano-ESI-TOF/IM MS ...... 57

4.5.3. MALDI MS ...... 58

4.6. In silico Digests ...... 59

4.7. Spiking of BSA Digests ...... 59

Pre-Screening Method for SUMOylated Proteins ...... 60

5.1. Proof-of-Concept for the Screening Method ...... 60

5.1.1. Overview ...... 60

5.1.2. ESI-MS Analysis of the Poly-SUMO-2 Digest ...... 61

5.1.3. Ion Mobility Analysis of the Poly-SUMO-2 Digest ...... 63

5.1.4. Other SUMO Isoforms ...... 69

5.2. Applications of the Method ...... 70

5.2.1. In vitro SUMOylation: Sp100 and RanGAP1 ...... 70

Analysis of Tyrosine O-Sulfation by IMMS ...... 74

viii 6.1. Physiological Functions of Small PTMs ...... 74

6.1.1. Sulfation ...... 74

6.1.2. ...... 76

6.1.3. Sulfation, Phosphorylation: Similarities and Differences ...... 77

6.2. Project Goal ...... 79

6.3. Optimization of the IMMS Experiments: Phosphorylation ...... 80

6.3.1. Model System ...... 80

6.3.2. Results ...... 81

6.4. Screening for Sulfated Peptides ...... 85

6.4.1. Model System ...... 85

6.4.2. Analysis of Spp and DSpp ...... 85

6.4.3. Spiking of a Bovine Serum Albumin Digest with Spp ...... 88

6.4.4. The effect of positive versus negative ionization mode ...... 90

6.5. State of the Project and Conclusion ...... 94

Conclusions and Future Directions ...... 96

7.1. Conclusions ...... 96

7.2. Future Directions ...... 98

References ...... 100

Appendix A: Tandem MS of the linear SUMO-2 peptides ...... 111

Appendix B: Tandem MS simulated distribution of SUMO-2 isopeptides ...... 116

Appendix C: ESI-MS and MALDI-MS data for Sp100 and RanGAP1 fragments . 119

Appendix D: -casein peptides identified ...... 123

ix Appendix E: BSA peptides identified ...... 127

x

List of Tables

1.1 Known protein post-translational modifications of amino acids side chains...... 5

1.2 Common PTMs: the change in mass on the substrate and their functions ...... 7

4.1 Protein and reagent amounts and final concentrations used for in vitro SUMO

conjugation reactions ...... 54

4.2 Chemicals and amounts used for the preparation of 15% acrylamide SDS-PAGE

gel ...... 55

5.1 List of the major linear peptides detected from the poly-SUMO-2 dual digest .... 63

5.2 List of the SUMO isopeptides detected using ion mobility mass spectrometry ... 68

6.1 Comparison of some biological and biochemical features of phosyphorylation and

sulfation as PTMs; together with their properties during MS analysis ...... 79

D.1 List of the major linear peptides detected matching the masses of an in silico list

of -casein digest peptides ...... 123

E.1 List of the major linear peptides detected matching the masses of an in silico list

ofBSA digest peptides ...... 127

xi

List of Figures

2-1 Schematic of a typical mass spectrometer ...... 14

2-2 Formation of ions in ESI ...... 16

2-3 Schematics of linear and single-stage reflectron TOF analyzers ...... 21

2-4 Schematic of the Synapt HDMS system ...... 23

2-5 The peaks found in MS spectra provide both qualitative and quantitative

information about the species being analyzed ...... 25

2-6 Schematic of two data representations used for IMMS: ion mobilogram and drift

ion chromatogram ...... 26

3-1 Scheme of the formation during Ub or SUMO conjugation ...... 35

3-2 Crystal structures of human SUMO-1 and Ub ...... 37

3-3 Sequence alignment of the C-terminal region of Ub and several Ubls ...... 42

3-4 Sequence alignment of the human SUMO isoforms C-terminal regions ...... 49

3-5 Schematic of an isopeptide carrying the 6 residues SUMO tag obtained

by dual digestion with trypsin and chymotrypsin ...... 50

3-6 Scheme of the N-terminal glutamine rearrangement into pyroglutamate ...... 51

5-1 Sequence of human SUMO-2 ...... 61

5-2 ESI-TOF MS mass spectrum of the poly-SUMO-2 dual digest ...... 62

5-3 Mass-moblity plot of the trypsin/chymotrypsin digest of poly-SUMO-2 ...... 64

xii 5-4 Drift time total ion chromatogram of peptides detected from the

trypsin/chymotrypsin proteolysis of poly-SUMO2 ...... 65

5-5 Reconstructed mass spectrum of the z ≥ +3 peptides from the poly-SUMO-2 dual

digest ...... 67

5-6 Comparison of the experimental and simulated isotopic distributions for peptide 2

of Table 5.2 ...... 67

5-7 MS/MS spectrum of the isopeptide 731.74 m/z (+3) (3 in Table 5.2) ...... 69

5-8 Reconstructed mass spectrum of the z ≥ +3 peptides from the poly-SUMO-2 dual

digest ...... 70

5-9 Sequences of the commercial RanGAP1 and Sp100 fragments ...... 71

5-10 SDS-PAGE of in vitro SUMOylation Sp100 and RanGAP1 fragments in presence

or absence of SUMO-1 and ATP ...... 72

6-1 ESI-TOF MS spectrum of an -casein trypsin digest ...... 82

6-2 Tandem mass spectrum of the precursor ion at m/z 1660.70 corresponding to the

singly phosphorylated peptide VPQLEIVPNpSAEER ...... 83

6-3 Mass-mobility plot of the -casein trypsin digest ...... 84

6-4 IMMS analysis of a solution of Spp in denaturing solution ...... 87

6-5 MS/MS spectrum of the 1063.41 m/z (+1) peptide ...... 87

6-6 Positive ESI-TOF MS spectrum of a 5 M BSA trypsin digest spiked with an

equimolar ratio of Spp ...... 89

6-7 Positive IMMS of a 5 M BSA digest spiked with a 1:1 molar ratio of Spp ...... 90

6-8 Analysis in positive and negative ionization mode of Spp solution ...... 91

xiii 6-9 Negative ESI-TOF MS spectrum of a 5 M BSA trypsin digest spiked with an

equimolar ratio of Spp ...... 92

6-10 MS/MS spectrum of the 1140.81 m/z (-1) peptide ...... 93

6-11 Negative IMMS of a 5 mM BSA digest spiked with a 1:1 molar ratio of Spp ..... 94

A-1 MS/MS spectrum of the linear SUMO-2 peptide 535.26 (+2) ...... 111

A-2 MS/MS spectrum of the linear SUMO-2 peptide 599.35 (+2) ...... 112

A-3 MS/MS spectrum of the linear SUMO-2 peptide 617.84 (+2) ...... 112

A-4 MS/MS spectrum of the linear SUMO-2 peptide 749.87 (+2) ...... 113

A-5 MS/MS spectrum of the linear SUMO-2 peptide 823.41 (+2) ...... 113

A-6 MS/MS spectrum of the linear SUMO-2 peptide 901.46 (+2) ...... 114

A-7 MS/MS spectrum of the linear SUMO-2 peptide 1106.58 (+1) ...... 114

A-8 MS/MS spectrum of the linear SUMO-2 peptide 1342.57 (+1) ...... 115

B-1 MS/MS spectrum of isopeptide 2 in Table 5.2 ...... 116

B-2 Comparison of the experimental and simulated isotopic distributions for peptide 1 of Table 5.2 ...... 117

B-3 Comparison of the experimental and simulated isotopic distributions for peptide 3 of Table 5.2 ...... 117

B-4 Comparison of the experimental and simulated isotopic distributions for peptide 4 of Table 5.2 ...... 118

C-1 ESI-TOF MS spectrum of the in vitro SUMOylation of Sp100 fragment by

SUMO-1 ...... 119

C-2 MALDI-TOF MS of the trypsin digested Sp100 fragment...... 120

xiv C-3 ESI-TOF MS spectrum of the in vitro SUMOylation of RanGAP1 fragment by

SUMO-1 ...... 121

C-4 MALDI-TOF MS of the trypsin digested RanGAP1 fragment ...... 122

D-1 MS/MS spectrum of the -casein peptide 615.32 (+1) ...... 123

D-2 MS/MS spectrum of the -casein peptide 748.37 (+1) ...... 123

D-3 MS/MS spectrum of the -casein peptide 1267.69 (+1) ...... 124

D-4 MS/MS spectrum of the -casein peptide 1384.73 (+1) ...... 124

D-5 MS/MS spectrum of the -casein peptide 1759.90 (+1) ...... 125

E-1 MS/MS spectrum of the BSA peptide 517.34 (+1) ...... 126

E-2 MS/MS spectrum of the BSA peptide 545.38 (+1) ...... 128

E-3 MS/MS spectrum of the BSA peptide 649.38 (+1) ...... 129

E-4 MS/MS spectrum of the BSA peptide 660.40 (+1) ...... 120

E-5 MS/MS spectrum of the BSA peptide 689.43 (+1) ...... 130

E-6 MS/MS spectrum of the BSA peptide 712.42 (+1) ...... 130

E-7 MS/MS spectrum of the BSA peptide 789.53 (+1) ...... 131

E-8 MS/MS spectrum of the BSA peptide 922.56 (+1) ...... 131

E-9 MS/MS spectrum of the BSA peptide 927.57 (+1) ...... 132

E-10 MS/MS spectrum of the BSA peptide 974.53 (+1) ...... 132

E-11 MS/MS spectrum of the BSA peptide 1014.71 (+1) ...... 133

E-12 MS/MS spectrum of the BSA peptide 1163.73 (+1) ...... 133

E-13 MS/MS spectrum of the BSA peptide 1305.82 (+1) ...... 134

E-14 MS/MS spectrum of the BSA peptide 1479.93 (+1) ...... 134

E-15 MS/MS spectrum of the BSA peptide 1567.89 (+1) ...... 135

xv

List of Abbreviations

ACN ...... Acetonitrile AmAc ...... Ammonium Acetate AmBic ...... Ammonium Bicarbonate AMP ...... Adenosine Monophosphate AP ...... Atmospheric Pressure ATG8 ...... Autophagy-Related Protein 8 ATG12 ...... Autophagy-Related Protein 12 ATP ...... Adenosine Triphosphate

BSA ...... Bovine Serum Albumin

CE ...... Collision Energy CENP-C ...... Human Centromer Protein C CHCA ...... -Cyano-4-Hydroxycinnamic Acid CI ...... Chemical Ionization CID ...... Collision Induced Dissociation

DNA ...... Deoxyribonucleic Acid DSpp ...... Desulfated Peptide DTT ...... Dithiothreitol

E. coli ...... Escherichia coli E1 ...... Ubiquitin-Activating Enzyme E2 ...... Ubiquitin-Conjugating Enzyme E3 ...... Ubiquitin ECD ...... Electron Capture Dissociation ESI ...... Electrospray Ionization ETD ...... Electron Transfer Dissociation

FAB ...... Fast Atom Bombardment FT-ICR ...... Fourier Transform Ion Cyclotron Resonance FUBI ...... Fau and its Ubiquitin-Like Domain

GTP ...... Guanosine Triophosphate

HDMS ...... High-Definition Mass Spectrometry HEPES ...... 4- (2-hydroxyethyl)-1-Piperazineethanesulfonic

xvi HGP ...... Human Genome Project HIV ...... Human Immunodeficiency Virus HPLC ...... High Performance Liquid Chromatography HUB1 ...... Histone Mono-Ubiquitination Protein 1

IKB ...... Kappa Light Polypeptide Gene Enhancer in B- Inhibitor, Alpha IMMS ...... Ion Mobility Mass Spectrometry ISG15 ...... Interferon-Stimulated Gene 15 IT ...... Ion Trap

LC ...... Liquid Chromatography LDI ...... Laser Desorption/Ionization

MALDI ...... Maxtrix-Assisted Laser Desorption/Ionization mRNA ...... Messenger Ribonucleic Acid MS ...... Mass Spectrometry MS/MS ...... Tandem Mass Spectrometry

NEDD8 ...... Neural Precursor Cell Expressed Protein, Developmentally Down- Regulated 8

PDB ...... Protein Database PTM ...... Post-translational Modification

Q ...... Quadrupole

RanGAP1 ...... Ran GTP-ase Activating Protein 1 RF ...... Radiofrequency RNA ...... Ribonucleic Acid RP ...... Reverse Phase

S. cerevisiae ...... Saccharomyces cerevisiae SDS-PAGE ...... Dodecyl Sulfate Polyacrylamide Gel Electrophoresis SILAC ...... Stable Isotope Labeling by Amino Acids in Cell Culture SILIS ...... Stable Isotope-Labeled Internal Standard SPITC ...... Sulfophenyl Isothiocyanate Spp ...... Sulfated Peptide SRM ...... Single Reaction Monitoring SUMO ...... Small Ubiquitin-Like Modifier

TEAA ...... Triethylammonium Acetate TFA ...... Trifluoro Acetic Acid TOF ...... Time-of-Flight TPST ...... Tyrosylprotein Sulfotransferase T-wave ...... Traveling Wave TWIMS ...... Traveling Wave Ion Mobility Spectrometry

xvii

Ub ...... Ubiquitin Ubl ...... Ubiquitin-Like Modifier UCRP ...... Ubiquitin Cross-Reactive Protein UFM1 ...... Ubiquitin-Fold Modifier 1 URM1 ...... Ubiquitin-Related Modifier 1 UV ...... Ultraviolet

xviii

List of Symbols

cm ...... Centimeter m ...... Micrometer Da ...... Dalton eV ...... Electron Volt mM ...... Millimolar M ...... Micromolar m/z ...... Mass-to-Charge Ratio min ...... Minute Th ...... Thomson

xix

Chapter 1: Introduction

1.1. From Genes to Proteins: a Multi-Step Process With Increased Complexity

1.1.1. Protein Synthesis: General Process

All living organisms are maintained and regulated by mechanisms involving proteins, the building blocks of any living species. Each protein carries out a specific function, and is encoded by genes present in deoxyribonucleic acid (DNA). As the complexity of an organism increases, the number of proteins required also increases.

Because the synthesis of so many proteins would demand a lot of energy, living organisms have developed mechanisms to change the function of proteins after synthesis: post-translational modifications (PTMs). Post-translational modification is the covalent modification of a protein by a chemical moiety in order to change its function, thus a single protein can be involved in different mechanisms depending on its modification state. Most PTMs are reversible, which allows for proteome diversity. The understanding of all biochemical mechanisms and pathways in the body cannot be achieved without the study of PTMs. As the current methods available for the analysis of PTMs are not always optimal, the work presented in this thesis is aimed at the development of new mass spectrometry-based methods for the screening of select PTMs.

1 Proteins are linear polymers of amino acids, chemical subunits linked to each other by peptide bonds. The range of weights, structures and functions for proteins is extremely wide, and together with polysaccharides and nucleic acids, two other types of biological macromolecules, proteins are involved in every single process occurring in the cells. The wide variety of protein functions ranges from catalyzing biological reactions1 to helping fold other proteins.2, 3 Due to their omnipresence in any organism, their proper biosynthesis is inherent to all life. Each protein has a specific amino acid sequence, which is defined by the nucleotide sequence of the gene encoding the protein.4

1.1.2. The Diversification of the Proteome

One gene with a unique sequence has the capacity to encode only one protein of a given amino acid sequence. The number of genes in an organism would define the number of proteins and consequently, the complexity of the proteome (set of proteins expressed by an organism). Following the revolution of the Human Genome Project

(HGP),5 the genome of 46 organisms has been sequenced. Interestingly, D. melanogaster, a species of drosophila, was found to have about 26,000 genes, while humans possess a genome of approximately 30,000 genes. Moreover, it was discovered that the genes in humans are able to produce more than 1,000,000 different molecular protein species, representing a complexity about two orders of magnitude higher than if one gene encoded only one protein.6 The comparison of the complexity of both organisms was the first clue that led to the conclusion that proteome diversity is not directly related to the number of genes.

2 Indeed, it was later discovered that there are several mechanisms to generate high diversity in the proteome. Several processes occur before or during protein synthesis.

They include mRNA splicing7 and editing,8 and alternative promoter.9 The last pathway, which will be the focus of this thesis, that explains the much increased complexity of the proteome compared to the genome is post-translational modification. PTM is the chemical modification of a protein that occurs shortly after biosynthesis (once the RNA has been translated) or at any point in its lifespan. PTMs are not directly encoded by genes but can modify the function, location and turnover of a single protein. The attachment of various biochemical functional groups (such as acetyl, lipid, carbohydrate, and sulfate) extends the range of functions of the protein by changing the chemical nature of the modified amino acid, or inducing structural change (for instance, formation of bridges). The wide variety of PTM reactions are catalyzed by an equally important number of enzymes, and can occur at one or several places either on the side chain of amino acids constituting the primary sequence or on the backbone of the protein itself. Although PTMs have been identified in , the extent of modifications is much higher in both in terms of diversity and occurrence.6 It has to be noted here that the modification of a protein is typically not homogenous: because of alternative splicing and a combination of various modifications described above, a single gene can lead to multiple gene products. Consequently, the amount of protein in a given modification state is minute compared to the total amount of gene products.

3 1.2. Post-Translational Modifications: Diversity and Functions

1.2.1. The Different Types of PTMs

PTMs can be divided into two main groups: the ones that covalently modify the substrate protein by addition of a chemical moiety or polypeptide chain, and others that cleave the backbone at a specific determined by the primary sequence. For the purpose of this dissertation, most of the discussion is focused on the first kind of

PTM, leading to covalent modification of the protein.

Of the 20 common amino acids produced by human cells, only 15 of them can undergo post-translational modification of their side chain. These amino acids and some of their known modifications are listed in Table 1.1. The combination of the 15 modifiable amino acids and the variety of modifications possible for each residue leads to an extremely broad range of possible PTMs possible for a single protein. Each covalent modification has a specific effect on the modified protein, whether it affects the formal charge or the overall protein structure. The extent of change can vary greatly, for example

N- adds 14 Daltons (Da) to the protein mass, while addition of ubiquitin increases the total mass by more than 8,000 Da.

4 Table 1.1: Known protein modifications of amino acids side chains6

Residue Mechanisms Arg N-methylation, N-ADP-ribosylation Asn N-methylation, N-ADP-ribosylation Asp Phosphorylation Cys S-, disulfide bond formation, phosphorylation, S- Gln Transglutamination Glu Methylation, , polyglycination, polyglutamination Gly C-hydroxylation His Phosphorylation Lys N-methylation, N-acetylation, Ubiquitination, SUMOylation Met Oxidation to sulfoxide Pro C-hydroxylation Ser Phosphorylation Thr Phosphorylation, O- Trp C-mannosylation Tyr Phosphorylation, sulfation, ortho-nitration

1.2.2. The Effect of PTMs on the Protein Substrates

PTMs are of great importance in most cellular pathways and processes. When a protein gets modified, one or several of its features is changed. As an example, acetylation of a positively charged  amine group on a lysine residue side chain will neutralize its positive charge. Phosphorylation of a residue increases negative charge and hydrophilicity, thereby altering the protein’s properties and in some cases structure may be altered. Intra-molecular interactions can be disrupted or modified. Those modifications ultimately have an effect on the function of the protein. Enzymes have very

5 specific active sites due to the interactions created between the binding pocket and the substrate. If an amino acid gets modified and triggers a change in the structure of the , even a small one, the enzyme can undergo a complete loss of its catalytic activity, and the pathway in which it is involved is then down-regulated. A list of some common and important PTMs, mass changes and their functions is provided in Table 1.2.

Due to the variety of the possible modifications and the broad range of possible chemical and physical changes to a protein, PTMs are involved in the regulation of most cellular events, and play an important role in all steps of the proteins life cycle. For example, poly-ubiquitination of short-lived proteins results in their degradation by the 26S proteasome.10 Comprehensive knowledge of PTMs is a requirement for complete understanding of the mechanisms taking place in the cell. Some modifications are currently very-well characterized (for example phosphorylation) and their analysis is carried out by routine experiments. However, understanding of mechanisms involving larger PTMs (SUMO, in particular) has been hindered by the lack of analytical methods providing fast and reliable results.

6 Table 1.2: Common PTMs: the change in mass on the substrate and their functions11

Mass PTM type change Function and notes (Da) Phosphorylation Reversible, activation/inactivation of pTyr + 80 enzymes activity, modulation of molecular pSer, pThr + 80 interactions, signaling Protein stability, protection of N-terminus. Acetylation + 42 Regulation of protein-DNA interactions Methylation + 14 Regulation of gene expression Acetylation, fatty acid modification Cellular localization and signals targeting, Farnesyl + 204 membrane tethering, mediator of protein- Myristoyl + 210 protein interactions Palmitoyl + 238 Glycosylation Excreted proteins, cell-cell N-linked > 800 recognition/signaling, reversible, regulatory O-linked 203, > 800 functions Glycosylphosphatidylinositol (GPI) anchor. Membrane tethering of enzymes and GPI anchor > 1,000 receptors, mainly to outer leaflet of plasma membrane Hydroxyproline + 16 Protein stability and protein-ligand interactions Modulator of protein-protein and receptor- Sulfation (sTyr) + 80 ligand interactions Intra- and intermolecular crosslink, protein Disulfide bond formation - 2 stability Possible regulator of protein-ligand and + 1 protein-protein interactions, also a common chemical artifact - 17 Protein stability, blocked N-terminus Ubiquitination > 1,000 Destruction signal Nitration of Tyr + 45 Oxidative damage during inflammation

1.3. Project Goal

Post-translational modifications are a crucial aspect to the understanding of cells as they trigger, regulate and terminate most of the mechanisms that take place. Studying the proteome without taking PTMs into account would give an incomplete, if not

7 incorrect, picture of the function of each protein. There is consequently a real need for the development of fast and reliable methods for the detection and characterization of PTMs.

This goal has been hindered in the past by the fact that the occurrence of PTMs is usually small (for instance, SUMOylation happens for only 2% of substrate proteins)12 and a number of PTMs are labile. Both of these issues make their analysis more challenging and requiring instrumentation and methods with very low detection limit, sensitivity, and capability for conserving labile modifications.

SUMO is an ubiquitin-like modifier that has gained importance in the scientific community over the past few years. While ubiquitin function is quite well known, especially its role in protein degradation by the proteasome,13 SUMO has only recently become a subject of interest. The implication of SUMOylation in many nuclear pathways, together with finding that the deregulation of SUMOylation leads to various diseases, have increased current interest in this modification. To gain a better understanding, effective techniques must be available for the analysis of SUMOylation. Although some research groups have developed reliable MS methods for the analysis of PTMs, and more specifically SUMOylation (see Chapter 3 for more details), most of them lack physiological relevance and/or are highly time-consuming.

One of the goals of the research carried out in the Griffith laboratory is to develop new mass spectrometry-based methods for the characterization of analytes that have proven too challenging for other currently available analytical methods. In that perspective, the goal of the project presented in this thesis is to develop and apply techniques using the existing MS instrumentation for the reliable, efficient, and accurate screening of PTMs, in particular SUMO. The ion mobility mass spectrometry utility of

8 the Waters Synapt High Definition Mass Spectrometry (HDMS) instrument was used for all of the work presented here. Due to its ability to separate ions, based on molecular shape, size and net charge in addition to m/z, IMMS is a convenient technique for the analysis of PTMs. Applications of IMMS to the analysis of SUMOylation and tyrosine O- sulfation are presented in this thesis.

1.4. Organization of the Thesis

The experiments carried out for the completion of this research involved used of techniques including, but not limited to, in-solution enzymatic digests, in vitro modification of proteins, and desalting using ZipTips. The mass spectrometers that were used included the Bruker Esquire-LC (ESI-QIT MS), Bruker Daltonics UltrafleXtreme

(MALDI-TOF/TOF MS), and the Waters Synapt HDMS (ESI Q-TOF MS) equipped with nano-ESI source and traveling wave ion mobility MS.

This thesis has been divided into seven chapters, each addressing a specific aspect of the project. Firstly, an overview of the mechanisms that increase proteome complexity is provided, with focus on the diversity and the functions of post-translational modifications. In this research, PTMs were analyzed using mass spectrometry, a technique that has become the method of choice for the analysis of biological samples.

Consequently, the second chapter provides insight into the field of mass spectrometry with its developments, the features of each part of a mass spectrometer and the intricate relation between MS and proteomics. Chapter 3 provides an overview of the

SUMOylation reaction, together with the existing MS based methods that are used to detect and identify SUMOylated proteins. The goal of this thesis research was to develop

9 a new method that can rapidly analyze post-translational modified proteins with fewer purification steps. The material, methods and instrumentation used for these experiments are presented in Chapter 4. The first project was the development of a technique using ion mobility mass spectrometry as a tool to allow fast and simple screening for SUMOylated isopeptides. The proof-of-concept details and applications of this method are highlighted in Chapter 5. Chapter 6 focuses on similar IMMS approaches for screening and identification of tyrosine O-sulfation sites in proteins. The thesis concludes with an overview of the results and presentation of some possible future directions.

10

Chapter 2: Mass Spectrometry and its Role in Proteomics

2.1. Principles of Mass Spectrometry

Of all the currently used analytical techniques, very few are as versatile as mass spectrometry (MS). Owing to the many developments in MS instrumentation over the last few decades, MS provides unequaled performance in terms of sensitivity, detection limits, ease of use and speed of analysis. The applications are very diverse and range from nanoparticle analysis to the study of intact ribosomes. The development of electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI) makes MS suitable for analysis of biological samples and currently MS experiments are widely used for protein and biomolecules analysis.

2.1.1. Fundamentals

During a mass spectrometry experiment, ions are separated in the instrument based on their mass- to-charge ratios and detected in proportion to their abundance.14 The collected data are plotted in a mass spectrum, where the abscissa is a scale of mass-to- charge ratio (m/z, or Thomson, Th)15 and the ordinate usually relative intensity. The y- axis may also display ion count in some data representations. In order to be detected, the

11 analytes have to be gas phase ions. In ESI-MS analysis, typically the solution containing the analyte is introduced into the instrument via an ionization source. As the name suggests, this part of the instrument is used to vaporize and ionize the analytes. After desolvation, ions are focused and sorted in the mass analyzer, which discriminates the ions depending on their m/z. They finally hit the detector which counts the number of ions for each m/z value and sends an electrical signal interpreted by the computer.16

Because the ions formed are short-lived and unstable, the instrument is under high vacuum (usually from 10-3 to 10-6 torr pressure) to prevent degradation of the ions. The source is sometimes at atmospheric pressure, and a gradient of pressures (called differential pumping) makes the transition between atmospheric pressure (AP) and high vacuum. Low pressure ensures a longer mean free path, which is the distance an ion can travel without colliding into another particle. It is necessary for the instrument to have a large mean free path so that the ions can reach the detector without undergoing collisions, which can alter the path of the ion.

2.1.2. The Development of MS for Proteomics

The first mass spectrometer, a “parabola spectrograph” was built in 1913 by

Thomson.17 However, it took about a century of developments and breakthroughs in the field of MS for mass spectrometers to be suitable to the analysis of large biological samples. Although a wide variety of ionization techniques were developed quite early

(electron impact, chemical ionization, etc.), none of them could keep the analyte intact, i.e. without fragmentation. In the mid-1980s, a true need for analysis of full-length proteins with high molecular masses emerged, and the techniques available at that time

12 did not meet the criteria for the desired analysis.18 Although ionization by Fast-Atom

Bombardment (FAB)19 could prevent most of the fragmentation of proteins, this technique was limited by the fact that it produced mainly singly-charged ions. Proteins with a molecular weight above 1,000 Da could not be measured by the mass analyzers available.

The invention of ESI, the first “soft” ionization method, by John Fenn in 198920 overcame all of these problems. Briefly, ESI creates multiply charged species, keeping the analyte without any fragmentation, which enables analysis of large proteins. Indeed, because of the multiple charging of analytes, the m/z values are in a range that can be measured by common mass analyzers. The importance of this discovery that changed the field of MS was rewarded by a portion of the Nobel Prize in Chemistry in 2002. More details about ESI are provided in the Section 2.2.1. Some other soft ionization techniques, notably MALDI, developed by Tanaka in 1988,21 have been used for the analysis of biological samples, but none can compete with the ease of use and efficiency of the ESI.

Owing to the advances made in the past, MS has grown increasingly important in the field of proteomics and is currently one of the most versatile and efficient method for gaining knowledge about proteins.

2.2. Instrument Parts

All MS instruments have the same basic “building blocks”: an ionization source, one or several mass analyzers, and a detector, (Fig. 2-1). Although for most instruments the ionization source is found under vacuum, atmospheric pressure sources have also been developed; they include electrospray ionization, atmospheric pressure matrix-assisted

13 laser desorption/ionization and atmospheric pressure chemical ionization (AP-CI). The choice of each part is extremely important, as their properties can be very different; thus the choice of each component of the instrument is application-driven.

Figure 2-1: Schematic of a typical Mass Spectrometer.

2.2.1. Ionization Sources

A large number of ionization sources are currently available, each one suitable for a range of applications. They can be classified into two groups: the organic ionization sources used to ionize organic and biological compounds; and the inorganic sources that are more suitable for studying metals, oxides, etc. For the purpose of this thesis, the focus will be on two organic ionization sources, widely applied to the analysis of biological samples: ESI and MALDI.

14 2.2.1.1. Electrospray Ionization

Electrospray ionization is one of the most widely used techniques in the field of proteomics due to its unique properties. ESI produces highly-charged analytes while preventing them from fragmentation as they are being ionized. Although it is currently used for many different families of compounds, such as small organic molecules, etc.,

ESI was developed for proteins and it remains the best ionization source for the analysis of biological samples.

Typically, a solution of the analyte (millimolar, mM concentration or lower) in a polar volatile solvent is injected via a syringe into a metal capillary at a low flow rate (1-

20 L/min). A high voltage, 2-6 kV, is applied to the tip of the capillary, creating an electric field between the capillary and the counter electrode located at the entrance of the instrument, 1 or 2 centimeters (cm) away from the tip. As the solution exits the capillary, the strong field induces a charge accumulation at the liquid surface at the end of the tip, resulting in the formation of a Taylor cone, followed by dispersion into an aerosol containing highly charged droplets. As the solvent is evaporated, a point is reached when the surface tension (which holds the droplet together) is equal to the columbic repulsion known as the Rayleigh limit. As the solvent continues to evaporate, the charges become too close to one another, and the droplet “explodes” into smaller ones. This process is repeated until a totally desolvated gas-phase ion is obtained. A schematic of the formation of charged ions in ESI is presented in Fig. 2-2. It is interesting to note that despite the fact that ESI is so widely used, the exact mechanism is currently not fully understood.

Nevertheless, some studies are leading the way to the better understanding of the physical and chemical processes involved in ESI.22, 23

15

Figure 2-2: Formation of ions in ESI. The blue dots represent the analyte molecules. a: solvent evaporation. b: “Coulombic explosion”. Figure adapted from24

The Synapt HDMS that was used in this work is equipped with a nanoESI source, with a lower flow rate of about 0.5 to 1 L/min. The spray needle is replaced by a borosilicate glass capillary of a few L volume with a diameter at the tip of 1-4 m. The initial droplets produced by the nano-ESI are much smaller compared to conventional ESI

(about 200 nm diameter vs. 20-200 m). The lower flow rate affords one greatly reduced sample consumption, facilitates improved desolvation and consequently sensitivity.

As ESI produces analytes that are multiply charged species, the charge state distributions in mass spectra provide much information about the studied compounds. For instance, it has been shown that proteins that are denatured give mass spectra displaying peaks at higher charges, with higher intensities and with a wider charge state distribution than when they are analyzed in their native, more folded conformation.25, 26 A denatured protein has more solvent-exposed surface area and is statistically more likely to

16 accommodate a larger number of protons than more folded conformers thereby. Due to all of the advantageous features mentioned above, the range of applications for the ESI is very broad. ESI-MS-based experiments using ESI are further described in Section 2.4.

2.2.1.2. MALDI

Matrix-Assisted Laser Desorption/Ionization was developed by Karas and

Hillenkamp in 198727 and shortly thereafter was applied to the analysis of proteins with a molecular weight over 100 kDa by Tanaka.21 MALDI was developed as an improvement to the already existing Laser Desorption/Ionization (LDI) technique, where the analyte is fixed on a metal plate and irradiated with an ultraviolet (UV) laser. However this ionization source had several limitations: extensive fragmentation due to the large amount of energy absorbed by the sample, low sensitivity, poor reproducibility and very high dependence of the signal on the capacity of the analyte to absorb UV light. MALDI is based on the same principle, except that the analyte is co-crystallized with an excess of matrix, an organic molecule chosen to absorb light and enhance ionization of the sample.

The matrix molecules are typically aromatic to absorb UV light, acidic to donate protons to the analyte, and easy to co-crystallize with the analyte of interest.28, 29 The addition of the matrix makes the ionization of the analyte indirect. Thus, for each laser pulse the sample is not fragmented as efficient desorption/ionization occurs.30 In addition, using a matrix enables the spot where the laser hits to be refreshed, greatly increasing the shot-to- shot reproducibility of the analysis.

MALDI has many interesting features that make it suitable for the analysis of large biopolymers and to the field of proteomics. The main one is that unlike ESI,

MALDI mainly produces singly charged ions. It becomes convenient for analysis of 17 complex peptide mixtures, as one peptide produces only one peak in the resultant mass spectrum, while for ESI each peptide produces multiple charged species, greatly increasing the number of peaks and the complexity of data analysis. For larger analytes like proteins, the +1 charge state peak is oftentimes the most intense, but it not uncommon for the higher charge states to be detected as well. Another advantage of using

MALDI is its relative tolerance for common ionization suppressors in protein and peptide samples such as salt, glycerol and urea. Such compounds can be present at a low concentrations in biological samples and not greatly interfere with the analysis.31 This enables faster determination of the analytes as a perfect purification is not required.

MALDI MS is a very sensitive technique, requiring only nano- to fentomoles of sample.

It has to be pointed out that MALDI MS is not suitable for compounds with a low molecular mass. Matrix and fragment ions produce high-intensity peaks in the range m/z

≤ 500. Small analytes, which give peaks in the same area of the mass spectrum, are likely to not be detected or overlap with the matrix peaks. In a typical MALDI analysis, the minimum m/z is often set at 400 in order to exclude most of the matrix peaks. Pulsed sources like MALDI work really well with time-of-flight mass analyzers.

2.2.2. Mass Analyzers

The improvements in the field of MS are not only due to the development of ionization sources, mass analyzers are just as important to get a reliable analysis with quality data. Mass analyzers are the part of MS instruments that separates ions based on their m/z. There are several types of mass analyzers, each one possessing a specific m/z range, resolving power and mechanism of ion transmission. Two main classes can be distinguished: the scanning mass analyzers which allow only ions with a specific m/z to

18 go through (e.g. quadrupole), and the simultaneous transmission mass analyzers, for which all of the ions pass through at the same time (e.g. time-of-flight).

2.2.2.1. Quadrupole

The single quadrupole (Q) analyzer is a device composed of four metallic rods perfectly parallel to each other. Two are connected by a constant voltage, and the other two rods by a radiofrequency voltage. The polarity of each rod switches from positive to negative in a cyclic fashion. Each pair of rods is 180° out of phase with the other. The principle of the quadrupole is that the rods create an oscillating electric field within which only ions with a given m/z are stable.32 When a positive ion enters the quadrupole, it is attracted by the negative rod due to electrostatic interactions. If the rod switches to positive before the ions reaches it, the latter is repulsed from the rod and keeps moving towards the exit end of the mass analyzer. Otherwise, it crashes into the rod, becomes neutral and is not detected. The modulation of the currents (intensity, frequency of oscillation, etc.) enables scanning for a large range of m/z values. The advantage of using scanning analyzers, such as a quadrupole, is that as only ions with very close m/z are transmitted at the same time and reach the detector, the sensitivity is greatly improved.

The detector has to scan only a small range of values, making the detection more efficient.

Quadrupoles are inexpensive, long-lived and commonly used. However, they are limited by an upper m/z of 4,000; they are not suitable for analysis of large compounds

(especially proteins), unless the ionization source produces multiply charged ions. Thus,

ESI-Q is a widely used combination for the analysis of proteins, while MALDI-Q is seldom employed. Quadrupoles, when not used as mass analyzers, can also serve as ion 19 guide at the entrance of the mass spectrometers, in order to focus the ions and ensure a stable trajectory. The first commercial quadrupole mass analyzer was developed by

Finnigan in 1994.33

2.2.2.2. Time-of-flight

Although the principles of time-of-flight (TOF) mass analyzers were described in

1946,34 the first design for a linear TOF analyzer was only published a decade later in

1955 by Wiley and McLaren,35 later becoming the first commercial TOF instrument.

The separation of ions in a TOF analyzer is based on the fact that ions with different m/z have different velocities, so they take different amounts of time to travel a defined given distance: ions are dispersed in time. By measuring the time needed for each ion to reach the detector, one can determine its m/z value and obtain a mass spectrum for the sample. Practically, the ions produced as packets, are initially accelerated by an electric field. They then enter the flight tube (a field-free region) where they are separated based on their velocities before reaching a detector situated at the extremity of the tube

(in linear mode). It is important to note that all the ions entering the drift tube have the same kinetic energy. There are two types of TOF mass analyzers: linear and reflectron.36

The design differences are shown in Fig. 2-3.

20

Figure 2-3: Schematics of linear (top) and single-stage reflectron (bottom) TOF analyzers. Adapted from Cotter et al. Anal. Chem. 199937

In reflectron mode, an electrostatic reflector creates a retarding field that deflects the incoming ions back into the drift tube. The reflectron is situated opposite the ion source (about the same place as the detector in linear mode) while the orthogonal detector is off-axis with respect to the initial ion beam so that it can be placed co-axially to the ion source. Ions with higher velocity go deeper into the reflectron field, while slower ions do not spend as much time in it. Hence, reflectron mode enables a larger flight distance, significantly improving the resolution. However, the main advantage of the reflectron is to correct the dispersion in the initial kinetic energies for ions with the same m/z leaving the source, so they all end up reaching the detector at the same time. This is illustrated on

Fig. 2-3 (bottom), where three ions of same m/z, regardless of whether or not their kinetic

21 energy is lower (eV – U0) or higher (eV + U0), have their trajectories corrected by the reflectron and are detected simultaneously. This prevents peak broadening and improves the resolution of the data obtained. Despite these advantages and the dramatic improvement in resolving power, there are some drawbacks to using reflectron mode. The main drawback is the limitation in the m/z range that can be analyzed. While in linear mode, there is virtually no limitation for the molecular weight of the analyte (masses above 300 kDa have been detected with MALDI-TOF MS,38, 39 the reflectron is limited to

≤ 10,000 m/z. TOF analyzers have a high transmission efficiency (i.e. the number of ions exiting the mass analyzer is almost the same as the number of ions entering it) leading to higher sensitivity. Due to the longer distance the ions have to travel, the sensitivity in reflectron mode is usually lower than in linear mode. TOF mass analyzers provide an excellent synergy with pulsed ionization techniques that produce packets of ions, such as

MALDI.

2.3. Synapt HDMS Instrument: Components and Features

2.3.1. Intrument Parts

The instrument that was used for the majority of the work presented in this thesis is the Synapt HDMS system (Waters Corp.), a hybrid quadrupole ion mobility separation orthogonal acceleration time-of-flight mass spectrometer equipped with a nanospray source (Fig. 2-4).

22 (e)

(a) (b) (c) (d)

Figure 2-4: Schematic of the Synapt HDMS (Waters Corp.)40

The nanospray ionization source (a), so-called “Z-shaped” source is engineered to effectively get rid of solvent molecules and neutral species before they can enter into the instrument. Pushers repel the charged ions to the entrance while the unaffected neutral molecules are pumped away by the vacuum system. Once inside the instrument, the ion beam is guided by a traveling-wave (T-wave) ion guide (b), whose role is to transfer ions from the pressurized ion source region to high-vacuum. Next, the quadrupole (c) acts as a mass filter, which stabilizes the trajectory of the ions. A focusing lens focuses the ions before they enter the tri-wave region (d) of the instrument. The tri-wave region is comprised of three T-waves connected in tandem. The T-wave cell consists of a stacked- ring radio frequency (RF) ion guide which incorporates a repeating sequence of transient voltages applied to the ring electrodes. These voltage pulses create a traveling electric field that propels ions through the background gas present in the mobility cell. For a given

23 charge state, ion-neutral collision frequency increases with extended ion conformation resulting in higher propensity to roll back over the waves. The time it takes for an ion to drift through the cell depends on its mobility; the wave period and height; as well as the gas pressure. Ions with high mobility (compact shape) are better able to keep up with traveling waves and are pushed more quickly through the cell. Ions with low mobility

(extended structures) crest over the waves more often and have to wait for subsequent waves to push them forward, resulting in longer drift times.41

In the Synapt HDMS, each cell of the tri-wave region has a specific function. The trap T-wave ensures high-efficiency by trapping the ions and increasing their number. It can also be used as a collision cell to fragment ions of interest. The IMS T-wave cell is where the ion mobility separation occurs: ions are discriminated based on their charge, size and shape. Ion mobility MS is described more thoroughly in thesis Section 2.3.2.3.

Finally, the packets of ions exiting the transfer T-wave are repelled into the TOF mass analyzer (e). The TOF can be operated in either of two modes: “V-mode”, for which the ions are reflected twice before reaching the detector; or “W-mode” where the resolution is increased dramatically as the ions travel almost twice the distance, but this improvement is at the cost of sensitivity.

2.3.2. Experimental Approaches

2.3.2.1. MS Analysis

The most basic MS experiment that can be carried out is a survey scan. Basically the sample is introduced into the mass spectrometer and ionized; the ions are separated by the mass analyzer and cataloged according to their m/z; and finally their intensities

24 measured and plotted on a mass spectrum. Although it appears to be a simple experimental scheme, a great deal of information can be extracted from a typical mass spectrum. The peaks displayed on mass spectra provide both qualitative (position on the m/z scale) and quantitative (peak height or area) information about the analyte, as seen in

Fig. 2-5. However, it has to be pointed out that the peak height is not enough to achieve quantitation without proper standards as different analytes can have different ionization efficiencies leading to varying intensities, even if they were introduced in equimolar amounts. Because any slight change in the analyte conformation, mass or environment triggers a change in the detected peaks, even a simple MS scan under suitable experimental conditions is powerful enough to analyze non-covalent complexes, for instance. Usually, MS scans are the first step of any MS-based experimental workflow as they provide an overview of the sample’s contents and complexity.

Figure 2-5: The peaks found in MS spectra provide both qualitative and quantitative information about the species being analyzed.

25

2.3.2.2. MS/MS Analysis

Tandem mass spectrometry (MS/MS) experiments involve at least two stages of mass analysis, in conjunction with a dissociation process. Commonly, two mass analyzers are required for this method: the first one selects an ion of given m/z (chosen by the user), which then undergoes spontaneous or activated fragmentation, leading to product ions and neutral fragments. Activation can be achieved in multiple ways, commonly using collisions with neutral gas molecules42 or a solid material,43 electrons44, 45 or photons.46

The daughter ions are separated by a second mass analyzer, detected, and the mass spectrum for the fragments is obtained. In the case of the Synapt HDMS system, the selection of the parent ion m/z is achieved by the quadrupole. Selected ions are then fragmented by violent collision with inert gas atoms (Argon) in the trap T-wave cell; the fragments are then mass analyzed by the TOF analyzer and detected.

The interest of MS/MS lies in its ability to break down a complex molecule to its basic structural features. Some practical applications of MS/MS are structure elucidation

(small organic molecules, differentiation of isomers and stereoisomers), selective detection of target compound class, ion-molecule reaction studies, and protein sequencing.47

2.3.2.3. IMMS

Ion mobility spectrometry is a way of separating ions based on their interactions

48 with an inert buffer gas (usually He or N2) as they fly through a drift tube. When coupled with mass spectrometry, this technique is referred to as ion mobility mass

26 spectrometry49 and becomes a powerful analytical tool capable of separating isomers, isobars and conformers; reducing chemical noise and measuring ion sizes (cross section).

The strength of this approach relies on the ability of IMMS to differentiate between isomeric and isobaric analytes that have the same molecular mass and nominal mass, respectively. IMS can be carried out in four different ways depending on the type of instrument: drift-time IMS, aspiration IMS, field-asymmetric waveform IMS; and traveling wave IMS (TWIMS, which is used in the Synapt HDMS). The mode of operation of TWIMS was described in Section 2.3.1. of this thesis. It is also operated at reduced pressure, which is a unique characteristic compared to the other IM spectrometers. TWIMS exhibits high transmission efficiency and separative power although the resolution is not as good as in conventional drift time methods.50

IM is extremely efficient when coupled with TOF mass analyzers as this technology provides an analyzer which is capable of providing full mass spectra on timescales short enough to enable profiling of millisecond wide ion mobility peaks.51 The data obtained when carrying out an IMMS experiment are presented as a mass-mobility plot (ion mobilogram), representing the mass-to-charge ratio against the drift time (or vice versa). A unique property of IMMS spectra is that in ion mobilograms, regardless of how they are plotted, mass-mobility correlations (commonly called “trend-lines”) are observed for classes of ions. Any statistically relevant deviation from these trend lines by an ion gives information regarding its structural compactness. Another way of presenting the data is to use a drift time ion chromatogram, where the intensity (or ion count) is plotted against the drift time; this representation is used to compare the drift times of all the ions detected. A schematic of the data obtained in IMMS is presented in Fig. 2-6.

27

A B

Figure 2-6: Two common data representations used for IMMS: ion mobilogram or mass- mobility plot (A ) and drift time total ion chromatogram (B). The dashed lines represent the mass mobility correlations and go through all the ions with a same charge state.

There is an almost infinite number of applications for IMMS, some of them include inorganic chemistry (metabolic profiling of inorganic ions),52 gas-phase ion structure studies, isomer separation in complex mixtures, analysis of saccharides, peptides, proteins, nucleic acids, drugs and metabolites. The importance of IMMS has grown increasingly in the field of proteomics due to the rapid and high-resolution separation it provides.53

2.4. Applications of MS to Proteomics

2.4.1. Definition of Proteomics

Characterization of proteins present in a biological system (proteome) provides insight into the function and complexity of that system. The proteome is not only

28 complex, but also spatially, temporally and chemically dynamic. Proteomics is the systematic study of all of the proteins expressed by a genome, cell, tissue or organism at a given time point under defined conditions. The term appeared in print for the first time in

1995.54 Proteomics aims to answer questions such as: what proteins are expressed in a cell, and in which amounts; how these expression levels relate to function? How do proteins interact with each other? How do PTMs help to regulate the function of proteins?

Two approaches have been developed: global proteomics, which experiments are designed to characterize as many proteins expressed by a genome as possible; and targeted proteomics for which the number of proteins analyzed is much lower (e.g. only proteins that are phosphorylated). Historically, proteomic experiments were mainly conducted using 2D gel electrophoresis for protein separation but the progress made in analytical instrumentation extended the range of techniques used; for example cell imaging, array and genetic readout experiments. However, MS has grown increasingly important in proteomics as the development of the instrumentation enabled the analysis of complex protein samples,55 and currently most of the proteomics efforts are MS-based.

As previously explained in Chapter 1, the proteome is much more complex than the genome, which makes proteomics a more challenging field compared to genomics.

Indeed, different cells express different proteins, or a same protein can be expressed by two different cells but not at the same level. Localization of the protein, interactions with other biomolecules, PTMs and conformational changes in protein structures are some of the obstacles that highlight the difficulty of proteomics. Practically, in order to overcome these challenges, proteomics experiments involve a step to reduce the complexity of the

29 representative proteome sample, analysis by MS and bioinformatics tools for fast data processing.

2.4.2. Protein Identity, Quantity and Structural Features

One aspect of proteomics, usually the first carried out on an unknown system, is protein characterization. Identifying the proteins comprising the proteome of a given cell or organism is a necessary step before conducting more advanced experiments. Two main experimental methods have been developed: the “bottom-up” and “top-down” approaches.

2.4.2.1. Bottom-up Approach

The most common MS-based proteomics technique to study proteins is the bottom-up approach. The workflow is as follows: a sample containing several proteins is isolated and enzymatically digested into peptides; trypsin is commonly used as the positive ions created by cleavage C-terminally to lysine and residues enhance ionization efficiency and detection. The resultant peptide mixture is fractionated using liquid chromatography (LC) or other methods (affinity purification among them). This step allows for enrichment of the sample and reduces the number of peptides entering the mass spectrometer at any given time, thereby increasing sensitivity. As the peptides elute from the chromatography column (typically reverse phase), they are infused directly into the MS and ionized by the ESI source. At that point, a complete scan is taken and the peptide masses are compared against in silico digested protein sequences from databases of fully-sequenced proteins (Mascot, Sequest, etc.).56 If more information is needed for a

30 given peptide, tandem mass spectrometry can be carried out, and the MS/MS data leads to its amino acid sequence.

The bottom-up approach is very popular because of its many advantages: this method is sensitive; samples can be analyzed with a high-throughput as some steps can be automated, and software for data interpretation is available. This approach is fast and reliable and has been applied to many systems: the shotgun method allowed for identification of 42 subunits of the mitochondrial complex I, which represents 95% of the complex.57 It was also used to analyze MCF-7 cell lines expressing the zinc-finger or the -rich domain of retinoblastoma-interacting-zinc-finger protein,58 and identification of a single mutation in Escherichia coli (E. coli) RNA polymerase was achieved following this workflow.59 In the case of PTMs, the bottom-up approach is interesting as it produces peptides that carry the modification and can be distinguished by mass from the unmodified peptides (see Thesis Section 2.4.3.). Despite these many advantages, the efficiency of this technique decreases dramatically as the complexity of the sample increases.

2.4.2.2. The Top-Down Approach

The top-down proteomics approach refers to the analysis of intact proteins, which are not enzymatically digested prior to the MS experiment. Bottom-up and top-down approaches each have specific advantages and drawbacks that have been thoroughly discussed in the past.60, 61 The top-down approach has the advantage of being less time- consuming and to provide the mass of the intact protein and can consequently highlight modifications, cleavages or isoforms of a given protein. However, analyzing intact proteins is more challenging as the ionization is limited and the detection of high 31 molecular weight compounds is reduced. Oftentimes, the full-length protein is fragmented and the masses of the obtained fragments can be compared against database values.

The top-down approach is very efficient for large proteins or proteins expressed at high levels. It has been applied to the analysis of intact membrane proteins.62 Soft ionization MS preserved the covalent structure of the proteins that were then fragmented; the sequence of fragments define the original native covalent state of the protein. Proteins with masses greater than 200 kDa were successfully identified by this technique using electrospray additives, heated vaporization, and separate non-covalent and covalent bond dissociation.63 For the characterization of PTMs, top-down is a more suitable approach as the fragment ion mass data are much more specific than the masses of the peptides from protein’s digest. This has been demonstrated by Sze and co-workers in 2002 as they reported the characterization of PTMs within one residue for a 29 kDa protein.64

2.4.3. PTMs and Proteomics

Qualitative MS-based proteomics analyses of PTMs on purified proteins are usually achieved by peptide mapping. The protein is digested with one or more proteolytic enzymes in order to cover as much of its sequence as possible. Protein modifications are then identified by correlation of the measured masses and sequence information derived from MS/MS data. Peptides carrying a modification display a shift in mass corresponding to the weight of the attached moiety, and are usually easily identifiable. This can be carried out manually or with the use of a number of available software packages including Mascot, Sequest, etc.65 This untargeted approach is commonly used because a single protein can be multiply modified, and it is generally

32 faster to screen for all PTMs instead of targeting them one by one.66 The detection of modifications on histones, proteins that are heavily and variably modified, is an example of successful methods for PTM analysis by proteomics.67

In addition to this general method, several techniques for targeted identification of modifications have been developed.68 The experimental design for large-scale analysis of

PTMs always involves some sort of sample enrichment, usually tandem affinity purification.69 For instance, the efficiency of the sample enrichment in phosphopeptides using titanium oxide (TiO2) columns was beneficial to the analysis of phosphorylation in proteins. Phosphorylation is one of the modifications associated with biological regulation as it is involved in many mechanisms.70, 71, 72

Small stable modifications such as methylation and acetylation are relatively easy to detect and characterize by MS, given that they only change the molecular weight of the protein by a few Daltons. However, some PTMs add much larger moieties to the substrate. Ubiquitin (Ub) is a protein that links to the  amino group of a lysine residue and adds 8.5 kDa to the protein. The Small Ubiquitin-Like Modifier (SUMO) is an 11 kDa protein that modifies its substrate the same way as ubiquitin. The great diversity of these PTMs requires equally as diverse analytical methods for their analysis.

33

Chapter 3: Characterization of SUMOylated proteins by MS

3.1. Ubiquitin and SUMO

3.1.1. Structure and Functions of Ub

Ubiquitin is regulatory protein found in almost all tissues of eukaryotes.

Interestingly, ubiquitin is one of the most highly conserved proteins among eukaryotic organisms, but totally absent from the superkingdoms Eubacteria and Archaea.13

Ubiquitin is a 76 amino acid protein with a molecular weight of 8564.84 Da. The human

Ub was isolated and sequenced by Edman degradation for the first time in 1975.73 The protein is now very well-known and fully characterized; it is also one of the first PTMs to be identified.74

Substrate proteins are covalently modified by Ub through an enzyme cascade requiring the concerted action of three enzymes referred as E1, E2 and E3.75 Firstly, E1

(ubiquitin-activating enzyme) activates Ub through an ATP-dependent process by producing an Ub-adenylate intermediate. Ub is then transferred to the E1 active site residue with loss of adenosine monophosphate (AMP) and formation of a thioester bond between the Ub C-terminus and the E1 cysteine sulfhydryl group. Through

34 a trans(thio)esterification reaction, Ub is transferred to E2 (ubiquitin-conjugating enzyme). Finally, with the assistance of one of hundreds of E3 enzymes (), the last step of the process results in the formation of an isopeptide bond between the C- terminal of Ub and the -amino group of the target lysine (Fig. 3-1).76

O H

NH C CH O 1 R CH NH C

NH2 H OH CH2 CH C-terminal di-glycine 2 Target lysine side chain CH2 2CH

2 R CH O NH C

3 R

- H2O

O H

NH C CH O 1 R CH NH C isopeptide bond

H NH

CH2 2CH

CH2 2CH

2 R CH O NH C

3 R

Figure 3-1: Scheme of the isopeptide bond formation during Ub or SUMO conjugation to a target lysine.

35 One specific feature of this modification is that Ub can modify itself and form chain or branched structures with various lengths and linkages, providing versatile means of cellular regulation. Interestingly, the various Ub branched and chain structures have different effects on the target protein, thus this single modification displays various functions. It has been shown that Ub is involved in the mechanisms regulating cell development, growth, and apoptosis and signal transduction processes;77 however it is mainly known for triggering degradation of proteins by the 26S proteasomes through

ATP-dependent attachment of a specific polymeric chain of Ub.78, 79

3.1.2. An Ubiquitin-Like Modifier: SUMO

While the Ub sequence is highly conserved phylogenetically, it has been found that by contrast, a number of proteins sharing a similar fold with Ub (but very little sequence homology) are highly different from one organism to another. Despite low identity to Ub, along with similar fold these ubiquitin-like modifiers (Ubls) share some general features, such as the chemistry of modification and the enzyme cascade leading to reversible and covalent binding. The currently known Ubls include: interferon-stimulated gene 15 (ISG15); neural precursor cell expressed protein, developmentally down- regulated 8 (NEDD8); autophagy-related protein 8 (ATG8); autophagy-related protein 12

(ATG12); fau and its ubiquitin-like domain (FUBI); ubiquitin-related modifier 1

(URM1); ubiquitin-fold modifier 1 (UFM1); histone mono-ubiquitination protein 1

(HUB1); ubiquitin cross-reactive protein (UCRP); and small ubiquitin-like modifier

(SUMO).80 These proteins all display the diglycine motif at their C-terminal (in mature form), a necessary feature for covalent binding to the substrate proteins.

36 One Ubl in particular, the small ubiquitin-like modifier, an approximately 11 kDa protein, has been shown to covalently modify a large number of proteins to regulate many cellular processes including gene expression, chromatin structure, signal transduction and maintenance of the genome.81, 82, 83 Contrary to what the name suggests, SUMO only shares about 18% sequence homology with Ub but the both proteins share a -GRASP fold, as illustrated in Fig. 3-2. (adapted from literature):84 the same secondary motifs are found in both structures, although SUMO possesses an N-terminal extension absent from the Ub protein. The core of both proteins is formed by four -sheets around an -helix. A small helical structure is also present at the periphery of the proteins, with a high solvent exposure.

Figure 3-2: Human SUMO-1 (left, PDB: 1AR5) and Ub (right, PDB: 1UBQ) share a similar fold: they both display the same secondary motifs. SUMO-1 has a floppy N- terminal extension absent in the Ub structure.

37 The E1 and E2 enzymes involved in SUMOylation are closely related to those involved in ubiquitination. However, some differences have been identified between the two mechanisms: for example while many E2 enzymes have been identified for the Ub process, only one conjugating enzyme, UbcH9, is currently known for SUMO.

Moreover, E3 is crucial for the specificity of Ub attachment, but does not appear to be even necessary for SUMOylation to properly occur. The chemistry of modification and isopeptide bond formation is the same as well. Because the mechanisms of SUMOylation and ubiquitination are so similar and has been shown to target the same lysine residues in certain proteins, it has been hypothesized that SUMO and Ub act as antagonists.84 For instance, SUMOylation stabilizes the nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor, alpha (IkB) while ubiquitination leads to its degradation.

SUMOylation occurs on a consensus acceptor site consisting of the sequence KxE/D, where  is a large hydrophobic amino acid, x can be any amino acid residue, and K is the modified lysine.85 It has to be noted that this sequence is not an absolute rule, as it has been reported that about 40% of proteins can be SUMOylated outside of this consensus sequence, demonstrating differences in substrate specificity.12 Other studies have identified two alternative consensus sequences for SUMOylation: one includes a phosphorylation site (KxExxpSP where pS is a phosphoserine)86 and the other is a negatively charged amino acid-dependent extended motif that enhances SUMOylation.87

Four SUMO isoforms have been identified in humans: SUMO-1, -2, -3 and -4.

Yeast cells express a simple SUMO analog, Smt3p, which has been widely used in the early stages of SUMO research due to its easy expression in large amounts. Moreover, the primary sequence of Smt3p allows for straightforward determination of modified peptides

38 by MS (see Thesis section 3.2.2. for more details). SUMO-2 and SUMO-3 are closely related sharing 95% sequence homology, but only 50% sequence identity with SUMO-

1.88 SUMO-4 has a restricted expression pattern, with the highest levels found in the kidneys.89 Some substrates can be modified by both SUMO-1 and -2/3 , while Ran GTP- ase activating protein 1 (RanGAP1) for instance, the first substrate ever found for

SUMOylation,90 is almost exclusively modified by SUMO-1. Topoisomerase II is predominantly a substrate for SUMO-2/3. It has been mentioned previously that the function of Ub is dependent on the chain or branched structures formed on the substrate protein. This feature is found to a lesser extent with SUMO: SUMO-2 and -3 can covalently bind to themselves via the lysine residue located in a consensus sequence near the N-terminus and form polymeric chains (not branched). SUMO-1 does not have a consensus acceptor site, though it has been reported to form polymers in vitro. No proof of SUMO-1 polymerization has yet been demonstrated in vivo, leading to the idea that

SUMO-1 would act as a chain terminator.12 The significance of SUMO polymer chains has to be investigated for further understanding of the physiological functions of

SUMOylation.

SUMO has grown as a protein of interest, due to its implication in many cellular

(especially nuclear) mechanisms. Data obtained after deregulation of the SUMOylation process suggests that diseases such as cancer, pathogenic infections and neurodegenerative disorders could be linked to malfunction of the SUMO machinery. It is also important to understand SUMOylation in order to generate reliable cell models to predict the toxicity of drugs in humans.91 Several approaches have been reported in the past to characterize SUMO-modified proteins.

39 3.2. Existing Methods for the Detection of Ubls by MS

3.2.1. Methods for Ubiquitin

Historically, ubiquitination was the first PTM that was reported,74 and since then its identification has become a routine procedure, especially with the development of proteomics experiments involving MS. Because some MS-based methods to characterize

SUMOylated proteins are based on the principles used for Ub identification, an overview of the technique is provided.

The early stages of identification and characterization of Ub substrates were difficult due to low steady state levels and the presence of a varying number of Ub molecules attached to different molecules of the same species. The first Ub identification method was reported in 1985 by Haas and Bright;92 they conducted sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) of whole cell extracts followed by immunoblot analysis with anti-Ub antibody. They detected a complex profile of poorly resolved high molecular mass proteins together with a few well revolved low molecular mass proteins. This workflow was optimized and applied in 1992 by Beers and co- workers.93 Due to the inefficiency of the detection of Ub-modified proteins, which are present in low amounts in the sample, methods involving specific affinity purification were developed; pre-concentration of the sample in ubiquitinated proteins allowed for detection and identification of multiple Ub substrates. Oftentimes, the studies of protein ubiquitination involves three major steps: affinity purification, proteolytic digestion and tandem MS. Proteins are expressed with an epitope tag genetically attached to the N- terminus of ubiquitin, which enables post-expression affinity purification.94 Peng et al. reported a study in which Ub conjugates were identified by expression of His6-tagged

40 proteins, Ni-affinity capture, trypsin digestion, chromatographic separation and MS/MS analysis, leading to the discovery of 1075 Ub candidate substrates.95

Ubiquitination adds 8.5 kDa to the mass a protein, making its direct analysis by

MS challenging: the large size of the modification makes ionization of the substrate less efficient; increases the complexity of the MS/MS spectra; and prevents the use of bioinformatics tools and databases to process the data obtained. Thus, the use of enzymatic digestion arose, making the MS analysis more straightforward. Most frequently, trypsin is used: it cleaves the C-terminal peptide bond of lysine and arginine residues, unless followed by a proline. The advantage of using this protease is that the mature ubiquitin sequence terminates with an arginine residue followed by two .

Thus, after trypsin digestion, the modified lysine residue contains a diglycine tag (+ 114

Da). While the unmodified peptides display a linear structure, the Ub-modified ones carry a tag, making them branched. Because a peptide bond is formed between the substrate lysine and the Ub glycine, they are referred to as isopeptides. The mass shift caused by the – GG tag is easily identifiable by MS or MS/MS and can be used as a diagnostic tool for the identification of modified peptides. Because of its small molecular weight, the strength of this technique is the simplicity of the analysis once the diglycine tag is obtained by digestion.

Due to the success of this method, the idea of applying it to the Ubls has arisen.

However, the primary sequence of many Ubls is such that the proteolytic cleavage site is located several residues from the C-terminus, making the tag relatively long. Figure 3-3 presents the sequence alignment of the C-terminus of the four SUMO isoforms; human

Ub; the yeast analog of SUMO, Smt3p; and two other Ubls: UCRP and NEDD8. With the

41 exception of UCRP and NEDD8, the length of the tags resulting from trypsinolysis

(shown in bold font) for the SUMO proteins are greatly increased compared to Ub.

MS/MS data collected for isopeptides from the human SUMO isoforms would mostly provide sequence information for the tag, which would not enable mapping of the modified lysine or identification of the substrate protein. Due to this, a number of other approaches have been developed.

SUMO-2_HUMAN QIRFRFDGQP INETDTPAQL EMEDEDTIDV FQQQTGG SUMO-3_HUMAN QIRFRFDGQP INETDTPAQL EMEDEDTIDV FQQQTGG SUMO-4_HUMAN QIRFRFGGQP ISGTDKPAQL EMEDEDTIDV FQQPTGG SUMO-1_HUMAN SLRFLFEGQR IADNHTPKEL GMEEEDVIEV YQEQTGG SMT3_YEAST SLRFLYDGIR IQADQTPEDL DMEDNDIIEA HREQIGG UBIQ_HUMAN QQRLIFAGKQ LEDGRTLSDY NIQKESTLHL VLRLRGG UCRP_HUMAN LFWLTFEGKP LEDQLPLGEY GLKPLSTVFM NLRLRGG NEDD8_HUMAN QQRLIYSGKQ MNDEKTAADY KILGGSVLHL VLALRGG

Figure 3-3: Sequence alignment of the C-terminal region of Ub and several Ubls showing the low conservation of primary sequence. Tags obtained on modified peptides after trypsin digestion as shown in bold.

As mentioned previously, Ub is highly conserved across species, making the method described above applicable for most organisms. However, Ubls have little homology with Ub, translating into their C-terminal residues not necessarily giving a -GG tag when cleaved with trypsin. It is true in particular for SUMO isoforms, which yield a

19 (SUMO-1) or 32 (all others) residue tag. The MS-based methods currently used for analysis of Ubl-modified proteins include the study of the SUMO yeast analog; creation of mutated SUMO proteins; high-end MS with FT-ICR instruments; the use of specifically designed software; and a tailored-proteolysis approach.

42 3.2.2. Yeast Analog

The SUMO analog protein in Saccharomyces cerevisiae (S. cerevisiae) possesses an arginine residue at the sixth position from the C-terminus. Proteolytic digestion with trypsin consequently leads to a 5 residues tag (EQIGG), adding a supplementary 484.2 Da to the mass of the substrate peptide.

The general procedure for analysis of SUMOylated substrates in S. cerevisiae follows the same key steps as for Ub. The proteins are expressed with both His6 and

FLAG tags, and are purified by tandem affinity chromatography, with a nickel column followed by a FLAG-affinity purification.96 The tandem purification has the advantage of dramatically reducing the time spent sequencing non-SUMOylated proteins with MS/MS; the time saved can be redirected to analyze the peptides of interest. Furthermore, using two columns increases the confidence that the proteins isolated are actual targets of the

SUMO pathway. The purified and pre-concentrated proteins are then digested with trypsin leading to two kinds of peptides: linear peptides (from SUMO and from target protein), and isopeptides containing the modified lysine residue and SUMO tag. Modified peptides feature a missed cleavage site at the modified lysine and an increased mass due to the tag. Database searching finally enables for rapid interpretation of the tandem MS data obtained.97 In some studies, mutagenesis was used to obtain a -GG tag and apply the same method as for Ub.98

Yeast is an interesting model as it allows for straightforward genetic and biochemical studies. Despite this important feature, the relevance of this method is questionable regarding the goal of gaining a better understanding of SUMOylation to

43 endogenous human proteins. Characterizing proteins modified by Smt3 can point to some interesting leads but is not sufficient by itself for the complete understanding of human

SUMO physiological targets and specific functions.

3.2.3. Mutagenesis

Site directed mutagenesis has been used to help reduce the size of the tag carried by SUMO-modified peptides: by mutating a residue close to the C-terminus to an arginine or a lysine, a trypsin cleavage site is created and the tag is reduced to only a few amino acids. In 2005 Knuesel and co-workers reported a SUMO-1 mutant yielding a diglycine tag after trypsinolysis.99 They observed that Ub and two other Ubls possess a tryspin cleavage site (arginine) at the third residue away from the C-terminus; however,

SUMO-1, -2 and -3 have a residue at the third position, preventing trypsin from cleaving at this site. To mimic Ub, the mutation T95R was implemented. As Ub and

SUMO are so similar in their activity, it was assumed that the mutation would not induce any changes in reactivity. This hypothesis was confirmed by their experiments. In 2010,

Blomster et al. reused the T95R SUMO-1 idea and added many other mutations (C52S,

H75K, V87K, V90C, and Q92C) on the C-terminus of SUMO-1 in order to purify the protein based on cysteine-affinity columns.100 Using this short-tag mutant in combination with MS/MS, they were able to identify the SUMOylation sites in twelve human proteins.

Although the T95R provides a trypsin cleavage site that allows for a short tag with trypsin digestion, threonine and arginine do not share many similarities in terms of structure and polarity. Using this observation, some groups have moved the mutation to an amino acid of closer properties to arginine. The sixth residue away from the C-terminus of all human

SUMO isoforms is a glutamine. Mutating to an arginine residue induces little change in

44 the structure of the side chain, conserving the reactivity and functions of SUMO.101 The five amino acid fragment obtained after trypsin digestion results in fewer fragment ions from the isopeptide tags and facilitates data interpretation with common databases and bioinformatics tools. Despite the advantages of using the mutagenic approach to SUMO analysis, the main issue with such procedures is the amount of time and effort needed to complete them.

3.2.4. Use of High-end MS Instrumentation

Digestion with trypsin produces extremely large isopeptides; the tag itself for

SUMO-2, -3 and -4 consists of up to 32 amino acid residues. Two main issues arise with the MS analysis of such SUMOylated peptides. The first one is peaks are obtained from both the long SUMO tag and the substrate peptides, significantly increasing the complexity of the tandem mass spectra. Moreover, because the size of the tag is oftentimes much larger than the substrate peptide, most of the fragments obtained originate from the SUMO isopeptide tag and not the substrate peptide. This is a key issue as the sequence information of the substrate peptide is crucial for mapping the modified lysine and for identification of the modified protein. In order to solve these issues, high- energy fragmentation is required to increase the number of fragments and allow for detection of substrate peaks. However, even common instruments such as Q-TOF mass spectrometers, capable of high-energy fragmentation, do not produce enough fragments to obtain the information needed.

A solution to this challenge is to use high-resolution and high-sensitivity MS instruments such as the FT-ICR; despite the high cost, these mass spectrometers currently

45 have the best performance in terms of limit of detection, sensitivity and resolving power.

The use of non-ergodic (fast heating) fragmentation methods, especially electron capture dissociation (ECD) and electron transfer dissociation (ETD), allows for direct bond cleavage and dramatically increased fragmentation efficiency. Coupling FT-ICR instruments with ETD/ECD fragmentation mechanisms can provide substrate peptide information even when carrying a long SUMO tag. The use of FT-ICR MS for the detection of ubiquitination sites was reported in 2004 by Cooper et al.102 They extended this approach one year later to SUMOylation,103 achieving measurement of a SUMO-

1:RanGAP1418-587 intact conjugate with the mass accuracy of 2.7ppm. Moreover, the digestion with trypsin of in vitro modified wild type proteins led to the mapping of the substrate . The efficiency of the FT-ICR also allowed for the characterization of the number of SUMO chains when the protein was poly-SUMOylated, which is of importance to understand how SUMOylation regulates the mechanisms in which substrates are involved. This methodology was also successfully applied to the identification of the modification sites of the human centromere protein (CENP-C).104

Despite the quality of the data that can be obtained by FT-ICR MS, the instrumentation required is fairly expensive and not commonly found in many laboratories. Consequently, this is not a good method for routine analysis with commonly available MS instruments.

3.2.5. Software Tools

For the analysis of small PTMs, such as tyrosine phosphorylation or lysine methylation, the data collected with a LC/MS/MS method can be processed with the help of a database. Standard database searching algorithms compare the experimental MS/MS spectra to theoretical spectra generated in silico and identify any resulting mass matches.

46 Thus, based on the m/z and pattern of the obtained peaks, the primary sequence of the peptide (modification included) can be identified. However, database searching is only applicable to small modifications that do not fragment during analysis, and alter one of the peptide fragment ions by a specific, indivisible mass. In the case of SUMOylation, the fragmentation within the isopeptide tag produces peaks in the mass spectrum, which obscure the pattern of the modified target peptide. Because several ion series are overlapping on the same spectrum, database searching is inefficient for SUMOylation data processing.

Some research groups have developed an automated approach for SUMO analysis using software to process MS/MS datain. SUMmOn, developed by Pedrioli and co- workers, is an automated pattern recognition tool that detects diagnostic PTM ion fragment series within complex collision-induced dissociation (CID) mass spectra to identify modified peptides and modification sites within those peptides.105 SUMmOn is based on an algorithm that extracts intensities for any user-defined b- and y-ion series from every MS/MS scan in a specific analysis. Because SUMO is attached by its C- terminus, the y-ion series generated by the modification is dependent on the mass of the

1+ substrate peptide. Consequently, the yn -ion series must be recalculated for each MS/MS

1+ scan by subtracting the masses of the bn (independent) fragments from the singly charged precursor ion mass, and adding the mass of one hydrogen atom. At the end of the procedure, SUMmOn calculates two scores: one for the modification and one for the target peptide. The use of this software in combination with alternative proteases (LysC among them) allows for maintaining of the identity of the original Ub and Ubl conjugate.

Indeed, if a target is modified by Ub and two other Ubls producing a -GG tag after

47 trypsinolysis, the identity of the modifier is lost after digestion. However, using this approach in combination with LysC digestion and SUMmOn analysis allows for comprehensive identification of Ub/Ubl-modified peptides. Furthermore, Jeram et al. identified putative NEDDylation sites, and previously unpublished SUMO and NEDD8 chain topologies.106 A few years after the development of SUMmOn, “ChopNSpice”, a simple and straightforward database tool was presented.107 The software relies on the idea that MS/MS fragmentation of branched peptides is similar to the fragmentation of a linear peptide with a miscleaved lysine residue and the SUMO peptide at its N-terminus.

However, databases do not include SUMO as a putative modification at lysine residues.

To address this problem, ChopNSpice automatically generates SUMO-modified FASTA sequences of proteins in silico. These sequences are then implemented in a database search (SEQUEST, MASCOT, etc.), to identify acceptor sites for SUMO conjugation.

This method is limited to fairly simple samples. Once the complexity of a sample increases, the SUMmOn database increases as well, leading to a higher number of eligible target peptides. To overcome this issue, the instrument must possess an excellent mass resolution and high duty cycle in order to avoid peak crowding. The development of software tools for the PTM identification is of great value for data interpretation but does not bring anything new to the protein preparation or analysis by MS.

3.2.6. Griffith/Cotter: Tailored-Proteolysis Approach

The Griffith/Cotter method is based on a dual enzyme digestion to reduce the size of the isopeptide tag, and taking advantage of data-dependent MS/MS acquisition mode to provide an accurate and fast analysis.108 Data dependent MS/MS is an automated

48 procedure that consists in fragmenting the next three peaks with the highest intensity, and collecting the MS/MS spectra. Once this step is done, the fragmented parent ions are excluded from the list and the operation is repeated on the three most intense ions, and so on, until all the ions have been fragmented. The technique relies on the use of chymotrypsin, an enzyme that cleaves proteins C-terminal to large hydrophobic amino acid residues (tyrosine, , phenylalanine, and to a lesser extent leucine). In the case of SUMO-modified proteins, chymotrypsin is of high interest as it yields a 6 amino acid tag for all human SUMO isoforms (italicized in Fig. 3-4). Because the tag is much smaller, the modified peptides are more likely to produce quality MS and MS/MS data.

Moreover, mutagenesis is unnecessary to produce a short tag with this approach, endogenous proteins can be used and the relevance of the analysis is not questionable.

SUMO-2_HUMAN QIRFRFDGQP INETDTPAQL EMEDEDTIDV FQQQTGG SUMO-3_HUMAN QIRFRFDGQP INETDTPAQL EMEDEDTIDV FQQQTGG SUMO-4_HUMAN QIRFRFGGQP ISGTDKPAQL EMEDEDTIDV FQQPTGG SUMO-1_HUMAN SLRFLFEGQR IADNHTPKEL GMEEEDVIEV YQEQTGG

Figure 3-4: Sequence alignment of the human SUMO C-terminal regions. The tag obtained by digestion with trypsin and chymotrypsin are shown in bold and italic, respectively.

The workflow of the Griffith/Cotter method is as follows: the sample containing several proteins is dually digested with chymotrypsin (for 3 or 4 hours), followed by trypsin (overnight). The resulting peptide mixture was separated by LC, connected to an

ESI-MS instrument. A survey scan was carried out on the eluted peptides, and they were then fragmented in a data dependent fashion. One of the features of this technique is the

49 ability to extract and monitor given masses for the daughter ions. This is of interest because when fragmented, the tag on modified peptides produces ions that are specific to only the SUMO isopeptide tag. Extracted ion chromatograms from fragment ions resulting from bond cleavage within this QQQTGG- tag can be used to screen

LC/MS/MS data for possible SUMO isopeptides. These ions, named b2’, b3’ and b4’, as shown in Fig. 3-5 are produced by fragmentation within the SUMO isopeptide tag.

Interestingly, it has been reported that N-terminal glutamine residues can undergo an internal rearrangement leading to the neutral loss of ammonia (Fig. 3-6).109 Because all

SUMO tags feature an N-terminal Q residue, simultaneous screening for b2’− 17 (240.10 m/z), b3’− 17 (368.15 m/z) and b4’− 17 (469.19 m/z) can selectively point to SUMO substrate candidates. These ions are found in any SUMO-2 or -3 peptides regardless of the charge or extent of the proteolytic digest. Consequently, when all three specific diagnostic mass tag ions are present at the same elution time, there is a high likelihood that the peptide being analyzed is modified by SUMO.

Figure 3-5: Schematic of an isopeptide carrying the 6 amino acid residues SUMO tag obtained by dual digestion with trypsin and chymotrypsin. The fragment ion series obtained by CID are labeled using the Roepstorff and Fohlman nomenclature. The prime refers to fragment produced by the SUMO tag and not the peptide itself.

50 2 ONH C CH2 CH2 CH2 O C - NH3 (17 Da) 2CH CH N NH CH O H C R 2NH C O NH R

glutamine pyroglutamate

Figure 3-6: Scheme of the N-terminal glutamine rearrangement into pyroglutamate with neutral loss of ammonia.

There are many advantages to using this method: first, it allows faster analysis as the total time needed for the experiment is on the order of minutes. Because most of the steps are automated, the user-induced error is reduced. The procedure is also applicable for all human SUMO isoforms with minimal tweaking to account for differences in the sequences of SUMO-1 (QEQTGG) and SUMO-4 (QQPTGG). LC separation of peptides though is very time- and solvent-consuming, leading to current interest in the development of pre-screening methods to complement the established methods.

3.3. Requirements for a New Method

All current methods for SUMO analysis require steps for separation and/or pre- concentration of the analytes prior to MS analysis. These techniques rely on a compromise: the generation of shorter tags through mutagenesis that are more easily analyzed by MS is at the expense of physiological relevance. Also higher-throughput experiments are effective for simple mixtures and become increasingly inefficient for

51 more complex samples. New and effective methods for SUMOylation analysis must address a number of important factors. The first is the ability to carry out the analysis using common MS platforms. Secondly, the new method must be fast, efficient and require less time and solvent; and it has to be a routine procedure that can generate reliable results. Finally, it must have the ability to analyze SUMOylation in endogenous human proteins, with no mutatagenesis or genetic tag. The work presented in this thesis attempts to address these goals and investigates a new method based on ion mobility mass spectrometry for the screening of SUMOylated proteins.

52

Chapter 4: Material and methods

4.1. Materials

All chemicals used in this study were of analytical grade or better. Solutions of 1.5

M Tris HCl pH 8.8, 1 M Tris HCl pH 6.5, Laemmli sample buffer, 10x Tris/Glycine/SDS running buffer and 30% acrylamide solutions were purchased from Bio-Rad Life Science

(Hercules, CA). Sodium bicarbonate, ammonium bicarbonate (AmBic), trifluoroacetic acid (TFA), bovine serum albumin, sulfophenyl isothiocyanate (SPITC) power and proteomics grade trypsin were purchased from Sigma (St. Louis, MO). Sequencing grade bovine pancreas chymotrypsin was purchased as a salt free lyophilizate from Roche

Applied Science (Indianapolis, IN). Poly-SUMO2 and -3, SUMO protein set, SUMO-1 conjugation kit, SUMO conjugation substrate UBE2K (E2-25K), UBE2I (UbcH9) and

SUMO activating enzyme (SAE1/SAE2) were purchased from Boston Biochem, Inc.

(Cambridge, MA). GST-tagged RanGAP1 and SP100 fragments (human recombinant,

GST-tagged) were obtained from Enzo Life Sciences (Farmingdale, NY). C-terminal amidated CCK-8 desulfated peptide (DSpp) and its sulfated analog (Spp) were purchased from Research Plus (Barnegat, NJ). Lyophilized -casein was purchased from USB

(Cleveland, OH). OMIX 10 μL, C18 resin ziptips were purchased from Varian. All solvents were of HPLC grade.

53 4.2. In vitro SUMOylation of Proteins

In vitro SUMOylation of protein substrates was carried out in 0.6 mL low retention Eppendorf microcentrifuge tubes using a SUMOyation kit from Boston

Biochem according to manufacturer’s recommended protocol. The amounts added and final concentrations for each reagent are presented in Table 4.1. Proteins/reagents were introduced in the same order as they appear in Table 4.1. The conjugation reactions were incubated at 37 °C for 3 h.

Table 4-1: Protein and reagent amounts and final concentrations used for in vitro SUMO conjugation reactions.

Amount for a 20 L Reagent Final concentration reaction

E1 enzyme 100 mM 4 L of 0.5 M solution

SUMO-1/-2/-3 50 M 4 L of 250 M solution Reaction buffer (10X: 500 mM Hepes pH 8, 1000 1X 2 L of 10X solution mM NaCl, 10 mM DTT)a

UbcH9 5 M 2 L of 50 L solution

Substrate protein As specified 5.2 g

Distilled water N/A Complete to 20 L

Mg-ATP 1 mM 2 L of 10 mM solution

a DTT = dithiothreitol

4.2. SDS-PAGE Gels

To verify the extent of the modification after in vitro SUMOylation, one fourth of the volume of each reaction was run on a 15% and 5% acrylamide/bis acrylamide SDS-

54 PAGE separating and stacking gels, respectively. The gels were prepared as reported in literature.110 Table 4.2 summarized the volumes of the various reagents used in preparation of the gels. The 10% ammonium persulfate solutions were freshly made prior to use.

Table 4.2: Chemicals and amounts used for the preparation of 15% acrylamide SDS- PAGE gel.

15% resolving gel (10 mL) 5% stacking gel (2 mL) Distilled water 2.3 mL 1.4 mL 30% acrylamide mix 5.0 mL 0.33 mL 1.5 M Tris (pH 8.8) 2.5 mL - 1 M Tris (pH 6.5) - 0.25 mL 10% SDS 0.1 mL 0.02 mL 10% ammonium 0.1 mL 0.02 mL persulfate TEMED 0.004 mL 0.002 mL

In vitro SUMOylation sample mixtures were mixed in a 1:1 ratio with Laemmli sample buffer and boiled for five minutes before introduction into the wells. The gels were run at 200 V for 80 min using 1x Tris/glycine/SDS buffer. The masses of the proteins were estimated in comparison to the migration of a standard protein ladder

(Precision Plus Protein Dual Xtra Standard), ranging from 2 kDa to 250 kDa.

4.3. Enzymatic Digestion

A sample containing 5 L of a 2.2 g/L poly-SUMO-2 or -3 solution (provided in 50 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES) pH 8.0, 100 mM

55 NaCl, 1 mM DTT), or the products of in vitro SUMOylation were evaporated to dryness by Speed-Vac, and reconstituted in 20 L of 50 mM AmBic pH 8.5 for digestion.

Sequencing grade chymotrypsin (1 μL of 1 mg/mL solution in 1 mM HCl) was added in a

1:50 enzyme/substrate ratio, and the solution was incubated for 4 h at room temperature

(25 °C). Proteomics grade trypsin was then added (1:25, 2 μL of 1 mg/mL solution in 1 mM HCl) and incubated overnight (approximately 15 h) at 37 °C. Digestion was quenched by the addition of formic acid to a concentration of 0.1% v/v. Aliquots (10 μL) of the digest sample were desalted using 10 μL C18 ziptip following the manufacturer’s recommended protocol (as described in section 4.4.) and subsequently evaporated to dryness by Speed-Vac for MS analysis.

4.4. Zip Tip Procedures

4.4.1. Manufacturer Protocol

SUMOylated, digested substrates and sulfated/desulfated peptides were desalted using 10 μL C18 ziptip following the manufacturer’s recommended protocol and evaporated to dryness by the Speed-vac. In short, five solutions were prepared: wetting solution (5% acetonitrile (ACN) in water), equilibration and wash solutions (0.1% TFA in water), elution solution 1 (0.1% TFA in water containing 50% ACN) and elution solution

2 (0.1%TFA in water containing 50% ACN). Using 10 L Eppendorf pipette and the

Ziptip media was wet with wetting solution. Equilibration was then achieved by aspiration of the equilibration solution. Peptides were bound with 7 to 10 slow aspirate and dispense cycles of the sample solution. The tip was rinsed with the wash solution, which was discarded in waste. Peptides were recovered with 15 aspirate and dispense

56 cycles with 20 L of elution solution 1, followed by the same step with elution solution 2.

The two solutions are finally combined and sample is immediately dried with Speed-Vac.

4.4.2. Preparation of Oligonucleotides Procedure

N-terminal sufonated peptides were desalted using 10 μL C18 ziptip using a modified procedure optimized for better retention of oligonucleotides and negatively charged analytes. Five solutions were prepared for the procedure: wetting (50% acetonitrile in water), equilibration (0.1 M triethylammonium acetate (TEAA), pH 7.0), wash solution #1 (0.1 M TEAA, pH 7.0), wash solution #2 (water) and elution buffer

(50% ACN in water). The sample was dissolved in 10 L of the equilibration solution prior to desalting. The Ziptip was wet using the wetting solution, and then equilibrated with 10 L of equilibration solution 3 times. Peptides were bound with 5 to 10 aspirate and dispense cycles of the sample solution. The tip was rinsed 3 times with 10 L of both wash #1 and #2 solutions. Finally, peptides were recovered by at least 3 cycles of aspiration and dispensing into 10 L of the elution buffer. Desalted samples were immediately dried using Speed-Vac.

4.5. Mass Spectrometry

4.5.1. Nano-ESI-TOF/IM MS

Desalted digested samples were reconstituted in 50 μL of a solution containing

0.05% formic acid and 50% ACN for MS analysis. All mass spectrometry data were collected on a Synapt HDMS quadrupole time-of-flight ion mobility mass spectrometer equipped with a nanospray source (Waters Corp.). The following instrumental parameters were used for all experiments: capillary, 2.5-2.8 kV; sampling cone, 40.0 V; extraction

57 cone, 4.0 V; cone gas flow rate, 0 L/h; trap collision energy (CE), 6.0 V; transfer CE, 4.0

V; trap gas flow, 1.5 mL/min; IMS gas flow reate, 0 mL/min. For ion mobility experiments, the parameters were kept identical with the exception of IMS gas flow (20-

27 mL/min). For MS/MS experiments the trap CE was increased to 15.0-35.0 V. Mass spectra were calibrated externally in the positive ionization mode in the range 500 ≤ m/z

≤ 4000 using a solution of sodium cesium iodide, and processed using Masslynx 4.1 software (Waters). All mass spectra were averages of approximately 300 scans and presented unprocessed (unsmoothed and without background subtraction). Ion mobility data were processed using the Driftscope 2.0 software.

4.5.3. MALDI MS

The progress of the BSA digestion was monitored using the UltrafleXtreme

MALDI-TOF/TOF MS equipped with a smartbeam II laser (Bruker Daltonics). A supersaturated solution of -cyano-4-hydroxycinnamic acid (CHCA) in 0.05% TFA containing 50% ACN was prepared. The sample was spotted onto a stainless steel plate using the sandwich method where 1 L of matrix solution was spotted and dried, followed by 1 L of sample solution, and then another 1 L of matrix solution. Data were collected in the positive ionization and reflectron modes. External calibration was done in the range 400-4000 m/z using angiotensin II, 1056.5418 m/z; angiotensin I, 1296.6848 m/z; Substance P, 1347.7354 m/z; bombesin, 1619.8223 m/z; ACTH clip 1-17, 2093.0862 m/z ACTH clip 18-39, 2465.1983 m/z; and somatostatin 28, 3147.4710 m/z (Peptide calibration standard #206195, Bruker Daltonics). Approximately 3000-8000 shots were acquired per MS spectrum using 1000 Hz acquisition speed. FlexAnalysis 3.3 software

58 (Bruker Daltonics) was used for data processing. The peaks obtained were used to carry out a peptide mass fingerprinting with the MASCOT software.

4.6. In silico Digests

The mass lists of the expected peptides from the tryspin/chymotrypsin digested substrates were generated in silico from the protein sequence using the “MS-digest” function of Protein Prospector, a free utility created by the University of California, San

Francisco (prospector.ucsf.edu/prospector/mshome.htm). The calculated masses of peptides were compared against the experimental masses in order to match sequences.

4.7. Spiking of BSA Digests

0.001 g of bovine serum albumin (BSA) were digested in 21 L of 50 mM AmBic pH 8.5 with 19 L of 1 mg/mL trypsin solution; total volume 40 L and final BSA concentration 500 M. The solution was incubated at 37 °C for two days. Working samples were prepared by spiking the appropriate volume of a 87 L solution of Spp to achieve a 5 M solution of BSA digest containing 1:1 molar ratio of Spp.

59

Chapter 5: Pre-Screening Method for SUMOylated Proteins

5.1. Proof-of-Concept for the Screening Method

5.1.1. Overview

As stated previously, one of the project goals was to remedy the lack of rapid and efficient methods to confidently analyze with confidence SUMOylated protein. Due to its separation properties, IMMS was chosen to distinguish between the modified and unmodified peptides, allowing removing of any purification or pre-concentration step prior to the MS analysis. A facile method was developed for screening simple protein digests for possible modification by SUMO. With the use of poly-SUMO2 as a model

(sequence shown in Fig. 5-1), the method includes digestion with trypsin/chymotrypsin, followed by ion mobility mass spectrometry analysis.111 Using a two-enzyme system typically leads to small peptides. The presence of the QQQTGG-tag results in a significant increase in mass (+ 618 Da) and size relative to the unmodified peptide. We exploit denaturing solution conditions to promote higher charge states; and IMMS to separate charge states in order to screen/identify possible SUMO isopeptides, which are confirmed by MS/MS. This method is very simple and much faster than LC-MS/MS approaches.

60 1 MADEKPKEGV KTENNDHINL KVAGQDGSVV QFKIKRHTPL SKLMKAYCER 51 QGLSVRQIRF RFDGQPINET DTPAQLEMED EDTIDVFQQQ TGG

Figure 5-1: Sequence of human SUMO-2. Sequence covered by MS is indicated by underlined. In bold red is shown the SUMO-modified lysine residue.

5.1.2. ESI-MS Analysis of the Poly-SUMO-2 Digest

SUMO-2 possesses the interesting feature of being able not only to modify its substrate, but also to be modified by itself at K11, which is found within a SUMOylation consensus sequence. This results in the formation of polymeric chains linked by isopeptide bonds. A solution of poly-SUMO-2 containing chains of various lengths was used as a model system. The mass spectrum of the trypsin/chymotrypsin digest of poly-

SUMO-2 is provided in Figure 5-2. The vast majority of these peptides and also the most abundant peptides could be matched to theoretical masses for the trypsin/chymotrypsin digest calculated from the sequence of SUMO-2. The sequence covered by the identified peptides is shown as underlined in Fig. 5-1. It is interesting to note that peptides yielding from cleavage after the substrate lysine are obtained, although it is known that modified peptides exhibit a missed cleavage site at the modified lysine. It is actually due to the fact that the SUMO molecule at the end of the polymeric chain is not modified. Thus, the lysine C-terminal amide bond can be cleaved by trypsin and peptides starting at T12 can be observed. The peptides detected range in charge state from +1 to +3. Under the solution conditions used (50% acetonitrile, 0.05% formic acid), relative to neutral aqueous solutions, it is expected that peptides will acquire a larger number of charges, depending on their size and in some part sequence. All of the larger peptides detected had

61 a charge of 3+. As expected for a two-enzyme digest with trypsin and chymotrypsin, the resultant peptides are relatively smaller in mass than expected with a one-enzyme digest.

A list of peptides detected and the sequence coverage are provided in Table 5.1. The identity of these peptides was confirmed by MS/MS and the data is presented in

Appendix A-1 to A-7. It was observed that some peaks do not match the expected masses for the linear SUMO-2 peptides, such as the ones at 1033.04 and 1041.55 m/z.

Figure 5-2: Mass spectrum of the trypsin/chymotrypsin digest of poly-SUMO-2 showing some of the linear peptides identified. The numbers of the peptides refer to the number assigned to each peptide in Table 5.1 shown below.

62 Table 5.1: List of the major linear peptides detected from the poly-SUMO-2 dual digest. These peptides are indicated on the mass spectrum in Figure 5-2. m/z: experimentally measured m/z value of peptide ion; z: charge of peptide ion; Mcalculated: neutral mass of peptide calculated from the monoisotopic experimentally measured m/z value; Mtheoretical: neutral mass of peptide calculated from sequence; ΔM: monoisotopic mass error of measurement.

Sequence # m/z z Mcalculated Mtheoretical M position 1 12-20 535.2824 +2 1068.5648 1068.4909 0.0739 2 22-32 553.8039 +2 1105.6078 1105.5477 0.0601 3 12-21 599.3212 +2 1196.6424 1196.5858 0.0566 21-32 4 617.8597 +2 1233.7194 1233.6426 0.0768 22-33 5 77-87 671.8137 +2 1341.6274 1341.5355 0.0919 6 63-76 749.8804 +2 1497.7608 1497.7020 0.0588 7 62-76 823.4124 +2 1644.8248 1644.7704 0.0544 8 61-76 901.4847 +2 1800.9694 1800.8715 0.0979 9 77-87 941.4407 +3 2821.3222 2821.2197 0.1025 10 22-32 1106.6132 +1 1105.6132 1105.5477 0.0655 11 44-53 1198.6930 +1 1197.6930 1197.5707 0.1223 12 77-87 1342.6472 +1 1341.6472 1341.5355 0.1117 13 63-87 1411.6888 +2 2821.3776 2821.2197 0.1579 14 62-87 1485.2484 +2 2968.4968 2968.2881 0.2087 15 61-87 1563.3177 +2 3124.6354 3124.3892 0.2462

5.1.3. Ion Mobility Analysis of the Poly-SUMO-2 Digest

A very practical advantage of IMMS is the ability to separate multiply charged ions generated from electrospray ionization into their respective charge states.50 Figure 5-

3 shows an ion mobilogram (or mass-mobility plot) for the mass spectrum provided in

Figure 5-2. The data is representative of previously reported mass-mobility plots for protein digests where the majority of ion signals exhibit a high correlation to diagonal lines 112. As indicated, the diagonal lines correspond to the various charge states of peptide ions. The lower trend line passes through peaks for peptides with only one charge.

63 At the resolution of our IMMS instrument, it is possible to separate the +2 and +3 charge states. Negligible amounts of any higher charge states were observed. Only one linear peptide (no. 9 in Table 5.1) was detected with the +3 charge state.

Figure 5-3: Ion mobilogram/mass-mobility plot (drift time versus m/z diagram) of the trypsin/chymotrypsin digest of human poly-SUMO-2. The diagonal trend lines representative of the various charge states of ions are indicated.

64

Figure 5-4: Drift time total ion chromatogram (black solid trace) of peptides detected from the trypsin/chymotrypsin proteolysis of poly-SUMO2. Extracted drift time ion chromatogram of only the larger more highly charged (z ≤ +3) ions (dashed trace).

The utility of IMMS can be seen in Figure 5-4. The drift time total ion chromatogram (solid black trace) shows the distribution of drift times of all ions detected in the poly-SUMO2 digest. The gray dashed trace represents the extracted ion chromatogram of only the larger ions with charge state +3. This grouping of ions, as expected, has a much narrower drift time distribution. On the basis of the intensity of the maximum at approximately 2 ms, relative to total ion chromatogram, the +3 ions represents only a very small percentage all ions in the mass spectrum. Despite the lower abundance (<35% of the intensity of the base peak), the extracted mass spectrum representing these +3 charge state ions (Figure 5-5) shows a signal-to-noise ratio sufficient for definitive determination of the monoisotopic masses of these peptides

(Table 5.2). Using MS/MS, the identities of these peptides were confirmed (Fig. 5-7 and

65 Appendix B-1). SUMO-2 isopeptides are labeled in Figure 5-5. Comparison of the experimental data with a computer simulation of the isotopic distribution for the isopeptides increased the confidence of the assignment (Fig. 5-6 and Appendix B-2 to B-

4). As expected, these peptides correspond to a peptide containing lysine-11 of SUMO-2, which lies within the consensus sequence, modified by the QQQTGG- isopeptide tag.

Substrate peptides all begin at a trypsin cleavage site, while termination from both trypsin and chymotrypsin cleavage was observed. The isopeptides terminating from chymotrypsin cleavage were in much higher abundance than the corresponding peptide that terminated with a trypsin cleavage site. The sensitivity of the method was high enough to obtain the monoisotopic peak and several isotope peaks for all the isopeptides, allowing for unambiguous mass determination. This also explains the peak clusters observed for each isopeptide.

66

Figure 5-5: Mass spectrum reconstructed from the gray dashed trace in Figure 5-4 showing only the peptides with z ≥ +3. Confirmed SUMO-2 isopeptides are indicated.

Figure 5-6: Comparison of the experimental (blue solid trace) and simulated (red dashed trace) isotopic distributions for isopeptide 2 in Table 5.2

67 Table 5.2: List of the SUMO-2 isopeptides detected using ion mobility mass spectrometry. These peptides are indicated on the mass spectrum in Figure 5-5.

# m/z z Mcalculated Mtheoretical M 1 689.0397 +3 2064.0957 2063.9946 0.1011 2 694.6915 +3 2081.0511 2080.9773 0.0738 3 731.7465 +3 2192.2240 2192.0696 0.1544 4 737.4106 +3 2209.2084 2209.0723 0.1361

For all species, both [M + 3H+]+3 and [M + 3H+ − 17]+3 ions were observed. This

17 Da loss has been previously shown to occur in peptides containing N-terminal glutamine residue.109 Under acidic solution conditions, peptides with N-terminal Gln can undergo internal nucleophilic attack with the expulsion of ammonia, resulting in N- terminal pyroglutamic acid (as shown previously in Fig. 3-6). The b’ − 17 fragment ions were also shown to be highly abundant in the tandem mass spectra of the N-terminal Q- containing peptides (Fig. 5-7). Only isopeptide no. 2 of Table 5.2 was also detected with a

+2 charge state. Once potential candidates have been identified, these are confirmed by

’ tandem mass spectrometry and the presence of diagnostic fragment mass tags for b2 − 17,

’ ’ 108 b3 − 17, and b4 − 17, which originate from the isopeptide tag.

68

Figure 5-7: MS/MS spectrum of the isopeptide 731.74 m/z (+3), 3 in Table 5.2. The b’n fragments originating from the SUMO tag were clearly identified.

5.1.4. Other SUMO Isoforms

Because all SUMO isoforms possess a chymotrypsin cleavage site 6 amino acids from the C-terminal, the method can be applied to any human SUMO. Following the same experimental procedure as for SUMO-2, a poly-SUMO-3 sample was analyzed. The experiment resulted in the detection of 4 isopeptides, both [M + 3H+]+3 and [M + 3H+ −

+3 17] as shown on Fig. 5-8. The satellite peaks observed (above 700 m/z) correspond to sodium adducts, and are present due to inefficient Ziptip desalting of the sample. Because

SUMO-1 and -4 have not been reported to form polymeric chains in vivo, the method could not be tested against the polymers of those two isoforms. However, due to the high

69 similarities of the tag obtained from SUMO-1 and -4, it is expected that this method will be suitable for all human isoforms.

Figure 5-8: Mass spectrum reconstructed showing only the peptides with z ≥ +3 for the dually digest poly-SUMO-3 sample. Confirmed SUMO-3 isopeptides are indicated.

5.2. Applications of the Method

5.2.1. In vitro SUMOylation: Sp100 and RanGAP1

In an attempt to increase the complexity of the sample analyzed, and get closer to the goal of studying a complete cell lysate, in vitro SUMOylated proteins were investigated. The reaction was conducted on two reported SUMO-1 substrates, Sp100 and

RanGAP1 fragments, whose sequences are presented in Fig. 5-9.

70 1 INLNDNTFTE KGAVAMAETL KTLRQVEVIN FGDCLVRSKG 41 AVAIADAIRG GLPKLKELNL SFCEIKRDAA LAVAEAMADK 81 AELEKLDLNG NTLGEEGCEQ LQEVLEGFNM AKVLASLSDD

1 KAEPTESCEQ IAVQVNNGDA GREMPCPLPC DEESPEAELH 41 NHGIQINSCS VRLVDIKKEK PFSNSKVECQ AQARTHHNQA

81 SDIIVISSED SEGSTDVDEP LEVFISAPRS EPVINNDNP

Figure 5-9: Sequences of the commercial RanGAP1 (top) and Sp100 (bottom) fragments. In red are shown the reported SUMO-modified lysine residues found in the consensus sequence.

Due to the presence of not only one protein and two enzymes (as for the experiments described above), the sample contained SUMO-1, the substrate protein, E1 and UbCh9 enzymes required for the reaction, the small molecule Mg-ATP and the two proteases (trypsin and chymotrypsin). For the reaction to yield a sufficient amount of modified protein, SUMO-1 had to be introduced in large excess compared to the substrate

(1:10 ratio substrate:SUMO-1). Because of this, and the many species are present in the sample, the percentage of modified peptides compared to the total number of peptides in the sample is significantly decreased. Moreover, in vitro SUMOylation produces less modified protein compared to auto-SUMOylation of SUMO-2 and -3. Consequently, using in vitro SUMOylated proteins is a good way to test for the sensitivity and robustness of the method. The gel obtained after the reaction for Sp100 and RanGAP1 is presented below in Figure 5-8. The first striking observation is that the commercial fragments, analyzed as received, displayed multiple bands on the gel. This is especially notable for the Sp100 sample that seems to contain multiple species. There is no evidence on this gel for conjugation of SUMO-1 with either Sp100 or RanGAP1. The bands obtained for the complete reaction mixture and the controls are the same, showing that no new products are formed by the in vitro SUMOylation reaction.

71

Figure 5-10: SDS-PAGE of in vitro SUMOylation of Sp100 and RanGAP1 fragments in presence or absence of SUMO-1 and ATP.

MS experiments (MS scan and IMMS) on the dual enzyme digested and desalted samples did not lead to any definitive results. About one half of the peaks obtained on the

TOF spectra were matched with SUMO-1 peptides; however none of the masses matched the expected sequence for the Sp100 and RanGAP1 fragments, or any of the enzymes that were present in the solution (Appendix C-1 and C-3). Trypsin digestion of the commercial substrates followed by MALDI analysis (Appendix C-2 and C-4) and peptide

72 mass fingerprinting revealed that the substrate sequence was not the one indicated by the manufacturer. GST-tagged Sp100 fragment digested with trypsin led to a MASCOT hit for GST-tag only. None of the other detected peaks matched the theoretical m/z list generated in silico, confirming that the wrong proteins were provided by the manufacturer. Moreover, the same experiment was carried out on the GST-tagged

RanGAP1 fragment, which did not score any hit with the MASCOT search. Manual matching of the obtained peaks revealed that once again, the sequence of the fragment is not likely to be what was indicated by the manufacturer. Because of this issue, experiments to apply the method were delayed.

73

Chapter 6: Analysis of Tyrosine O-Sulfation by IMMS

6.1. Physiological Functions of Small PTMs

6.1.1. Sulfation

Sulfation refers to chemical or enzymatic modification by covalent addition of a

2- sulfate moiety (SO4 ). Biologically, this reaction is an irreversible post-translational modification in vivo113 that targets the hydroxyl group on tyrosine residues in proteins

(termed tyrosine O-sulfation) leading to the modified amino acid, sulfotyrosine. The identification of sulfated proteins coupled to the mapping of their modification sites led to the conclusion that there is no consensus sequence for tyrosine O-sulfation.114 Although some have described consensus features (tyrosine residues exposed on the surface of the protein and typically surrounded by acidic residues), many protein sulfation sites do no fulfill those features.115 The first occurrence of protein sulfation was discovered in 1954 in a peptide derived from bovine fibrinogen.116 The first human proteins found to carry a sulfotyrosine (7 of them) were reported in 1985 by Liu et al.117 Sulfation is catalyzed in vivo by the tyrosylprotein sulfotransferase (TPST) enzymes in the .118 It was later discovered that two variants of this enzyme exist, TPST-1 and -2, which display high homology of both sequence and structure.119

74 More than 275 sulfated proteins have been discovered and identified, mostly secreted proteins and trans-membrane spanning proteins. However, the exact biological functions of tyrosine O-sulfation remain unclear. It is thought that this PTM is a key regulator for protein-protein interactions, it even seems that some interactions are made through recognition of the sulfate group itself.120 Some of the known mechanisms in which sulfation plays a role include the modification of the chemokine receptor, involvement in the entry events of human immunodeficiency virus 1 (HIV-1) in target cells; leukocyte adhesion and inflammatory response; homeostatis (modification of factors forming a stable complex); and modification of neurologically bioactive peptides.121 The biological function and the significance of some of these modifications are currently still unknown.

Various methods for analysis of sulfated proteins have been reported previously.122 Historically, metabolic labeling with 35S-sulfate was used for the screening of sulfated substrates123 and is still currently used as a back-up method for unambiguous identification of sulfotyrosine residues. The main disadvantage is the non-specificity of this labeling technique; it does not distinguish between actual , and sulfation of protein linked carbohydrates. Separation of sulfated peptides is typically carried out by RP-HPLC, the modified species elute more rapidly than their unsulfated counterparts.122 Some spectroscopic methods are used for characterization of sulfotyrosine peptides; UV spectroscopy is especially efficient for quantification as the presence of a sulfate group dramatically changes the absorbance features of the aromatic ring: shift of the absorbance maximum from 275 nm to 260.5 nm in 0.01 M HCl or from

293 nm to 263 nm in 0.01 M NaOH.116 Fourier transform infrared spectroscopy is also a

75 common technique for the analysis of sulfated proteins and peptides. Indeed, the

-1 symmetric SO3 stretching vibrations give rise to an intense band at 1050 cm ; supplementary bands corresponding to the asymmetric vibrations can also be observed at

1230 cm-1 and 1270 cm-1.124 Despite the wide range of available analytical instruments for the characterization of sulfation, the current technique of choice is MS. MALDI and ESI sources have made it possible to analyze labile modifications, and MS/MS data allow for accurate mapping of the modification sites. This is usually achieved by methods including enzymatic digestion and fragmentation methods such as CID. However, this workflow is time-consuming and requires relatively large quantities of protein. With the introduction of ECD, that preferentially breaks peptide backbones, better sequence coverage and better retention of PTMs was obtained. One of the main difficulties in MS analysis of sulfated proteins is that the mass added by a sulfate and a phosphate group are almost identical.

Methods to distinguish between these two PTMs are presented in more detail in Section

6.1.3.

6.1.2. Phosphorylation

3- Phosphorylation is the covalent addition of a phosphate group (PO4 ) to a substrate. Unlike sulfation, phosphorylation is reversible and can modify several amino acid residues. Serine, tyrosine, and threonine display the most abundant phosphorylation, but low stability modifications of , and are also observed.72 The phosphorylation of proteins is regulated by a balance between the activity of kinases (enzymes that catalyze phosphorylation) and phosphatases (enzymes that remove the phosphate group from modified proteins). The human genome encodes for

500 kinases and 100 phosphatases, revealing the importance of this modification. It is

76 estimated that approximately 30% to 50% of proteins are phosphorylated at some point in their lifespans.125 However, at any given point in time, only 1% of the proteins in the cell are phosphorylated.126 Phosphorylation is involved in a large number of mechanisms, and is widely used in the cell for regulation of protein stability, localization, function and activity.127 Moreover, this PTM controls many cellular processes such as cell division, enzymatic activity, signal transduction, and metabolic pathway regulation.128

Deregulation of the phosphorylation machinery is involved in several diseases, including cancer.129

6.1.3. Sulfation, Phosphorylation: Similarities and Differences

A summary of some characteristics of phosphorylation and sulfation as PTMs is presented in Table 6.1. Sulfation and phosphorylation are often thought to be homologous modifications, especially from a MS perspective, as they both add 80 Da to the mass of their substrates and are highly acidic. Although sulfur and phosphorus have different atomic masses (32.065 and 30.974 respectively), a phosphate group carries 3 negative charges while a sulfate group only has 2. Thus, phosphorylation requires ion pairing with one proton to result in a single negative charge, which is not the case for sulfation.

Because of this additional proton, the mass of the modification is increased by 1.008, making phosphorylation and sulfation isobaric. However, the nature of the modification, the physiological functions and the behavior in the MS instruments are different for sulfation and phosphorylation. Phosphopeptides can undergo several losses during CID,

- which are the PO3 ion (− 80 Da), neutral HPO3 (− 80 Da) and H3PO4 (− 98 Da) or H3PO4

130 + H2O (− 116 Da). Loss of H3PO4 occurs readily and typically generates a high intensity peak; it has to be noted that the neutral loss of phosphoric acid occurs for pS and

77 - 131 pT only. CID of sulfopeptides only leads to the loss of the sulfate ion SO3 (− 80 Da).

Sulfation and phosphorylation groups actually have different monoisotopic masses

(respectively 79.9568 Da and 79.9663 Da) and can be distinguished without ambiguity when the resolution of the instrument is high enough to attain mass accuracies of 5 ppm or better. This approach is usually not applicable as the FT-ICR instruments that provide enough resolution are not available in many MS facilities. Another alternative is to use a combination of positive and negative ionization: peptides usually lose their sulfo moiety in positive mode while phosphopeptide intact ions can be obtained in both ionization modes. Furthermore, in negative ionization mode phosphorylation produces a specific negative ion at 79 m/z whereas sulfation forms an equivalent ion at 80 m/z.

78 Table 6.1: Comparison of some biological and biochemical features of phosyphorylation and sulfation as PTMs; together with their properties during MS analysis.

Phosphorylation Sulfation

Biology Location Intracellular Extracellular Reversibility Reversible Irreversible Activation, inactivation, Modulation of Function modulation of protein protein interactions interactions pY is stable Chemical Biochemistry pS/pT are alkaline labile sY is acid labile stability pE/pD/pH are acid labile Removal Phosphatases Arysulfatases MS Property Acidic (2 −) Acidic (1 −) Monoisotopic mass + 79.9663 Da + 79.9568 Da change

Stability during pS/pT: good sY: easily lost under

ionization pY: stable standard conditions Characteristic PO - (− 79.9663 Da) SO - (− 79.9568 Da) fragment loss 3 3

6.2. Project Goal

As shown in the previous sections, sulfation and phosphorylation are involved in many biological mechanisms, and complete understanding of their functions requires identification of the modified proteins. The goal of this project was to apply IMMS to the analysis of sulfated sites in proteins, using approaches based on currently available methods for phosphorylation analysis.

79 The addition of a sulfate or phosphate group to a peptide triggers a change in its formal charge and general structure. These changes can be exploited by using ion mobility MS, as it provides separation of ions not only based on their m/z but also formal charge, size and shape. Due to the increase of the negative charge and change of structure of the modified peptides, it was hypothesized that the drift time of these species would be altered. Consequently, the band for these ions in the correlation lines on the ion mobilogram plot would be slightly shifted, making the screening for modified peptides fast and straightforward. It was extensively demonstrated in the past that phosphorylated peptides exhibit a more compact structure that shifts their drift times to lower values.112,

132 Because the behavior of phosphopeptides in IMMS is so well characterized, they were chosen as model system for understanding the effects of each instrumental parameter on the shift of the modified species.

Bovine -casein protein was chosen as a model for phosphorylation, and used in

IMMS experiments to optimize the experimental conditions. This established approach was then extended to sulfation with the use of an eight-residue peptide in combination with its sulfated analog.

6.3. Optimization of the IMMS Experiments: Phosphorylation

6.3.1. Model System

Casein is the major protein of bovine milk and is well known for being a phosphorylation substrate. Casein is found as a heterogeneous mixture of three isoforms:

-, - and -casein present at 75%, 22% and 3% respectively. Alpha casein is constituted of 214 amino acid residues (24,529 Da) and can be phosphorylated at multiple serine

80 positions: 56, 61, 63, 79, 81, 82, 83, 90, 130.133 It was used as model system to study phosphorylation as modified protein is available commercially. This protein forms hydrophobic micelles and is not very soluble in water and aqueous solutions.

6.3.2. Results

A 4.8 mg sample of multiply phosphorylated -casein lyophilized powder was dissolved in 400 L of ammonium bicarbonate. Because of the poor solubility of the protein, the pH was increased to about 9 and 40 L of 50 mM magnesium acetate was added. The protein eventually dissolved and the solution had a final volume of 696.4 L with a pH of about 9. From the stock solution, 25 L was digested overnight with trypsin and after a 1:33 dilution in denaturing solvent was analyzed by ESI-TOF MS (Fig. 6-1).

The spectrum is presented after processing by the MaxEnt3 function, which converted all the peaks to the +1 charge state. Several unmodified and modified peptides were identified; interestingly, two pairs of identical peptides differing by only one phosphorylation were found. The extensive list of -casein peptides identified is presented in the Appendix D-1. Tandem MS on the precursor peak 1660.7 m/z confirmed the sequence of a phosphorylated peptide 121-134, as shown in Figure 6-2.

81

Figure 6-1: ESI-TOF MS spectrum of an -casein trypsin digest in 50 mM AmAc pH 7.0. The displayed spectrum was processed using the MaxEnt3 algorithm.

82

Figure 6-2: MS/MS spectrum of the precursor ion 1660.70 m/z corresponding to the singly phosphorylated peptide VPQLEIVPNsAEER. pS represents a phosphorylated serine residue.

As the TOF and MS/MS data confirmed the presence of phosphorylated peptides,

IMMS experiments were carried out on the digest mixture (Fig. 6-3). The mass spectra of bands that displayed a shift in their mobility from to the charge correlation lines were obtained for peptide identification. Two phosphorylated peptides were identified (106-

119 and 43-58), the second one carrying two modifications. The identity of the peptides was confirmed by MS/MS. Complete sequence information was not obtained for most of these peptides.

83

Figure 6-3: Mass-mobility plot of the -casein digest in AmAc pH 9. In bands indicated by the arrows correspond to phosphorylated peptides and display a shift from the correlation lines.

The two confirmed phosphorylated peptides displayed a shift from the 2+ correlation line, as shown in Fig. 6-3. Interestingly, the doubly modified peptide shows a much more marked shift in drift time compared to the mono-phosphorylated peptide. As stated previously, there is a structural difference between the phosphorylated and non- phosphorylated peptides. This can be attributed to the intra-molecular interactions between the partial negative charge carried by the modification and basic sites (arginine, lysine, histidine side chains and protonated N-terminus) that are positively charged.132

The presence of two modifications on a single peptide is likely to increase the strength of these electrostatic interactions thereby making the structure more compact. As the experiments with phosphopeptides produced the expected results, it was concluded that

84 the instrument was suitably optimized and the approach was then extended to the sulfated peptides.

6.4. Screening for Sulfated Peptides

6.4.1. Model System

To test the performance of IMMS for the screening and identification of sulfated peptides, the model system chosen consisted of a fragment of cholecystokinin, a protein secreted in the small intestine and known sulfoprotein. The sequence of the peptide is as follows: DYMGWMDF-NH2 (C-terminus is amidated) and two samples were purchased, differing only by a sulfate group on the tyrosine. The non-modified peptide is referred to as desulfated peptide (DSpp) in contrast to the sulfated peptide (Spp). Their monoisotopic masses were 1063.4012 and 1142.3580 Da, respectively.

6.4.2. Analysis of Spp and DSpp

An equimolar mixture of Spp and DSpp was prepared, subjected to N-terminal sulfonation and analyzed in denaturing solution by ESI-TOF MS after desalting. The spectrum obtained showed not only salt contamination, but also extremely small intensity signal for the sulfated peptide. It was hypothesized that, the peptides (especially Spp) had a quite high negative charge and did not bind efficiently to the reverse phase C18 column.

Another approach was attempted, using a zip tip protocol optimized for oligonucleotides, which are also negatively charged species.

IMMS analysis of the Spp solution only (Fig. 6-4) showed very interesting features. The correlation line for the +2 charge state was obtained from the BSA experiment that was done under the same conditions (see Fig. 6-7). Firstly, the desulfated

85 peptide is clearly present in the solution although the sample contained only Spp. The identity of the peptide was confirmed by MS/MS (Fig. 6-5). This demonstrated the lability of this modification in the mass spectrometer. MS/MS data could not be obtained for Spp since any increase of the collision energy (required to fragment the analyte) led to the loss of the modification. It can also be observed that the non-modified peptide is on the charge correlation line, meaning that the drift time is consistent with its m/z value.

However, the modified Spp peptide band was to the left of the correlation line, indicating that its drift time was shifted to smaller values. This observation is consistent with the behavior of phosphorylated species in IMMS: it is very likely that the negative charge of the sulfate group also interacts with the positively charged N-terminus, making the overall structure more compact. The other bands found on the mass mobility plot are likely to be the same peptides (same m/z) but with different and more compact conformations (much smaller drift time) that exist in equilibrium.

86

Figure 6-4: IMMS analysis of a solution of Spp in denaturing solution

Figure 6-5: MS/MS spectrum of the 1063.41 m/z (+1) peptide. The obtained sequence confirmed the identity of DSpp.

87 6.4.3. Spiking of a Bovine Serum Albumin Digest with Spp

As the IMMS analysis of a simple modified peptide showed interesting results, another experiment was designed to test for sensitivity of IMMS for detection of the sulfated peptide. Bovine serum albumin was digested with trypsin and mixed in 1:1 ratio with Spp to a working solution concentration of 5 M. Comparison of the experimental values with an in silico generated list of BSA peptides (Fig. 6-4 and Appendix E-1) allowed all the main peaks to be positively identified as BSA tryptic peptides. The Spp or

DSpp peaks were not detected in TOF mode.

Figure 6-6: Positive ESI-TOF MS spectrum of a 5 M BSA digest spiked with a 1:1 molar ratio of Spp.

88 IMMS experiment were carried out on the same sample and although the 1142.36 peak (+1 charge state for Spp) was totally absent, a +2 peak at 571.75 m/z was detected, only by IMMS and not by TOF-MS. MS/MS could not confirm that the peptide was carrying a modification since the instrumental parameters required to obtain fragmentation of the peptide led to its desulfation (i.e. only the desulfated mass was observed). On the mass-mobility plot, the 571.75 peak showed a shift in drift time to the left, as indicated on Figure 6-5. Another band for the +2 charge state peptide of Spp at

674.2 m/z displayed a large shift in drift time compared to the correlation line; the calculated mass matched a -oxidized peptide from the BSA digest. The exact explanation for this behavior is not yet clear, and complementary experiments would be needed to fully understand this observation.

89

Figure 6-7: Mass-mobility plot of a 1:1 mixture of BSA digest:Spp. The band corresponding to Spp is shown. An ion displays a shift in the drift time, as indicated by the top white box. It matches a BSA peptide containing an oxidized methionine residue.

6.4.4. The effect of positive versus negative ionization mode

In the previous experiments, the data obtained showed that the sulfate group was readily lost in the mass spectrometer: the DSpp peak was observed in all mass spectra although the sample wonly contained Spp. It was reported in the literature that sulfation is more stable in negative ionization mode for MS analyses.134 Figure 6-7 shows the comparison of positive and negative ionization mode for the same Spp sample. In positive ionization mode, the base peak corresponds to DSpp, which shows that the modification is not stable under these conditions. However, in negative mode, only Spp is observed at

90 two different charge states. It was concluded that sulfation is indeed more stable in negative ionization mode.

Figure 6-8: Analysis in positive and negative ionization mode of a Spp solution.

Negative ionization mode was a key improvement in the development of the analytical method. It enabled collection of MS/MS data for Spp, while it was not possible in positive ionization mode, since the slightest increase in the collision energy resulted in the total loss of the Spp peak. Moreover, the sensitivity for Spp was dramatically increased, as shown in Fig. 6-9: in the BSA digest. Both -1 and -2 charge states for Spp peaks were detected, which was not the case in positive ionization. The identity of the

91 peptide was confirmed by MS/MS, leading to fragment peaks showing the modification.

The DSpp peptide was not observed.

Figure 6-9: Negative ESI-TOF MS spectrum of a 5 M BSA digest spiked with a 1:1 molar ratio of Spp.

92

Figure 6-10: MS/MS spectrum of the 1140.81 m/z (-1) peak. Enough fragments are detected to confirm modification of a tyrosine residue by a sulfate group.

The sample was then analyzed by IMMS in negative mode (Fig. 6-11). Compared to the same experiment in positive mode, the separation between the charge states was more marked, and the Spp band was one of the most intense. The same behavior was observed in negative mode regarding the drift time of the sulfated peptide: the band was shifted to the left.

93

Figure 6-11: Negative IMMS experiment on a 5 M BSA digest spiked with a 1:1 molar ratio of Spp.

6.5. State of the Project and Conclusion

Although not complete, the data for the analysis of tyrosine O-sulfation by IMMS are promising. Repetition of published experiments using IMMS for screening of phosphorylation was successful, as the expected shift in the drift time of the phosphorylated species was observed. Notably, the shift for a peptide containing two phosphate groups is larger than for a peptide with a single modification, due to increased intra-molecular electrostatic interactions making the peptide structure more compact. The sensitivity was high enough to fragment the peptides of interest with CID and collect

MS/MS data for confirmation of the sequence. The sulfated peptide could not be identified in a BSA digest using TOF only in positive ionization mode. However, the

94 analysis of the IMMS data revealed the presence of a band shifted to the left of the correlation line, corresponding to the sulfated peptide. The desulfated analog was not shifted from the correlation line, showing the effect of sulfation on the drift time of peptides. It was hypothesized that as for phosphorylation, the negative charge carried by the sulfate group would interact with the positively charged N-terminus of the peptide making its structure more compact. Screening of the left-shifted bands on the ion mobilogram in negative ionization mode can lead to the identification of tyrosine O- sulfated candidates, that can later be confirmed by tandem MS.

Further experiments are needed to complete this project. For tyrosine O-sulfation, the limit of detection can be tested by using decreasing ratios of Spp in the BSA digest.

Optimized Ziptip procedure for desalting (for instance use an ion exchange column instead of a reverse phase) could pre-concentrate the sample in highly negatively charged peptides, giving better sensitivity for sulfated species.

95

Chapter 7: Conclusions and Future Directions

7.1. Conclusions

The work presented in this Thesis demonstrates the potential of ion mobility mass spectrometry for the quick and easy screening for specific analysis of post-translational modifications: SUMOylation and tyrosine O-sulfation. Although current results were obtained from systems of low complexity, the developed methods show promising features, such as rapid analysis, good sensitivity and selectivity for the modified peptides.

Because PTMs trigger a change in shape and charge of the target proteins, the peptides carrying the modification have different properties than the non-modified species. IMMS is ideal to take advantage of these features: the modified peptides can display a change in drift time, shifting the corresponding band from the correlation lines observed in a mass- mobility plot. For larger modifications, such as SUMOylation, the branched isopeptides can even carry an extra charge compared to unmodified peptides, facilitating screening for them.

Experimentally, evidence for the SUMO isopeptides carrying a higher charge was demonstrated with the poly-SUMO-2 and -3 systems: all of the linear peptides (but one) were found in the 1+ and 2+ charge states, while all the isopeptides carried a 3+ charge.

Extraction of the MS spectra for the z ≥ +3 allowed for the unambiguous identification of

96 four predicted isopeptides (two produced by either trypsin or chymotrypsin cleavage, and each – 17 Da analog). With simple model systems, the sensitivity was sufficient to obtain data with enough resolution to accurately determine the molecular mass of the modified species, even if the intensity of their signals were less than 35% of the base peak. The combination of the MS and MS/MS also allowed for precise mapping of the modified lysine peptides. Mass fingerprinting of the purchased substrates for in vitro SUMOylation revealed that the sequence of the actual molecules did not match the supposed fragments.

Applying the method to more complex samples was hindered by this issue.

Phosphorylated casein was successfully used as a model system for generating shifts in the drift time of modified peptides. It was also demonstrated that multiple modifications on a single peptide increases the importance of the shift from the correlation line. Preliminary and promising results were obtained from the extension of the method to tyrosine O-sulfation. IMMS analysis of a solution of Spp revealed that although the unmodified analog was found to be on the charge state correlation line as expected, Spp displayed a shift to smaller drift times. This confirms that sulfated peptides exhibit similar behavior in IMMS experiments as phosphopeptides. Tyrosine O-sulfation is also likely to trigger intra-molecular electrostatic interactions due to its partial negative charge and make the overall structure of the peptide more compact. A tyrosine O-sulfated peptide, CCK-8 (sequence DpYMGWMDF-NH2, Spp) spiked at a 1:1 molar ratio into a bovine serum albumin digest was detected on the mass-mobility plot, and the band corresponding to this ion was shifted from the trend line to smaller drift time.

Experiments in negative ionization mode allowed for collection of MS/MS data for Spp, due to the increased stability of the modification in this mode. Moreover, no product

97 resulting from the loss of the sulfate group was observed. The sensitivity is increased in this mode, as the Spp peak could be observed in an MS scan, which was not the case in positive ionization. Negative ionization IMMS data showed a good separation of the different charge states, and the characteristic shift to smaller drift time for Spp was still observed.

7.2. Future Directions

Further experiments should be conducted to complete the optimization of the methods described in this thesis. The influence of the ionization mode (positive or negative) should be further investigated. Negative ionization is known for generating better signal to noise ratios for negatively charged species, which could lower the limit of detection for the method. Once these steps are achieved the complexity of the sample will be increased to demonstrate the utility of IMMS and determine how low the limit of detection can be achieved.

The SUMO project, which is already at a more advanced state, can also be improved by additional experiments. First, the method can be tested against a pool of digested proteins containing only one SUMO substrate. Next, whole cell lysate from human cells could be used to determine the robustness of the procedure. The current results appear to indicate that an increased sample complexity could lead to an inability to detect the SUMO isopeptides, especially if their abundance is extremely low. In this case, a LC separation step might be necessary prior to MS analysis of the peptides. Another approach relies on the fact that the addition of a second N-terminus on the isopeptides due to the presence of the tag can also be taken advantage of: chemical modification targeting

98 N-terminal regions of peptides could be carried out and provide a more significant difference in the gas phase behavior of modified peptides compared to the linear ones.

Furthermore, the specificity of the neutral 17 Da loss due to the N-terminal glutamine residue that takes place for all isopeptides should be investigated as an additional screening method for SUMO-modified isopeptides.

99

References

1. Gutteridge, A.; Thornton, J. M., Understanding nature's catalytic toolkit. Trends Biochem. Sci. 2005, 30 (11), 622-9.

2. De Los Rios, P.; Goloubinoff, P., : Chaperoning protein evolution. Nat. Chem. Biol. 2012, 8 (3), 226-8.

3. Makhnevych, T.; Houry, W. A., The role of in protein complex assembly. Biochim. Biophys. Acta 2012, 1823 (3), 674-82.

4. Pesole, G., What is a gene? An updated operational definition. Gene 2008, 417 (1- 2), 1-4.

5. International Human Genome Sequencing, C., Finishing the euchromatic sequence of the human genome. Nature 2004, 431 (7011), 931-45.

6. Walsh, C. T.; Garneau-Tsodikova, S.; Gatto, G. J., Jr., Protein posttranslational modifications: the chemistry of proteome diversifications. Angew. Chem. Int. Ed. 2005, 44 (45), 7342-72.

7. Black, D. L., Mechanisms of alternative pre-messenger RNA splicing. Annu. Rev. Biochem. 2003, 72, 291-336.

8. Hajduk, S.; Ochsenreiter, T., RNA editing in kinetoplastids. RNA Biol. 2010, 7 (2), 229-36.

9. Ayoubi, T. A.; Van De Ven, W. J., Regulation of gene expression by alternative promoters. FASEB J. 1996, 10 (4), 453-60.

10. Finley, D.; Sadis, S.; Monia, B. P.; Boucher, P.; Ecker, D. J.; Crooke, S. T.; Chau, V., Inhibition of proteolysis and cell cycle progression in a multiubiquitination-deficient yeast mutant. Mol.Cell Biol. 1994, 14 (8), 5501-9.

11. Mann, M.; Jensen, O. N., Proteomic analysis of post-translational modifications. Nat. Biotechnol. 2003, 21 (3), 255-61.

100 12. Ulrich, H. D., The SUMO system: an overview. Method. Mol. Biol. 2009, 497, 3- 16.

13. Hochstrasser, M., Ubiquitin-dependent protein degradation. Annu. Rev. Gen. 1996, 30, 405-39.

14. Glish, G. L.; Vachet, R. W., The basics of mass spectrometry in the twenty-first century. Nat. Rev. Drug Discov. 2003, 2 (2), 140-50.

15. Cooks, R. G.; Rockwood, A. L., The 'Thomson'. A suggested unit for mass spectroscopists. Rapid Comm. Mass Spectrom. 1991, 5, 93.

16. de, H. E.; Stroobant, V.; Editors, Mass Spectrometry: Principles and Applications, Third Edition. John Wiley & Sons, Ltd.: 2007; p 489 pp.

17. MILLIKAN, R. A., Rays of Positive Electricity and their Application to Chemical Analysis. By SIR J. J. THOMSON. Longmans, Green & Co. 1913. Science 1914, 40 (1022), 174.

18. Grayson, M. A., John Bennett Fenn: a curious road to the prize. J. Am. Chem. Soc. Mass Spectrom. 2011, 22 (8), 1301-8.

19. Barber, M.; Bordoli, R. S.; Sedgwick, R. D.; Tyler, A. N.; Bycroft, B. W., Fast atom bombardment mass spectrometry of bleomycin A2 and B2 and their metal complexes. Biochem. Biophys. Res. Commu. 1981, 101 (2), 632-8.

20. Fenn, J. B.; Mann, M.; Meng, C. K.; Wong, S. F.; Whitehouse, C. M., Electrospray ionization for mass spectrometry of large biomolecules. Science 1989, 246 (4926), 64-71.

21. Tanaka, K.; Waki, H.; Ido, Y.; Akita, S.; Yoshida, Y.; Yoshida, T.; Matsuo, T., Protein and polymer analyses up to m/z 100 000 by laser ionization time-of-flight mass spectrometry. Rapid Comm. Mass Spectrom. 1988, 2 (8), 151-153.

22. Ahadi, E.; Konermann, L., Ejection of solvated ions from electrosprayed methanol/water nanodroplets studied by molecular dynamics simulations. J. Am. Chem. Soc. 2011, 133 (24), 9354-63.

23. Ahadi, E.; Konermann, L., Modeling the behavior of coarse-grained polymer chains in charged water droplets: implications for the mechanism of electrospray ionization. The journal of physical chemistry. B 2012, 116 (1), 104-12.

24. Banerjee, S. M., S., Electrospray Ionization Mass Spectrometry: A Technique to Access the Information beyond the Molecular Weight of the Analyte. Int. J. Anal. Chem. 2012, 2012.

101 25. Grandori, R., Origin of the conformation dependence of protein charge-state distributions in electrospray ionization mass spectrometry. J. Mass Spectrom. 2003, 38 (1), 11-5.

26. Kaltashov, I. A.; Abzalimov, R. R., Do ionic charges in ESI MS provide useful information on macromolecular structure? J. Am. Chem. Soc. Mass Spectrom. 2008, 19 (9), 1239-46.

27. Karas, M.; Bachmann, D.; Bahr, U.; Hillenkamp, F., Matrix-assisted ultraviolet laser desorption of non-volatile compounds. Int. J. Mass Spectrom. 1987, 78 (0), 53-68.

28. Beavis, R. C.; Chait, B. T., Cinnamic acid derivatives as matrices for ultraviolet laser desorption mass spectrometry of proteins. Rapid Comm. Mass Spectrom. 1989, 3 (12), 432-5.

29. Karas, M.; Ehring, H.; Nordhoff, E.; Stahl, B.; Strupat, K.; Hillenkamp, F.; Grehl, M.; Krebs, B., Matrix-assisted laser desorption/ionization mass spectrometry with additives to 2,5-dihydroxybenzoic acid. Org. Mass Spectrom. 1993, 28, 1476-81.

30. Knochenmuss, R., Ion formation mechanisms in UV-MALDI. The Analyst 2006, 131 (9), 966-86.

31. Patterson, S. D.; Aebersold, R., Mass spectrometric approaches for the identification of gel-separated proteins. Electrophoresis 1995, 16 (10), 1791-814.

32. Paul, W.; Steinwedel, H., A new mass spectrometer without magnetic field. Z. Naturforsch. 1953, 8a, 448-50.

33. Finnigan, R. E., Quadrupole mass spectrometers. Anal. Chem. 1994, 66, 969A- 975A.

34. Stephens, W. E., A pulsed mass spectrometer with time dispersion. Phys. Rev. 1946, 69, 691.

35. Wiley, W. C.; McLaren, I. H., Time-of-flight mass spectrometer with improved resolution. Rev. Sci. Instrum. 1955, 26, 1150-7.

36. Mamyrin, B. A.; Karataev, V. I.; Shmikk, D. V.; Zagulin, V. A., Mass reflectron. New nonmagnetic time-of-flight high-resolution mass spectrometer. Zh. Eksp. Teor. Fiz. 1973, 64, 82-9.

37. Cotter, R. J., Peer Reviewed: The New Time-of-Flight Mass Spectrometry. Anal. Chem. 1999, 71 (13), 445A-51A.

102 38. Imrie, D. C.; Pentney, J. M.; Cottrell, J. S., A Faraday cup detector for high-mass ions in matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Rapid Comm. Mass Spectrom. 1995, 9, 1293-6.

39. Moniatte, M.; van der Goot, F. G.; Buckley, J. T.; Pattus, F.; van Dorsselaer, A., Characterisation of the heptameric pore-forming complex of the Aeromonas toxin aerolysin using MALDI-TOF mass spectrometry. FEBS Lett. 1996, 384 (3), 269-72.

40. Waters Schematic of the Synapt HDMS G1 system. http://www.waters.com/webassets/cms/category/media/detail_page_images/Synapt_HDM S_detail_4.jpg.

41. Giles, K.; Pringle, S. D.; Worthington, K. R.; Little, D.; Wildgoose, J. L.; Bateman, R. H., Applications of a travelling wave-based radio-frequency-only stacked ring ion guide. Rapid Comm. Mass Spectrom. 2004, 18 (20), 2401-14.

42. Shukla, A. K.; Futrell, J. H., Tandem mass spectrometry: dissociation of ions by collisional activation. J. Mass Spectrom. 2000, 35 (9), 1069-90.

43. Cooks, R. G.; Ast, T.; Pradeep, T.; Wysocki, V., Reactions of ions with organic surfaces. Acc. Chem. Res. 1994, 27 (11), 316-323.

44. Zubarev, R. A.; Kelleher, N. L.; McLafferty, F. W., Electron Capture Dissociation of Multiply Charged Protein Cations. A Nonergodic Process. J. Am. Chem. Soc. 1998, 120, 3265-3266.

45. Syka, J. E. P.; Coon, J. J.; Schroeder, M. J.; Shabanowitz, J.; Hunt, D. F., Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 9528-9533.

46. Avouris, P.; Chan, I. Y.; Loy, M. M. T., The infrared-laser multiple-photon ionization of nitromethane. J. Chem. Phys. 1979, 70, 5315-17.

47. Ishikawa, K.; Koga, Y.; Niwa, Y., Sequencing of peptides by means of collision- induced dissociation of multiply charged ions. Nippon Iyo Masu Supekutoru Gakkai Koenshu 1993, 18, 203-6.

48. McKnight, L. G.; McAfee, K. B., Jr.; Sipler, D. P., Low-field drift velocities and reactions of nitrogen ions in nitrogen. Phys. Rev. 1967, 164, 62-70.

49. Collins, D. C.; Lee, M. L., Developments in ion mobility spectrometry-mass spectrometry. Anal. Bioanal. Chem. 2002, 372 (1), 66-73.

50. Kanu, A. B.; Dwivedi, P.; Tam, M.; Matz, L.; Hill, H. H., Jr., Ion mobility-mass spectrometry. J. Mass Spectrom. 2008, 43 (1), 1-22.

103 51. Pringle, S. D.; Giles, K.; Wildgoose, J. L.; Williams, J. P.; Slade, S. E.; Thalassinos, K.; Bateman, R. H.; Bowers, M. T.; Scrivens, J. H., An investigation of the mobility separation of some peptide and protein ions using a new hybrid quadrupole/travelling wave IMS/oa-ToF instrument. Int. J. Mass Spectrom. 2007, 261, 1- 12.

52. Dwivedi, P.; Wu, P.; Klopsch, S.; Puzon, G.; Xun, L.; Hill, H., Metabolic profiling by ion mobility mass spectrometry (IMMS). Metabolomics 2008, 4 (1), 63-80.

53. McLean, J. A.; Ruotolo, B. T.; Gillig, K. J.; Russell, D. H., Ion mobility-mass spectrometry: a new paradigm for proteomics. Int. J. Mass Spectrom. 2005, 240, 301-315.

54. Wasinger, V. C.; Cordwell, S. J.; Cerpa-Poljak, A.; Yan, J. X.; Gooley, A. A.; Wilkins, M. R.; Duncan, M. W.; Harris, R.; Williams, K. L.; Humphery-Smith, I., Progress with gene-product mapping of the Mollicutes: Mycoplasma genitalium. Electrophoresis 1995, 16, 1090-4.

55. Aebersold, R.; Mann, M., Mass spectrometry-based proteomics. Nature 2003, 422 (6928), 198-207.

56. Pappin, D. J.; Hojrup, P.; Bleasby, A. J., Rapid identification of proteins by peptide-mass fingerprinting. Curr. Biol. 1993, 3 (6), 327-32.

57. Pocsfalvi, G.; Cuccurullo, M.; Schlosser, G.; Cacace, G.; Siciliano, R. A.; Mazzeo, M. F.; Scacco, S.; Cocco, T.; Gnoni, A.; Malorni, A.; Papa, S., Shotgun proteomics for the characterization of subunit composition of mitochondrial complex I. BBA-Bioenergetics 2006, 1757, 1438-1450.

58. Chambery, A.; Farina, A.; Di, M. A.; Rossi, M.; Abbondanza, C.; Moncharmont, B.; Malorni, L.; Cacace, G.; Pocsfalvi, G.; Malorni, A.; Parente, A., Proteomic Analysis of MCF-7 Cell Lines Expressing the Zinc-Finger or the Proline-Rich Domain of Retinoblastoma-Interacting-Zinc-Finger Protein. J. Proteom. Res. 2006, 5, 1176-1185.

59. Sabareesh, V.; Sarkar, P.; Sardesai, A. A.; Chatterji, D., Identifying N60D mutation in ω subunit of Escherichia coli RNA polymerase by bottom-up proteomic approach. The Analyst 2010, 135, 2723-2729.

60. Kettman, J. R.; Frey, J. R.; Lefkovits, I., Proteome, transcriptome and genome: top down or bottom up analysis? Biomol. Eng. 2001, 18, 207-212.

61. Boyne, M.; Bose, R. In Target proteins: bottom-up and top-down proteomics, John Wiley & Sons, Inc.: 2012; pp 89-100.

62. Whitelegge, J.; Halgand, F.; Souda, P.; Zabrouskov, V., Top-down mass spectrometry of integral membrane proteins. Exp. Rev. Proteom. 2006, 3, 585-596.

104 63. Han, X.; Jin, M.; Breuker, K.; McLafferty, F. W., Extending Top-Down Mass Spectrometry to Proteins with Masses Greater Than 200 Kilodaltons. Science 2006, 314 (5796), 109-112.

64. Sze, S. K.; Ge, Y.; Oh, H.; McLafferty, F. W., Top-down mass spectrometry of a 29-kDa protein for characterization of any posttranslational modification to within one residue. Proc. Natl. Acad. Sci. U. S. A. 2002, 99 (4), 1774-1779.

65. Zhang, L.; Eugeni, E. E.; Parthun, M. R.; Freitas, M. A., Identification of novel histone post-translational modifications by peptide mass fingerprinting. Chromosoma 2003, 112 (2), 77-86.

66. Baliban, R. C.; Di, M. P. A.; Plazas-Mayorca, M. D.; Young, N. L.; Garcia, B. A.; Floudas, C. A., A novel approach for untargeted post-translational modification identification using integer linear optimization and tandem mass spectrometry. Mol. Cell Proteom. 2010, 9, 764-779.

67. Chen, Y.; Zhang, K.; He, X.; Zhang, Y., Development of identification for post- translational modifications in histones by mass spectrometry based proteomics. Huaxue Jinzhan 2010, 22, 713-719.

68. Jensen, O. N., Modification-specific proteomics: systematic strategies for analysing post-translationally modified proteins. Trends Biotechnol. 2000, 18, Supplement 1 (0), 36-42.

69. Tagwerker, C.; Flick, K.; Cui, M.; Guerrero, C.; Dou, Y.; Auer, B.; Baldi, P.; Huang, L.; Kaiser, P., A tandem affinity tag for two-step purification under fully denaturing conditions: application in ubiquitin profiling and protein complex identification combined with in vivo cross-linking. Mol. Cell Proteom. 2006, 5, 737-748.

70. Mann, M.; Ong, S. E.; Gronborg, M.; Steen, H.; Jensen, O. N.; Pandey, A., Analysis of protein phosphorylation using mass spectrometry: deciphering the phosphoproteome. Trends Biotechnol. 2002, 20 (6), 261-8.

71. McLachlin, D. T.; Chait, B. T., Analysis of phosphorylated proteins and peptides by mass spectrometry. Curr. Opin. Chem. Biol. 2001, 5 (5), 591-602.

72. Sickmann, A.; Meyer, H. E., Phosphoamino acid analysis. Proteomics 2001, 1, 200-206.

73. Schlesinger, D. H.; Goldstein, G., Molecular conservation of 74 amino acid sequence of ubiquitin between cattle and man. Nature 1975, 255 (5507), 42304.

74. Schlesinger, D. H.; Goldstein, G.; Niall, H. D., Complete amino acid sequence of ubiquitin, an adenylate cyclase stimulating polypeptide probably universal in living cells. Biochemistry 1975, 14, 2214-18.

105 75. Hershko, A.; Ciechanover, A., The ubiquitin system. Annu. Rev. Biochem. 1998, 67, 425-79.

76. Scheffner, M.; Nuber, U.; Huibrgtse, J. M., Protein ubiquitination involving an E1-E2-E3 enzyme ubiquitin thioester cascade. Nature 1995, 373, 81-3.

77. Hershko, A., The ubiquitin system for protein degradation and some of its roles in the control of the cell division cycle. Cell Death Differ. 2005, 12 (9), 1191-7.

78. Hershko, A., The ubiquitin pathway for protein degradation. Trends Biochem. Sci. 1991, 16 (7), 265-8.

79. Thrower, J. S.; Hoffman, L.; Rechsteiner, M.; Pickart, C. M., Recognition of the polyubiquitin proteolytic signal. EMBO J. 2000, 19 (1), 94-102.

80. Herrmann, J.; Lerman, L. O.; Lerman, A., Ubiquitin and ubiquitin-like proteins in protein regulation. Circ. Res. 2007, 100 (9), 1276-91.

81. Muller, S.; Hoege, C.; Pyrowolakis, G.; Jentsch, S., SUMO, ubiquitin's mysterious cousin. Nat. Rev. Mol. Cell Biol. 2001, 2 (3), 202-10.

82. Seeler, J. S.; Dejean, A., Nuclear and unclear functions of SUMO. Nat. Rev. Mol. Cell Biol. 2003, 4 (9), 690-9.

83. Verger, A.; Perdomo, J.; Crossley, M., Modification with SUMO. A role in transcriptional regulation. EMBO Rep. 2003, 4 (2), 137-42.

84. Gill, G., SUMO and ubiquitin in the nucleus: different functions, similar mechanisms? Gene. Dev. 2004, 18 (17), 2046-59.

85. Rodriguez, M. S.; Dargemont, C.; Hay, R. T., SUMO-1 conjugation in vivo requires both a consensus modification motif and nuclear targeting. J. Biol. Chem. 2001, 276 (16), 12654-9.

86. Hietakangas, V.; Anckar, J.; Blomster, H. A.; Fujimoto, M.; Palvimo, J. J.; Nakai, A.; Sistonen, L., PDSM, a motif for phosphorylation-dependent SUMO modification. Proc. Natl. Acad. Sci. U. S. A. 2006, 103 (1), 45-50.

87. Yang, S. H.; Galanis, A.; Witty, J.; Sharrocks, A. D., An extended consensus motif enhances the specificity of substrate modification by SUMO. EMBO J. 2006, 25 (21), 5083-93.

88. Johnson, E. S., Protein modification by SUMO. Annu. Rev. Biochem. 2004, 73, 355-82.

106 89. Bohren, K. M.; Nadkarni, V.; Song, J. H.; Gabbay, K. H.; Owerbach, D., A M55V polymorphism in a novel SUMO gene (SUMO-4) differentially activates heat shock transcription factors and is associated with susceptibility to type I diabetes mellitus. J. Biol. Chem. 2004, 279 (26), 27233-8.

90. Matunis, M. J.; Coutavas, E.; Blobel, G., A novel ubiquitin-like modification modulates the partitioning of the Ran-GTPase-activating protein RanGAP1 between the cytosol and the nuclear pore complex. J. Cell Biol. 1996, 135, 1457-1470.

91. Hannoun, Z.; Greenhough, S.; Jaffray, E.; Hay, R. T.; Hay, D. C., Post- translational modification by SUMO. Toxicology 2010, 278 (3), 288-93.

92. Haas, A. L.; Bright, P. M., The immunochemical detection and quantitation of intracellular ubiquitin-protein conjugates. J. Biol. Chem. 1985, 260, 12464-73.

93. Beers, E. P.; Moreno, T. N.; Callis, J., Subcellular localization of ubiquitin and ubiquitinated proteins in Arabidopsis thaliana. J. Biol. Chem. 1992, 267, 15432-9.

94. Beers, E. P.; Callis, J., Utility of polyhistidine-tagged ubiquitin in the purification of ubiquitin-protein conjugates and as an affinity ligand for the purification of ubiquitin- specific hydrolases. J. Biol. Chem. 1993, 268 (29), 21645-9.

95. Peng, J.; Schwartz, D.; Elias, J. E.; Thoreen, C. C.; Cheng, D.; Marsischky, G.; Roelofs, J.; Finley, D.; Gygi, S. P., A proteomics approach to understanding protein ubiquitination. Nat. Biotechnol. 2003, 21, 921-926.

96. Zhou, W.; Ryan, J. J.; Zhou, H., Global Analyses of Sumoylated Proteins in Saccharomyces cerevisiae: induction of protein sumoylation by cellular stresses. J. Biol. Chem. 2004, 279, 32262-32268.

97. Denison, C.; Rudner, A. D.; Gerber, S. A.; Bakalarski, C. E.; Moazed, D.; Gygi, S. P., A proteomic strategy for gaining insights into protein sumoylation in yeast. Mol. Cell Proteom. 2005, 4 (3), 246-54.

98. Wohlschlegel, J. A.; Johnson, E. S.; Reed, S. I.; Yates, J. R., 3rd, Improved identification of SUMO attachment sites using C-terminal SUMO mutants and tailored protease digestion strategies. J. Proteom. Res. 2006, 5 (4), 761-70.

99. Knuesel, M.; Cheung, H. T.; Hamady, M.; Barthel, K. K. B.; Liu, X., A method of mapping protein sumoylation sites by mass spectrometry using a modified small ubiquitin-like modifier 1 (SUMO-1) and a computational program. Mol. Cell Proteom. 2005, 4, 1626-1636.

100. Blomster, H. A.; Imanishi, S. Y.; Siimes, J.; Kastu, J.; Morrice, N. A.; Eriksson, J. E.; Sistonen, L., In vivo identification of sumoylation sites by a signature tag and cysteine-targeted affinity purification. J. Biol. Chem. 2010, 285 (25), 19324-9.

107 101. Galisson, F.; Mahrouche, L.; Courcelles, M.; Bonneil, E.; Meloche, S.; Chelbi- Alix, M. K.; Thibault, P., A novel proteomics approach to identify SUMOylated proteins and their modification sites in human cells. Mol. Cell Proteom. 2011, 10 (2), M110 004796.

102. Cooper, H. J.; Heath, J. K.; Jaffray, E.; Hay, R. T.; Lam, T. T.; Marshall, A. G., Identification of sites of ubiquitination in proteins: a fourier transform ion cyclotron resonance mass spectrometry approach. Anal. Chem. 2004, 76 (23), 6982-8.

103. Cooper, H. J.; Tatham, M. H.; Jaffray, E.; Heath, J. K.; Lam, T. T.; Marshall, A. G.; Hay, R. T., Fourier transform ion cyclotron resonance mass spectrometry for the analysis of small ubiquitin-like modifier (SUMO) modification: identification of lysines in RanBP2 and SUMO targeted for modification during the E3 autoSUMOylation reaction. Anal. Chem. 2005, 77 (19), 6310-9.

104. Chung, T. L.; Hsiao, H. H.; Yeh, Y. Y.; Shia, H. L.; Chen, Y. L.; Liang, P. H.; Wang, A. H.; Khoo, K. H.; Shoei-Lung Li, S., In vitro modification of human centromere protein CENP-C fragments by small ubiquitin-like modifier (SUMO) protein: definitive identification of the modification sites by tandem mass spectrometry analysis of the isopeptides. J. Biol. Chem. 2004, 279 (38), 39653-62.

105. Pedrioli, P. G. A.; Raught, B.; Zhang, X.-D.; Rogers, R.; Aitchison, J.; Matunis, M.; Aebersold, R., Automated identification of SUMOylation sites using mass spectrometry and SUMmOn pattern recognition software. Nat. Methods 2006, 3, 533-539.

106. Jeram, S. M.; Srikumar, T.; Zhang, X. D.; Anne Eisenhauer, H.; Rogers, R.; Pedrioli, P. G.; Matunis, M.; Raught, B., An improved SUMmOn-based methodology for the identification of ubiquitin and ubiquitin-like protein conjugation sites identifies novel ubiquitin-like protein chain linkages. Proteomics 2010, 10 (2), 254-65.

107. Hsiao, H. H.; Meulmeester, E.; Frank, B. T.; Melchior, F.; Urlaub, H., "ChopNSpice," a mass spectrometric approach that allows identification of endogenous small ubiquitin-like modifier-conjugated peptides. Mol. Cell Proteom. 2009, 8 (12), 2664- 75.

108. Griffith, W. P.; Cotter, R. J. In A Universal Method for the Determination of Posttranslational Modification by the Small Ubiquitin-related Modifier, SUMO, 54th ASMS Conference on Mass Spectrometry and Applied Topics, Seattle, Washington, Seattle, Washington, 2006.

109. Baldwin, M. A.; Falick, A. M.; Gibson, B. W.; Prusiner, S. B.; Stahl, N.; Burlingame, A. L., Tandem mass spectrometry of peptides with N-terminal glutamine. Studies on a prion protein peptide. J. Am. Chem. Soc. Mass Spectrom. 1990, 1, 258-64.

110. Bollag, D. M.; Rozycki, M. D.; Edelstein, S. J., Protein Methods. 2, illustrated ed.; Wiley-Liss: New York, NY, 1996.

108 111. Dumont, Q.; Donaldson, D. L.; Griffith, W. P., Screening method for isopeptides from small ubiquitin-related modifier-conjugated proteins by ion mobility mass spectrometry. Anal. Chem. 2011, 83, 9638-42.

112. Ruotolo, B. T.; Verbeck, G. F. t.; Thomson, L. M.; Woods, A. S.; Gillig, K. J.; Russell, D. H., Distinguishing between phosphorylated and nonphosphorylated peptides with ion mobility-mass spectrometry. J. Proteom. Res. 2002, 1 (4), 303-6.

113. Huttner, W. B., Tyrosine sulfation and the secretory pathway. Annu. Rev. Phys. 1988, 50, 363-76.

114. Nicholas, H. B., Jr.; Chan, S. S.; Rosenquist, G. L., Reevaluation of the determinants of tyrosine sulfation. Endocrine 1999, 11, 285-292.

115. Huttner, W. B.; Baeuerle, P. A., Protein sulfation on tyrosine. Mod. Cell Biol. 1988, 6, 97-140.

116. Bettelheim, F. R., Tyrosine-O-sulfate in a peptide from fibrinogen. J. Am. Chem. Soc. 1954, 76, 2838-9.

117. Liu, M.-C.; Yu, S.; Sy, J.; Redman, C. M.; Lipmann, F., Tyrosine sulfation of proteins from the human hepatoma cell line HepG2. Proc. Natl. Acad. Sci. U. S. A. 1985, 82, 7160-4.

118. Stone, M. J.; Chuang, S.; Hou, X.; Shoham, M.; Zhu, J. Z., Tyrosine sulfation: an increasingly recognised post-translational modification of secreted proteins. New Biotechnol. 2009, 25 (5), 299-317.

119. Moore, K. L., The Biology and Enzymology of Protein Tyrosine O-Sulfation. J. Biol. Chem. 2003, 278 (27), 24243-24246.

120. Kehoe, J. W.; Bertozzi, C. R., Tyrosine sulfation: a modulator of extracellular protein-protein interactions. Chem. Biol. 2000, 7, R57-R61.

121. Monigatti, F.; Hekking, B.; Steen, H., Protein sulfation analysis—A primer. Biochim. Biophys. Acta 2006, 1764 (12), 1904-1913.

122. Seibert, C.; Sakmar, T. P., Toward a framework for sulfoproteomics: synthesis and characterization of sulfotyrosine-containing peptides. Biopolymers 2008, 90, 459- 477.

123. Huttner, W. B., Determination and occurrence of tyrosine O-sulfate in proteins. Method. Enzymol. 1984, 107, 200-23.

109 124. Yagami, T.; Kitagawa, K.; Futaki, S., Liquid secondary-ion mass spectrometry of peptides containing multiple tyrosine-O-sulfates. Rapid Comm. Mass Spectrom. 1995, 9 (14), 1335-41.

125. Reinders, J.; Sickmann, A., State-of-the-art in phosphoproteomics. Proteomics 2005, 5, 4052-4061.

126. Marcus, K.; Moebius, J.; Meyer, H. E., Differential analysis of phosphorylated proteins in resting and -stimulated human . Anal. Bioanal. Chem. 2003, 376, 973-993.

127. Hunter, T., Signaling - 2000 and beyond. Cell 2000, 100, 113-127.

128. Audette, G. F.; Engelmann, R.; Hengstenberg, W.; Deutscher, J.; Hayakawa, K.; Quail, J. W.; Delbaere, L. T. J., The 1.9 Å resolution structure of phospho-serine 46 HPr from Enterococcus faecalis. J. Mol. Biol. 2000, 303, 545-553.

129. Hunter, T., Oncoprotein networks. Cell 1997, 88, 333-46.

130. Boersema, P. J.; Mohammed, S.; Heck, A. J. R., Phosphopeptide fragmentation and analysis by mass spectrometry. J. Mass Spectrom. 2009, 44 (6), 861-878.

131. Medzihradszky, K. F.; Darula, Z.; Perlson, E.; Fainzilber, M.; Chalkley, R. J.; Ball, H.; Greenbaum, D.; Bogyo, M.; Tyson, D. R.; Bradshaw, R. A.; Burlingame, A. L., O-sulfonation of serine and threonine: Mass spectrometric detection and characterization of a new posttranslational modification in diverse proteins throughout the eukaryotes. Mol. Cell Proteom. 2004, 3, 429-440.

132. Ruotolo, B. T.; Gillig, K. J.; Woods, A. S.; Egan, T. F.; Ugarov, M. V.; Schultz, J. A.; Russell, D. H., Analysis of phosphorylated peptides by ion mobility-mass spectrometry. Anal. Chem. 2004, 76, 6727-6733.

133. Larsen, M. R.; Thingholm, T. E.; Jensen, O. N.; Roepstorff, P.; Jorgensen, T. J. D., Highly selective enrichment of phosphorylated peptides from peptide mixtures using titanium dioxide microcolumns. Mol. Cell Proteom. 2005, 4, 873-886.

134. Drake, S. K.; Hortin, G. L., Improved detection of intact tyrosine sulfate- containing peptides by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry in linear negative ion mode. Int. J. Biochem. Cell Biol. 2010, 42, 174-179.

110

Appendix A

Tandem MS of the linear SUMO-2 peptides

Figure A-1: MS/MS of the linear SUMO-2 peptide 535.26 (+2). The confirmed peptide is TENNDHINL. The peaks labeled with a red star (*) correspond to the neutral loss of NH3 (− 17 Da) from a b fragment.

111

Figure A-2: MS/MS of the linear SUMO-2 peptide 599.35 (+2). The confirmed peptide is TENNDHINLK.

Figure A-3: MS/MS of the linear SUMO-2 peptide 617.84 (+2). The confirmed peptide is VAGQDGSVVQFK. The peaks with blue labels correspond to fragments from the isobaric peptide KVAGQDGSVVQF.

112

Figure A-4: MS/MS of the linear SUMO-2 peptide 749.87 (+2). The only possible peptide for the partial sequence obtained and precursor mass is DGQPINETDTPAQL.

Figure A-5: MS/MS of the linear SUMO-2 peptide 823.41 (+2). The only possible peptide for the partial sequence obtained and precursor mass is FDGQPINETDTPAQL. 113

Figure A-6: MS/MS of the linear SUMO-2 peptide 901.46 (+2). The only possible peptide for the partial sequence obtained and precursor mass is RFDGQPINETDTPAQL.

Figure A-7: MS/MS of the linear SUMO-2 peptide 1106.58 (+1). The only possible peptide for the partial sequence obtained and precursor mass is VAGQDGSVVQF. The peaks labeled with a red star (*) or a green circle (°) correspond to the neutral loss of

respectively NH3 (− 17 Da) or H2O (− 18 Da) from a b fragment. 114

Figure A-7: MS/MS of the linear SUMO-2 peptide 1342.57 (+1). The only possible peptide for the partial sequence obtained and precursor mass is EMEDEDTIDVF.

115

Appendix B

Tandem MS simulated distribution of SUMO-2 isopeptides

Figure B-1: MS/MS spectrum of the – 17 Da trypsin SUMO-2 isopeptide (peptide 2 in Table 5.2). The b’n fragments originating from the SUMO tag were clearly identified.

116

Figure B-2: Comparison of the experimental (blue solid trace) and simulated (red dashed trace) isotopic distributions for isopeptide 1 in Table 5.2

Figure B-3: Comparison of the experimental (blue solid trace) and simulated (red dashed trace) isotopic distributions for peptide 3 in Table 5.2

117

Figure B-4: Comparison of the actual (blue solid trace) and simulated (red dashed trace) isotopic distributions for peptide 4 in Table 5.2

118

Appendix C

ESI-MS and MALDI-MS data for Sp100 and RanGAP1 fragments

Figure C-1: MS scan of the in vitro SUMOylation of the GST-tagged Sp100 fragment by SUMO-1 in denaturing solution. No peak was detected for the Sp100 peptides or modified peptides.

119

Figure C-2: MALDI-MS of the trypsin digested GST-tagged Sp100 commercial fragment. MASCOT search gave a significant hit for the GST tag. Manual assignment of the experimental peaks did not lead to matching of any Sp100 peptides.

120

Figure C-3: MS scan of the in vitro SUMOylation of the GST-tagged RanGAP1 fragment by SUMO-1 in denaturing solution. No peak was detected for the RanGAP1 peptides, modified peptides or GST tag peptides.

121

Figure C-4: MALDI-MS of the trypsin digested GST-tagged RanGAP1 commercial fragment. MASCOT search did not give any significant hit. Manual assignment of the experimental peaks did not lead to matching of any RanGAP1 peptides.

122

Appendix D

-casein peptides identified

Table D.1: List of the major peptides detected matching the masses of an in silico generated list of -casein peptides. These peptides are indicated on the mass spectrum in Figure 6-1. m/z: experimentally measured m/z value of peptide ion; z: charge of peptide ion; Mcalculated: neutral mass of peptide calculated from the monoisotopic experimentally measured m/z value; Mtheoretical: neutral mass of peptide calculated from sequence; ΔM: monoisotopic mass error of measurement. MS/MS data for the confirmed peptides are presented in Figures D-1 to D-5.

Sequence Modification # m/z Mcalculated Mtheoretical M Position site(s) 1 80-83 525.3178 525.3178 525.3144 0.0034

2 135-139 615.3276 615.3276 615.3283 0.0007

3 52-57 689.3938 689.3938 689.3828 0.011

4 209-214 748.3659 748.3659 748.3698 0.0039

5 99-105 831.3517 831.3517 831.3843 0.0326

6 140-147 910.4473 910.4473 910.4741 0.0268

7 106-115 1267.6862 1267.6862 1267.7045 0.0183

8 95-105 1337.6675 1337.6675 1337.6808 0.0133

9 38-49 1384.713 1384.713 1384.73 0.017

10 121-134 1580.8058 1580.8058 1580.8279 0.0221

11 121-134 1660.7444 1660.7444 1660.7942 0.0498 S130 12 23-37 1759.9032 1759.9032 1759.945 0.0418

13 58-73 1927.6322 1927.6322 1927.6916 0.0594 S61. S63 14 119-134 1951.8817 1951.8817 1951.9525 0.0708 S130 15 148-166 2316.0415 2316.0415 2316.1369 0.0954

123

Figure D-1: MS/MS of the -casein peptide 615.32 (+1). The confirmed peptide is LHSMK

Figure D-2: MS/MS of the -casein peptide 748.37 (+1). The confirmed peptide is TTMPLW

124

Figure D-3: MS/MS of the -casein peptide 1267.69 (+1). The confirmed peptide is YLGYLEQLLR

Figure D-4: MS/MS of the -casein peptide 1384.73 (+1). The confirmed peptide is FFVAPFPEVFGK

125

Figure D-5: MS/MS of the -casein peptide 1759.90 (+1). The confirmed peptide is HQGLPQEVLNENLLR

126

Appendix E

BSA peptides identified

Table B.1: List of the major peptides detected matching the masses of an in silico generated list of bovine serumalbumin peptides. These peptides are indicated on the mass spectrum in Figure 6-4. MS/MS data for the confirmed peptides are presented in Figures B-1 to B-15.

Sequence # m/z z Mcalculated Mtheoretical M position 1 281-285 517.1953 1 517.1953 517.298 0.1027 2 101-105 545.235 1 545.235 545.3406 0.1056 3 205-209 649.2125 1 649.2125 649.3338 0.1213 3 205-209 649.2125 1 649.2125 649.3338 0.1213 4 490-495 660.2311 1 660.2311 660.3563 0.1252 5 236-241 689.2383 1 689.2383 689.3729 0.1346 6 29-34 712.2415 1 712.2415 712.3737 0.1322 7 257-263 789.3269 1 789.3269 789.4716 0.1447 8 249-256 922.3131 1 922.3131 922.488 0.1749 9 161-167 927.325 1 927.325 927.4934 0.1684 10 37-44 974.2853 1 974.2853 974.4578 0.1725 11 549-557 1014.4284 1 1014.4284 1014.6194 0.191 12 548-557 1142.4969 1 1142.4969 1142.7143 0.2174 13 66-75 1163.416 1 1163.416 1163.6307 0.2147 14 35-44 1249.3939 1 1249.3939 1249.6121 0.2182 15 402-412 1305.4741 1 1305.4741 1305.7161 0.242 16 421-433 1479.5266 1 1479.5266 1479.7954 0.2688 17 438-451 1511.5498 1 1511.5498 1511.8428 0.293 18 347-359 1567.4448 1 1567.4448 1567.7427 0.2979 19 437-451 1639.6365 1 1639.6365 1639.9377 0.3012

127

Figure E-1: MS/MS of the BSA peptide 517.34 (+1). The confirmed peptide is ADLAK

Figure E-2: MS/MS of the BSA peptide 545.38 (+1). The confirmed peptide is VASLR

128

Figure E-3: MS/MS of the BSA peptide 649.38 (+1). The confirmed peptide is IETMR

Figure E-4: MS/MS of the BSA peptide 660.40 (+1). The confirmed peptide is TPVSEK

129

Figure E-5: MS/MS of the BSA peptide 689.43 (+1). The confirmed peptide is AWSVAR

Figure E-6: MS/MS of the BSA peptide 712.42 (+1). The confirmed peptide is SEIAHR

130

Figure E-7: MS/MS of the BSA peptide 789.53 (+1). The confirmed peptide is LVTDLTK

Figure E-8: MS/MS of the BSA peptide 922.56 (+1). The confirmed peptide is AEFVEVTK

131

Figure E-9: MS/MS of the BSA peptide 927.57 (+1). The confirmed sequence is …YEIAR. The only possible peptide for the precursor mass is YLYEIAR

Figure E-10: MS/MS of the BSA peptide 974.53 (+1). The confirmed peptide is DLGEEHFK

132

Figure E-11: MS/MS of the BSA peptide 1014.71 (+1). The confirmed peptide is QTALVELLK. The peaks labeled with a red star (*) correspond to the neutral loss of NH3 (− 17Da) from a b fragment.

Figure E-12: MS/MS of the BSA peptide 1163.73 (+1). The confirmed peptide is LVNELTEFAK

133

Figure E-13: MS/MS of the BSA peptide 1305.82 (+1). The confirmed peptide is HLVDEPQNLIK

Figure E-14: MS/MS of the BSA peptide 1479.93 (+1). The precursor ion mass and the y fragments obtained point to the peptide LGEYGFQNALIVR

134

Figure E-15: MS/MS of the BSA peptide 1567.89 (+1). The confirmed peptide is DAFLGSFLYEYSR

135