UCLA UCLA Electronic Theses and Dissertations

Title Developing Top-Down Spectrometry for Intact Protein Identification in the Chromatographic Timescale

Permalink https://escholarship.org/uc/item/16f1599t

Author Lakshmanan, Rajeswari

Publication Date 2012

Peer reviewed|Thesis/dissertation

eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA

Los Angeles

Developing Top-Down for Intact Protein

Identification in the Chromatographic Timescale

A dissertation submitted in partial satisfaction of the

requirements for the degree Doctor of Philosophy

in Biochemistry and Molecular Biology

by

Rajeswari Lakshmanan

2012

ABSTRACT OF THE DISSERTATION

Developing Top-Down Mass Spectrometry for Intact Protein

Identification in the Chromatographic Timescale

by

Rajeswari Lakshmanan

Doctor of Philosophy in Biochemistry and Molecular Biology

University of California, Los Angeles, 2012

Professor Joseph A. Loo, Chair

Protein identification by top-down mass spectrometry based methods yield intact mass of the proteins and indicate the presence of post-translational modifications (PTMs) and/or isoforms.

Currently, the methods employed for top-down protein identification are performed using instruments with dual mass analyzers and are based on fragmenting isolated charge states, which greatly reduces the duty cycle of the instrument. High throughput top-down methods are required for protein identification in complex sample mixtures. We demonstrate the capability to perform intact protein identifications in a single-stage time-of-flight mass spectrometer during protein elution from a liquid chromatography (LC) column. In addition, we have developed a new data-independent fragmentation method known as ‘Continuous Accumulation of Selected Ions–

Collisionally Activated Dissociation’ (CASI-CAD) to fragment multiple charge states of the protein simultaneously for the purpose of identification in the LC timescale. CASI-CAD is performed without any precursor selection and thus, the duty-cycle of the instrument is not ii lowered. Both these methods unambiguously identified all the proteins in the human proteasome complex used for method development. The presence of PTMs and N-terminal modifications were also characterized for the proteins in this complex.

Supercharging reagents are known for their ability to enhance the multiple charging of proteins during (ESI). This improves the mass measurement accuracy and fragmentation efficiency of proteins during ESI-MS. Currently, the mechanism behind supercharging is unknown. We have analyzed different supercharging reagents under a variety of solvent conditions to probe the mechanisms behind supercharging. In addition, the supercharging ability of sulfolane was utilized for proteins eluting from a column by adding the reagent to the LC solvents. Furthermore, reagent introduction in the vapor phase increased the signal intensity for intact proteins eluting from a column when compared to experiments performed without the reagent. These methods presented here are efficient top-down means to address complex samples in the chromatographic timescale.

iii

The dissertation of Rajeswari Lakshmanan is approved.

Catherine F. Clarke

Julian P. Whitelegge

Steven G. Clarke

Thomas G. Graeber

Joseph A. Loo, Committee Chair

University of California, Los Angeles

2012

iv

DEDICATION

To my wonderful parents and loving husband

v

TABLE OF CONTENTS

LIST OF FIGURES ...... xi

LIST OF TABLES...... xiv

LIST OF SUPPLEMENTAL FIGURES ...... xv

ACKNOWLEDGMENTS ...... xvi

VITA ...... xx

CHAPTER 1: Introduction ...... 1

1.1. Mass spectrometry based ...... 1

1.2. Top-Down and Bottom-Up MS ...... 1

1.3. Protein identification and characterization by TD ...... 3

1.4. A short chronological history on the development of TD methods...... 4

1.5. Mass analyzers for TD proteomics ...... 6

1.5.1. FT-ICR MS ...... 6

1.5.2. ...... 8

1.5.3. Quadrupole time-of-flight ...... 9

1.5.4. Ion trap ...... 12

1.6. Intact protein fragmentation schemes ...... 13

1.6.1. Collisionally activated dissociation ...... 13

1.6.2. Higher energy collision dissociation ...... 14

1.6.3. Electron capture dissociation ...... 14

1.6.4. Electron transfer dissociation ...... 15

1.6.5. Other miscellaneous fragmentation methods ...... 17

1.7. Intact protein separation techniques ...... 17

1.7.1. Electrophoresis ...... 17

vi

1.7.1.1. Gel-based techniques ...... 17

1.7.1.2. Isoelectric focusing...... 19

1.7.1.3. Capillary electrophoresis ...... 20

1.7.2. Liquid chromatography ...... 21

1.7.2.1. Reversed phase chromatography ...... 21

1.7.2.2. Ion-exchange chromatography ...... 22

1.7.2.3. Size-exclusion chromatography ...... 22

1.7.2.4. Multi-dimensional separation ...... 22

1.8. TD data analyses ...... 23

1.8.1. ProSight ...... 24

1.8.2. BIG Mascot ...... 25

1.8.3. MS-Align+ ...... 25

1.8.4. Precursor ion independent top-down algorithm ...... 25

1.9. Supercharging ...... 26

1.10. Current limitations and future directions ...... 32

1.11. Summary ...... 35

1.12. References ...... 37

CHAPTER 2: Top-Down Protein Identification using a Single-Stage LC-ESI-TOF Mass Spectrometer ...... 52

2.1. Abstract ...... 52

2.2. Introduction ...... 53

2.3. Experimental section ...... 56

2.3.1. Sample preparation ...... 56

2.3.2. LC conditions ...... 57

2.3.3. MS parameters...... 57

2.3.4. Data analysis ...... 58 vii

2.4. Results and discussion ...... 58

2.4.1. Optimization of fragmentor voltage ...... 58

2.4.2. LC-MS of standard protein mix ...... 59

2.4.3. LC-MS of h20S proteasome sample ...... 61

2.4.4. Characterization of the N-terminus and PTMs in h20S proteasome subunits ...... 63

2.5. Conclusions ...... 64

2.6. References ...... 66

CHAPTER 3: Top-Down Protein Identification of Proteasome Complex Proteins with nano LC FT-ICR MS Employing Data-Independent Fragmentation Methods ...... 85

3.1. Abstract ...... 85

3.2. Introduction ...... 86

3.3. Experimental section ...... 90

3.3.1. Sample preparation ...... 90

3.3.2. LC-MS on Dionex nano LC-Bruker FT-ICR MS ...... 91

3.3.3. Data analysis ...... 92

3.4. Results and discussion ...... 93

3.4.1. LC-MS of h20S proteasome using nLC-FT-ICR MS ...... 93

3.4.2. LC-MS/MS using data-independent fragmentation methods ...... 95

3.4.2.1. Protein identification using FS-CAD ...... 95

3.4.2.2. Protein identification using CASI-CAD...... 96

3.4.3. Identification of co-eluting proteins ...... 99

3.4.4. N-terminal characterization and PTM analyses ...... 100

3.4.5. FS-CAD followed by ion enrichment using CASI ...... 101

3.4.6. LC-MS of h19S complex ...... 102

3.5. Conclusions ...... 103

3.6. References ...... 105 viii

CHAPTER 4: Reagents for Enhancing Protein Charging for Electrospray Ionization Mass Spectrometry from Organic/Aqueous Solvent Systems ...... 124

4.1. Abstract ...... 124

4.2. Introduction ...... 125

4.3. Experimental section ...... 126

4.3.1. Sample preparation ...... 126

4.3.2. Mass spectrometry ...... 127

4.3.3. Calculation of average and maximum charges ...... 128

4.4. Results and discussion ...... 128

4.4.1. Reagents for supercharging under denaturing conditions ...... 128

4.4.2. Compounds effective only in denaturing conditions ...... 130

4.4.3. Effect of different solvents in the supercharging efficiency of m-nitrophenol ...... 131

4.4.4. Supercharging disulfide bonded proteins ...... 133

4.4.5. Supercharging high MW proteins ...... 136

4.4.6. Supercharging membrane proteins...... 138

4.4.7. Supercharging peptides ...... 140

4.4.8. Reagent concentration versus supercharging ...... 141

4.4.9. Protein conformation during supercharging ...... 143

4.5. Conclusions ...... 145

4.6. References ...... 146

CHAPTER 5: Enhanced Multiple Charging for Protein LC-MS in the Chromatographic Timescale Utilizing Supercharging Reagents ...... 178

5.1. Abstract ...... 178

5.2. Introduction ...... 179

5.3. Experimental section ...... 182

5.3.1. Sample preparation ...... 182

ix

5.3.2. LC conditions and MS parameters ...... 182

5.3.2.1. Supercharging during LC-MS by adding reagents directly to LC solvents .182

5.3.2.2. Supercharging during LC-MS by delivering reagent vapors ...... 183

5.3.3. Calculation of average and maximum charges ...... 184

5.4. Results and discussion ...... 185

5.4.1. Supercharging proteins during LC-MS with sulfolane in the LC solvent ...... 185

5.4.2. Supercharging proteins by spotting reagents in the CaptiveSpray air filter...... 187

5.4.3. Supercharging during LC timescale using reagent vapors ...... 188

5.4. Conclusions ...... 190

5.5. References ...... 191

CHAPTER 6: Conclusions and Future Directions...... 203

6.1. Overview ...... 203

6.2. Top-Down protein identification using a single-stage LC-ESI-TOF MS ...... 204

6.3. Top-Down protein identification using data-independent fragmentation in FT-ICR MS.205

6.4. Supercharging reagents for enhancing charge during protein elution from HPLC ...... 209

6.5. Concluding remarks ...... 211

APPENDIX TO THE DISSERTATION ...... 213

x

LIST OF FIGURES

Figure 2.1. Schematic representation of the top-down identification of proteasome complex proteins using LC-ESI-TOF MS ...... 70

Figure 2.2. Mass spectra of BSA samples at different fragmentor voltages to compare the efficiency of fragmentation...... 71

Figure 2.3. Homology between the N-terminal 40 amino acid residues in albumins from different species...... 72

Figure 2.4. Nozzle-skimmer dissociation of albumins from different species...... 73

Figure 2.5. Fragment ion assignments in mass spectra of albumins from different species. .... 74

Figure 2.6. TIC of a standard 7 protein mix run through C3 column ...... 75

Figure 2.7. Fragmentation of proteins in the standard 7 protein mix at high NS voltage during elution from the column ...... 76

Figure 2.8. Nozzle-skimmer dissociation of superoxide dismutase C during its elution from the LC column...... 77

Figure 2.9. TIC of h20S sample run through C3 column ...... 78

Figure 2.10. Nozzle-skimmer dissociation of ß6 subunit during its elution from the LC column 79

Figure 2.11. TIC of h20S sample with the peaks labeled with the corresponding h20S subunits eluting in it...... 80

Figure 2.12. Intact mass spectra of all the proteins in the h20S complex identified by LC separation followed by NSD...... 81

Figure 2.13. Deconvoluted mass spectrum of the α3 subunit of h20S proteasome ...... 82

Figure 3.1. Schematic representation of the top-down protein identification using nano LC FT- ICR MS...... 110

Figure 3.2. TIC of h20S complex run through RP monolithic column ...... 111

Figure 3.3. Intact mass spectra of h20S proteins averaged from peaks in the TIC in Figure.3.2...... 112

Figure 3.4. Schematic representation of the Bruker 15T FT-ICR MS ...... 113

Figure 3.5. FS-CAD of α4 protein ...... 114

Figure 3.6. CASI-CAD of ubiquitin ...... 115

Figure 3.7. Fragmentation map of α5 protein identified with CASI-CAD using different ion xi

accumulation times...... 116

Figure 3.8. CASI-CAD mass spectra of α5 subunit obtained with different ion accumulation times ...... 117

Figure 3.9. CASI-CAD of α4 protein ...... 118

Figure 3.10. MaxEnt deconvoluted zero-charge mass spectrum of α3 protein ...... 119

Figure 3.11. Comparison of sequence coverage for α4 protein by FS-CAD and FS-CAD with CASI...... 120

Figure 3.12. Intact mass spectrum of Rpn1, a 100.2 kDa protein in the h19S complex ...... 121

Figure 4.1. Supercharging insulin with different reagents in denaturing conditions ...... 149

Figure 4.2. Supercharging disulfide-intact lysozyme with different reagents in denaturing conditions ...... 150

Figure 4.3. Supercharging disulfide-reduced lysozyme with sulfolane in denaturing conditions ...... 151

Figure 4.4. Supercharging disulfide-intact lysozyme with m-nitrophenol in denaturing and native solvent conditions ...... 152

Figure 4.5. Supercharging disulfide-intact lysozyme with m-nitrophenol in different solvent systems ...... 153

Figure 4.6. Supercharging β-lactoglobulin A and α-lactalbumin with sulfolane ...... 154

Figure 4.7. Supercharging RNase A and RNase B in positive and negative modes with sulfolane...... 155

Figure 4.8. Supercharging transferrin with different concentrations of sulfolane ...... 156

Figure 4.9. Comparing nozzle-skimmer dissociation of transferrin with and without sulfolane.157

Figure 4.10. Supercharging BSA with sulfolane ...... 158

Figure 4.11. Supercharging IgG1 with m-nitrophenyl ethanol ...... 159

Figure 4.12. Supercharging bacterioopsin with sulfolane in aqueous conditions...... 160

Figure 4.13. Supercharging bacterioopsin with sulfolane in denaturing conditions ...... 161

Figure 4.14. Supercharging bacterioopsin with sulfolane in chloroform/methanol/1% FA ...... 162

Figure 4.15. Supercharging bradykinin with sulfolane in denaturing conditions...... 163

Figure 4.16. Plots of reagent concentration versus average and maximum charge ...... 164 xii

Figure 4.17. Ion mobility spectrometry analysis of transferrin in denaturing condition ...... 165

Figure 4.18. Ion mobility spectrometry analysis of lysozyme in denaturing conditions ...... 166

Figure 4.19. Ion mobility spectrometry analysis of lysozyme in native solvent conditions ...... 167

Figure 4.20. Mass spectra of lysozyme acquired during drift time analysis ...... 168

Figure 4.21. Ion mobility arrival time distributions for selected charge states of lysozyme ...... 169

Figure 5.1. Supercharging reagent delivery through CaptiveSpray air filter ...... 193

Figure 5.2. Supercharging reagent delivery in the vapor phase using 100% ACN ...... 194

Figure 5.3. Supercharging during LC-MS with sulfolane in the LC solvent...... 195

Figure 5.4. Mass spectra of RNase A with sulfolane in the LC solvent ...... 196

Figure 5.5. Mass spectra of BSA with sulfolane in the LC solvent ...... 197

Figure 5.6. Mass spectra of lysozyme with supercharging reagents spotted in the air filters...198

Figure 5.7. Supercharging during LC-MS with sulfolane delivery in the vapor phase...... 199

Figure 5.8. Mass spectra of RNase A with sulfolane delivery in the vapor phase ...... 200

Figure 5.9. Mass spectra of ubiquitin with sulfolane delivery in the vapor phase ...... 201

Figure 5.10. Mass spectra of myoglobin with sulfolane delivery in the vapor phase ...... 202

xiii

LIST OF TABLES

Table 2.1. ESI bn product ion m/z for sheep albumin ...... 83

Table 2.2. Summary of h20S proteasome subunits identified by LC separation followed by NSD ...... 84

Table 3.1. Summary of h20S subunits identified by FS-CAD, CASI-CAD, and FS-CAD with CASI techniques ...... 122

Table 3.2. Summary of mass accuracies for the proteins in the h20S complex...... 123

Table 4.1. Summary of average and maximum charges acquired by supercharging different proteins with various reagents ...... 170

Table 4.2. Summary of average and maximum charges acquired by supercharging high MW proteins and peptides with different reagents...... 171

xiv

LIST OF SUPPLEMENTAL FIGURES

Supplemental Figure 4.1. Supercharging calmodulin with sulfolane ...... 172

Supplemental Figure 4.2. Supercharging actin with sulfolane ...... 173

Supplemental Figure 4.3. Supercharging lysozyme with increasing concentrations of sulfolane in denaturing conditions ...... 174

Supplemental Figure 4.4. Supercharging lysozyme with increasing concentrations of m-NBA in denaturing conditions ...... 175

Supplemental Figure 4.5. Supercharging lysozyme with increasing concentrations of m-nitrophenyl ethanol in denaturing conditions ...... 176

Supplemental Figure 4.6. Supercharging lysozyme with increasing concentrations of m-nitrophenol in denaturing conditions ...... 177

xv

ACKNOWLEDGEMENTS

I would first like to thank my mentor, Dr. Joe Loo, for the support and encouragement through these years. Joe - The passion I have for mass spectrometry today is truly because of the opportunity to do graduate research under your mentorship. I have learnt a lot from the one-on- one meetings with you. I have always felt more focused, confident, and encouraged after those meetings. Thank you for giving me an opportunity to maintain the Agilent ESI-TOF instrument. It gave me a wonderful opportunity to network with many Agilent Scientists and learn from working with them. I thoroughly enjoyed working with you and Rachel on the top-down and supercharging projects.

Many thanks to Dr. Rachel Loo – It was a great learning experience when I worked on the supercharging project with you. Thank you for all the time you spent to go over (and over) the theories behind supercharging, and explaining the chemistry behind everything. I’m really looking forward to all your supercharging theories and its impact in this field.

I would like to thank my former and current committee members: Dr. James Gober, Dr. Jon

Fukuto, Dr. Catherine Clarke, and Dr. Thomas Graeber. Thank you for providing wonderful comments during my orals and midstream. Thank you to Dr. Steve Clarke for giving me an opportunity to rotate in his lab and also for kindly accepting to serve in my committee.

To Dr. Julian Whitelegge – Thank you for providing guidance when I was working with membrane proteins and more recently, for the top-down project. All your suggestions were very valuable and really helpful. And thank you for giving access to ProSight server and also for gladly accepting to serve in my committee when I approached you recently.

xvi

Many thanks to Agilent scientist, Dr. Andy Gieschen for help with ESI-TOF and chromatography. Thank you to Dr. Jeremy Wolff and Dr. Guillaume Tremintin at Bruker for answering my questions during the method development in the FT-ICR instrument.

Thank you to Dr. Tom Drake and Dr. James LeBlanc for all the help and support when we collaborated for the brain injury project.

Thank you to Dr. Jane Strouse for all the help during Agilent instrument installation and when I was teaching Chem184.

‘MIC’ family: To Dr. Melissa Sondej – Thank you for both scientific and non-scientific advice, for listening to all my frustrations and making me feel better in the end, for helping network with people in the industry, and for encouraging me constantly. To Rudy Alvarado – Working with you was always fun and enjoyable, and thank you for spending your time in troubleshooting the

Dionex nanoLC with me. To Dr. Greg Khitrov – Thank you for answering all the quick questions about the LC columns and general instrument related help.

‘Loo Lab’ family: To Dr. Jimmy Ytterberg – Thank you for teaching me the basics of mass spectrometry. The brain injury project went up to the point of publication just because of the sample prep techniques you had taught me. To Dr. Sabrina Benchaar – Top-down and supercharging were the passions we shared when we worked together. I can’t thank you enough for the time you spent during ASMS when I was practicing for my oral presentation. To

Dr. Jonathan Erde – Thank you for answering my questions related to bio-informatics and databases.

xvii

Heartfelt thanks to all the great women scientists I made friends with at UCLA: Dr. Cecilia Zurita

Lopez, Dr. Deborah Francoleon, Dr. Cathy Kaddis, Dr. Shirley Lomeli, and Hong Nguyen. You gals are wonderful in the perspective you have for life and science.

To my lifetime friends Charumathi Anbalagan and Porkodi Muthu – We made a great ‘trio’ and shared a wonderful friendship during our Masters degree. You gals were the sole reason for my decision to go into research. I can’t forget our group studies and the prep for GRE that we did in rain and shine.

Thank you to Kay Andiappan and Jeya Sridharan for being very supportive and encouraging.

Thanks for calling/emailing and telling you were thinking about me. You both have inspired me in various ways.

All I’m today is because of my mom and dad’s love and support. Amma – You are my greatest inspiration. The interest I have towards learning is because of the seeds that you sowed during my childhood. Daddy – Thank you for all the help and for financially supporting for everything until my Masters degree. You never said a ‘no’ for anything I have asked you. You both have always celebrated my academic achievements so much that it made me progress more. To my sister, Seva Lakshmanan – Thank you so much for your love and ways you show it.

To my husband, Athappan – Thanks for all the love, care, support, and most of all, having the same amount of interest as I have towards my Ph.D. I’m indebted to all the things that you did during these 6 years, especially for taking care of the kids single handedly so many times during

ASMS and recently, when I wrote my thesis.

xviii

To my kids – Though you guys made this Ph.D an uphill journey, I can’t imagine this without you two. Thank you for making me feel that being loved is the most wonderful thing in the world!

Finally, I would like to formally acknowledge Springer Publishing Group for providing copyright access to reprint the Neurocritical Care article, “Metabolic Crisis After Traumatic Brain Injury is

Associated with a Novel Microdialysis Proteome”, published on Jan. 2010 in the Appendix to my dissertation.

xix

VITA

1999 Bachelor of Science in Biochemistry University of Madras Chennai, India

2001 Master of Science in Biochemistry University of Madras Chennai, India

2007-2011 Teaching Assistant Department of Chemistry & Biochemistry University of California, Los Angeles Los Angeles, CA

2007 – 2012 Graduate Research Assistant Department of Chemistry & Biochemistry University of California, Los Angeles Los Angeles, CA

SCHOLARSHIPS AND AWARDS

2001 Received two national level merit scholarships (Graduate Aptitude Test in Engineering and Indian Council of Medical Research – Junior Research Fellowship)

1999 Received award for academic excellence in Bachelor’s degree in Biochemistry

PUBLICATIONS AND PRESENTATIONS

Journal Articles

1. Lakshmanan, R.; Loo, J. A.; Drake, T.; Leblanc, J.; Ytterberg, A. J.; McArthur, D. L.; Etchepare, M.; Vespa, P. M. Metabolic Crisis after Traumatic Brain Injury is Associated with a Novel Microdialysis Proteome. Neurocritical Care 2010, 12(3), 324-36.

2. Prabhudesai, S.; Sinha, S.; Attar, A.; Kotagiri, A.; Fitzmaurice, A. G.; Lakshmanan, R.; Ivanova, M. I.; Loo, J. A.; Klärner, F. G.; Schrader, T.; Stahl, M.; Bitan, G.; Bronstein, J. M. A Novel "Molecular Tweezer" Inhibitor of α-Synuclein Neurotoxicity in vitro and in vivo. Neurotherapeutics 2012, 9(2), 464-476.

3. Liu, Y.; Miao, Z.; Lakshmanan, R.; Loo, R. O.; Loo, J. A.; Chen, H. Signal and Charge Enhancement for Protein Analysis by Liquid Chromatography-Mass Spectrometry with Desorption Electrospray Ionization. International Journal of Mass Spectrometry 2012, 325- 327, 161-166.

xx

Oral Presentations

1. Top-Down Mass Spectrometry of Proteasome Complex Proteins with nLC FT-ICR MS. Lakshmanan, R.*; Alvarado, R.; Wolff, J.; Loo, J. A. Intact Proteins Session at the 60th Annual ASMS Conference on Mass Spectrometry and Allied Topics, Vancouver, BC, Canada, May 2012.

2. Factors that Promote ESI Multiple Charging for Proteins. Loo, J. A.*; Lomeli, S. H.; Benchaar, S. A.; Peng, I. X.; Lakshmanan, R.; Ferguson, C. N.; Loo, R. O. Multiple Charging in Mass Spectrometry Session at the 58th Annual ASMS Conference on Mass Spectrometry and Allied Topics, Salt Lake City, UT, May 2010.

3. Top-Down Mass Spectrometry of Charge-Enhanced Proteins and Protein-Ligand Complexes. Benchaar, S. A.; Lakshmanan, R.; Yin, S.; Loo, R. O.; Loo, J. A.* Pacifichem Conference, Honolulu, HI, Dec. 2010.

Poster Presentations

1. Supercharging and Subcharging Proteins and Protein Complexes to Explore Electrospray Ionization. Loo, R. O.; Lakshmanan, R.; Zhang, J.; Loo, J. A. 60th Annual ASMS Conference on Mass Spectrometry and Allied Topics, Vancouver, BC, Canada, May 2012.

2. Reagents for Supercharging Proteins in ESI-MS from Organic/Aqueous Solvent Systems. Lakshmanan, R.; Benchaar, S. A.; Peng, I. X.; Ferguson, C. N.; Lomeli, S. H.; Zhang, J.; Loo, R. O.; Loo, J. A. 59th Annual ASMS Conference on Mass Spectrometry and Allied Topics, Denver, CO, June 2011.

3. Effects of Polypeptide Sequence on Supercharging for ESI-MS and MS/MS. Alvarado, R.; Lakshmanan, R.; Loo, R. O.; Loo, J. A. 59th Annual ASMS Conference on Mass Spectrometry and Allied Topics, Denver, CO, June 2011.

4. Top-Down MS of Protein Complexes using In-Source Fragmentation and Supercharging: LC-MSE Approach using TOF MS. Lakshmanan, R.; Peng, I. X; Ferguson, C. N.; Loo, R. O.; Loo, J. A. 58th Annual ASMS Conference on Mass Spectrometry and Allied Topics, Salt Lake City, UT, May 2010.

5. Top-Down and Bottom-Up based Proteomic Techniques Utilizing Mass Spectrometry for Identification of Candidate Biomarkers for Brain Trauma. Lakshmanan, R.; Ytterberg, Y. J.; Drake, T. A.; Vespa, P. M.; Loo, J. A.; Leblanc, J. F. 2nd Annual Mass Spectrometry Applications to the Clinical Lab Conference, San Diego, CA, Feb. 2009.

6. Proteomic Biomarker Candidates of Traumatic Brain Injury. Lakshmanan, R.; Leblanc, J. F.; Loo, J. A.; Drake, T. A.; Vespa, P. M. Stein/Oppenheimer Endowment Award Annual Poster Presentation Event, Los Angeles, CA, Sept. 2008.

xxi

CHAPTER 1

Introduction

1.1. Mass spectrometry based proteomics

Proteins play a crucial role in defining the cellular structure, assembly, and function in all organisms. The complexity at the protein level is increased due to the co- and post-translational events1 and the chemical modifications that take place in proteins are not directly evident at the gene level.2 Additionally, the protein milieu of a cell varies in different cell types and it is also dynamic since it responds to the environmental changes.3, 4 As a result, the analysis of proteins is more challenging and necessitates technologies with high throughput, specificity, and dynamic range. Mass spectrometry (MS) is one such promising tool and its tremendous growth in the field of proteomics in the last decade confirms its significance. The vast applications of

MS from the well-known biomarker field to remote areas such as bio-defense5 and phylogenetic characterization of microbes6 portrays the versatility and maturity of the technique. When compared to the other protein analysis techniques, the resolution available in the MS based analysis down to the amino acid level makes it an efficient tool for protein characterization.7

1.2. Top-Down and Bottom-Up MS

Depending on whether intact proteins or the peptides that constitute the proteins are being dealt, MS centered proteomic approaches can be categorized as Top-Down (TD) and

Bottom-Up (BU) methodologies. In the TD technique, proteins are directly analyzed by MS without any experimental proteolytic processing and this workflow ensures that intact mass of the molecule being studied is obtained.8 BU methods incorporate a highly specific enzymatic cleavage step prior to MS analyses to break the proteins into its peptide components. 9 The peptides generated can be analyzed by either ‘Peptide Mass Fingerprinting’ (PMF), where only 1 the mass of the peptides are measured and matched to in-silico peptide masses generated with the same cleavage specificity or both the mass of the peptides and the fragments generated by dissociating the peptides in the gas phase are measured and searched against the database to find the proteins as in ‘shotgun proteomics’.10 While PMF is suitable for targeted samples and is performed in conventional instruments, the shotgun methodology is more widely applied for complex mixtures and is performed using instruments with high sensitivity and good mass accuracy.11

A more popular version of shotgun is ‘MudPIT’ (Multi-Dimensional Protein Identification

Technology), where peptides are fractionated by selectivity to two orthogonal stationary phases.12, 13 This is currently the method of choice and is replacing the traditional ‘GeLC-MS’, in which intact proteins are separated by gel electrophoresis and the ensuing peptides recovered from the gel spots by in-gel digestion are analyzed by tandem MS.11, 14 The general limitations of

GeLC-MS are low recovery of peptides from the gel, multiple proteins being present in one spot, and limited dynamic range.15

Ultimately, the proteotypic peptides detected in the BU experiments are searched against the proteomic databases to identify the corresponding proteins. The sequence coverage is seldom 100%, since not all peptides are detected equally well and information about labile post-translational modifications (PTMs) is not preserved. Moreover, isoforms that share the same peptides cannot be differentiated by the BU methods.10 Though the term ‘isoforms’ has been at times inadvertently used in the literature to refer PTMs, a recent review of the nomenclature has clearly defined that isoforms refer to similar proteins originating from different genes such as, allelic genes giving rise to similar proteins.16 The terms ‘proteoforms’ or ‘protein species’ have been proposed to indicate chemical changes resulting from alternative splicing, proteolytic processing, and PTMs.17 Single nucleotide polymorphisms (SNPs) are also

2 categorized as proteoforms since it is technically a chemical change that takes place within the same gene.16

In spite of the drawbacks that exist in BU proteomics, it is well-suited for cataloguing all the proteins in an organism18, 19 and quantitative proteomics approaches are mostly based on

BU workflow.20-23 In addition, BU is heavily used in the biomarker field where it is more important to compare disease and healthy states, and identify the differentially expressed proteins to trace back the gene that caused a change rather than knowing more about the protein itself. 24

Shotgun proteomics is commonly applied for such biomarker discovery pipelines but suffers from the inherent disadvantage that isoforms and proteoforms are not efficiently identified. 25

Even if PTMs are identified by using enrichment techniques in BU experiments, the information about localization of combinatorial modifications and which PTMs belong to which isoforms cannot be envisioned using BU.26

Detecting the presence of the above-mentioned altered forms of proteins is critical since there is mounting evidence about the roles played by them as biomarkers and therapeutic targets.27 Switching to TD techniques will overcome these problems since the intact mass obtained in the TD workflow indicates the presence of any altered forms if the observed and theoretical intact masses are in disagreement. Moreover, TD methods also incorporate a gas phase fragmentation technique to interrogate the protein sequence and the information acquired helps to identify and characterize the protein.

1.3. Protein identification and characterization by TD

Though TD methods are generally looked as a special workflow for studying pure proteins for the purpose of characterization, it is being increasingly used for large-scale protein identification.28 ‘Identification’ and ‘characterization’ are different types of analysis that can be performed by TD MS. The former refers to finding the comprehensive set of proteins present in

3 a sample but the sequence coverage obtained for each protein identified remains low while the latter represents in-depth analysis yielding more sequence coverage.

This review will outline the development of TD technique as a tool for targeted protein characterization and also highlight the recent improvements in the area of protein identification for samples of higher complexity. Emphasis will be given to the instrumentation, fragmentation, and intact protein separation methods which are compatible with electrospray ionization (ESI)

MS. Finally, the current limitations are stated and future directions are discussed.

1.4. A Short chronological history on the development of TD methods

With the development of ESI, a soft ionization technique that can ionize large, complex, and fragile entities or simply “flying elephants” in John Fenn’s words, intact protein measurement was made possible and opened the door for TD mass spectrometry.29, 30 Since then numerous reports on the accurate mass measurements for species as small as tripeptides to macromolecules as large as bovine albumin dimer (133 kDa) using ESI coupled to different types of mass analyzers were reported.31 Utilization of MS for examining intact proteins advanced to the next stage when polypeptides were shown to fragment inside the mass spectrometer and primary sequence information could be deduced by measuring the resulting fragments.32 In this first “top-down” experiment in 1990 by Loo et al., the multiply charged ions from ribonuclease A (14 kDa) were fragmented by Collisionally Activated Dissociation (CAD) in a triple quadrupole instrument and this approach was later extended to high molecular weight

(MW ) proteins such as bovine serum albumin (66 kDa).33 Feng and Konishi showed that even larger proteins such as antibodies (150 kDa) and complement component (200 kDa) can be analyzed by ESI MS34 and they also demonstrated that these multiply charged macrosized antibody ions can be fragmented by CAD in a triple quadrupole instrument.35

4

Following the TD demonstrations in triple quadrupole instruments, a significant improvement in resolution and mass accuracy was achieved when Fourier Transform Ion

Cyclotron Resonance (FT-ICR) instruments were employed.36, 37 High resolution is important for the accurate determination of MW of intact proteins from the ESI generated multiple charge states and also for unambiguous assignment of fragment ions during protein characterization.

The initial work utilizing a 2.8 Tesla (T) magnet FT instrument yielded unit mass resolution for myoglobin (17 kDa),38 followed by a 6.2 T magnet which extended the isotopic resolution to carbonic anhydrase (29 kDa).39, 40 In addition to the ultra-high resolving power, the multi-stage tandem MS capability available in the single mass analyzer FT-ICR instrument allowed temporal mass selection41 and detected sequence variations as low as single amino acid replacement.42

Subsequently, it was shown that the intact mass and sequence tags obtained from tandem MS experiments can be searched against the sequence databases and protein identifications were made possible.43 With further advancements in the magnetic field strength and data sampling techniques, unit mass resolution for chondroitinase I (112 kDa) enzyme was obtained with a 9.4

T FT-ICR instrument.44

The term “top-down” was put forth by the McLafferty group in 1999 (though it appears a few times in earlier publications by the McLafferty group45, 46) and they demonstrated that 100% sequence coverage can be attained by TD characterization.47 Other milestones include development of dissociation techniques such as IRMPD,46 ECD,48 and ETD49 (discussed more under fragmentation techniques section) and these methods prove to be valuable for characterizing sequence conflicts, disulfide bonds, and PTMs at the intact protein level. 7, 50-52

Other applications of TD MS include, but not limited to, studies to examine non-covalent protein complexes, protein-ligand complexes, and protein conformation.53, 54 In addition, TD characterization of membrane proteins has proved to be highly beneficial since the hydrophobic regions in the membrane proteins do not ionize efficiently by BU methods.55 Furthermore, TD

5 quantitation of proteins provides information about the relative amounts of different isoforms/proteoforms present and protocols are emerging in this area.56

1.5. Mass analyzers for TD proteomics

This section will give a brief overview of the different mass analyzers that are being used in the TD field.

1.5.1. FT-ICR MS

The instruments based on FT-ICR technology are highly sought for TD work since the current capabilities include resolving power greater than 106, subattomole sensitivity, broad m/z range, and accurate mass measurement with less than 1 ppm deviation.57 In addition, a variety of tandem MS techniques are also available and a combination of different fragmentation methods can be applied in series.58 High MW proteins often tend to have more charges and as the spacing between the isotopes decreases with increasing charge, high resolution is required to unambiguously resolve these high charge states. This is also true for fragment ions at higher charge states and assignment of overlapping fragment ions. The combined benefit of high resolution and mass accuracy makes FT-ICR a popular choice for examining high MW proteins and TD analyses of proteins over 200 kDa using FT instruments have been reported.59

Moreover, accurate mass measurements are required to distinguish PTMs (trimethylation versus acetylation) and amino acid residues (lysine versus glutamine) that have small mass differences which becomes significant in high MW proteins.26

Early studies for protein characterization relied heavily on custom FT instruments and even to date, the highest mass protein to be isotopically resolved is IgG1 (148 kDa) by the

Marshall group using a custom-built 9.4 T FT-ICR instrument.60 With new customized ion trapping cell in a 12 T FT-ICR instrument, Smith’s group demonstrated sub-ppm mass

6 accuracies for intact proteins.61 Agar’s research group has demonstrated that proteins up to 669 kDa (thyroglobulin dimer) can be fragmented by in-source dissociation in a hybrid quadrupole

FT-ICR instrument and identified by database searching.62 This instrument has a quadrupole mass filter which can be used to select the ions before CAD in the collision cell.

Another widely used type of FT-ICR is the linear ion trap-FT-ICR (LTQ-FT-ICR) hybrid instrument which has the LTQ mass analyzer in addition to the FT-ICR. Ions pass from the ESI source into the LTQ where they are stored, then passed down to the FT-ICR cell for analysis.

With LTQ, ion accumulation, mass-selection, and dissociation can be performed within the LTQ instead of the FT-ICR cell. CAD is typically performed within the LTQ and LTQ can also be used for ion detection and used for rapid, low resolution analysis. The Kelleher group first utilized this type of instrument to perform intact protein identifications during protein elution from a liquid chromatography (LC) column and identified 22 yeast proteins ranging from 14 to 35 kDa in a single LC-MS/MS run.63

To the biomedical research community, it is crucial to identify the isoforms and/or proteoforms in proteins related to health and disease. Ge’s group has utilized the FT-ICR instruments to characterize phosphorylations in various cardiac proteins64 and Whitelegge’s group detected various isoforms and SNPs in salivary proteins using a similar instrument.65

Furthermore, single amino acid polymorphism in the amyloidosis causing protein, transthyretin was accurately identified using the monoisotopic mass obtained from the FT-ICR instrument.66

The single amino acid changes in proteins can be identified only if the experimental and theoretical monoisotopic masses are compared, since the average mass calculated from the most abundant isotopes can vary depending on the experimental ion abundances.

In spite of these exceptional capabilities, the FT-ICR instruments are very expensive, slow, and not very user-friendly. Due to these drawbacks, instruments with other types of mass analyzers are also being actively explored for TD studies.

7

1.5.2. Orbitrap

The orbitrap works under the same principle as FT-ICR, but the ions are oscillating back and forth within an electrostatic field which eliminates the need for superconducting magnets. 67

Hence, the cost and maintenance of orbitrap is less when compared FT-ICR and this makes it more affordable. The hybrid version of the orbitrap has a LTQ in the frontend and it combines the resolving power and mass accuracy of orbitrap with the speed and sensitivity of LTQ which makes it more amenable to TD proteins analysis. In addition, the resolving power in orbitrap decreases as the square root of m/z as opposed to FT where the resolving power is very sensitive to m/z and decreases as its first power.68

In 2006, Mann and coworkers first reported the utility of LTQ-orbitrap for TD analysis by demonstrating that proteins up to 25 kDa can be isotopically resolved and measured within 3 ppm deviation using the “lock-mass” approach.69 In their study, standard proteins were fragmented in the LTQ and mass analyzed in the orbitrap followed by database searching to identify it. These authors also showed the ability to perform MS3 in the ion trap for detailed structural analysis of intact proteins. The mass limit for analyzing intact proteins in LTQ-orbitrap was greatly extended by Bondarenko et al. when they showed that intact monoclonal antibodies

(150 kDa) can be analyzed in it.70 In addition, they characterized isoforms differing in glycosylation and disulfide bonding, but the isotopic resolution could be obtained only up to 23 kDa for the light chain component after disulfide reduction of the antibody. Following this demonstration, there are few other reports that describe the feasibility of using LTQ-orbitrap for

TD analysis of high MW standard proteins and antibody components. 71, 72 Huber’s group investigated a real complex sample mixture from wheat extract and showed isotopic resolution for proteins up to 42 kDa but tandem MS was not performed in their study.73

The Costello group has utilized the potential of LTQ-orbitrap for the biomedical research field by developing a clinical assay for detecting variants in transthyretin.74 In their approach,

8 both intact mass and the fragments from in-source dissociation were used for initial screening of the mass shift corresponding to the variant. This was followed by selection of the mass shifted fragment ion (from in-source dissociation) in the ion trap for further MS/MS and detection by the orbitrap.

Currently, the two newer versions of the LTQ-orbitrap, hybrid orbitrap Velos Pro and orbitrap Elite are beginning to be explored for TD proteomics. For the TD analysis of histone proteins which are heavily modified low MW proteins, the fast scan speed and efficient fragmentation available in the Velos LTQ-orbitrap shows promising results with characterizing isoforms and proteoforms.75 Orbitrap Elite has the combined features of Velos and high-field orbitrap technology, and offers a resolution of 240,000 at an m/z of 400, which has quadrupled since its earlier version. Makarov’s group has demonstrated that proteins up to 47 kDa can be isotopically resolved and identified by combining high resolution MS scan of a selected charge state and 5-6 high resolution MS/MS scans.76 Furthermore, Kelleher’s group performed a pilot study with HeLa cell lysates and showed that high throughput protein identifications can be effectively carried out using the various fragmentation techniques available in the chromatographic timescale in the orbitrap Elite.77

1.5.3. Quadrupole time-of-flight

The time-of-flight (TOF) mass analyzer measures the masses simultaneously as opposed to scanning instruments which sequentially focus only one mass at a time.67 This makes the TOF inherently fast and sensitive, and when combined with a quadrupole mass filter, the hybrid QTOF instrument also provides the benefits of MS/MS in space. The deviation in mass measurement is usually 10 ppm, however, instruments with ‘on-the-fly’ calibration mode routinely achieve within 2-5 ppm. In addition, even though the current resolution is limited to

9

10,000, newer generation instruments hold much promise with resolution up to 50,000 which makes it ideal for high throughput TD analysis of proteins.27, 78

Almost a decade ago, Nemeth-Cawley et al., first demonstrated the utility of QTOF instruments for TD protein analysis, by investigating 8 standard proteins of a wide mass range

(5 to 66 kDa).79 They isolated individual charge states from the ESI generated multiple charge state distribution of the protein and fragmented it in the collision cell followed by mass analysis using the TOF. The tandem MS data correctly identified all the proteins during database searches and the same approach was employed by the authors again for the characterization of recombinant IgG1 fusion protein products.80 Similar TD experiments in QTOF instruments were reported in a few other studies, but the maximum resolution of 10,000 at a m/z of 1000 limited the charge state assignment of ions above 5+.81

Following the early reports on in-source fragmentation mediated TD analysis on a variety of proteins in quadrupole mass analyzers, Thevis et al., employed a similar approach in a hybrid

QTOF instrument to characterize the high MW intact protein transferrin (78 kDa).82 These authors employed both in-source and conventional MS/MS, and obtained comparable results. In addition, they found a novel C-terminal fragment linked by disulfide bonds in mammalian transferrins using a combination of protein chemistry and MS/MS techniques.

Another strategy that can be employed for intact protein analysis in the QTOF instruments is a pseudo MS3 approach by combining in-source fragmentation with conventional dissociation in a dedicated collision cell. Ginter et al. applied this dual fragmentation approach to characterize standard proteins up to 66 kDa in a hybrid TOF instrument.83 First they fragmented intact proteins in the source region by raising the cone voltage which was followed by isolating one of the fragments for tandem MS in the collision cell and this sequential dissociation yielded sequence tags for unambiguous protein identification.

10

Johnson et al. reported the feasibility of identifying proteins fragments in recombinant samples by increasing only the cone voltage in a hybrid QTOF instrument.84 However, for such an experimental strategy, a stand-alone TOF analyzer would be sufficient to measure the fragment ion masses generated by in-source dissociation. Recently, Ren et al. demonstrated this possibility by characterizing light and heavy chains in antibodies using a single-stage TOF mass spectrometer by inducing fragmentation in the source region.85

The ion mobility separation available in QTOF instrument designs from Waters separates ions based on differences in shape and this adds another dimension to the existing m/z based separation. This has already shown promise in the BU area and is being explored for

TD research. The Clemmer lab has incorporated ion mobility based separation for intact proteins eluting from a column and the charge states of each intact protein gets separated based on the drift time.86 This helps with separating partially folded and unfolded structures of a protein which facilitates protein characterization to some extent during the chromatographic timescale. Fragmentation of each of the separated intact protein charge states yielded sequence information by measuring the m/z ratio of the fragments. However, the dissociation data from different charge states of the same protein were not compared to find the value of separating the intact protein charge states based on drift time. Nevertheless, this technique resolves proteins that co-elute with other contaminants, and hence, could potentially resolve co- eluting proteins by generating different ion mobility profiles. Recently, Zinner et al. has further enhanced this technique for TD studies by first isolating and dissociating a charge state, and then recording both the drift time and m/z of the fragment ions.87 By recording the drift time of the fragment ions instead of the intact protein charge states, they were able to add another layer of separation in the tandem MS spectrum. Similar strategy was used by Halgand et al. to analyze a recombinant measles virus phosphoprotein, and the ion mobility based separation followed by TD MS/MS revealed novel microheterogenities in this protein.88

11

1.5.4. Ion trap

Though high scanning speed and sensitivity are the inherent advantages that make ion traps high throughput workhorses in BU proteomics, the low resolution and deviation up to 100 ppm in mass accuracy are the major factors that limit its usage in TD analysis.10 In spite of these shortcomings, the efficient ion dissociation characteristics and multistage tandem MS in time available in ion trap instruments are advantageous for intact protein characterization. 89

McLuckey’s group pioneered in adapting ion trap instruments for targeted TD protein analysis by developing methods to reduce the charge states of multiply charged fragment ions which eliminates the necessity for high resolution.90 Intact protein charge states are isolated and fragmented in the ion trap which is followed by gas-phase ion/ion proton transfer reaction (PTR) using an anionic reagent to reduce the multiply charged product ions to mostly 1+ charge state.

This charge state reduction results in products with higher m/z which necessitates mass analyzers with a broad mass range and so, it is mostly performed in custom-built ion traps with extended mass analyzing capabilities.

Quadrupole ion trap (QIT) instruments with charge state manipulation using PTR have been mostly used for targeted TD analysis of a variety of proteins.91-95 These studies provide a wealth of information to understand the fragmentation mechanisms of intact proteins and also demonstrate that different intact protein charge states yield varied amounts of sequence coverage. However, the dynamic range in QITs is limited due to space charge effects but this has been overcome by employing linear ion traps (LTQ, also known as LIT) which have greater ion storage capacity and improved dynamic range.96 The tandem MS in space capability in the

LIT can be used for TD characterization and Scherperel et al. demonstrated this by characterizing recombinant proteins with site-directed mutagenesis in this instrument type.97

12

1.6. Intact protein fragmentation schemes

This section will briefly cover the fragmentation methods for TD sequence analysis with emphasis on the mechanism, sequence coverage obtained, availability in commercially made instruments, and feasibility of combining it with on-line protein separations.

1.6.1. Collisionally activated dissociation

Collisionally activated dissociation (CAD) (also known as collision induced dissociation) is one of the earliest fragmentation methods to be developed and it was primarily employed for characterizing small molecules and peptides.98-100 After the introduction of ESI to ionize proteins,

CAD was extended to fragment isolated intact protein charge states32 and is still widely used for characterizing proteins. Intact protein ions are collided with inert gas molecules (helium, argon, xenon, or ) which raise the internal energy above the dissociation threshold for various bonds and leads to fragmentation of the C-N bonds in the peptide backbone. This produces b- and y- series fragment ions and the fragmentation is dependent on the amino acid sequence and location, as explained by the mobile proton model of fragmentation.101

CAD in the quadrupole and ion trap mass spectrometers are low-energy collisions in contrast to the high-energy type that is associated with TOF/TOF analyzers. Due to the low energy associated, these collisions lead to successive fragmentations resulting in internal fragmentation, losses of neutral molecules such as water and ammonia, and do not preserve labile PTMs such as glycosylations and phosphorylations. However, for intact protein separations combined with mass spectrometry based analysis, CAD is compatible within the protein separation timescale and generally result in 7-10 long stretches of amino acids

(commonly known as sequence tags) which are then used to identify the protein by database searches.

13

CAD can also be carried out in the source region of the mass spectrometer by varying the voltage of the sampling cone or skimmer resulting in a similar type of fragmentation known as ‘Nozzle-Skimmer Dissociation’ (NSD).102 NSD or in-source dissociation eliminates the necessity of a dedicated collision cell to perform tandem MS and hence, isolation of intact protein charge states is not a prerequisite.103 NSD is regarded as a cost-effective way to induce fragmentation and due to its data-independent nature, current on-line protein identification methods utilize it as a way to fragment multiple protein charge states within the duration of protein separation.104, 105

1.6.2. Higher energy collision dissociation

Higher energy collision dissociation (HCD) is similar to CAD except it is a more energetic type of fragmentation. This was introduced by Makarov’s group when they upgraded to a curved linear trap (C-trap) to eliminate the low mass cutoff loss of fragment ions that is prevalent in the ion trap.106 The product ions generated by HCD are similar to CAD in triple quadrupole instruments with prevalence of low mass ions such as immonium ions, and y1 and y2 ions. HCD is now available only in orbitrap and the product ions are detected by the high resolution orbitrap mass analyzer which makes it suitable for intact protein analysis.107 More recently, Kelleher’s group has applied HCD in high throughput protein identifications by dissociating intact proteins eluting from a column and their results show most number of unique identifications by HCD among all the fragmentation types compared.77

1.6.3. Electron capture dissociation

Electron capture dissociation (ECD), a serendipitous achievement by McLafferty and colleagues,48 is a complementary dissociation technique to CAD. Currently, it is performed only in FT-ICR mass spectrometers and it involves capture of a thermal electron by multiply charged

14 protein cations which results in dissociation of covalent bonds in the polypeptide backbone predominantly. ECD is a non-ergodic process, meaning, fragmentation is independent of bond dissociation energies and thus, it is sequence independent.108 Due to this much randomized dissociation, ECD is a preferred method for de novo protein sequencing and for structural analysis of protein-protein and protein-ligand non-covalent associations in native pH conditions.109 The products of ECD are mostly c- and z- series ions, and labile PTMs are well- preserved during this non-ergodic ECD process.110

Though a few reports demonstrate the applicability of ECD for on-line peptide separations with MS,111 it has not been demonstrated for on-line intact protein analysis due to two reasons. Firstly, the product ions formed from ECD are very low in intensity due to the random fragmentation scheme it follows and data has to be averaged for longer durations which prevent on-line usage.112 Secondly, for proteins greater than 20 kDa, fragments dissociated by

ECD can still be held together by the intramolecular non-covalent bonds existing between amino acid resides and this leads to less sequence coverage. But this has been overcome with the development of ‘activated-ion ECD’ (AI-ECD), where protein ions are preheated by collisions with inert gas to break the intramolecular interactions and then ECD is applied.58, 113 AI-ECD has been shown to separate the ECD fragments effectively for proteins up to 200 kDa.59

1.6.4. Electron transfer dissociation

Electron transfer dissociation (ETD), an off-shoot of ECD, developed by the Hunt lab made it possible to perform ECD type fragmentation in bench top mass spectrometers.49 ETD is mediated by transfer of an electron from a low electron affinity anionic reagent to multiply charged protein cations resulting in formation of a radical which induces fragmentation.114 The outcome of ETD is similar to ECD, in that, the product ions are of c- and z- series and it preserves PTMs. Due to its availability in ion trap instruments, ETD can be performed in

15 addition to CAD since the fragmentation data derived from these two methods are complementary. This is advantageous in BU protein identifications for peptides eluting from a column, since both CAD and ETD can be performed or a decision tree based approach where low charge states are selected for CAD and higher charge states are chosen for ETD. 115, 116

ETD can be combined with other gas-phase reactions such as PTR to reduce the charge state of the product ions which enables intact protein analysis in bench top instruments. 117 Coon et al. demonstrated this applicability in an ion trap instrument by sequentially doing ETD and

PTR which allowed them to analyze intact proteins up to 66 kDa (bovine serum albumin) and also characterize variations in histone proteins.118 In addition, Chi et al. showed that ETD and

PTR are fast and efficient reactions, and it can be combined with on-line chromatographic separations for intact proteins under 20 kDa in the 70S ribosomal protein complex.119

High resolution instruments such as also have ETD capability which makes it ideal for intact protein analysis. ETD capability in combination with higher mass accuracy and resolution is advantageous in unambiguous assignment of PTMs such as trimethylation versus acetylation in small histone proteins. Though ETD has been mostly demonstrated on low MW proteins under 20 kDa, its availability in high resolution instruments is promising for future applications. Recently, Tsybin’s group has characterized IgG1 using ETD in orbitrap by signal averaging time-domain transients from different on-line chromatographic experiments before FT processing to increase the signal intensity and this has almost doubled the sequence coverage from their earlier analysis of the same antibody in the QTOF instrument by signal averaging different ETD experiments.120, 121 Another important advantage of using ETD is that the sequence coverage it yields is greater than other fragmentation types for high throughput protein identifications performed by combining protein separation with MS analysis.77

16

1.6.5. Other miscellaneous fragmentation methods

In addition to the above discussed fragmentation methods, other techniques such as

IRMPD, activated ion electron detachment dissociation (AI-EDD), sustained off-resonance-CAD

(SORI-CAD), negative ion IRMPD, and blackbody infrared radiative dissociation (BIRD) have been applied either solely or in addition to other widely used fragmentation methods for intact protein analysis.68 Most of these methods are still at initial stages of development and further explorations will reveal future applications in the TD area.

1.7. Intact protein separation techniques

This section briefly reviews the intact protein separation and fractionation techniques available for analyzing large number of proteins that constitute complex sample mixtures, with the ultimate goal of identifying the proteins by ESI-MS. Protein purification by classical biochemical methods and immunoprecipitation protocols are targeted approaches, and so these are not discussed here.

1.7.1. Electrophoresis

Electrophoretic methods separate proteins based on properties such as size, charge, or both of the molecule and the order of separation in these methods is to some extent predictable.

1.7.1.1. Gel-based techniques

One dimensional SDS-PAGE is the most convenient and widely adopted MW based sieving technique to separate proteins. It is based on negatively charged SDS molecules binding proteins to yield a uniform charge/mass ratio and a generic unfolded shape when bound by SDS.122 This causes the electrophoretic migration to be dependent only on the protein’s MW.

Combining SDS-PAGE with isoelectric focusing (IEF) results in a two dimensional separation

17

(2D PAGE) technique. The separated proteins can be recovered as peptides by digestion with a proteolytic enzyme and this forms the basis of BU proteomic experiments. Though the orthogonal nature of separation of 2D PAGE results in very high resolution, it is seldom utilized in TD proteomics due to the inherent problem of proteins being fixed inside the gel matrix and elution of intact proteins from gels is cumbersome. Methods based on passive123 and electroelution are available but these are time-consuming and are ideal only for targeted proteins.

Continuous elution of intact proteins from gels was made possible by applying an electric field to elute proteins at the end of the gel column.124 A modern and much improved version of continuous elution tube gel electrophoresis is the ‘GELFrEE’ (Gel-Eluted Liquid Fraction

Entrapment Electrophoresis) system by Tran and Doucette.125 The GELFrEE device consists of tube gels to separate the proteins based on MW followed by liquid phase recovery of the separated proteins. Though liquid phase recovery is highly attractive, the high concentration of

SDS in the fractions has to be removed before MS analysis. Protein precipitation is usually performed to remove SDS followed by resolubilization, but it comes with additional steps during which sample loss is inevitable. Moreover, this technique also requires high amounts of samples to be loaded due to inherent losses of protein adsorption to the walls of the chamber and buffer washes has to be performed to minimize overlapping elutions.

Alternatively, SDS can be replaced with an acid-labile surfactant (ALS) which degrades at low pH and thus, the interference of the surfactant during mass spectrometric analysis can be minimized.126 Meng et al. utilized this approach and combined continuous-elution gel electrophoresis using ALS with LC to fractionate yeast cell lysate followed by high resolution mass spectrometry to detect intact proteins.127

18

1.7.1.2. Isoelectric focusing

IEF is based on migration of proteins through a continuous pH gradient until it approaches a pH at which the net charge of the protein is zero and there is no further movement. It can be performed either using carrier ampholytes in tube gel formats or immobilized pH gradient (IPG) in strips. While IPG strips are the preferred method of first dimension separation in 2D PAGE for BU experiments, it can also be directly visualized by MS based methods as reported by Loo et al.128, 129

IEF can also be performed in solution (sIEF) and it is a popular orthogonal strategy to

MW based separation for TD workflows. The custom sIEF design by Tran et al. has overcome some of the problems that existed in earlier commercially available sIEF devices.125 This in conjunction with GELFrEE is a popular method to fractionate complex samples in TD applications.130

Capillary IEF (cIEF) is based on focusing proteins at their isoelectric point by using a polyampholyte mixture in solution inside a capillary and this can be directly coupled to ESI-MS for further analysis of the separated proteins.131, 132 Though the on-line coupling of cIEF and

ESI-MS proved to be high throughput, it has limited capability due to the adverse effects of ampholytes and detergents used for increasing protein solubility during cIEF. Alternatively, reversed-phase liquid chromatography (RPLC) was used as an intermediate step to improve the cIEF-ESI-MS setup.133, 134

Free-flow electrophoresis (FFE) is another liquid-based prefractionation technique in a matrix-free environment and can be performed in preparative scale.135 Samples can be continuously injected into a carrier ampholyte solution followed by IEF and fraction collection for downstream ESI-MS either directly136, 137 or in combination with RPLC.138

19

1.7.1.3. Capillary electrophoresis

Capillary electrophoresis (CE) is based on the migration of solutes inside narrow-bore fused-silica capillaries under the influence of high electric fields. CE includes a wide variety of ancillary methods which are based on the properties of proteins (charge, size, conformation, and hydrophobicity) during electrophoretic separation.139 The most common method is capillary zone electrophoresis (CZE) which is based on the differences in the electrophoretic mobilities of the charged species. In addition, electro-osmotic flow (movement of fluid in the capillary under the applied electric field) also contributes to the direction of net flow during CZE.

Though CE was a prevalent separation technique for more than two decades140, it was only in the late 80’s when the Smith group first demonstrated that CE effluent can be interfaced to ESI-MS by using a liquid sheath junction.141 Subsequently, Loo et al. showed that mixtures of intact proteins can be separated and mass analyzed by CE-ESI-MS.142 Recent advancements include sheathless interfaces, where the polyimide coating at the capillary outlet is removed to create a porous tip and this is inserted into the ESI needle and filled with background electrolyte to establish electrical connection.143 Using this porous tip sprayer design and nanoESI-MS,

Haselberg et al. demonstrated low nM limit of detection for intact proteins in comparison to using sheath liquid.144 The droplets formed during the electrospray process in the sheathless design are small and this results in more efficient ionization.145

Narrow peak widths, high resolution and sensitivity are the inherent benefits of CE-MS and it has been mostly used for analyzing glycosylations in pharmaceutical proteins. 145, 146

Though high speed separations reduce the analysis time, it also necessitates the use instruments such as ion traps and TOF mass analyzers to preserve the separation efficiency. In addition, only very low volumes of samples can be injected and preconcentration techniques have to be performed before sample is injected.147 Due to these disadvantages, CE is not yet widely used for large-scale intact protein separations.

20

1.7.2. Liquid chromatography

Liquid chromatography (LC) is currently the method of choice for separating complex samples before mass spectrometric analysis. The different types of phases commonly used for protein separations are discussed here. In addition, some of the methods discussed below can be directly coupled to mass spectrometric analysis (on-line) while others require fraction collection and further purification (off-line) before MS.

1.7.2.1. Reversed phase chromatography

Reversed phase (RP) is the dominant stationary phase used mainly due to the compatibility of the salt-free aqueous/organic solvents with ESI-MS. Shorter alkyl chains such as C3 and C4 attached to silica based supports and polymeric stationary phases (PLRP material) are the most common types used for intact protein separation. Although Wang et al. have demonstrated that resolution is better with longer alkyl chains, like C18, it is better to use

C3/C4 chain lengths to prevent irreversible adsorption of proteins to the column.148 Vellaichamy et al. have shown that the PLRP columns yield better separation efficiency for high MW proteins and the number of proteins detected was more using PLRP when compared to silica based columns.149

In addition to these conventional bead based columns, RP monolithic columns that contain porous polymeric stationary phase are also being explored for intact protein separations. Eeltink et al. have shown that for intact proteins, peak capacities greater than 600 can be achieved using RP monolithic capillary columns with a two hour linear gradients and formic acid as a modifier.150 In addition, these columns tend to be very sensitive since it lacks the stagnant zones inside the mesopores and hence, there is no intra-particular mass transfer of solutes.

21

1.7.2.2. Ion-exchange chromatography

In ion-exchange based chromatography, the acidic or basic functional groups in the stationary phase interacts with the corresponding oppositely charged groups in proteins.

Increasing concentrations of salt is then used to elute the protein from the column which makes it incompatible for MS analysis. However, it can be combined with a RP column and used as an orthogonal setup for the identification of proteins. Sharma et al. demonstrated this strategy and detected 715 intact proteins in a microbial cell lysate by fractionating the sample using weak anion exchange (WAX) in the first dimension followed by analyzing the individual fractions using on-line RP separation and mass spectrometric analysis using FT-ICR MS.151

1.7.2.3. Size-exclusion chromatography

Size-exclusion chromatography (SEC) has been combined with MS for the TD analysis of membrane proteins. Whitelegge’s group have reported the use of SEC columns followed by

ESI-MS for the analysis of membrane proteins such as bacteriorhodopsin, proteins in the

152-154 photosystem 1 complex, and cytochrome b6f complex. SEC was used since the elution of these integral membrane proteins from a RP column is less efficient and often RP columns have to be regenerated after the analyzing these proteins.

1.7.2.4. Multi-dimensional separation

Following the success of MudPIT technology for peptide separations, multi-dimensional separations are gaining popularity for intact proteins. However, it requires special plumbing requirements to connect the two different stationary phases in-line and the fractions separated from one column have to be stored in loops or capillaries until it can be analyzed by the second dimension. Two recent reviews have excellently outlined the various multi-dimensional separations for intact proteins followed by MS.155, 156

22

1.8. TD data analyses

The data analysis tools available for TD protein analysis are only a handful in contrast to the varied options that exist for BU approaches. This is, in part, because early TD analyses were confined to targeted protein samples and did not demand any database searches. With the advent of high throughput instrumentation and protein identification type experiments in samples of higher complexity, it is necessary to build efficient data processing platforms.

TD data contain information about intact mass of the protein and the products obtained by fragmenting the protein. The N-terminus of the eukaryotic proteins can be acetylated with or without removal of the N-terminal methionine. In addition, in some proteins, signal peptide is cleaved which will make a substantial difference in mass if it is not annotated in the database.

So, when complex samples with a wide variety of proteins are searched, appropriate intact mass tolerance window has to be chosen to accommodate these variations. BU analyses are restricted to peptides which fall within a narrow MW range and are not susceptible to these limitations.

TD analysis is more confounded due to the presence of isoforms and/or proteoforms for each protein. These could have a high degree of sequence similarity and unambiguous assignments can be made only if fragment ions that represent the unique region or PTMs are present in the data. This is especially true when a protein is reported to be modified by the same modifying group at different sites. Matching the modified intact mass as reported in the database to the experimental mass is sufficient to bring up the protein as a possible hit but the fragment ions mapped to the protein has to be perused to find if the experimental data can explain the modification identified.

Two different fragmentation types are generally performed in high throughput TD experiments. Either all the charge states of the intact protein are fragmented by NSD or three most abundant charge states are isolated separately and fragmented by CAD followed by

23 merging the data from these three fragmentations during post-processing. In CAD, when the charge states are isolated, there is a precursor mass that represents the mass of the protein.

However, when the all the charge states are dissociated as in NSD, the information about the intact mass of the protein is not present and has to be manually added to the file. This can be overcome by alternating the voltage high/low during data acquisition to collect both intact and fragment masses in the same LC-MS run and determining both the masses by deconvolution during data processing. Vellaichamy et al. had followed this scheme to identify proteins in complex samples, but this is currently limited to instruments from Thermo Scientific.149

Following is a brief description of TD data analysis options. Each tool was specifically developed to the instrument platform it uses for data acquisition and file format conversions are necessary for navigating cross platforms.

1.8.1. ProSight

ProSight, currently the most widely used tool for TD analysis, searches the data against shotgun annotated protein databases.157 Three different modes such as absolute mass, biomarker, and sequence tag are available to perform searches. Absolute mass search matches the experimental intact mass to the theoretical intact masses in the database and creates a list of theoretically possible fragment ions and compares it to the experimental fragment mass list.

The biomarker mode is designed to search endogenously formed peptides and searches every protein subsequence in the database to match the experimental intact mass given, followed by matching the fragment ions. Finally if these two modes fail to identify the protein, the sequence tag approach is taken, where sequence tags in the fragment ion data are compiled and queried against the database.

The disadvantage of ProSight is that it searches against a virtual database and considers far too many variants during a search which brings false positives and a well-defined

24 scoring system is necessary to address this disadvantage.158 In addition, it does not identify truncated proteins with unexpected PTMs that are not in the shotgun annotated databases.

Also, internal fragments and neutral losses in the data are not taken into account.

1.8.2. BIG Mascot

BIG Mascot is the TD version of the well-known Mascot search engine for BU analyses.

TD searches are enabled by increasing the 16 kDa limit for precursor mass to 110 kDa. Intact and fragment ion data are deconvoluted to generate monoisotopic masses and Mascot generic files are created which are then searched against the databases available.62 Searches can be performed in MS only, MS/MS only or using both. Internal fragments in the data can also be searched by adding user-defined instrument definitions.159 Its use in the TD field is currently limited due to various programming tasks that need to be performed to enable these functionalities.

1.8.3. MS-Align+

MS-Align+ is based on spectral alignment between the experimental and theoretically generated spectra for proteins in the database.160 It uses the deconvolution tool, MS-Deconv161 to group the isotopomer envelopes from the tandem MS data followed by database searching using MS-Align+. It can also search for unexpected PTMs and these authors also that showed it can outperform ProSight on complex datasets.160

1.8.4. Precursor ion independent top-down algorithm

Precursor ion independent top-down algorithm (PIITA) uses only the fragment ion data to match the genomic sequences and makes initial identifications from these searches.162 This is followed by matching the intact mass and mapping the observed mass shifts due to additions

25 of PTMs or deletions of N-terminal signal sequences. The main advantage of this method is that precursor ion mass and PTMs are not searched in the beginning which shortens the search time and gives a wider search space respectively, and tries to find as many protein identifications as possible. As in BU analysis, PIITA also considers neutral losses of water and ammonia.

1.9. Supercharging

The efficiency of fragmentation during ECD and ETD depend on the charge state of the precursor selected for dissociation. Zubarev and coworkers demonstrated that the electron capture cross section is proportional to the square of the charge.163 In addition, several studies have shown that the sequence coverage obtained from higher charge state precursors is more when compared to the lower charge states during ETD and ECD.164 Recently, reagents that enhance ESI multiple charging in both aqueous/organic and native pH solvent systems were reported by different groups including ours. These ‘supercharging’ reagents are beneficial for

ETD/ECD based tandem MS studies and it is reviewed here briefly.

The multiple charging phenomena observed during ESI-MS generates extensively charged gas-phase ions which yield a bell-shaped charge state envelope and this greatly reduces the mass-to-charge limits of the mass analyzer for analyzing intact proteins. The measurements of intact proteins and protein complexes that significantly exceed the upper limit of the mass analyzer (2000 m/z) are enabled by the merits of multiple charging.165 There are various factors that influence multiple charging such as molecular conformation, pH, and instrument parameters. In general, when proteins are sprayed in aqueous buffers (native conditions, pH 7), the charges acquired by the protein are low and are in the high m/z region of the mass spectrum. On the other hand, when proteins are in organic/aqueous solvents or in the presence of acids (denaturing conditions, pH 4 or less), the protein is more unfolded and can

26 acquire more charges. Consequently, the charge state envelope of the protein is in the low m/z region of the mass spectrum.

In 2001, Iavarone and Williams demonstrated that the multiple charging of proteins can further be increased by adding small molecules such as glycerol and m-nitrobenzyl alcohol (m-

NBA).166 They showed that these reagents can increase the maximum charge and the most abundant charge gained by cytochrome c in solutions containing water/methanol/acetic acid

(47/50/3, v/v/v). They proposed that in the absence of other factors that contribute to ESI charging, the increase in charge by adding m-NBA was due to the increased surface tension of the electrospray droplets.167 To support this theory, they suggest that when evaporation occurs during ESI droplet formation, supercharging reagents get concentrated in the droplet due to its high boiling points (the boiling point of m-NBA is 177˚C at 3 Torr) relative to water/organic solvents. This causes the surface tension of the droplet to increase when the more volatile components evaporate, since m-NBA has a higher surface tension than methanol (50 mN/m versus 25 mN/m), and this leads to droplet heating and unfolding of the protein. In summary, their theory supports the idea that ‘chemical/thermal’ unfolding or simply, denaturation is the mechanism behind supercharging.

In contrast to this ‘denaturing’ mechanism, the Loo group demonstrated that proteins can be supercharged in the native state.168 They demonstrated that in the presence of 1% m-NBA,

Zn bound carbonic anhydrase II can be charge enhanced which in turn showed that protein- ligand interactions can be maintained during supercharging. They also suggested that aqueous solvents have higher surface tension (the surface tension of water is 72 mN/m) and adding m-

NBA to aqueous solvents decreased the surface tension in the droplet after the evaporation of water. In addition, they suggest the role played by ‘reagent basicity’ as a possible mechanism behind supercharging. Furthermore, the new supercharging reagent introduced by the Loo group, sulfolane (a clear colorless liquid, also known as tetra methylene sulfone, IUPAC name:

27

2,3,4,5-tetrahydrothiophene-1,1-dioxide), was weakly basic than water and had a low gas-phase basicity.169 In addition, they also noticed the presence of ‘protein-sulfolane’ adducts at higher charge states which is contrasting to the salt adducts present at lower charge states that are formed due to incomplete desolvation. Recently, Douglass and Venter also reported the presence of reagent adducts at higher charge states and a direct interaction between the analyte and reagent is considered as one of the mechanisms behind supercharging.170

On the contrary, the Williams group provided evidences from ion mobility experiments to show that protein unfolding occurs during reagent addition.171 An ion in a compact-form has a higher mobility, hence a shorter drift time than the same ion in a more open conformation as it travels through the ion mobility separation device. The drift times of the higher charge state ions of myoglobin in the presence of m-NBA was more (i.e., the mobility is decreased), suggesting that it is in the unfolded state and has a greater collision cross section.171 However, the same charge states with and without supercharging were not compared when they reported this arrival time distribution data. For the 7+, 8+, and 9+ charges which were present with and without supercharging, the overlaid arrival time distributions does not show a significant difference in the mobility plot to conclude that a conformational change occurs. For the higher charge states such as 10+ to 19+ which are present only when m-NBA is added, they report that the relative increase in the collision cross-sections indicate unfolding of the protein. To make the argument more compelling, the collision cross-section has to be calculated for the same charge state with and without supercharging.

Hogan et al. also employed ion mobility MS using a differential mobility analyzer to probe the conformation of supercharged phosphorylase b ions.172 Immediately after solvent evaporation, the supercharged protein ions showed a small decrease (6-10%) in the mobility when compared to the one without any reagent added. But this decrease in mobility was not

28 high enough to conclude the transition of folded to unfolded conformation (~60% decrease). The authors attributed this to the presence of sulfolane adducts in the protein.

In addition to the above evidences, the Williams group employed circular dichroism (CD) to monitor the change in protein conformation upon adding supercharging reagents. Results from CD experiments showed that sulfolane unfolded proteins only at high concentrations and they extended these results to the situation present during ESI droplet evaporation. 171

Furthermore, dimethyl sulfoxide (DMSO) was also shown to charge enhance by the same mechanism, i.e., the high boiling point of DMSO caused it to enrich in the droplet and denature the protein.173 However, DMSO is highly basic and decreases charge at low concentrations

(1%) and induces denaturation at high conformation (> 5%). Marshall’s group also noticed charge enhancements with 5% DMSO174 but these results cannot be categorized as supercharging since DMSO acts as a denaturant and the charge enhancing effects are not independent from the conformational changes it can cause.

Finally, the recent paper from the Williams group discusses the effects of conformation in supercharging in disulfide bonded proteins.175 First, they show that chemical crosslinking in proteins decreases the ability to get supercharged with m-NBA in aqueous solution conditions.

These results have to be explored with other supercharging reagents using proteins that have naturally occurring disulfide bonds in aqueous conditions. In denaturing conditions, disulfide bonded proteins can be charge enhanced greatly and the only limitation to this is the number of basic groups in the protein that can be protonated.176 Chemical crosslinking might create an artificial constraint in the protein that causes an increase in the charge density during supercharging and so it could lead to a decrease in the number of charges observed. Next, supercharging the reduced/alkylated RNase A did not show any further supercharging in aqueous solutions. This effect is also questionable since reducing the disulfide bonds in the protein creates a more unfolded conformation and increases the ESI multiple charging

29 regardless of supercharging. Loo et al., have reported that the number of charges for disulfide- reduced proteins increases above the number of basic sites177 and Smith et al., have tabulated the same observation in different disulfide-reduced proteins.31 Supercharging the reduced form might not be expected to generate a substantial gain since the number of sites where a charge can potentially reside is already occupied.

In light of the above discussion about the mechanisms behind supercharging, the following observations should be taken into consideration:

1. Recent evidence by different groups have illustrated that in the negative mode, the increase in charge is not equivalent to the one observed in positive mode.170, 176

2. Peptides which lack secondary structures can be supercharged178, 179 and the number of charges acquired exceeds the basic sites.176

3. Non-ionic saccharide detergents known for their ability to solubilize proteins by reducing the surface tension can increase charge for peptides in both aqueous and denaturing conditions. 180

4. Not all proteins can be equally charge enhanced. Size and pI of the protein play an important role in the ability to be supercharged.176

Any one mechanism cannot support the supercharging effect observed during the above-mentioned situations. Though it is currently not evident whether unfolding occurs during supercharging, the other factors such as the basicity of the reagent, direct interaction between reagent and protein, dipole moment of the reagent, and the role of non-basic amino acids in proteins can be explored in the future to find the mechanism(s) behind supercharging.

Despite the different theories behind supercharging, the benefits of supercharging are remarkable to the field of TD protein characterization. For targeted protein characterization, selecting the higher charge states obtained during supercharging has proved to be more efficient for ETD and ECD. Sterling et al. compared the ETD of ubiquitin by selecting the most abundant charge state with and without supercharging, and found a 61% increase in the

30 sequence coverage with supercharging.181 Madsen and Brodbelt demonstrated that as infrared multiphoton dissociation (IRMPD) of highly charged peptides obtained by supercharging resulted in increased sequence coverage and yielded more diagnostic ions that help with characterizing the modifications present.182 For characterizing protein-ligand complexes, Yin and Loo reported sequence-specific product ions originating from the ligand-bound form of the protein by performing ECD on the higher charge states obtained only during supercharging.183

With the targeted studies highlighting the importance of supercharging, these advantages have been extended to large-scale protein/peptide analysis by adding the supercharging reagents to the solvents used in liquid LC to facilitate increased charging during elution from the column. For peptide analysis, Jensen’s group has demonstrated that ETD fragmentation efficiency and hence, sequence coverage of phosphopeptides and tryptic peptides increased when m-NBA was added to the LC solvent.178 Valeja et al. extended this approach to intact proteins eluting from a column and demonstrated charge increment for transferrin (78 kDa) with reagents such as DMSO and DMF.174 However, these reagents are highly basic and there are other reports on the ‘subcharging’ effects of DMSO. For instance,

Meyer and Komives reported that addition of 5% DMSO to the LC solvent for peptide identification, narrowed the charge state distribution to a few intense charge states.179 They attributed this ‘charge state coalescence’ effect to the improved spray stability and desolvation efficiency, since more uniform droplet sizes resulting from spray stability could result in uniform charge states. In addition, Tsybin’s group also noticed a bimodal charge distribution for cytochrome c when 5% DMSO was used as a supercharging reagent in a dual electrospray tip to deliver the protein solution and supercharging reagent separately into the ESI plume. 184 In the same report, 5% DMSO did not efficiently supercharge substance P (a peptide, with MW 1347

Da) when compared to the other supercharging reagents used.184

31

In addition to the charge enhancing effect, supercharging reagents have been used for promoting signal intensity under circumstances where signal suppression from other LC modifiers such as trifluoro acetic acid (TFA) exists.185 TFA is the most common organic modifier used for reversed phase LC separations utilizing UV based detectors. TFA works as an ion- pairing agent and it complexes with the basic groups present in the proteins/peptides, decreasing the hydrophilicity and thereby increasing the retention of analytes in a reversed phase column. Yet another mechanism that is favored by addition of TFA is that the pH of the buffer is maintained below 2 which is essential to keep the free silanol groups in the column from binding to the basic groups of the protein, thereby avoiding the second mode of retention which causes peak tailing. Though TFA has lots of properties that yields good resolution, increases peak capacity and lowers peak tailing, TFA is not used for MS because of the signal suppression that happens even at very low concentrations.186 TFA is a strong acid and the acetate anions ion-pair with the basic groups of the protein preventing the ionization of the analyte in the gas phase. Chen’s group has reported an increase in the signal intensity in the presence of sulfolane and m-NBA when 0.1% TFA was used as a LC modifier.185 In their experimental setup, supercharging reagents were introduced by desorption electrospray ionization (DESI) and it gets mixed with the LC effluent containing TFA. They postulate that the supercharging reagents might interact with TFA anions and limit the association of TFA anions with the protein ions, which might result in enhancing the signal in the presence of TFA.

1.10. Current limitations and future directions

The field of TD mass spectrometry has grown tremendously in the last decade with major improvements in mass measurement accuracy, resolving power, and fragmentation techniques for intact proteins. However, these improvements are currently utilized mostly for targeted studies of intact proteins. Large-scale TD proteomics is still under development. Most

32 high throughput analysis of complex samples by TD methods are aimed at finding merely high numbers of proteins and the information about PTMs/isoforms are seldom obtained. Given the distinct biological roles of different isoforms and/or proteoforms in biomedical and clinical settings,27 it is important to move towards experimental strategies that are aimed at identifying these species.

Though RPLC is the most conveniently interfaced protein separation method with MS, the resolution obtained is not sufficient to differentiate the modified forms of proteins. When modified and unmodified forms of the protein co-elute, isolation/fragmentation of either form is difficult. Alternative strategies such as pI or charge based separation can be combined with MS since PTMs often change the pI of the protein.68 In addition, methods such as CE are promising, and with further improvements in sample loading capacity and by integrating with the high speed TOF instruments, it is possible to utilize the high resolution offered by this technique.

While improving the front-end separation technique helps with sorting the proteins for

MS analysis, identification and characterization of the intact proteins are solely dependent on the developments in MS technology. Advancements in instrumentation have to be made to surpass the current 30 kDa limit for high throughput intact protein identification. High resolution

FT-ICR instruments are ideal for studying high MW proteins and with improved ICR cell design to prolong the transients, the resolving power can be doubled.187, 188 In addition, with the increase in the magnetic field strength, such as the 21 T instrument that is now under construction, routine characterization of intact proteins as high as 150 kDa are possible.78

Though FT-ICR instruments yield unparalleled resolution, it also comes with an expensive price tag and it is not commonly used. Orbitrap MS is comparatively less expensive and recent improvements such as shorter data acquisition time and extended m/z range are attractive features to be employed for intact protein analysis.77, 189 The current scan speed of the orbitrap instrument is at 1 scan/second and if this is doubled, the resolution gets decreased. In

33 situations where higher scanning speeds are required to analyze large number of samples, the high scan speed and resolution of TOF instruments (currently 50,000 on the new design instruments) can be utilized.190

In addition to the complementary TD and BU methods, an intermediate area known as

‘Middle-Down approach’ is being slowly explored as an alternative to overcome the issues encountered with the former technologies. In middle-down, enzymatic or chemical cleavage is performed to generate peptides which are much higher in MW than the tryptic peptides. 191 For example, ompT cleaves at dibasic amino acids and generates peptides that are 10 kDa. Akin to the BU technique, the sensitivity, chromatographic performance, ionization/fragmentation efficiency are better in the middle-down approach.28 More importantly, it preserves the combined modifications which are present in relatively close regions of the protein and the issue of common peptides being identified (as in the BU) is not present.191 Wu et al. explored this approach for studying the EGFR receptor and they obtained 96% sequence coverage along with characterizing 13 phosphorylations and 10 N-linked glycan modifications.192 Recently,

Kelleher’s group also followed the same approach and demonstrated that closely related isoforms/PTMs can be differentiated in a HeLa cell lysate with the ompT generated peptides. 193

With continuing developments in separation strategies and MS instrumentation, TD will be the method of choice for protein identification and characterization in complex samples. In addition, with the high throughput characterization of isoforms/PTMs in complex samples, many breakthrough discoveries in the areas of biomarker discovery and protein regulation by PTMs will also be possible by the TD technique.

34

1.11. Summary

Mass spectrometry based proteomics is one of the powerful tools to identify the comprehensive set of proteins in simple to complex sample mixtures and can be performed using either top-down (TD) or bottom-up (BU) approaches. While proteins analyzed by cleaving into peptides in BU methods, TD approaches are aimed at finding intact proteins. By choosing the TD methods, intact mass of the protein, short stretches of sequence information to identify the protein and information about the presence of isoforms and/or proteoforms are obtained.

Current TD methods for protein identification are performed on instruments with dual mass analyzers and employ data-dependent dissociation schemes to fragment the protein. In these methods, three most abundant charge states of the protein ions are selected and fragmented one at a time followed by post-processing to merge the data. Consequently, these methods are time-consuming and reduce the duty-cycle of the instrument. Therefore, it is crucial to develop robust methods for high throughput TD protein identification. The goal of my dissertation is to develop top-down methods for high throughput analysis of proteins in complex samples.

In Chapter 2, I describe a high throughput platform to perform intact protein identification in complex mixtures using a single-stage TOF mass spectrometer. This method obviates the need for expensive instruments with dual mass analyzers to perform TD protein identification.

In Chapter 3, I present a new data-independent fragmentation technique to identify intact proteins in complex samples. In this method, multiple charge states are dissociated all at once inside the collision cell which increases sequence coverage and the duty cycle of the instrument is also not lowered in this method.

In Chapter 4, I investigate the efficiency of various supercharging reagents in different solvent conditions to enhance the multiple charging of proteins. This is performed in order to utilize these reagents to increase the efficiency of mass measurement and fragmentation for proteins eluting from a column.

35

In Chapter 5, I employ the supercharging reagents to enhance the multiple charging of proteins when eluting from a column. Enhancing charge while the proteins are getting eluted in the chromatographic timescale is beneficial for high throughput analysis of complex samples.

Finally, in Chapter 6, I discuss the major findings from my dissertation and provide future directions for the field.

In the Appendix, I have examined the metabolic distress after traumatic brain injury (TBI) by studying the differential proteome in patients who were critically injured and patients who recovered after TBI. I employed top-down and bottom-up MS based methods to profile both groups of samples. Included here is a reprint of the manuscript: Lakshmanan, R., Loo, J.A.,

Drake, T., LeBlanc, J., Ytterberg, A.J., McArthur, D.L., Etchepare, M., Vespa, P.M. Metabolic

Crisis After Traumatic Brain Injury is Associated with a Novel Microdialysis Proteome.

Neurocritical Care 2010, 12(3), 324-336.

36

1.12. References

1. Domon, B.; Aebersold, R., Mass spectrometry and protein analysis. Science 2006, 312, (5771), 212-217.

2. Bogdanov, B.; Smith, R. D., Proteomics by FTICR mass spectrometry: top down and bottom up. Mass Spectrom Rev 2005, 24, (2), 168-200.

3. Aebersold, R.; Goodlett, D. R., Mass spectrometry in proteomics. Chem Rev 2001, 101, (2), 269-295.

4. Han, X.; Aslanian, A.; Yates, J. R., 3rd, Mass spectrometry for proteomics. Curr Opin Chem Biol 2008, 12, (5), 483-490.

5. Lee, M. S., Methods of Mass Spectrometry in Homeland Security Applications. Mass Spectrometry Handbook 2012, 419-440.

6. Wynne, C.; Edwards, N. J.; Fenselau, C., Phyloproteomic classification of unsequenced organisms by top-down identification of bacterial proteins using capLC-MS/MS on an Orbitrap. Proteomics 2010, 10, (20), 3631-3643.

7. Sze, S. K.; Ge, Y.; Oh, H.; McLafferty, F. W., Top-down mass spectrometry of a 29-kDa protein for characterization of any posttranslational modification to within one residue. Proc Natl Acad Sci U S A 2002, 99, (4), 1774-1779.

8. Kelleher, N. L., Top-down proteomics. Anal Chem 2004, 76, (11), 197A-203A.

9. Chait, B. T., Chemistry. Mass spectrometry: bottom-up or top-down? Science 2006, 314, (5796), 65-66.

10. Yates, J. R.; Ruse, C. I.; Nakorchevsky, A., Proteomics by mass spectrometry: approaches, advances, and applications. Annu Rev Biomed Eng 2009, 11, 49-79.

11. Boyne, M.; Bose, R., Target Proteins: Bottom-up and Top-down Proteomics. Protein and Peptide Mass Spectrometry in Drug Discovery 2012, 89-100.

12. Delahunty, C. M.; Yates, J. R., 3rd, MudPIT: multidimensional protein identification technology. Biotechniques 2007, 43, (5), 563-569.

13. Kislinger, T.; Gramolini, A. O.; MacLennan, D. H.; Emili, A., Multidimensional protein identification technology (MudPIT): technical overview of a profiling method optimized for the comprehensive proteomic investigation of normal and diseased heart tissue. J Am Soc Mass Spectrom 2005, 16, (8), 1207-1220.

14. Cunningham, R.; Ma, D.; Li, L., Mass spectrometry-based proteomics and peptidomics for and biomarker discovery. Front Biol 2012, 7, (4), 313-335.

15. Hart, S. R.; Gaskell, S. J., Mass spectrometry-based proteomics and peptidomics for systems biology and biomarker discovery. Trends Anal Chem 2005, 24, (7), 566-574.

37

16. Schluter, H.; Apweiler, R.; Holzhutter, H. G.; Jungblut, P. R., Finding one's way in proteomics: a protein species nomenclature. Chem Cent J 2009, 3, 11.

17. Jungblut, P. R.; Holzhutter, H. G.; Apweiler, R.; Schluter, H., The speciation of the proteome. Chem Cent J 2008, 2, 16.

18. Brill, L. M.; Motamedchaboki, K.; Wu, S.; Wolf, D. A., Comprehensive proteomic analysis of Schizosaccharomyces pombe by two-dimensional HPLC-tandem mass spectrometry. Methods 2009, 48, (3), 311-319.

19. Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M., Universal sample preparation method for proteome analysis. Nat Methods 2009, 6, (5), 359-362.

20. Barnidge, D. R.; Hall, G. D.; Stocker, J. L.; Muddiman, D. C., Evaluation of a cleavable stable isotope labeled synthetic peptide for absolute protein quantification using LC-MS/MS. J Proteome Res 2004, 3, (3), 658-661.

21. Gruhler, S.; Kratchmarova, I., Stable isotope labeling by amino acids in cell culture (SILAC). Methods Mol Biol 2008, 424, 101-111.

22. Perkel, J. M., iTRAQ gets put to the test. J Proteome Res 2009, 8, (11), 4885.

23. Ramus, C.; Gonzalez de Peredo, A.; Dahout, C.; Gallagher, M.; Garin, J., An optimized strategy for ICAT quantification of membrane proteins. Mol Cell Proteomics 2006, 5, (1), 68-78.

24. Armirotti, A.; Damonte, G., Achievements and perspectives of top-down proteomics. Proteomics 2010, 10, (20), 3566-3576.

25. Calligaris, D.; Villard, C.; Lafitte, D., Advances in top-down proteomics for disease biomarker discovery. J Proteomics 2011, 74, (7), 920-934.

26. Angel, T. E.; Aryal, U. K.; Hengel, S. M.; Baker, E. S.; Kelly, R. T.; Robinson, E. W.; Smith, R. D., Mass spectrometry-based proteomics: existing capabilities and future directions. Chem Soc Rev 2012, 41, (10), 3912-3928.

27. Stastna, M.; Van Eyk, J. E., Analysis of protein isoforms: can we do it better? Proteomics 2012, 12, (19-20), 2937-2948.

28. Garcia, B. A., What Does the Future Hold for Top Down Mass Spectrometry? J Am Soc Mass Spectrom 2010, 21, 193-202.

29. Fenn, J. B., Electrospray wings for molecular elephants (Nobel lecture). Angew Chem Int Ed Engl 2003, 42, (33), 3871-3894.

30. Fenn, J. B.; Mann, M.; Meng, C. K.; Wong, S. F.; Whitehouse, C. M., Electrospray ionization for mass spectrometry of large biomolecules. Science 1989, 246, (4926), 64-71.

31. Smith, R. D.; Loo, J. A.; Edmonds, C. G.; Barinaga, C. J.; Udseth, H. R., New developments in biochemical mass spectrometry: electrospray ionization. Anal Chem 1990, 62, (9), 882-899.

38

32. Loo, J. A.; Edmonds, C. G.; Smith, R. D., Primary sequence information from intact proteins by electrospray ionization tandem mass spectrometry. Science 1990, 248, (4952), 201-204.

33. Loo, J. A.; Edmonds, C. G.; Smith, R. D., Tandem mass spectrometry of very large molecules: serum albumin sequence information from multiply charged ions formed by electrospray ionization. Anal Chem 1991, 63, (21), 2488-2499.

34. Feng, R.; Konishi, Y., Analysis of antibodies and other large glycoproteins in the mass range of 150,000-200,000 Da by electrospray ionization mass spectrometry. Anal Chem 1992, 64, (18), 2090-2095.

35. Feng, R.; Konishi, Y., Collisionally- Activated Dissociation of Multiply Charged 150-kDa Antibody Ions. Anal Chem 1993, 65, 645-649.

36. Henry, K. D.; Williams, E. R.; Wang, B. H.; McLafferty, F. W.; Shabanowitz, J.; Hunt, D. F., Fourier-transform mass spectrometry of large molecules by electrospray ionization. Proc Natl Acad Sci U S A 1989, 86, (23), 9075-9078.

37. Horn, D. M.; Zubarev, R. A.; McLafferty, F. W., Automated reduction and interpretation of high resolution electrospray mass spectra of large molecules. J Am Soc Mass Spectrom 2000, 11, (4), 320-332.

38. Loo, J. A.; Quinn, J. P.; Ryu, S. I.; Henry, K. D.; Senko, M. W.; McLafferty, F. W., High- resolution tandem mass spectrometry of large biomolecules. Proc Natl Acad Sci U S A 1992, 89, (1), 286-289.

39. Beu, S. C.; Senko, M. W.; Quinn, J. P.; McLafferty, F. W., Improved Fourier-Transform Ion- Cyclotron-Resonance Mass Spectrometry of Large Biomolecules. J Am Soc Mass Spectrom 1993, 4, 190-192.

40. Beu, S. C.; Senko, M. W.; Quinn, J. P.; Wampler, F. M.; McLafferty, F. W., Fourier- Transform Electrospray Instrumentation for Tandem High-Resolution Mass Spectrometry of Large Molecules. J Am Soc Mass Spectrom 1993, 4, 557-565.

41. Guan, S.; Marshall, A. G., Stored waveform inverse Fourier transform axial excitation/ejection for quadrupole ion trap mass spectrometry. Anal Chem 1993, 65, (9), 1288- 1294.

42. Senko, M. W.; Beu, S. C.; McLafferty, F. W., High-resolution tandem mass spectrometry of carbonic anhydrase. Anal Chem 1994, 66, (3), 415-418.

43. Mortz, E.; O'Connor, P. B.; Roepstorff, P.; Kelleher, N. L.; Wood, T. D.; McLafferty, F. W.; Mann, M., Sequence tag identification of intact proteins by matching tanden mass spectral data against sequence data bases. Proc Natl Acad Sci U S A 1996, 93, (16), 8264-8267.

44. Kelleher, N. L.; Senko, M. W.; Siegel, M. M.; McLafferty, F. W., Unit Resolution Mass Spectra of 112 kDa Molecules with 3 Da Accuracy. J Am Soc Mass Spectrom 1997, 8, 380-383.

39

45. Guan, Z.; Kelleher, N. L.; O'Connor, P. B.; D.J., A.; Little, D. P.; McLafferty, F. W., 193 nm photodissociation of larger multiply-charged biomolecules. Int J Mass Spectrom and Ion Proc 1996, 157/158, 357-364.

46. Little, D. P.; Speir, J. P.; Senko, M. W.; O'Connor, P. B.; McLafferty, F. W., Infrared multiphoton dissociation of large multiply charged ions for biomolecule sequencing. Anal Chem 1994, 66, (18), 2809-2815.

47. Kelleher, N. L.; Lin, H. Y.; Valaskovic, G. A.; Aaserud, D. J.; Fridriksson, E. K.; McLafferty, F. W., Top Down versus Bottom Up Protein Characterization by Tandem High-Resolution Mass Spectrometry. J Am Chem Soc 1999, 121, 806-812.

48. Zubarev, R. A.; Kelleher, N. L.; McLafferty, F. W., Electron Capture Dissociation of Multiply Charged Protein Cations. A Nonergodic Process. J Am Chem Soc 1998, 120, 3265-3266.

49. Syka, J. E.; Coon, J. J.; Schroeder, M. J.; Shabanowitz, J.; Hunt, D. F., Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc Natl Acad Sci U S A 2004, 101, (26), 9528-9533.

50. Aaserud, D. J.; Little, D. P.; O'Connor, P. B.; McLafferty, F. W., Distinguishing N- and C- Terminus Ions for Mass Spectrometry Sequencing of Proteins without Prior Degradation. Rapid Commun Mass Spectrom 1995, 9, 871-876.

51. Kelleher, N. L.; Zubarev, R. A.; Bush, K.; Furie, B.; Furie, B. C.; McLafferty, F. W.; Walsh, C. T., Localization of labile posttranslational modifications by electron capture dissociation: the case of gamma-carboxyglutamic acid. Anal Chem 1999, 71, (19), 4250-4253.

52. Zabrouskov, V.; Han, X.; Welker, E.; Zhai, H.; Lin, C.; van Wijk, K. J.; Scheraga, H. A.; McLafferty, F. W., Stepwise deamidation of ribonuclease A at five sites determined by top down mass spectrometry. Biochemistry 2006, 45, (3), 987-992.

53. Sharon, M.; Robinson, C. V., The role of mass spectrometry in structure elucidation of dynamic protein complexes. Annu Rev Biochem 2007, 76, 167-193.

54. Smith, R. D.; Light-Wahl, K. J.; Winger, B. E.; Loo, J. A., Preservation of non-covalent associations in electrospray ionization mass spectrometry: Multiply charged polypeptide and protein dimers. Organic Mass Spec 2005, 27, (7), 811-821.

55. Souda, P.; Ryan, C. M.; Cramer, W. A.; Whitelegge, J., Profiling of integral membrane proteins and their post translational modifications using high-resolution mass spectrometry. Methods 2011, 55, (4), 330-336.

56. Mazur, M. T.; Cardasis, H. L.; Spellman, D. S.; Liaw, A.; Yates, N. A.; Hendrickson, R. C., Quantitative analysis of intact apolipoproteins in human HDL by top-down differential mass spectrometry. Proc Natl Acad Sci U S A 2010, 107, (17), 7728-7733.

57. McLafferty, F. W., A century of progress in molecular mass spectrometry. Annu Rev Anal Chem 2011, 4, 1-22.

40

58. Ge, Y.; Lawhorn, B. G.; ElNaggar, M.; Strauss, E.; Park, J. H.; Begley, T. P.; McLafferty, F. W., Top down characterization of larger proteins (45 kDa) by electron capture dissociation mass spectrometry. J Am Chem Soc 2002, 124, (4), 672-678.

59. Han, X.; Jin, M.; Breuker, K.; McLafferty, F. W., Extending top-down mass spectrometry to proteins with masses greater than 200 kilodaltons. Science 2006, 314, (5796), 109-112.

60. Valeja, S. G.; Kaiser, N. K.; Xian, F.; Hendrickson, C. L.; Rouse, J. C.; Marshall, A. G., Unit mass baseline resolution for an intact 148 kDa therapeutic monoclonal antibody by Fourier transform ion cyclotron resonance mass spectrometry. Anal Chem 2011, 83, (22), 8391-8395.

61. Tolmachev, A. V.; Robinson, E. W.; Wu, S.; Pasa-Tolic, L.; Smith, R. D., FT-ICR MS optimization for the analysis of intact proteins. Int J Mass Spectrom 2009, 281, (1-3), 32-38.

62. Karabacak, N. M.; Li, L.; Tiwari, A.; Hayward, L. J.; Hong, P.; Easterling, M. L.; Agar, J. N., Sensitive and specific identification of wild type and variant proteins from 8 to 669 kDa using top-down mass spectrometry. Mol Cell Proteomics 2009, 8, (4), 846-856.

63. Parks, B. A.; Jiang, L.; Thomas, P. M.; Wenger, C. D.; Roth, M. J.; Boyne, M. T., 2nd; Burke, P. V.; Kwast, K. E.; Kelleher, N. L., Top-down proteomics on a chromatographic time scale using linear ion trap fourier transform hybrid mass spectrometers. Anal Chem 2007, 79, (21), 7984-7991.

64. Zhang, J.; Guy, M. J.; Norman, H. S.; Chen, Y. C.; Xu, Q.; Dong, X.; Guner, H.; Wang, S.; Kohmoto, T.; Young, K. H.; Moss, R. L.; Ge, Y., Top-down quantitative proteomics identified phosphorylation of cardiac troponin I as a candidate biomarker for chronic heart failure. J Proteome Res 2011, 10, (9), 4054-4065.

65. Ryan, C. M.; Souda, P.; Halgand, F.; Wong, D. T.; Loo, J. A.; Faull, K. F.; Whitelegge, J. P., Confident assignment of intact mass tags to human salivary cystatins using top-down Fourier- transform ion cyclotron resonance mass spectrometry. J Am Soc Mass Spectrom 2010, 21, (6), 908-917.

66. Nepomuceno, A. I.; Mason, C. J.; Muddiman, D. C.; Bergen, H. R., 3rd; Zeldenrust, S. R., Detection of genetic variants of transthyretin by liquid chromatography-dual electrospray ionization fourier-transform ion-cyclotron-resonance mass spectrometry. Clin Chem 2004, 50, (9), 1535-1543.

67. Marshall, A. G.; Hendrickson, C. L., High-resolution mass spectrometers. Annu Rev Anal Chem 2008, 1, 579-599.

68. Cui, W.; Rohrs, H. W.; Gross, M. L., Top-down mass spectrometry: recent developments, applications and perspectives. Analyst 2011, 136, (19), 3854-3864.

69. Macek, B.; Waanders, L. F.; Olsen, J. V.; Mann, M., Top-down protein sequencing and MS3 on a hybrid linear quadrupole ion trap-orbitrap mass spectrometer. Mol Cell Proteomics 2006, 5, (5), 949-958.

70. Bondarenko, P. V.; Second, T. P.; Zabrouskov, V.; Makarov, A. A.; Zhang, Z., Mass measurement and top-down HPLC/MS analysis of intact monoclonal antibodies on a hybrid 41 linear quadrupole ion trap-Orbitrap mass spectrometer. J Am Soc Mass Spectrom 2009, 20, (8), 1415-1424.

71. Mazur, M. T.; Seipert, R. S.; Mahon, D.; Zhou, Q.; Liu, T., A platform for characterizing therapeutic monoclonal antibody breakdown products by 2D chromatography and top-down mass spectrometry. The AAPS J 2012, 14, (3), 530-541.

72. Second, T. P.; Zabrouskov, V., Utilizing a Hybrid Mass Spectrometer to Enable Fundamental Protein Characterization: Intact Mass Analysis and Top-Down Fragmentation with the LTQ Orbitrap MS. Thermo Application Note: 498 2010.

73. Mohr, J.; Swart, R.; Samonig, M.; Bohm, G.; Huber, C. G., High-efficiency nano- and micro- HPLC--high-resolution Orbitrap-MS platform for top-down proteomics. Proteomics 2010, 10, (20), 3598-3609.

74. Theberge, R.; Infusini, G.; Tong, W.; McComb, M. E.; Costello, C. E., Top-Down Analysis of Small Plasma Proteins Using an LTQ-Orbitrap. Potential for Mass Spectrometry-Based Clinical Assays for Transthyretin and Hemoglobin. Int J Mass Spectrom 2011, 300, (2-3), 130-142.

75. Tian, Z.; Zhao, R.; Tolic, N.; Moore, R. J.; Stenoien, D. L.; Robinson, E. W.; Smith, R. D.; Pasa-Tolic, L., Two-dimensional liquid chromatography system for online top-down mass spectrometry. Proteomics 2010, 10, (20), 3610-3620.

76. Michalski, A.; Damoc, E.; Lange, O.; Denisov, E.; Nolting, D.; Muller, M.; Viner, R.; Schwartz, J.; Remes, P.; Belford, M.; Dunyach, J. J.; Cox, J.; Horning, S.; Mann, M.; Makarov, A., Ultra high resolution linear ion trap Orbitrap mass spectrometer (Orbitrap Elite) facilitates top down LC MS/MS and versatile peptide fragmentation modes. Mol Cell Proteomics 2011, 11, (3), O111.013698.

77. Ahlf, D. R.; Compton, P. D.; Tran, J. C.; Early, B. P.; Thomas, P. M.; Kelleher, N. L., Evaluation of the compact high-field orbitrap for top-down proteomics of human cells. J Proteome Res 2012, 11, (8), 4308-4314.

78. Arnaud, C. H., High-Res Mass Spec. Chem Engg News 2010, 88, (25), 10-15.

79. Nemeth-Cawley, J. F.; Rouse, J. C., Identification and sequencing analysis of intact proteins via collision-induced dissociation and quadrupole time-of-flight mass spectrometry. J Mass Spectrom 2002, 37, (3), 270-282.

80. Nemeth-Cawley, J. F.; Tangarone, B. S.; Rouse, J. C., "Top Down" characterization is a complementary technique to peptide sequencing for identifying protein species in complex mixtures. J Proteome Res 2003, 2, (5), 495-505.

81. Armirotti, A.; Benatti, U.; Damonte, G., Top-down proteomics with a quadrupole time-of-flight mass spectrometer and collision-induced dissociation. Rapid Commun Mass Spectrom 2009, 23, (5), 661-666.

82. Thevis, M.; Loo, R. R.; Loo, J. A., Mass spectrometric characterization of transferrins and their fragments derived by reduction of disulfide bonds. J Am Soc Mass Spectrom 2003, 14, (6), 635-647. 42

83. Ginter, J. M.; Zhou, F.; Johnston, M. V., Generating protein sequence tags by combining cone and conventional collision induced dissociation in a quadrupole time-of-flight mass spectrometer. J Am Soc Mass Spectrom 2004, 15, (10), 1478-1486.

84. Johnson, R. W., Jr.; Ahmed, T. F.; Miesbauer, L. J.; Edalji, R.; Smith, R.; Harlan, J.; Dorwin, S.; Walter, K.; Holzman, T., Protein fragmentation via liquid chromatography-quadrupole time- of-flight mass spectrometry: the use of limited sequence information in structural characterization. Anal Biochem 2005, 341, (1), 22-32.

85. Ren, D.; Pipes, G. D.; Hambly, D.; Bondarenko, P. V.; Treuheit, M. J.; Gadgil, H. S., Top- down N-terminal sequencing of Immunoglobulin subunits with electrospray ionization time of flight mass spectrometry. Anal Biochem 2009, 384, (1), 42-48.

86. Sowell, R. A.; Koeniger, S. L.; Valentine, S. J.; Moon, M. H.; Clemmer, D. E., Nanoflow LC/IMS-MS and LC/IMS-CID/MS of protein mixtures. J Am Soc Mass Spectrom 2004, 15, (9), 1341-1353.

87. Zinnel, N. F.; Pai, P. J.; Russell, D. H., Ion mobility-mass spectrometry (IM-MS) for top-down proteomics: increased dynamic range affords increased sequence coverage. Anal Chem 2012, 84, (7), 3390-3397.

88. Halgand, F.; Habchi, J.; Cravello, L.; Martinho, M.; Guigliarelli, B.; Longhi, S., Dividing to unveil protein microheterogeneities: traveling wave ion mobility study. Anal Chem 2011, 83, (19), 7306-7315.

89. Scherperel, G.; Reid, G. E., Emerging methods in proteomics: top-down protein characterization by multistage tandem mass spectrometry. Analyst 2007, 132, (6), 500-506.

90. Huang, T. Y.; McLuckey, S. A., Top-down protein characterization facilitated by ion/ion reactions on a quadrupole/time of flight platform. Proteomics 2010, 10, (20), 3577-3588.

91. Chanthamontri, C.; Liu, J.; McLuckey, S. A., Charge State Dependent Fragmentation of Gaseous alpha-Synuclein Cations via Ion Trap and Beam-Type Collisional Activation. Int J Mass Spectrom 2009, 283, (1-3), 9-16.

92. Hogan, J. M.; Pitteri, S. J.; McLuckey, S. A., Phosphorylation site identification via ion trap tandem mass spectrometry of whole protein and peptide ions: bovine alpha-crystallin A chain. Anal Chem 2003, 75, (23), 6509-6516.

93. Mekecha, T. T.; Amunugama, R.; McLuckey, S. A., Ion trap collision-induced dissociation of human hemoglobin alpha-chain cations. J Am Soc Mass Spectrom 2006, 17, (7), 923-931.

94. Reid, G. E.; Stephenson, J. L., Jr.; McLuckey, S. A., Tandem mass spectrometry of ribonuclease A and B: N-linked glycosylation site analysis of whole protein ions. Anal Chem 2002, 74, (3), 577-583.

95. Reid, G. E.; Wu, J.; Chrisman, P. A.; Wells, J. M.; McLuckey, S. A., Charge-state-dependent sequence analysis of protonated ubiquitin ions via ion trap tandem mass spectrometry. Anal Chem 2001, 73, (14), 3274-3281.

43

96. Gorshkov, M. V.; Zubarev, R. A., On the accuracy of polypeptide masses measured in a linear ion trap. Rapid Commun Mass Spectrom 2005, 19, (24), 3755-3758.

97. Scherperel, G.; Yan, H.; Wang, Y.; Reid, G. E., 'Top-down' characterization of site-directed mutagenesis products of Staphylococcus aureus dihydroneopterin aldolase by multistage tandem mass spectrometry in a linear quadrupole ion trap. Analyst 2006, 131, (2), 291-302.

98. Cody, R. B., Jr.; Amster, I. J.; McLafferty, F. W., Peptide mixture sequencing by tandem Fourier-transform mass spectrometry. Proc Natl Acad Sci U S A 1985, 82, (19), 6367-6370.

99. Hunt, D. F.; Yates, J. R., 3rd; Shabanowitz, J.; Winston, S.; Hauer, C. R., Protein sequencing by tandem mass spectrometry. Proc Natl Acad Sci U S A 1986, 83, (17), 6233- 6237.

100. McLafferty, F. W., Tandem Mass Spectrometry (MS/MS): A Promising New Analytical Technique for Specific Component Determination in Complex Mixtures. Acc Chem Res 1980, 13, (2), 33-39.

101. Wysocki, V. H.; Tsaprailis, G.; Smith, L. L.; Breci, L. A., Mobile and localized protons: a framework for understanding peptide dissociation. J Mass Spectrom 2000, 35, (12), 1399-1406.

102. Loo, J. A.; Udseth, H. R.; Smith, R. D., Collisional Effects on the Charge Distribution of Ions from Large Molecules, Formed by Electrospray-ionization Mass Spectrometry. Rapid Commun Mass Spectrom 1988, 2, (10), 207-210.

103. Bure, C.; Lange, C., Comparison of Dissociation of Ions in an Electrospray Source, or a Collision Cell in Tandem Mass Spectrometry. Curr Organic Chem 2003, 7, 1613-1624.

104. Kellie, J. F.; Catherman, A. D.; Durbin, K. R.; Tran, J. C.; Tipton, J. D.; Norris, J. L.; Witkowski, C. E., 2nd; Thomas, P. M.; Kelleher, N. L., Robust analysis of the yeast proteome under 50 kDa by molecular-mass-based fractionation and top-down mass spectrometry. Anal Chem 2012, 84, (1), 209-215.

105. Tipton, J. D.; Tran, J. C.; Catherman, A. D.; Ahlf, D. R.; Durbin, K. R.; Lee, J. E.; Kellie, J. F.; Kelleher, N. L.; Hendrickson, C. L.; Marshall, A. G., Nano-LC FTICR tandem mass spectrometry for top-down proteomics: routine baseline unit mass resolution of whole cell lysate proteins up to 72 kDa. Anal Chem 2012, 84, (5), 2111-2117.

106. Olsen, J. V.; Macek, B.; Lange, O.; Makarov, A.; Horning, S.; Mann, M., Higher-energy C- trap dissociation for peptide modification analysis. Nat Methods 2007, 4, (9), 709-712.

107. Olsen, J. V.; Schwartz, J. C.; Griep-Raming, J.; Nielsen, M. L.; Damoc, E.; Denisov, E.; Lange, O.; Remes, P.; Taylor, D.; Splendore, M.; Wouters, E. R.; Senko, M.; Makarov, A.; Mann, M.; Horning, S., A dual pressure linear ion trap Orbitrap instrument with very high sequencing speed. Mol Cell Proteomics 2009, 8, (12), 2759-2769.

108. Breuker, K.; Jin, M.; Han, X.; Jiang, H.; McLafferty, F. W., Top-down identification and characterization of biomolecules by mass spectrometry. J Am Soc Mass Spectrom 2008, 19, (8), 1045-1053.

44

109. Zhang, H.; Cui, W.; Wen, J.; Blankenship, R. E.; Gross, M. L., Native electrospray and electron-capture dissociation FTICR mass spectrometry for top-down studies of protein assemblies. Anal Chem 2011, 83, (14), 5598-5606.

110. Coon, J. J., Collisions or electrons? Protein sequence analysis in the 21st century. Anal Chem 2009, 81, (9), 3208-3215.

111. Cooper, H. J.; Akbarzadeh, S.; Heath, J. K.; Zeller, M., Data-dependent electron capture dissociation FT-ICR mass spectrometry for proteomic analyses. J Proteome Res 2005, 4, (5), 1538-1544.

112. McLafferty, F. W.; Breuker, K.; Jin, M.; Han, X.; Infusini, G.; Jiang, H.; Kong, X.; Begley, T. P., Top-down MS, a powerful complement to the high capabilities of proteolysis proteomics. FEBS J 2007, 274, (24), 6256-6268.

113. Horn, D. M.; Ge, Y.; McLafferty, F. W., Activated ion electron capture dissociation for mass spectral sequencing of larger (42 kDa) proteins. Anal Chem 2000, 72, (20), 4778-4784.

114. Kim, M. S.; Pandey, A., Electron transfer dissociation mass spectrometry in proteomics. Proteomics 2012, 12, (4-5), 530-542.

115. Kim, M. S.; Zhong, J.; Kandasamy, K.; Delanghe, B.; Pandey, A., Systematic evaluation of alternating CID and ETD fragmentation for phosphorylated peptides. Proteomics 2011, 11, (12), 2568-2572.

116. Swaney, D. L.; McAlister, G. C.; Coon, J. J., Decision tree-driven tandem mass spectrometry for shotgun proteomics. Nat Methods 2008, 5, (11), 959-964.

117. Hart, S. R., A Protocol for top-down proteomics using HPLC and ETD/PTR-MS. Methods Mol Biol 2010, 658, 339-353.

118. Mikesh, L. M.; Ueberheide, B.; Chi, A.; Coon, J. J.; Syka, J. E.; Shabanowitz, J.; Hunt, D. F., The utility of ETD mass spectrometry in proteomic analysis. Biochim Biophys Acta 2006, 1764, (12), 1811-1822.

119. Chi, A.; Bai, D. L.; Geer, L. Y.; Shabanowitz, J.; Hunt, D. F., Analysis of intact proteins on a chromatographic time scale by electron transfer dissociation tandem mass spectrometry. Int J Mass Spectrom 2007, 259, (1-3), 197-203.

120. Fornelli, L.; Damoc, E.; Thomas, P. M.; Kelleher, N. L.; Aizikov, K.; Denisov, E.; Makarov, A.; Tsybin, Y. O., Top-down analysis of monoclonal antibody IgG1 by electron transfer dissociation Orbitrap FTMS. Mol Cell Proteomics 2012, M112.019620.

121. Tsybin, Y. O.; Fornelli, L.; Stoermer, C.; Luebeck, M.; Parra, J.; Nallet, S.; Wurm, F. M.; Hartmer, R., Structural analysis of intact monoclonal antibodies by electron transfer dissociation mass spectrometry. Anal Chem 2011, 83, (23), 8919-8927.

122. Righetti, P. G., Electrophoresis: the march of pennies, the march of dimes. J Chromatogr A 2005, 1079, (1-2), 24-40.

45

123. Claverol, S.; Burlet-Schiltz, O.; Gairin, J. E.; Monsarrat, B., Characterization of protein variants and post-translational modifications: ESI-MSn analyses of intact proteins eluted from polyacrylamide gels. Mol Cell Proteomics 2003, 2, (8), 483-493.

124. Rose, D. J.; Opiteck, G. J., Two-dimensional gel electrophoresis/liquid chromatography for the micropreparative isolation of proteins. Anal Chem 1994, 66, (15), 2529-2536.

125. Tran, J. C.; Doucette, A. A., Gel-eluted liquid fraction entrapment electrophoresis: an electrophoretic method for broad molecular weight range proteome separation. Anal Chem 2008, 80, (5), 1568-1573.

126. Root, B. E.; Zhang, B.; Barron, A. E., Size-based protein separations by microchip electrophoresis using an acid-labile surfactant as a replacement for SDS. Electrophoresis 2009, 30, (12), 2117-2122.

127. Meng, F.; Cargile, B. J.; Patrie, S. M.; Johnson, J. R.; McLoughlin, S. M.; Kelleher, N. L., Processing complex mixtures of intact proteins for direct analysis by mass spectrometry. Anal Chem 2002, 74, (13), 2923-2929.

128. Loo, J. A.; Brown, J.; Critchley, G.; Mitchell, C.; Andrews, P. C.; Ogorzalek Loo, R. R., High sensitivity mass spectrometric methods for obtaining intact molecular weights from gel- separated proteins. Electrophoresis 1999, 20, (4-5), 743-748.

129. Loo, R. R.; Cavalcoli, J. D.; VanBogelen, R. A.; Mitchell, C.; Loo, J. A.; Moldover, B.; Andrews, P. C., Virtual 2-D gel electrophoresis: visualization and analysis of the E. coli proteome by mass spectrometry. Anal Chem 2001, 73, (17), 4063-4070.

130. Tran, J. C.; Zamdborg, L.; Ahlf, D. R.; Lee, J. E.; Catherman, A. D.; Durbin, K. R.; Tipton, J. D.; Vellaichamy, A.; Kellie, J. F.; Li, M.; Wu, C.; Sweet, S. M.; Early, B. P.; Siuti, N.; LeDuc, R. D.; Compton, P. D.; Thomas, P. M.; Kelleher, N. L., Mapping intact protein isoforms in discovery mode using top-down proteomics. Nature 2011, 480, (7376), 254-258.

131. Lee, C. S.; Balgley, B. M., Capillary isoelectric focusing/reversed phase liquid chromatography/mass spectrometry. Methods Mol Biol 2009, 492, 233-240.

132. Martinovic, S.; Pasa-Tolic, L.; Smith, R. D., Capillary isoelectric focusing--mass spectrometry of proteins and protein complexes. Methods Mol Biol 2004, 276, 291-304.

133. Zhou, F.; Johnston, M. V., Protein characterization by on-line capillary isoelectric focusing, reversed-phase liquid chromatography, and mass spectrometry. Anal Chem 2004, 76, (10), 2734-2740.

134. Zhou, F.; Johnston, M. V., Protein profiling by capillary isoelectric focusing, reversed-phase liquid chromatography, and mass spectrometry. Electrophoresis 2005, 26, (7-8), 1383-1388.

135. Weber, G.; Bocek, P., Recent developments in preparative free flow isoelectric focusing. Electrophoresis 1998, 19, (10), 1649-1653.

46

136. Ouvry-Patat, S. A.; Torres, M. P.; Gelfand, C. A.; Quek, H. H.; Easterling, M.; Speir, J. P.; Borchers, C. H., Top-down proteomics on a high-field Fourier transform ion cyclotron resonance mass spectrometer. Methods Mol Biol 2009, 492, 215-231.

137. Ouvry-Patat, S. A.; Torres, M. P.; Quek, H. H.; Gelfand, C. A.; O'Mullan, P.; Nissum, M.; Schroeder, G. K.; Han, J.; Elliott, M.; Dryhurst, D.; Ausio, J.; Wolfenden, R.; Borchers, C. H., Free-flow electrophoresis for top-down proteomics by Fourier transform ion cyclotron resonance mass spectrometry. Proteomics 2008, 8, (14), 2798-2808.

138. Moritz, R. L.; Ji, H.; Schutz, F.; Connolly, L. M.; Kapp, E. A.; Speed, T. P.; Simpson, R. J., A proteome strategy for fractionating proteins and peptides using continuous free-flow electrophoresis coupled off-line to reversed-phase high-performance liquid chromatography. Anal Chem 2004, 76, (16), 4811-4824.

139. Huck, C. W.; Bonn, G. K., Analysis of proteins by capillary electrophoresis. Methods Mol Biol 2008, 384, 507-540.

140. Kuhr, W. G., Capillary electrophoresis. Anal Chem 1990, 62, (12), 403R-414R.

141. Smith, R. D.; Udseth, H. R., Capillary zone electrophoresis-MS. Nature 1988, 331, (6157), 639-640.

142. Loo, J. A.; Udseth, H. R.; Smith, R. D., Peptide and protein analysis by electrospray ionization-mass spectrometry and capillary electrophoresis-mass spectrometry. Anal Biochem 1989, 179, (2), 404-412.

143. Moini, M., Simplifying CE-MS operation. 2. Interfacing low-flow separation techniques to mass spectrometry using a porous tip. Anal Chem 2007, 79, (11), 4241-4246.

144. Haselberg, R.; Ratnayake, C. K.; de Jong, G. J.; Somsen, G. W., Performance of a sheathless porous tip sprayer for capillary electrophoresis-electrospray ionization-mass spectrometry of intact proteins. J Chromatogr A 2010, 1217, (48), 7605-7611.

145. Haselberg, R.; de Jong, G. J.; Somsen, G. W., Capillary electrophoresis-mass spectrometry for the analysis of intact proteins 2007-2010. Electrophoresis 2011, 32, (1), 66-82.

146. El Rassi, Z., Electrophoretic and electrochromatographic separation of proteins in capillaries: an update covering 2007-2009. Electrophoresis 2010, 31, (1), 174-191.

147. Pioch, M.; Bunz, S. C.; Neususs, C., Capillary electrophoresis/mass spectrometry relevant to pharmaceutical and biotechnological applications. Electrophoresis 2012, 33, (11), 1517-1530.

148. Wang, Y.; Balgley, B. M.; Rudnick, P. A.; Lee, C. S., Effects of chromatography conditions on intact protein separations for top-down proteomics. J Chromatogr A 2005, 1073, (1-2), 35-41.

149. Vellaichamy, A.; Tran, J. C.; Catherman, A. D.; Lee, J. E.; Kellie, J. F.; Sweet, S. M.; Zamdborg, L.; Thomas, P. M.; Ahlf, D. R.; Durbin, K. R.; Valaskovic, G. A.; Kelleher, N. L., Size- sorting combined with improved nanocapillary liquid chromatography-mass spectrometry for identification of intact proteins up to 80 kDa. Anal Chem 2010, 82, (4), 1234-1244.

47

150. Eeltink, S.; Wouters, B.; Desmet, G.; Ursem, M.; Blinco, D.; Kemp, G. D.; Treumann, A., High-resolution separations of protein isoforms with liquid chromatography time-of-flight mass spectrometry using polymer monolithic capillary columns. J Chromatogr A 2011, 1218, (32), 5504-5511.

151. Sharma, S.; Simpson, D. C.; Tolic, N.; Jaitly, N.; Mayampurath, A. M.; Smith, R. D.; Pasa- Tolic, L., Proteomic Profiling of Intact Proteins Using WAX-RPLC 2-D Separations and FTICR Mass Spectrometry. J Proteome Res 2006, 6, 602-610.

152. Whitelegge, J. P., HPLC and mass spectrometry of intrinsic membrane proteins. Methods Mol Biol 2004, 251, 323-340.

153. Whitelegge, J. P., Tandem mass spectromery of integral membrane proteins for top-down proteomics. Trends Anal Chem 2005, 24, (7), 576-582.

154. Whitelegge, J. P.; Gundersen, C. B.; Faull, K. F., Electrospray-ionization mass spectrometry of intact intrinsic membrane proteins. Protein Sci 1998, 7, (6), 1423-1430.

155. Capriotti, A. L.; Cavaliere, C.; Foglia, P.; Samperi, R.; Lagana, A., Intact protein separation by chromatographic and/or electrophoretic techniques for top-down proteomics. J Chromatogr A 2011, 1218, (49), 8760-8776.

156. Wu, Q.; Yuan, H.; Zhang, L.; Zhang, Y., Recent advances on multidimensional liquid chromatography-mass spectrometry for proteomics: from qualitative to quantitative analysis--a review. Anal Chim Acta 2012, 731, 1-10.

157. Leduc, R. D.; Kelleher, N. L., Using ProSight PTM and related tools for targeted protein identification and characterization with high mass accuracy tandem MS data. Curr Protoc 2007, 19, 13.6.1-13.6.28.

158. Perkel, J. M., Tearing the top off 'Top-Down' Proteomics. Biotechniques 2012, 53, (2), 75- 78.

159. Cobb, J. S.; Easterling, M. L.; Agar, J. N., Structural characterization of intact proteins is enhanced by prevalent fragmentation pathways rarely observed for peptides. J Am Soc Mass Spectrom 2010, 21, (6), 949-959.

160. Liu, X.; Sirotkin, Y.; Shen, Y.; Anderson, G.; Tsai, Y. S.; Ting, Y. S.; Goodlett, D. R.; Smith, R. D.; Bafna, V.; Pevzner, P. A., Protein identification using top-down. Mol Cell Proteomics 2012, 11, (6), M111.008524.

161. Liu, X.; Inbar, Y.; Dorrestein, P. C.; Wynne, C.; Edwards, N.; Souda, P.; Whitelegge, J. P.; Bafna, V.; Pevzner, P. A., Deconvolution and database search of complex tandem mass spectra of intact proteins: a combinatorial approach. Mol Cell Proteomics 2010, 9, (12), 2772-2782.

162. Tsai, Y. S.; Scherl, A.; Shaw, J. L.; MacKay, C. L.; Shaffer, S. A.; Langridge-Smith, P. R.; Goodlett, D. R., Precursor ion independent algorithm for top-down shotgun proteomics. J Am Soc Mass Spectrom 2009, 20, (11), 2154-2166.

48

163. Zubarev, R. A.; Horn, D. M.; Fridriksson, E. K.; Kelleher, N. L.; Kruger, N. A.; Lewis, M. A.; Carpenter, B. K.; McLafferty, F. W., Electron capture dissociation for structural characterization of multiply charged protein cations. Anal Chem 2000, 72, (3), 563-573.

164. Rozman, M.; Gaskell, S. J., Charge state dependent top-down characterisation using electron transfer dissociation. Rapid Commun Mass Spectrom 2012, 26, (3), 282-286.

165. Kaddis, C. S.; Loo, J. A., Native protein MS and ion mobility large flying proteins with ESI. Anal Chem 2007, 79, (5), 1778-1784.

166. Iavarone, A. T.; Jurchen, J. C.; Williams, E. R., Supercharged protein and peptide ions formed by electrospray ionization. Anal Chem 2001, 73, (7), 1455-1460.

167. Iavarone, A. T.; Williams, E. R., Mechanism of charging and supercharging molecules in electrospray ionization. J Am Chem Soc 2003, 125, (8), 2319-2327.

168. Lomeli, S. H.; Yin, S.; Ogorzalek Loo, R. R.; Loo, J. A., Increasing charge while preserving noncovalent protein complexes for ESI-MS. J Am Soc Mass Spectrom 2009, 20, (4), 593-596.

169. Lomeli, S. H.; Peng, I. X.; Yin, S.; Loo, R. R.; Loo, J. A., New reagents for increasing ESI multiple charging of proteins and protein complexes. J Am Soc Mass Spectrom 2010, 21, (1), 127-131.

170. Douglass, K. A.; Venter, A. R., Investigating the role of adducts in protein supercharging with sulfolane. J Am Soc Mass Spectrom 2012, 23, (3), 489-497.

171. Sterling, H. J.; Daly, M. P.; Feld, G. K.; Thoren, K. L.; Kintzer, A. F.; Krantz, B. A.; Williams, E. R., Effects of supercharging reagents on noncovalent complex structure in electrospray ionization from aqueous solutions. J Am Soc Mass Spectrom 2010, 21, (10), 1762-1774.

172. Hogan, C. J., Jr.; Ogorzalek Loo, R. R.; Loo, J. A.; de la Mora, J. F., Ion mobility-mass spectrometry of phosphorylase B ions generated with supercharging reagents but in charge- reducing buffer. Phys Chem Chem Phys 2010, 12, (41), 13476-13483.

173. Sterling, H. J.; Prell, J. S.; Cassou, C. A.; Williams, E. R., Protein conformation and supercharging with DMSO from aqueous solution. J Am Soc Mass Spectrom 2011, 22, (7), 1178-1186.

174. Valeja, S. G.; Tipton, J. D.; Emmett, M. R.; Marshall, A. G., New reagents for enhanced liquid chromatographic separation and charging of intact protein ions for electrospray ionization mass spectrometry. Anal Chem 2010, 82, (17), 7515-7519.

175. Sterling, H. J.; Cassou, C. A.; Trnka, M. J.; Burlingame, A. L.; Krantz, B. A.; Williams, E. R., The role of conformational flexibility on protein supercharging in native electrospray ionization. Phys Chem Chem Phys 2011, 13, (41), 18288-18296.

176. Lakshmanan, R.; Loo, R. R.; Loo, J. A., Reagents for Enhancing Protein Charging for Electrospray Ionization Mass Spectrometry from Organic/Aqueous Solvent Systems. (Manuscript under preparation).

49

177. Loo, J. A.; Edmonds, C. G.; Udseth, H. R.; Smith, R. D., Effect of reducing disulfide- containing proteins on electrospray ionization mass spectra. Anal Chem 1990, 62, (7), 693-698.

178. Kjeldsen, F.; Giessing, A. M.; Ingrell, C. R.; Jensen, O. N., Peptide sequencing and characterization of post-translational modifications by enhanced ion-charging and liquid chromatography electron-transfer dissociation tandem mass spectrometry. Anal Chem 2007, 79, (24), 9243-9252.

179. Meyer, J. G.; E, A. K., Charge state coalescence during electrospray ionization improves peptide identification by tandem mass spectrometry. J Am Soc Mass Spectrom 2012, 23, (8), 1390-1399.

180. Loo, R. R.; Dales, N.; Andrews, P. C., Surfactant effects on protein structure examined by electrospray ionization mass spectrometry. Protein Sci 1994, 3, (11), 1975-1983.

181. Sterling, H. J.; Williams, E. R., Real-Time Hydrogen/Deuterium Exchange Kinetics via Supercharged Electrospray Ionization Tandem Mass Spectrometry. Anal Chem 2010, 82, (21), 9050-9057.

182. Madsen, J. A.; Brodbelt, J. S., Comparison of infrared multiphoton dissociation and collision-induced dissociation of supercharged peptides in ion traps. J Am Soc Mass Spectrom 2009, 20, (3), 349-358.

183. Yin, S.; Loo, J. A., Top-Down Mass Spectrometry of Supercharged Native Protein-Ligand Complexes. Int J Mass Spectrom 2010, 300, (2-3), 118-122.

184. Miladinovic, S. M.; Fornelli, L.; Lu, Y.; Piech, K. M. S.; Girault, H. H.; Tsybin, Y. O., In-spray supercharging of peptides and proteins in electrospray ionization mass spectrometry. Anal Chem 2012, 84, (11), 4647-4651.

185. Liu, Y.; Miao, Z.; Lakshmanan, R.; Loo, R. R.; Loo, J. A.; Chen, H., Signal and charge enhancement for protein analysis by liquid chromatography–mass spectrometry with desorption electrospray ionization. Int J Mass Spectrom 2012, 325-327, 161-166.

186. Wang, N. H.; Lee, W. L.; Her, G. R., Signal enhancement for peptide analysis in liquid chromatography-electrospray ionization mass spectrometry with trifluoroacetic acid containing mobile phase by postcolumn electrophoretic mobility control. Anal Chem 2011, 83, (16), 6163- 6168.

187. Nikolaev, E. N.; Boldin, I. A.; Jertz, R.; Baykut, G., Initial experimental characterization of a new ultra-high resolution FTICR cell with dynamic harmonization. J Am Soc Mass Spectrom 2011, 22, (7), 1125-1133.

188. Qi, Y.; Witt, M.; Jertz, R.; Baykut, G.; Barrow, M. P.; Nikolaev, E. N.; O'Connor, P. B., Absorption-mode spectra on the dynamically harmonized Fourier transform ion cyclotron resonance cell. Rapid Commun Mass Spectrom 2012, 26, (17), 2021-2026.

189. Rose, R. J.; Damoc, E.; Denisov, E.; Makarov, A.; Heck, A. J., High-sensitivity Orbitrap mass analysis of intact macromolecular assemblies. Nat Methods 2012, 9, 1084-1086.

50

190. Lanucara, F.; Eyers, C. E., Top-down mass spectrometry for the analysis of combinatorial post-translational modifications. Mass Spectrom Rev 2012, doi: 10.1002/mas.21348.

191. Zhou, H.; Ning, Z.; Starr, A. E.; Abu-Farha, M.; Figeys, D., Advancements in top-down proteomics. Anal Chem 2011, 84, (2), 720-734.

192. Wu, S. L.; Kim, J.; Bandle, R. W.; Liotta, L.; Petricoin, E.; Karger, B. L., Dynamic profiling of the post-translational modifications and interaction partners of epidermal growth factor receptor signaling after stimulation by epidermal growth factor using Extended Range Proteomic Analysis (ERPA). Mol Cell Proteomics 2006, 5, (9), 1610-1627.

193. Wu, C.; Tran, J. C.; Zamdborg, L.; Durbin, K. R.; Li, M.; Ahlf, D. R.; Early, B. P.; Thomas, P. M.; Sweedler, J. V.; Kelleher, N. L., A protease for 'middle-down' proteomics. Nat Methods 2012, 9, (8), 822-824.

51

CHAPTER 2

Top-Down Protein Identification using a Single-Stage LC-ESI-TOF Mass Spectrometer

2.1. Abstract

Direct dissociation of gas phase intact proteins has the potential for identifying proteins from complex mixtures and elucidating post-translational modifications. Fragmentation of proteins in the nozzle-skimmer (NS) region of the mass spectrometer is an effective dissociation technique that can be utilized for top-down analysis. We demonstrate the capability to perform intact protein identifications in a single-stage time-of-flight (TOF) mass spectrometer by rapidly switching the voltage in the NS region during protein elution from a liquid chromatography (LC) column. The intact protein mass and fragment ion masses obtained at low and high NS voltages respectively were both measured using a stand-alone TOF mass analyzer. With the higher resolution TOF mass analyzer, sequence determination from multiply charged products is facilitated. By coupling on-line protein separations to dissociation in the atmospheric pressure/vacuum interface region of the mass spectrometer, we have identified the proteins in a standard mixture. Furthermore, this technique unambiguously identified all the proteins in the human 20S proteasome complex and characterized their N-terminal modifications. This data- independent pseudo-MS/MS top-down method is attractive for high throughput top-down protein identification since the relatively less expensive TOF instrument can provide highly accurate intact mass measurements and generate sequence tags that are typically 6-7 amino acids long.

52

2.2. Introduction

Mass spectrometry (MS) based proteomics is one of the powerful tools available to analyze the comprehensive set of proteins in simple to complex sample mixtures. Two major routes that are commonly used in MS based analysis are top-down and bottom-up methodologies. While the proteins are analyzed intact in the top-down methods, the bottom-up methods utilize proteases that cleave the protein at specific sites to generate smaller fragments that are then searched against the database to identify the protein.

Though the choice of the method depends on the application and both methods have its own advantages and disadvantages,1 top-down methods provide the intact mass of the molecule being studied and indicate the presence of any isoforms and/or proteoforms.2 Bottom- up methods, on the other hand, are based on analyzing the proteotypic peptides (peptides that represent the protein being identified) and generally do not yield this information.3 Given the increasing numbers of isoforms/proteoforms being identified in many biological processes and pathologies, and its utilization as biomarkers/therapy targets, it is important to use the top-down methods in order to identify them.4

Until recently, top-down methods have mostly been used for ‘characterization’ studies of relatively protein samples5 which typically yield more data and the sequence coverage obtained is high. With the major improvements in the intact protein separation methods6 and mass spectrometry instrumentation,7 top-down technique is starting to be utilized for ‘protein identification’ which serves to find the intact proteins present in the sample, but the sequence coverage is low. The Kelleher group has pioneered intact protein high throughput identification by combining on-line protein separation with mass spectrometry and they identified 22 intact proteins ranging from 14 to 35 kDa from yeast cell lysates in a single liquid chromatography

(LC)-tandem mass spectrometry run.8 In their study, collisionally activated dissociation (CAD)

53 was used to fragment the proteins eluting from a LC column and, the intact and fragment masses obtained were searched against the database to identify it.

Though many on-line protein identification methods use CAD due to its shorter data averaging time within the protein elution from the column,9 problems arise during precursor selection for CAD. During electrospray ionization (ESI), proteins are multiply charged and the signal gets split among the ESI generated multiple charge states. Often the three most abundant charge states are chosen by the automated charge state selection algorithm for CAD, since there is no rigorous rule in choosing the precursor ion. But this strategy might not always be effective, since different charge states of the same protein have shown to yield differences in the amount of sequence coverage obtained.10, 11 Also, the instrument time spent in this charge state selection based CAD lowers the duty cycle of the instrument. Moreover, the collision energy applied for dissociation has to be optimized for each precursor charge state for good sequence coverage. Owing to these problems, CAD is often limited to low MW proteins during on-line protein identification and so, one solution is to choose a dissociation technique that does not require any precursor selection prior to fragmentation. One such technique that is beginning to be employed in transforming top-down methods to high throughput usage is nozzle-skimmer dissociation (NSD).

NSD was first reported by Loo et al. in 1988,12 when they found that a bias in the nozzle voltage (in the source region of the instrument) dramatically changed the charge distribution of multiply charged proteins. Later this effect was found to be advantageous when it turned out that primary structure of proteins could be obtained by altering this voltage.13 Since then many research groups have used this technique for characterizing intact proteins by direct infusion.

However, it has not been used for proteins eluting from a LC column. The fragmentation obtained by NSD is comparable to CAD and the dissociation in the source region could also be used as an extra level of collision to perform MS/MS/MS in triple quadrupole instruments.14, 15

54

In 2009, Vellaichamy et al. employed NSD in a hybrid linear ion trap (LIT) – Fourier transform (FT) [also known as LTQ-FT] instrument for on-line top-down identification of proteins in the yeast cell lysates.16 In their method, when intact proteins elute from a polymeric column, the voltage in the nozzle-skimmer (NS) region was switched to aid the detection of intact mass using the ion trap at low voltage and fragment mass using the FT analyzer at high voltage.

Since then instruments with dual mass analyzers such as LTQ-FTs17 and orbitraps18-20 have been used for on-line top-down protein identification studies, where typically two mass analyzers are utilized to collect the intact and fragment ion data, and in these experiments, CAD in the ion trap or NSD in the source region was used to fragment the protein.

Though the mass accuracy and the resolution obtained during LC-MS/MS runs in LTQ-

FT and orbitrap instruments are extremely high, it also comes with an expensive price tag and the affordability of these instruments is less. Apart from these two platforms, top-down protein identification and characterization by infusing almost pure protein samples have been demonstrated in triple quadrupole,12 quadrupole ion trap,21 quadrupole time-of-flight (QTOF),22 and quadrupole FT-ICR23 instruments which have tandem mass analyzers for selecting the precursor ion during dissociation and also in single quadrupole instruments using NSD. TOF instruments have only one mass analyzer and tandem MS experiments by isolating intact protein charge states cannot be performed. But the resolution and mass accuracy of this instrument are reasonable to be investigated for top-down studies and by combining a data- independent dissociation method such as NSD, high throughput protein identifications are enabled. When combined with a front-end separation technique such as LC, it makes a cost effective platform to analyze intact protein samples of medium complexity and minimizes the need for tandem MS instruments to obtain some basic structural information for protein identification purposes. So far, the utility of ESI-TOF has been described by only one report where the authors characterized light and heavy chain subunits in antibodies using NSD in a

55

TOF instrument.24 More detailed studies on the applicability of ESI-NSD-TOF platform for protein mixtures of higher complexity remain to be explored. To transform it into a high throughput method, switching the NS voltage during the LC-MS run can be implemented.

We demonstrate that when intact proteins separated by LC elute from the column, it can be subjected to NSD in a single-stage TOF mass spectrometer by switching the NS voltage from low to high every second, and both the intact and fragment mass ions obtained at low and high voltages respectively, can be measured using the TOF analyzer. This approach is similar to the label-free quantification strategy commonly known as ‘MSE’ in the QTOF instruments for quantifying absolute amounts of peptides in shotgun proteomics experiments.25 MSE is performed without any precursor ion selection and the energy in the collision cell continuously alternates between a low-energy mode for peptide precursor identification and an elevated- energy mode for generation of peptide fragmentation data. By applying the MSE approach in the source region during on-line protein elution, we have shown that all the intact proteins in a standard protein cocktail can be identified. In addition, we have used the human 20S (h20S) proteasome as a model to demonstrate that the proteins in a complex mixture could be identified using this top-down approach (Figure 2.1). This is a proof-of-principle study to demonstrate that mass spectrometers with single mass analyzers could be used for top-down studies of moderately complex mixtures.

2.3. Experimental section

2.3.1. Sample preparation

All protein standards were purchased from Sigma-Aldrich (St. Louis, MO) unless otherwise stated. Albumin from bovine, porcine, rabbit, and sheep were made at 1 mg/mL concentration in MS grade water and diluted to 1.5 µM with 0.1% formic acid (FA) before infusion. 10 µL of the sample was injected into the Agilent dual ESI source.

56

Standard proteins (bradykinin, neurotensin, ubiquitin, ribonuclease A, transferrin, BSA, and carbonic anhydrase) were made at 1 mg/mL concentration in MS grade water. An equimolar mixture of these 7 proteins (10 pmoles each) was prepared by diluting in 0.1% FA.

The h20S proteasome complex was purchased from Boston Biochem (Cambridge, MA). 10 pmol of h20S sample was denatured with 0.5% FA and loaded on the column. The LC and MS parameters were the same for the analyses of standard proteins and h20S sample.

2.3.2. LC conditions

5 µL of the standard protein mix was injected into the Agilent C3 column (pore size 300

Ǻ) with dimensions 2.1 mm x 150 mm. LC-MS was performed with an Agilent 1200 LC system connected on-line with an Agilent 6220 ESI-TOF mass spectrometer (Santa Clara, CA). A resolving power of 10,000 at an m/z of 1000 is routinely achieved in this instrument and the mass window can go as high as 3200 m/z in the extended dynamic range mode. Solvent A was

0.1% FA and solvent B was 90% acetonitrile (ACN) / 10% H2O containing 0.1% FA. The flow rate was 0.2 mL/min and the following gradient was used: 5% B for 3 min, 5%-25% B in 1 min,

25%-55% B in 30 min, 55%-80% B in 2 min, and held at 80% B for 9 min. The column was finally equilibrated with 5% B for 10 min prior to the next run. The column temperature was maintained at 60° C throughout the run.

2.3.3. MS parameters

The source parameters in the MS were as follows: ESI capillary voltage 4300 V, drying gas temperature 350° C, drying gas flow rate 10 L/min, nebulizer pressure 25 psig and fragmentor voltage 250 V. To induce NSD during LC run, the fragmentor voltage was altered between 250 V and 350 V every second during the elution window. The data was acquired at the rate of 1 spectrum / second and the acquisition window was set from 100 to 3000 m/z.

57

2.3.4. Data analysis

The peaks in the total ion chromatogram (TIC) were manually integrated and the mass spectra at different fragmentor voltages were obtained using the Agilent MassHunter Qualitative

Analysis Program. The multiple charge state distributions of intact proteins were deconvoluted using the maximum entropy deconvolution algorithm. The fragment ion spectra were manually sequenced to find sequence tags generated from the protein.

2.4. Results and discussion

2.4.1. Optimization of fragmentor voltage

NSD in the Agilent ESI-TOF instrument is the fragmentation that occurs when the ions are transported from the back end of the capillary into the skimmer. Generally, small molecules and peptides require low fragmentor voltages (175 V) and proteins need higher voltages (250 V) for intact mass analysis. First we optimized the fragmentor voltage for NSD by comparing the efficiency of fragmentation at different voltages ranging from 250-375 V using BSA sample

(Figure 2.2). We found that the signal intensity and number of fragment ions observed increases as we increase the voltage up to 350 V and any further increase in the voltage does not generate new fragment ions. Therefore, we decided to use 250 V for intact protein analysis and

350 V for fragmentation.

As a next step, we evaluated the optimized voltage (350 V) by comparing the fragmentation efficiencies of albumin from different species. Sequence alignment of the N- terminal 40 amino acid residues of albumins from bovine, sheep, porcine, and rabbit show that

26 out of 40 residues are conserved among all four species (Figure 2.3). The mass spectra obtained at 350 V from these different albumin species shows differences in the fragment ions obtained (Figure 2.4).

58

Further mapping these fragments generates sequence tags which are specific for each albumin species (Figure 2.5). Loo et al. have extensively analyzed albumins from a variety of species using NSD and CAD in a triple quadrupole instrument and they reported sequence specific differences from each species.13 We did notice many of the findings they reported such as abundant b-series ions (at 2+ to 6+ charge states) originating from the N-terminal region

(amino acid position 20-25) from all albumin species and occurrence of y-series ions only for porcine albumin. Moreover, Loo et al. had noticed an error in the published rabbit albumin sequence at position 20 and the fragment ions they observed suggested the presence of

13 3+ 3+ isoleucine instead of lysine. The b19 and b20 ions that we observed in our data for rabbit albumin confirm their finding and the sequence conflict is now resolved in the database.

The mass accuracy of the fragment ions is sufficient to distinguish lysine (‘K’, incremental mass = 128.094) from glutamine (‘Q’, incremental mass = 128.058). This is illustrated in Table 2.1 for sheep albumin which has a ‘Q’ at position 20 from the N-terminus.

The experimental m/z of fragment ions obtained is compared with calculated m/z when ‘Q’ is present and if ‘Q’ is replaced by ‘K’. To distinguish K from Q at an m/z of 2400 (which is the mass of the peptides that originate from residue 20), the error on mass accuracy should be less than 16 ppm. The actual errors are well within 5-6 ppm, which suggests that K and Q can be accurately identified in this instrument. We found that source-induced fragmentation is sufficiently efficient to measure sequence information from the N-terminal 40 residues of all albumins and can distinguish sequence variations among albumins from different species.

2.4.2. LC-MS of standard protein mix

NSD can be used as a protein identification technique to find the proteins in samples of medium complexity. The upper limit of the number of proteins that can be identified depends on the up-front separation technique and the duty cycle of the mass spectrometer. To test this

59 possibility, we separated a mix of standard proteins on the C3 column and when each protein elutes, the fragmentor voltage was altered from low to high every second. The intact mass spectrum obtained at 250 V and fragmentation spectrum generated at 350 V were analyzed to identify the proteins and each peak in the TIC was mapped with the corresponding protein eluting in it (Figure 2.6). For small peptides like bradykinin, 100% sequence coverage was obtained (Figure 2.7A). For high MW proteins such as BSA and transferrin, 7-8 amino acid sequence tags mostly from the ends of the protein were identified (Figure 2.7B and C). When the sequence coverage of BSA obtained during LC-MS is compared with direct infusion experiment, the efficiency of fragmentation is similar, which demonstrates that there is no loss in fragmentation efficiency at comparatively higher flow rates (100 fold higher) during the LC run.

The bovine carbonic anhydrase 2 sample contained bovine superoxide dismutase C

(SOD C) as a contaminant and we found this by using both the intact mass and the mass of fragment ions obtained by NSD (Figure 2.8). This protein is interesting as it has three cysteine residues with one of them in the reduced state (Cys6) and the remaining two are disulfide bonded (Cys56 ↔ Cys145). The experimental average mass of 15591.34 matches with a deviation of 3 ppm to the theoretical mass of 15591.39 calculated with two cysteines disulfide bonded and one free cysteine. Furthermore, the sequence tag 2TKAVCVL8 of singly charged fragment ions from the amino-terminus of SOD C was obtained and this confirms the presence of free cysteine at this position from the amino acid incremental mass obtained. Since the sequence information within the disulfide bonded regions is seldom obtained, the dissociation of backbone carbonyl bonds at position 6 supports that it is a free cysteine. Although the information about the nature of cysteine residues has already been reported in the UniProt database, the intact and fragment ion from SOD C data shows that the state of cysteines can be obtained by this method.

60

2.4.3. LC-MS of h20S proteasome sample

The top-down method that we developed was applied to a physiological protein complex such as the human proteasome complex with the purpose of protein identification in the LC timescale. The goal was to measure the intact mass of the proteins followed by fragmentation to determine short stretches of amino acid sequence for protein identification. In addition, the observed and theoretical intact masses were compared to identify post-translational modifications (PTMs) if present.

The 26S proteasome complex is a supra-macromolecular assembly of different polypeptides that function to degrade a majority of intracellular proteins.26 The two main components of the proteasome complex are the core particle or the 20S proteasome complex where the proteolysis occurs and the 19S regulatory particle which controls protein entry and has ATPase subunits that power the protein degradation. These two components function together to perform nonspecific protein degradation by allowing the entry of the substrate protein, proteolytic cleavage of the protein at specific sites followed by exit of the short peptide chains which are then cleaved into constituent amino acids and finally salvaged into various biosynthetic pathways. This mechanism along with ubiquitin as the intra-cellular molecular tag to label proteins for degradation is known as the ‘Ubiquitin-Proteasomal System’ of protein degradation.27

The proteins that constitute the 20S complex form a hollow cylinder shaped structure that can be composed of only two distinct subunits as in prokaryotes to 14 different subunits as in eukaryotes.28 In humans, the 20S proteasome complex is made up of four stacked rings, and each ring is a heptamer of 7 different α subunits or 7 different β subunits. The α subunits form the two outer rings and the β subunits form the two inner rings such that the whole 20S proteasome is arranged as a α7β7α7β7 barrel.

61

The first detailed study on proteomic characterization of the human 26S proteasome complex was done by Wang et al. using bottom-up proteomic approaches.29 Though they have listed the subunits in the 19S and 20S particles along with the N-terminal modification on all subunits and post-translational modifications on few of them, the information about the intact mass of these subunits could not be obtained from this experimental approach. Later in 2008,

Uttenweiler-Joseph et al. used a combination of proteomic approaches, and measured the intact mass of all the subunits in h20S complex by eluting proteins intact from native two dimensional gels followed infusing the subunits and analysis by high resolution FT-ICR MS.30 However, this method is time consuming and requires the possibility that intact proteins have to be eluted from the gel in sufficient amounts to be analyzed by top-down methods.

We tried our top-down approach on the proteasome sample by separating the proteins based on their retention on a C3 column and dissociating it in the NS region by applying high voltage. All the proteins eluted between 20 and 32 min as shown in the TIC (Figure 2.9).

Alternating the NS voltage during this elution window generated both intact and fragment mass spectra for all the proteins eluting within this timeframe. With the intact mass and the sequence tags generated from fragment mass spectra, the proteins in the h20S complex were identified and one such representative identification is shown for the ß6 subunit (Figure 2.10). The list of all the proteins identified along with the sequence tag that was obtained by NSD is shown in

Table 2.2.

We found that 6 proteins in the h20S complex elute singly and the rest 8 proteins elute in pairs of two (Figure 2.11). Overlapping charge state distributions in the intact mass spectra can be found for proteins that co-elute (Figure 2.12). By comparing the number of sequence tags obtained for each protein, we can see that the proteins that eluted singly fragmented well than the co-eluted proteins. Except the co-eluting proteins and the β2 subunit, all the other proteins were identified by sequence tags which have 6 to 7 amino acids (Table 2.2). The resolution

62

(around 10,000 for an ion at 1000 m/z) available in the TOF instrument is efficient to precisely assign the multiply charged fragment ions (up to 6+) during sequencing. The mass accuracy within a deviation of 10 ppm on average for intact and 7 ppm for fragment ions was sufficient to identify the proteins.

2.4.4. Characterization of the N-terminus and PTMs in h20S proteasome subunits

The fragment ion data acquired at high NS voltage showed the presence of fragment ions that indicate the nature of N-terminus for all the subunits in the complex. Out of the 14 proteins, 9 proteins were found to be acetylated at the N-terminus. The rest 5 subunits that were found to be unmodified belonged to the β hepatmeric ring. For the co-eluting proteins, the N- terminal modification was found by correlating the experimental mass obtained with the intact mass with the N-terminal modification suggested in the UniProt database identified from top- down and bottom-up methods of the h20S complex.

Even though many of the subunits in the 20S proteasomal core were reported to be phosphorylated,31 it was seldom identified since there is no enrichment for the phosphorylated subunits and the extent of ionization of phosphorylated species could be much less than their non-phosphorylated equivalents in positive ESI. Only the α3 subunit was found to be phosphorylated and this was identified by the presence of the unmodified and singly

- phosphorylated forms of this subunit, differing by 80 Da (which is the mass of the [PO3] group) in the deconvoluted mass spectrum (Figure 2.13). The signal from the phosphorylated form was more intense than the non-phosphorylated counterpart as visualized by two dimensional gel electrophoresis by the Monsarrat group.32 But the site of modification was not identified since this protein co-elutes with ß2 subunit and the fragmentation efficiency is usually low during co- elution.

63

Many research groups have infused targeted proteins and performed NS dissociation in single mass analyzer instruments and employed it as a pseudo tandem MS method. However, when combined with an up-front protein separation technique such as LC, it is an in-expensive solution to identify proteins in complex samples. The method was successful in that it identified all the proteins in the complex and characterized the N-terminal modifications. With the resolving power available in this instrument, fragment ions up to 6+ charges were unambiguously assigned. While this is sufficient for identifying non-homologous proteins, for proteins which have a relatively high degree of sequence similarity in one or both N-/C- terminals, higher resolution instruments such as FT-ICR MS or orbitrap are necessary. With higher resolution, the higher m/z fragments originating from cleavages in the middle of the protein can be identified and this yields more sequence coverage to identify homologous proteins.

2.5. Conclusions

The top down strategy that we have described above is well-suited for analyzing samples of medium complexity like physiological protein complexes, proteins eluted off one dimensional gel bands, and protein complexes pulled down by co-immunoprecipitation methods.

The factor that limits the number of proteins identified is the separation of proteins by LC, which can be improved by making use of up-front separation techniques like isoelectric focusing by free-flow electrophoresis,33 GELFrEE,34, 35 and multidimensional separation using columns connected end-to-end.36 As shown by the Kelleher group, when coupled to additional orthogonal separation techniques, it is possible to characterize the proteome from cell lysates. 37

In the bottom-up protein identification methods, the identification is based on matching the precursor and fragment ion spectra of the peptides to the one in the database. In our approach, both the intact mass of the protein and the mass of fragment ions generated by

64 dissociation are used for protein identification. This narrows down the possible protein hits, since in bottom-up methods, when the number of matching peptides is low for a protein identified, it increases the number of potential proteins it could match to and distinguishing proteins that share same peptides is not possible. Another important advantage of this method is that it is rapid, dependent only on the chromatographic time and does not require any precursor ion selection by the acquisition software.

Data analysis could be improved by automating the proteins identified in each run. The rate limiting step in automation is deconvolution of fragment ion spectra. A typical fragment ion spectrum generated in this method does not have a precursor ion and cannot be deconvoluted by the available deconvolution algorithms, since they require a precursor ion. By utilizing deconvolution methods that does not necessitate specifying the precursor ion and by obtaining the monoisotopic mass of the fragments observed, it is possible to search the databases using top-down search engines such as ProSight38 and BIG Mascot,39 and thereby progress towards whole proteome applications.

65

2.6. References

1. Armirotti, A.; Damonte, G., Achievements and perspectives of top-down proteomics. Proteomics 2010, 10, (20), 3566-3576.

2. Casado-Vela, J.; Cebrian, A.; Gomez del Pulgar, M. T.; Sanchez-Lopez, E.; Vilaseca, M.; Menchen, L.; Diema, C.; Selles-Marchart, S.; Martinez-Esteso, M. J.; Yubero, N.; Bru-Martinez, R.; Lacal, J. C., Lights and shadows of proteomic technologies for the study of protein species including isoforms, splicing variants and protein post-translational modifications. Proteomics 2011, 11, (4), 590-603.

3. Cui, W.; Rohrs, H. W.; Gross, M. L., Top-down mass spectrometry: recent developments, applications and perspectives. Analyst 2011, 136, (19), 3854-3864.

4. Stastna, M.; Van Eyk, J. E., Analysis of protein isoforms: can we do it better? Proteomics 2012, 12, (19-20), 2937-2948.

5. McLafferty, F. W.; Breuker, K.; Jin, M.; Han, X.; Infusini, G.; Jiang, H.; Kong, X.; Begley, T. P., Top-down MS, a powerful complement to the high capabilities of proteolysis proteomics. FEBS J 2007, 274, (24), 6256-6268.

6. Capriotti, A. L.; Cavaliere, C.; Foglia, P.; Samperi, R.; Lagana, A., Intact protein separation by chromatographic and/or electrophoretic techniques for top-down proteomics. J Chromatogr A 2011, 1218, (49), 8760-8776.

7. Zhou, H.; Ning, Z.; Starr, A. E.; Abu-Farha, M.; Figeys, D., Advancements in top-down proteomics. Anal Chem 2011, 84, (2), 720-734.

8. Parks, B. A.; Jiang, L.; Thomas, P. M.; Wenger, C. D.; Roth, M. J.; Boyne, M. T., 2nd; Burke, P. V.; Kwast, K. E.; Kelleher, N. L., Top-down proteomics on a chromatographic time scale using linear ion trap fourier transform hybrid mass spectrometers. Anal Chem 2007, 79, (21), 7984-7991.

9. Lee, J. E.; Kellie, J. F.; Tran, J. C.; Tipton, J. D.; Catherman, A. D.; Thomas, H. M.; Ahlf, D. R.; Durbin, K. R.; Vellaichamy, A.; Ntai, I.; Marshall, A. G.; Kelleher, N. L., A robust two- dimensional separation for top-down tandem mass spectrometry of the low-mass proteome. J Am Soc Mass Spectrom 2009, 20, (12), 2183-2191.

10. Scherperel, G.; Reid, G. E., Emerging methods in proteomics: top-down protein characterization by multistage tandem mass spectrometry. Analyst 2007, 132, (6), 500-506.

11. Chanthamontri, C.; Liu, J.; McLuckey, S. A., Charge State Dependent Fragmentation of Gaseous alpha-Synuclein Cations via Ion Trap and Beam-Type Collisional Activation. Int J Mass Spectrom 2009, 283, (1-3), 9-16.

12. Loo, J. A.; Udseth, H. R.; Smith, R. D., Collisional Effects on the Charge Distribution of Ions from Large Molecules, Formed by Electrospray-ionization Mass Spectrometry. Rapid Commun Mass Spectrom 1988, 2, (10), 207-210.

66

13. Loo, J. A.; Edmonds, C. G.; Smith, R. D., Tandem mass spectrometry of very large molecules: serum albumin sequence information from multiply charged ions formed by electrospray ionization. Anal Chem 1991, 63, (21), 2488-2499.

14. Loo, J. A.; Edmonds, C. G.; Smith, R. D., Primary sequence information from intact proteins by electrospray ionization tandem mass spectrometry. Science 1990, 248, (4952), 201-204.

15. Bure, C.; Lange, C., Comparison of Dissociation of Ions in an Electrospray Source, or a Collision Cell in Tandem Mass Spectrometry. Curr Organic Chem 2003, 7, 1613-1624.

16. Vellaichamy, A.; Tran, J. C.; Catherman, A. D.; Lee, J. E.; Kellie, J. F.; Sweet, S. M.; Zamdborg, L.; Thomas, P. M.; Ahlf, D. R.; Durbin, K. R.; Valaskovic, G. A.; Kelleher, N. L., Size- sorting combined with improved nanocapillary liquid chromatography-mass spectrometry for identification of intact proteins up to 80 kDa. Anal Chem 2010, 82, (4), 1234-1244.

17. Tran, J. C.; Zamdborg, L.; Ahlf, D. R.; Lee, J. E.; Catherman, A. D.; Durbin, K. R.; Tipton, J. D.; Vellaichamy, A.; Kellie, J. F.; Li, M.; Wu, C.; Sweet, S. M.; Early, B. P.; Siuti, N.; LeDuc, R. D.; Compton, P. D.; Thomas, P. M.; Kelleher, N. L., Mapping intact protein isoforms in discovery mode using top-down proteomics. Nature 2011, 480, (7376), 254-258.

18. Ahlf, D. R.; Compton, P. D.; Tran, J. C.; Early, B. P.; Thomas, P. M.; Kelleher, N. L., Evaluation of the compact high-field orbitrap for top-down proteomics of human cells. J Proteome Res 2012, 11, (8), 4308-4314.

19. Mohr, J.; Swart, R.; Samonig, M.; Bohm, G.; Huber, C. G., High-efficiency nano- and micro- HPLC--high-resolution Orbitrap-MS platform for top-down proteomics. Proteomics 2010, 10, (20), 3598-3609.

20. Michalski, A.; Damoc, E.; Lange, O.; Denisov, E.; Nolting, D.; Muller, M.; Viner, R.; Schwartz, J.; Remes, P.; Belford, M.; Dunyach, J. J.; Cox, J.; Horning, S.; Mann, M.; Makarov, A., Ultra high resolution linear ion trap Orbitrap mass spectrometer (Orbitrap Elite) facilitates top down LC MS/MS and versatile peptide fragmentation modes. Mol Cell Proteomics 2011, 11, (3), O111.013698.

21. Mekecha, T. T.; Amunugama, R.; McLuckey, S. A., Ion trap collision-induced dissociation of human hemoglobin alpha-chain cations. J Am Soc Mass Spectrom 2006, 17, (7), 923-931.

22. Armirotti, A.; Benatti, U.; Damonte, G., Top-down proteomics with a quadrupole time-of-flight mass spectrometer and collision-induced dissociation. Rapid Commun Mass Spectrom 2009, 23, (5), 661-666.

23. Cobb, J. S.; Easterling, M. L.; Agar, J. N., Structural characterization of intact proteins is enhanced by prevalent fragmentation pathways rarely observed for peptides. J Am Soc Mass Spectrom 2010, 21, (6), 949-959.

24. Ren, D.; Pipes, G. D.; Hambly, D.; Bondarenko, P. V.; Treuheit, M. J.; Gadgil, H. S., Top- down N-terminal sequencing of Immunoglobulin subunits with electrospray ionization time of flight mass spectrometry. Anal Biochem 2009, 384, (1), 42-48.

67

25. Geromanos, S. J.; Vissers, J. P.; Silva, J. C.; Dorschel, C. A.; Li, G. Z.; Gorenstein, M. V.; Bateman, R. H.; Langridge, J. I., The detection, correlation, and comparison of peptide precursor and product ions from data independent LC-MS with data dependant LC-MS/MS. Proteomics 2009, 9, (6), 1683-1695.

26. Bedford, L.; Paine, S.; Sheppard, P. W.; Mayer, R. J.; Roelofs, J., Assembly, structure, and function of the 26S proteasome. Trends Cell Biol 2010, 20, (7), 391-401.

27. Wong, E.; Cuervo, A. M., Integration of clearance mechanisms: the proteasome and autophagy. Cold Spring Harb Perspect Biol 2010, 2, (12), a006734.

28. Gallastegui, N.; Groll, M., The 26S proteasome: assembly and function of a destructive machine. Trends Biochem Sci 2010, 35, (11), 634-642.

29. Wang, X.; Chen, C. F.; Baker, P. R.; Chen, P. L.; Kaiser, P.; Huang, L., Mass spectrometric characterization of the affinity-purified human 26S proteasome complex. Biochemistry 2007, 46, (11), 3553-3565.

30. Uttenweiler-Joseph, S.; Claverol, S.; Sylvius, L.; Bousquet-Dubouch, M. P.; Burlet-Schiltz, O.; Monsarrat, B., Toward a full characterization of the human 20S proteasome subunits and their isoforms by a combination of proteomic approaches. Methods Mol Biol 2008, 484, 111- 130.

31. Lu, H.; Zong, C.; Wang, Y.; Young, G. W.; Deng, N.; Souda, P.; Li, X.; Whitelegge, J.; Drews, O.; Yang, P. Y.; Ping, P., Revealing the dynamics of the 20 S proteasome phosphoproteome: a combined CID and electron transfer dissociation approach. Mol Cell Proteomics 2008, 7, (11), 2073-2089.

32. Claverol, S.; Burlet-Schiltz, O.; Girbal-Neuhauser, E.; Gairin, J. E.; Monsarrat, B., Mapping and structural dissection of human 20 S proteasome using proteomic approaches. Mol Cell Proteomics 2002, 1, (8), 567-578.

33. Ouvry-Patat, S. A.; Torres, M. P.; Quek, H. H.; Gelfand, C. A.; O'Mullan, P.; Nissum, M.; Schroeder, G. K.; Han, J.; Elliott, M.; Dryhurst, D.; Ausio, J.; Wolfenden, R.; Borchers, C. H., Free-flow electrophoresis for top-down proteomics by Fourier transform ion cyclotron resonance mass spectrometry. Proteomics 2008, 8, (14), 2798-2808.

34. Tran, J. C.; Doucette, A. A., Gel-eluted liquid fraction entrapment electrophoresis: an electrophoretic method for broad molecular weight range proteome separation. Anal Chem 2008, 80, (5), 1568-1573.

35. Witkowski, C.; Harkins, J., Using the GELFREE 8100 Fractionation System for molecular weight-based fractionation with liquid phase recovery. J Vis Exp 2009, (34), 1842.

36. Wu, Q.; Yuan, H.; Zhang, L.; Zhang, Y., Recent advances on multidimensional liquid chromatography-mass spectrometry for proteomics: from qualitative to quantitative analysis-A review. Anal Chim Acta 2012, 731, 1-10.

37. Kellie, J. F.; Tran, J. C.; Lee, J. E.; Ahlf, D. R.; Thomas, H. M.; Ntai, I.; Catherman, A. D.; Durbin, K. R.; Zamdborg, L.; Vellaichamy, A.; Thomas, P. M.; Kelleher, N. L., The emerging 68 process of Top Down mass spectrometry for protein analysis: biomarkers, protein-therapeutics, and achieving high throughput. Mol Biosyst 2010, 6, (9), 1532-1539.

38. Leduc, R. D.; Kelleher, N. L., Using ProSight PTM and related tools for targeted protein identification and characterization with high mass accuracy tandem MS data. Curr Protoc Bioinformatics 2007, 19, 13.6.1-13.6.28.

39. Karabacak, N. M.; Li, L.; Tiwari, A.; Hayward, L. J.; Hong, P.; Easterling, M. L.; Agar, J. N., Sensitive and specific identification of wild type and variant proteins from 8 to 669 kDa using top-down mass spectrometry. Mol Cell Proteomics 2009, 8, (4), 846-856.

69

Figure 2.1. Schematic representation of the top-down identification of proteasome complex proteins using LC-ESI-TOF MS. When each protein elutes from the column, the voltage in the source region is switched between low and high to obtain intact and fragment ion masses respectively, and both masses were measured using the single-stage TOF mass analyzer.

70

Fragments Intact molecules

250 V

270 V

300 V

325 V

350 V

4+ b24 4+ 4+ b26 b23 375 V 3+ 3+ b22 b23 3+ b24 4+ b22

3+ b26

500 700 900 1100 1300 1500 1700 1900 2100 m/z

Figure 2.2. Mass spectra of BSA samples at different fragmentor voltages to compare the efficiency of fragmentation. Both the intensity and number of fragment ions increase with increasing voltage until 350 V after which there is no further increase. The b-ions pertaining to the N-terminal 30 amino acids obtained at different charge states (4+ in green and 3+ in blue) are shown for one representative spectrum.

71

Bovine 1DTHKSEIAHRFKDLGEEHFKGLVLIAFSQYLQQCPFDEHV40 Porcine DTYKSEIAHRFKDLGEQYFKGLVLIAFSQHLQQCPYEEHV Sheep DTHKSEIAHRFNDLGEENFQGLVLIAFSQYLQQCPFDEHV Rabbit EAHKSEIAHRFNDVGEEHFIGLVLITFSQYLQKCPYEEHA

Figure 2.3. Homology between the N-terminal 40 amino acid residues in albumins from different species. Shown in red are the conserved residues and the residues that are similar in two or three species are in blue. The residues labeled in black are not conserved.

72

A Bovine

B Porcine

C Sheep

D Rabbit

500 700 900 1100 1300 1500 1700 1900 2100 m/z

Figure 2.4. Nozzle-skimmer dissociation of albumins from different species. Mass spectra of albumins from (A) bovine, (B) porcine, (C) sheep, and (D) rabbit acquired at high NS voltage.

The amino-terminal sequence variations among these albumins were determined using NSD.

73

A Bovine * * 4+ b24 4+ b23 4+ 3+ 5+ b26 3+ b b24 5+ b22 23 3+ b26 4+ 3+ b24 5+ 5+ b22 b 4+ b b23 b25 25 21 5+ 4+ 4+ 4+ b b 4+ b28 b 4+ 22 21 5+ b 29 b30 b27 27 4+ b31

4+ B b23 Porcine

4+ b24 4+ b22

3+ 3+ b22 3+ b21 b23 4+ y8 b21 5+ y7 5+ b25 4+ b24 5+ b25 4+ 3+ b26 b26 b20 y6

C * Sheep *

3+ 3+ b18 3+ 3+ b17 b21 4+ 4+ b 4+ 3+ 3+ 3+ 4+ 3+ b20 16 b22 b 2+ 3+ b 4+ b15 b21 b22 b20 4+ 23 3+ b13 18 b19 b30 b24 b16

3+ D b22 Rabbit 3+ 3+ 4+ b21 b23 b 4+ b22 4+ 21 b23

4+ b24 5+ b 3+ 5+ b25 24 b24 5+ b26 b 4+ b 3+ 3+ 25 19 b20

520 540 560 580 600 620 640 660 680 700 720 740 760 780 800 820 840 860 880 900 920 m/z

Figure 2.5. Fragment ion assignments in mass spectra of albumins from different species. Mass spectra of albumins from (A) bovine, (B) porcine, (C) sheep, and (D) rabbit acquired at high NS voltage. The 500-900 m/z region is zoomed-in to show the b- and y-series fragment ion assignments. The b-ions pertaining to the N-terminal 30 amino acids obtained at different charge states (5+ in red, 4+ in green, 3+ in blue, and 2+ in purple) are shown. For porcine albumin, a y-ion sequence tag (shown in black) was also observed. The peaks labeled with an * are contaminant peaks that originated from albumin samples.

74

Carbonic anhydrase BSA

Ubiquitin

Ribonuclease A Superoxide dismutase C Neurotensin Relativeintensity

Transferrin Bradykinin

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Time (min)

Figure 2.6. TIC of a standard 7 protein mix run through C3 column. The peaks are labeled with the corresponding protein eluting in it after analyzing the intact and fragment ions generated from it. Superoxide dismutase C was a contaminant present in carbonic anhydrase 2 sample.

75

F P S F G P P A a5 y7 y1 a6 y6 y8 y3

y4 y5 y2

150 200 250 300 350 400 m/z 450 500 550 600 650 700 750 800 850 900

B 4 4 b23 b24

b 5 5 4 24 b26 b26 b 3 3 4 22 b23 3 5 b22 b 3 b24 b23 21 5 4 b25 b25 4 b21 5 4 5 5 b30 4 b28 4 b22 b27 b27 b29 4 b30 4 b31

520 540 560 580 600 620 640 660 680 700 720 740 760 780 800 820 840 860 880 900 920 940 m/z

6 C b58

6 b59 b 6 7 56 7 b60 7 b58 6 b56 b60 6 b 7 b61 59 b 7 7 61 6 b57 b57

840 860 880 900 920 940 960 980 1000 1020 1040 1060 1080 1100 1120 m/z

Figure 2.7. Fragmentation of proteins in the standard 7 protein mix at high NS voltage during elution from the column. Mass spectrum of (A) bradykinin [1RPPGFSPFR9] showing the y-series sequence tag, (B) BSA showing the b-series ions obtained at different charge states 5+

(orange), 4+ (blue), and 3+ (red) from the N-terminal 21-31 amino acids, and (C) human transferrin showing the b-series ions obtained at 6+ (black) and 7+ (purple) charge states from the amino-terminal 56-61 amino acids.

76

A 15+ 15591.42 16+

14+

17+ 15510 15550 15590 15630 Deconvoluted Mass (Da) 13+ 18+ 12+ 11+ 10+

850 900 950 1000 1050 1100 1150 1200 1250 1300 1350 1400 1450 1500 1550 m/z

B K A V C V L 2TKAVCVL8

4 b28 b 4 4 b27

b 4 b 26 4 b 2 6 b29 12 b 3 26 3 b27 3 b5 b28 b7 4 b3 4 b32 b2 4 b30 b25 4 b 3 b31 25 b8

250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 m/z

Figure 2.8. Nozzle-skimmer dissociation of superoxide dismutase C during its elution from the

LC column. (A) Intact mass spectrum of superoxide dismutase C with the inset showing the deconvoluted intact mass of the same and (B) fragment ion spectrum of superoxide dismutase

C showing the fragment ion assignments and the sequence tag obtained.

77

1 3

2 7 5 4 8 9 10 6 Relative intensity Relative

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Acquisition Time (min)

Figure 2.9. TIC of h20S sample run through C3 column. The peaks obtained are labeled from 1 through 10. The NS voltage was altered from low to high every second during the elution window from 20 to 33 minutes.

78

A 20+ 21903.7

22+

18+

24+ 21700 21850 22000 22150 22300 Deconvoluted Mass (Da)

16+

26+ 14+ 12+ 10+

800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300 m/z

B y3 2TIMAVQ7 I M AV Q

b5 b3 b2 b 4 b6 b7

250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1000 1050 m/z

Figure 2.10. Nozzle-skimmer dissociation of ß6 subunit during its elution from the LC column.

(A) Intact mass spectrum of protein eluting in peak 6 in the TIC obtained at low NS voltage and

(B) fragment mass spectrum obtained at high NS voltage from peak 6 in the TIC. The deconvoluted mass is shown in the inset of the intact mass spectrum and amino acid sequence tag deduced by mapping the fragment ions is given in the fragment ion spectrum. From the intact and fragment masses, this protein was identified as ß6 subunit.

79

α1, ß5 α2, ß1

α7, ß7 α3, ß2 α6 ß4 α4 ß3 α5 ß6 Relative intensity Relative

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 Acquisition Time (min)

Figure 2.11. TIC of h20S sample with the peaks labeled with the corresponding h20S subunits eluting in it. α and ß subunits are labeled in red and green respectively.

80

α1, ß5 α6

α2,ß1 α7, ß7

α3, ß2 ß3

α4 ß4

α5 ß6

700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 m/z m/z

Figure 2.12. Intact mass spectra of all the proteins in the h20S complex identified by LC separation followed by NSD. Co-eluted proteins are shown together in the same spectrum.

81

5 x10 - α3 protein + [PO3] 2.4 28423.98 2.2 2 -79.48 Da 1.8 1.6 1.4 1.2 1 0.8

Relative Intensity 0.6 α3 protein 0.4 28344.50 Da 0.2 0 28200 28250 28300 28350 28400 28450 28500 28550 28600 28650 28700 28750 28800 28850 Deconvoluted Mass (Da)

Figure 2.13. Deconvoluted mass spectrum of the α3 subunit of h20S proteasome. The singly phosphorylated form is more intense than the unmodified form of the protein.

82

Error in ppm Error in ppm Residue a b Experimental Calculated Calculated between between position from Amino acid a b b3+ b3+ b3+ Calculated and Calculated and N-terminus Experimental b3+ Experimental b3+ 20 Q 790.3648 790.3686 790.3807 4.7488 20.0578

21 G 809.3720 809.3757 809.3878 4.5550 19.5456

22 L 847.0655 847.0704 847.0825 5.7689 20.0925

23 V 880.0877 880.0932 880.1053 6.2342 20.0203

24 L 917.7826 917.7879 917.8000 5.7602 18.9439

a 3+ b if ‘Q’ is present b 3+ b if ‘K’ is present

Table 2.1. ESI bn product ion m/z for sheep albumin. Error in mass accuracy between calculated and experimental m/z when ‘Q’ is present at position 20 in sheep albumin is compared to the error in accuracy when ‘K’ is supposed to replace ‘Q’. The deviation in the mass at an m/z of 800 is within 6 ppm which is sufficient to differentiate ‘K’ from ‘Q’.

83

Average Average Error in mass Sequence tag obtained UniProt ID Subunit Theoretical Experimental Modification accuracy (ppm) from NSD MW (Da) MW (Da)

249PADEPA254 P25786 α1 29597.7 29597.8 3.38 260PME262 +Nacetyl Metini 3RNQ5 YSFSLTT -Met P25787 α2 25809.4 25809.39 0.39 5 11 ini 8SLTTFS13 +Nacetyl Ala1

-Metini P25788 α3 28424.1 28423.9 7.04 Co-eluting protein +Nacetyl Ser1 - +[PO3] CLGI -Met P25789 α4 29394.7 29395.04 11.57 33 36 ini 239HEEEEA244 +Nacetyl Ser1

13NTFSPEGR20 23QVEYAIEAI31 P28066 α5 26453.1 26452.8 11.34 36TAIGIQT42 +Nacetyl Metini 39GIQT42 218ATVQP222

2RGSSAG7 10RHITIFS16 ITIFS -Met P60900 α6 27310.3 27310.32 0.73 12 16 ini 238LVAL241 +Nacetyl Ser1 235DAH237 219VTVENP224 -Met O14818 α7 27797.7 27798.1 14.39 Co-eluting protein ini +Nacetyl Ser1

P20618 β1 23549.0 23548.3 29.73 Co-eluting protein No modification

2EYLIGI7 P49721 β2 22878.3 22878.32 0.87 +Nacetyl Metini 195SFPK198

146YGMCESLWEP155 -Metini P49720 β3 22859.8 22859.81 0.44 185IVH187 +Nacetyl Ser1 186VHIIEKDK193

P28070 β4 24391.8 24391.6 8.20 2QNPMVT7 No modification

P28074 β5 22458.4 22458.3 4.45 Co-eluting protein No modification

P28072 β6 21903.9 21903.7 9.13 2TIMAVQF8 No modification

Q99436 β7 25295.04 25295.05 0.4 229QTM231 No modification

Table 2.2. Summary of h20S proteasome subunits identified by LC separation followed by NSD.

The average theoretical and experimental MW, deviation in the measured mass, sequence tags obtained, and the modifications observed for each protein are tabulated.

84

CHAPTER 3

Top-Down Protein Identification of Proteasome Complex Proteins with nano LC

FT-ICR MS Employing Data-Independent Fragmentation Methods

3.1. Abstract

Intact protein identification by combining various up-front fractionation techniques followed by LC separation and fragmentation in the gas phase has been presented by different groups to identify proteins in simple standard mixtures to complex samples. We have developed a similar approach using a 15 T FT-ICR mass spectrometer to identify the proteasome complex proteins without any pre-fractionation. For the purpose of identification in the LC timescale, we have employed funnel-skimmer CAD. In addition to this, we have developed a new data- independent fragmentation method called ‘CASI (Continuous Accumulation of Selected Ions)-

CAD’. During CASI-CAD, several precursor charge states of the multiply charged protein were selected by choosing a mass range in the external quadrupole and the preselected ions were transferred to the collision cell and fragmented at once by CAD. Both these dissociation methods identified 13 out of the 14 proteins in the 20S complex, each with several 4-5 amino acids long sequence tags. From the product ion data, the nature of the N-terminus was characterized for 9 proteins and the α3 protein was found to be phosphorylated. The deviation in the measured monoisotopic mass for these proteins was 1 ppm on average. Furthermore, in the 19S complex, which has proteins extending over a wide range in MW, the intact mass of the proteins were obtained, including the 100.2 kDa protein which was measured with a deviation of

17 ppm. Efforts are being continued for fragmenting and identifying these proteins. The CASI-

CAD technique for fragmenting proteins that we have presented here is an efficient addition to the growing inventory of dissociation methods compatible with the on-line separation of proteins.

With further up-front separation techniques, this method could be an efficient top-down means to address more complex samples.

85

3.2. Introduction

Protein identification by top-down MS can be valuable for the analysis of complex samples (e.g., cell lysates) to relatively few proteins such as endogenous protein complexes.

Though bottom-up methodologies have advantages, such as well-developed protocols,1 efficient peptide chromatography,2 and well-developed bioinformatics support,3 the concept of analyzing proteins intact is still unmatched in the extent of information it yields potentially. Protein identification by top-down MS yields intact mass of the protein which readily reveals if the molecular mass is in agreement with that predicted from sequence databases. Any discrepancies in the mass hints the presence of ‘protein species’ or ‘proteoforms’ (proteins modified during co- and post-translational events) and ‘isoforms’ (variations at the gene level such as related genes encoding similar protein sequences)4 which can be further characterized by targeted studies. In-depth characterization of the protein by top-down methods using an assortment of tandem MS techniques leads to unraveling sequence conflicts,5 incorrect N- terminal modifications,6 disulfide bridges,7 combinatorial PTMs,8 and positional isomers.9

To carry out intact protein identification on the high throughput scale along with the required sensitivity and resolution, a diverse array of separation techniques have been developed10 with the goal of keeping the proteins intact and soluble at the end of every step.

Ultimately the proteins are separated by LC (mostly reversed phase) before the entrance into the mass spectrometer, where once again a choice of fragmentation technique awaits the protein for identification and characterization to some extent. The first demonstration of identifying a large sample set of proteins on the LC separation time scale was accomplished by

Kelleher’s group.11 Since then the separation techniques and mass spectrometry instrumentation have been modified and improved for identifying intact proteins in dense sample populations.12, 13 Furthermore, the recent addition of orbitrap instruments to the existing family of high resolution mass spectrometers channels the top-down protein identification field to much

86 shorter data acquisition times (1.5 seconds on the orbitrap Elite versus 4-8 seconds on the 12 T

FT-ICR MS).14, 15

Another significant achievement was incorporating electron transfer dissociation (ETD) during on-line protein chromatographic separation. For example, ribosomal proteins were characterized by LC-ETD-MS/MS on a modified linear ion trap and this led to protein identification by the c- and z-series ions obtained from the ETD reaction.16 Currently, the application of ETD and higher energy collision dissociation (HCD) during on-line protein elution in the orbitrap instrument has shown that these fragmentation methods yield the best sequence coverage and the most number of unique identifications respectively amongst all methods compared to top-down identification of proteins in a cancer cell line.15

Though efforts have been made to employ a variety of fragmentation methods for performing top-down protein identification in the LC timescale, nozzle-skimmer dissociation

(NSD) and collisionally activated dissociation (CAD) are the two predominant methods due to shorter data averaging time and feasibility of fragmenting proteins, including high MW proteins by NSD during on-line elution.17 ETD requires a longer duration to average data from the far more number of product ions obtained from the sequence independent fragmentation scheme it follows18 and is currently limited to proteins under 20 kDa when eluted from a column.16 HCD and electron capture dissociation (ECD) require dedicated instrument setups and can be performed only in orbitrap and FT-ICR instruments respectively. Akin to ETD, ECD is not used for on-line protein identification experiments.19

Most high throughput top-down protein identifying platforms use NSD and CAD to envelop high and low MW proteins respectively in the proteome being studied. NSD is performed without any precursor ion selection20 and can be programmed to switch between high/low nozzle-skimmer voltages during on-line protein elution.21 In contrast, CAD is performed in a dedicated collision cell by isolation of the precursor charge state. During on-line protein

87 identification, low MW proteins are mostly fragmented by CAD by isolating the top ‘n’ (n = 3 to 5, which determines the instrument duty cycle) charge states and fragmentation of each separately by CAD.22 The dissociation products obtained from different precursor charge states of the same protein are merged during post-processing to identify it by database searching. Most high throughput protein identifying platforms developed until now work on this scheme. Fragmenting multiple charge states of a protein is preferred compared to relying on one predominant charge state dissociation,23 since many studies to date have shown that the fragment ions obtained from different charge states vary and analyzing different charges states increases the sequence coverage.24, 25 Therefore, there is a need to develop high throughput top-down workflows that can perform in data-independent modes to fragment multiple charge states all at once in the collision cell by CAD type fragmentation, since this will eliminate the need to spend instrument time on charge state isolation based on isotope spacing and also reduce post processing time.

We have developed a top-down technique based on nanoscale LC separation followed by on-line ultra-high resolution FT-ICR MS to examine complex samples. Funnel-skimmer dissociation (FS-CAD) which is the FT instrument equivalent of the more widely known NSD was employed to fragment proteins in the LC timescale. FS-CAD been successfully used by

Agar’s group to fragment proteins up to 669 kDa by infusion experiments.26 In addition to using

FS-CAD, we have adapted a CAD type fragmentation known as Continuous Accumulation of

Selected Ions (CASI)-CAD for protein identification. CASI is predominantly known in the small molecule field for its ability to selectively enrich molecules in complex matrices27 and for tissue imaging.28 Due to the high background when imaging tissue sections to probe for the presence of drugs and its metabolites, the isolation and accumulation of the compound enriches it from the other molecules and yields an accurate mass which eliminates the need to do MS/MS for validation.29 Recently in 2011, Zhong et al. used this technology to determine the structure of an unknown degradation product from an antifungal agent, posaconazole.30 LC-MS of the sample

88 showed the degradant at m/z 685 with an accurate mass by CASI. This was followed by infusion experiments that allowed enrichment of the degradant by choosing a mass window and accumulation for a preset time followed by transfer to the ICR cell, where further fragmentation led to identification of the molecular composition of the degradant.

Originally, the initiative to selectively accumulate ions in an external quadrupole prior to transferring them to the ICR cell was developed by Marshall and coworkers in the late 90s. 31

They used it for selectively zooming in regions across the entire m/z window to get high resolution data to identify the compounds in the crude oil samples.32 Since then, this technique has been applied in both targeted33 and large-scale top-down studies,34 to accumulate ions to increase the signal-to-noise ratio and selectively scan various regions of the mass spectrum respectively. Recently, Tipton et al. have used this ‘zoom mapping’ approach for on-line top- down protein identification by selectively choosing ions spanning two sequential 80 to 100 m/z windows for high resolution mass measurement of precursor ions followed by a broad range m/z scan for measuring the fragment ions generated by NSD, which resulted in baseline resolution and identification of proteins up to 72 kDa.35

Here, we present the CASI-CAD technique for top-down protein identification by simultaneously accumulating 6-7 precursor charge states and dissociating all at once during on- line elution. The multiple charge states of an intact protein to be fragmented is selected by choosing a mass window (typically 800-1200 m/z) in an external quadrupole and then the ions are transferred to a collision cell where it is accumulated for a preset time and collision energy is applied to fragment it. The fragment ions are then transferred downstream to the ICR cell to facilitate mass measurement. Human 19S and 20S proteasomes were used as model complexes to develop our top-down technique. These two complexes form an integral part of the ‘Ubiquitin-Proteasomal System’ and degrade a majority of intracellular proteins.36 The 20S complex (700 kDa), also known as core particle, consists of two alpha and two beta rings, which

89 are stacked to form a hollow ring shaped structure (Figure 3.1) and contains the active sites responsible for proteolysis.37 Each alpha ring is made up of 7 different alpha proteins and each beta ring is made up of 7 different beta proteins in eukaryotes. The 19S complex, also called regulatory particle, contains more than 19 different subunits ranging from 8.2 to 105.8 kDa, and controls protein import and powers the complex for protein degradation.38

Prokaryotic and eukaryotic 20S complexes have so far been analyzed by both top- down39, 40 and bottom-up approaches,41 and the N-terminus/PTMs of the subunits have already been characterized.42, 43 The intact mass of the human 20S subunits have been deciphered by

2D gel electrophoretic separation of the 20S subunits followed by eluting the proteins intact from the gel for further analysis with FT-ICR MS.40 The 19S complex subunits have also been characterized to some extent by bottom-up methods,44 but the intact mass of these proteins are yet to be found. We have used FS-CAD (Figure 3.1a) and CASI-CAD (Figure 3.1b) methods for identifying proteins in the human 20S (h20S) complex, and we also measured the intact masses of proteins in the human 19S (h19S) complex, including the 100.2 kDa protein. In addition to these two dissociation methods, we have also explored the possibility of using the CASI principle to enrich for fragment ions after dissociation in the funnel-skimmer region (Figure 3.1c).

Intact and product ions obtained from the tandem MS experiments were then searched against the database for protein identification.

3.3. Experimental section

3.3.1. Sample preparation

Bovine ubiquitin was purchased from Sigma-Aldrich (St. Louis, MO). LC-MS grade solvents were from the following vendors: water and isopropanol from Baker (VWR, Radnor,

PA) and acetonitrile (ACN) from EMD (Billerica, MA). High purity formic acid (FA) and trifluoro acetic acid (TFA) were from Thermo Scientific Pierce (Waltham, MA). Human 20S and 19S

90 proteasome complexes were purchased from Boston Biochem (Cambridge, MA). Prior to loading on the column, the proteasome samples were denatured by diluting in 95/5/0.1

H2O/ACN/FA. Ubiquitin was made at 1 mg/mL concentration in MS grade water and then diluted to 500 fmol/µL using 50/50/0.1 H2O /ACN/FA for calibrating the instrument.

3.3.2. LC-MS on Dionex nano LC-Bruker FT-ICR MS

Chromatography was performed using a Dionex Ultimate 3000 rapid separation LC system (Thermo Scientific, Waltham, MA), which consists of a binary high-pressure gradient pump and a ternary low-pressure loading pump. 100 fmol of the h20S proteasome sample was injected into a Dionex Pepswift reversed phase monolithic trap column (200 µm x 5 cm) and washed for 10 min with 100% water 0.05% TFA at a flow rate of 5 µL/min using the loading pump. This was followed by gradient elution into the Dionex Pepswift reversed phase monolithic analytical column (100 µm x 25 cm). 100% water 0.1% FA (solvent A) and 100% ACN 0.1% FA

(solvent B) were delivered at a flow rate of 1 µL/min and the following gradient was used: 10% B for 10 min, increased to 25% B at 15 min, 50% B at 49 min, 80% B at 54 min, held at 80% B for

5 min, and finally equilibrated at 10% B for 10 min. Both the analytical and trap columns were held at 60° C throughout the run to aid protein separations. For separating the proteins in the

19S complex, the solvent B in the gradient pump was changed to 90/10/0.1 ACN/ isopropanol/

FA and the gradient used was the same as above.

The Dionex nanoLC system was connected in-line with Bruker solariX hybrid Qq-15 T

FT-ICR mass spectrometer. The CaptiveSpray source (Bruker Daltonics, Billerica, MA) was used to couple the LC flow directly into the inlet of the mass spectrometer. It consists of a non- tapered 20 µm i.d. silica tip which produces a stable spray with minimized clogging. The MS source parameters were optimized for the LC flow rate of 1 µL/min and were as follows: drying gas flow of 2.5 L/min at a temperature of 180°C and a capillary voltage of -1.5 kV was applied.

91

The capillary exit and deflector plate were at 250 V and 230 V respectively. For intact measurements of the proteins in the h20S complex, the funnel 1 and skimmer 1 were held at

140 V and 30 V respectively, whereas for FS-CAD, skimmer 1 potential was increased to 100 V.

The potential of funnel 2, which is downstream to skimmer 1, remained at 6 V in all the experiments. For intact measurements of the h19S complex proteins, the voltage in the skimmer

1 was increased to 60 V. The data was acquired in the broadband detection mode, using 1 MHz data points, with a mass range of 150-3000 m/z. The total duration of a MS scan event was 0.63 second, which includes a transient length of 0.34 second. The number of scans averaged for one spectrum was set to one.

For the CASI-CAD, the ions between 800-1200 m/z were selected in the quadrupole by setting the Q1 mass to a 1000 m/z and using an isolation window of 400 m/z. For proteins in the h20S complex, the ions were accumulated for 0.2 second in the collision cell and optimized collision energy of 14.5 V was applied for fragmentation.

The instrument was calibrated externally by infusing 500 fmol/µL of ubiquitin at a flow rate of 1 µL/min using a syringe pump through the CaptiveSpray source. The most abundant peaks from the 13+ to 6+ charge states of ubiquitin were used for calibration and a mass accuracy of 0.1-0.3 ppm was routinely obtained for the ubiquitin peaks.

3.3.3. Data analysis

Total Ion Chromatogram (TIC) obtained from LC-MS and LC-MS/MS (using FS-CAD and

CASI-CAD) runs were smoothed using the Gaussian smoothing algorithm (0.8 second width for

1 cycle) in the DataAnalysis 4.0 (Bruker Daltonics). Mass spectra were obtained by manually integrating at the apex of each peak in the TIC. Charge state distributions of the intact proteins and fragment ions from the tandem MS experiments were deconvoluted using Sophisticated

Numerical Annotation Procedure (SNAP) II algorithm in the DataAnalysis, which determines the

92 monoisotopic MH+ for both intact and fragment ions. The mass lists obtained were exported and converted to neutral masses by subtracting the mass of the proton. Since both FS-CAD and

CASI-CAD are data-independent fragmentation methods, the MS/MS data files are devoid of a particular precursor ion and so the mass of the protein being fragmented was manually added to each file as the precursor mass. ProSight 2.0 was used to search the human top-down complex database45 which is built against UniProt release 2012_06 and can be downloaded as a

ProSight warehouse format file from the following site: ftp://prosightpc.northwestern.edu/2012_06/Eukaryotes/Homo%20sapiens/

This database contains 1,496,086 basic sequences and supports top-down searches with known N-terminal modifications, PTMs, alternative splice forms, and endogenous cleavage events.17 The ‘Absolute Mass’ type search,46 which matches the observed precursor mass to the one in the database within the specified mass window, followed by comparing the theoretical fragments from all the proteins within the tolerance limit to the observed fragments, was performed to search the database using the following parameters: precursor mass tolerance of

± 2 Da [to account for mass discrepancies from biological events (amidation or deamidation) and/or errors during deconvolution], fragment mass tolerance of ± 10 ppm, and N-terminal acetylation and phosphorylation as modifications.

3.4. Results and discussion

3.4.1. LC-MS of h20S proteasome using nLC-FT-ICR MS

The h20S complex was separated using a reversed phase monolithic column and all the protein subunits elute between 27 to 42 min as illustrated in the TIC (Figure 3.2 bottom panel).

The intact masses of all the proteins in the complex were obtained and this data was collected by loading 100 fmol of the complex in the monolithic column. The polymeric (PS-DVB) material in this column forms a continuous rod of interconnected microglobules and contains flow-

93 through pores, but lacks the interstitial volume inside the microglobules and hence there is no intra-particular mass transfer.47 Consequently, we find these columns to be very sensitive than the conventional bead based columns (C4/C8 group bonded silica based columns and the polymeric PLRP columns) and this has been reported by other groups as well.48, 49

In the top panel of Figure 3.2, a simulated gel-view of the same elution window as in the

TIC is shown, with the x and y axes representing the time of elution and m/z respectively. The protein(s) eluting in each peak in the TIC has a corresponding charge state envelope as depicted in the gel-view. The intact mass spectra of four h20S subunits eluting in the TIC are shown in Figure 3.3, with the inset in each panel displaying the resolution obtained for the charge state highlighted. A resolution of 45,000-50,000 was routinely obtained for all the proteins with the data points averaged from 15-20 seconds peak widths at the apex on the LC timescale.

The peak widths at half maximum (FWHM) was found to be within 15-30 seconds for all the peaks, except peaks 1 and 9 which have peak widths of 1 min and comprise proteins approaching 30 kDa in MW. Increasing peak widths with protein MW is a general trend, with peak widths in the range of 1-2 min for proteins of 35 -50 kDa MW being obtained when PLRP columns were used by the Kelleher group.35, 50 For intact protein analysis, when compared to

PLRP and silica based columns, monoliths generally give rise to smaller peak widths and shorter retention times,51 and so rigorous comparisons can be done to evaluate the efficiency of these columns in top-down LC-MS protein analysis. The reproducibility of the peaks was found to be within ± 20 seconds, for both intra- and inter-day runs.

94

3.4.2. LC-MS/MS using data-independent fragmentation methods

To identify the proteins in the h20S complex, when each protein elutes, two different data-independent fragmentation methods such as funnel-skimmer CAD (FS-CAD) and CASI-

CAD were employed.

3.4.2.1. Protein identification using FS-CAD: The FS-CAD occurs in the wide ion funnel inter- stage region between the first and second funnel and provides higher transfer efficiency for large, multiply charges ions. Fragmentation in the funnel-skimmer region was facilitated by increasing the declustering potential voltage of skimmer 1 in the source region (Figure 3.4). The voltage of skimmer 1 was optimized by setting up several LC-MS/MS runs of the h20S complex with increasing voltages and the optimum voltage was chosen by examining the intensity and number of fragment ions at every voltage (data not shown). Optimization of the voltage is also to ensure that secondary fragmentation, which tends to be uninformative, is avoided. A skimmer 1 voltage of 100 V corresponding to a ΔSF (skimmer 1 voltage – funnel 2 voltage) of 94 V yielded the best coverage and was used for fragmenting the h20S subunits during on-line elution. This is a 70 V increase in the skimmer 1 potential that was used for intact mass measurements. A fragment ion mass list generated for each subunit by averaging the data points from each peak in the TIC was used for database searching. Table 3.1 shows the experimental MW, number of fragment ions matched (both b- and y-ions), expectation value (e value) obtained from ProSight and the nature of the N-terminus characterized from the fragment ion data for each protein in the h20S complex. 13 out of 14 h20S proteins were identified with b- and y-series fragment ions with errors in the mass accuracies less than 10 ppm (Table 3.1). For the intact mass measurements (which were acquired by keeping the skimmer 1 potential at 30 V), the errors in the measured mass were much lower, and we were able to measure the intact masses of all the proteins within a deviation of 1 ppm on average, during the LC-MS runs (Table 3.2). The

95 observed difference in measuring the intact mass more accurately than the fragment masses could be due to the increase in the number of ions being measured after fragmentation, when compared measuring fewer ions in the intact mass spectrum.

Illustrated in Figure 3.5 is the FS-CAD of a 29 kDa protein. FS-CAD of this protein yields a fragment ion rich spectrum (Figure 3.5A) with several 6-7 amino acids length sequence tags, one of which is zoomed in Figure 3.5B. Database searching identified this protein as the α4 subunit of the h20S complex, with an e value of 4.94E-47. Low e values represent better matches and are less likely to be false positives. Of the total 135 fragment ions obtained, 50 (32 b-ions and 18 y-ions) were mapped (Figure 3.5C) and this was from 48 of the possible 259 distinct inter-residue amide bond cleavages in the α4 protein by FS-CAD. This translates to 19% sequence coverage from the 37% of the total fragments matched by the search. The sequence coverage and number of fragment ions matched could be lower owing to the presence of a large number of internal fragments that were not interpreted. Recently, Agar’s group have deciphered the fragmentation pathways observed in intact proteins during FS-CAD at a similar declustering potential voltage in an instrument like the one used in our study and they found that intermediate charge states of proteins preferentially cleave through proline and aspartic acid channels, and generate internal fragments that account for around half of the total fragments obtained.52 Current top-down search engines are limited in the ability to search for internal fragments, which are formed by two backbone amide bond cleavages in the protein and lack information about either termini. Further improvement in the search tools to look for internal fragment ions after generating a list of potential protein hits will be very valuable to interpret more fragment ions and thereby increase the sequence coverage.

3.4.2.2. Protein identification using CASI-CAD: CASI-CAD, which has not yet been explored for proteins, was performed by choosing a mass range such as 800-1200 m/z in the external

96 quadrupole and all the ions within this range were accumulated for a certain amount of time in the collision cell and fragmented by CAD (Figure 3.4). This is illustrated by infusing a sample of bovine ubiquitin (Figure 3.6). The full m/z range ESI-MS of ubiquitin is shown in Figure 3.6A and only the ions between 800-1200 m/z were selected (Figure 3.6B) in the external quadrupole, by giving a mass window of 1000±400, which predominantly isolates the 10+ to 8+ charge states from the charge state envelope of the protein. These selected ions were then transferred to the collision cell and fragmented by CAD (Figure 3.6C).

We chose to carry out the CASI-CAD type of dissociation for proteins eluting from the column, since, it can be performed rapidly during the course of the experiment. Generally, in data-dependent acquisitions, the time it takes to acquire the fragment ion data from the charge states isolated determines the duty cycle of the instrument. The duty cycle of the instrument is lowered when the program has to automatically select the precursor based on the isotopic distribution followed by acquiring the high resolution fragment ion data for that charge state.22

This process has to be repeated for at least three most abundant charge states within the timescale the protein elutes from the column in data-dependent top-down protein identification experiments. CASI-CAD is a data-independent method and so it eliminates the need for spending the instrument time on precursor selection. Initial survey experiments with the h20S sample were conducted to determine the mass window or the range of charge states to be selected. All the proteins in the h20S complex fall within a mass range of 20,000 – 30,000 Da and the ESI charge state distributions for these proteins lie within 700-2000 m/z range. Since

CAD is much efficient when the intermediate charge states are selected,53 a mass window of

800-1200 m/z was selected, and this isolates 6-12 precursors varying between 38+ to 17+, depending on the ESI generated charge state distribution of the protein. The collision voltage was optimized and the ion accumulation time in the collision cell was increased from 0.2 second to 0.3 second to find if sequence coverage increases with accumulation time. The sequence

97 coverage of α5 subunit at 0.2 and 0.3 second ion accumulation times are shown in Figure 3.7A and 3.7B respectively, with the table in the inset showing the total number of ions exported, the number of matching fragment ions, and the e values for this ion accumulation time comparison.

For approximately an equal number of ions exported, the number of fragments matched is the same and increasing accumulation time did not yield more sequence coverage. Moreover, the relative intensities of the fragment ions obtained at 0.2 and 0.3 second accumulation times were also equal (Figure 3.8). Therefore, the precursor ions were accumulated for 0.2 second in the collision cell prior to CAD. Generally, higher ion accumulation times have to be avoided since this will increase the total time it takes to acquire a scan, thereby leading to a decrease in the

LC resolution.

The MS/MS data obtained from CASI-CAD experiments were also processed and searched with the same parameters as FS-CAD and the results from these identifications are summarized (Table 3.1). Akin to the FS-CAD protein identification results, the CASI-CAD technique also identified 13 out of the total 14 subunits with fragment ions being matched with errors in the mass accuracy under 10 ppm. One example, such as the CASI-CAD fragmentation of the α4 subunit is displayed in Figure 3.9A with the fragment ions obtained after dissociating the charge states that were enriched between 800-1200 m/z and a zoomed-in view is displayed

10+ 10+ in Figure 3.9B to show the b95 to b99 sequence tag that was obtained by mapping the fragment ions. The sequence coverage obtained for this protein is shown in Figure 3.9C.

CASI-CAD produced higher number of matching fragments for three proteins

(highlighted in green in Table 3.1) when compared to the other methods used. The β7 subunit was the only protein not identified since it co-eluted with β5 subunit and did not yield any fragment ions.

The three proteins that were identified with more fragments by the CASI-CAD method were α1, α2, and β6 subunits. α1 was found to co-elute with α7 and β6 co-eluted with β4. α1

98 and β6 subunits were identified with 14 and 13 fragments respectively by FS-CAD. On the other hand, CASI-CAD technique picked up more fragments, and the α1 and β6 subunits were identified with 17 and 23 fragment ions respectively by CASI-CAD. Similarly, α2 subunit co- eluted with α6 and was identified with 48 b- and y-ions by CASI-CAD, whereas FS-CAD identified only 38 ions. Moreover, the analyses of proteins that were identified equally well by both the methods indicate that the data from these two methods are complementary. CASI-CAD could be used as an additional tool to enrich for fragment ions from poorly fragmenting proteins and co-eluting proteins. Though both the methods are based on the same CAD mechanism, the internal energy of the multiply charged protein ions at different stages in the mass spectrometer are different and so further voltage comparisons cannot be made.

3.4.3. Identification of co-eluting proteins

The β7 protein was not identified due to its co-elution with β5 subunit. Apart from this pair, three other pairs of proteins such as α1-α7, α2-α6, and β4-β6 co-eluted, but both the proteins in all these pairs were identified during database searching. This was possible by including both the proteins in the co-eluting pair as precursor masses when performing the search. However, ProSight does not automatically take away the fragments that already matched to one protein. Instead, it matches the complete list of fragments again with the second precursor mass and so the e values which are calculated based on the number of fragments matched versus total number of fragments entered during the search could be artificially high due to the presence of fragment ions from the co-eluting protein.

Co-eluting proteins often present a significant problem during on-line identifications21 and sometimes result in only one of the two proteins in the co-eluting pair being identified as the

β7-β5 pair in our study. This is due to poor fragmentation of any one protein in the pair resulting in a few fragment ion matches and a greater chance of being a random hit as depicted by higher

99 e values for these proteins (Table 3.1). Efforts to separate these co-eluting pairs by decreasing the rate of change of the solvent B did not result in separating these pairs. Other changes like selecting a different stationary phase and/or mobile phase, changing the conformation of the proteins by altering the salt concentration followed by on-line desalting can be tried in order to separate the co-eluting species.

3.4.4. N-terminal characterization and PTM analyses

Data from both the methods showed the N-terminus to be acetylated for 9 proteins and the rest 5 proteins that were unmodified were found to be proteins in the beta ring. Only the α3 subunit was found to be phosphorylated and this was confirmed by matching the intact molecular mass obtained to the phosphorylated form of this protein reported in the database by bottom-up methods. But due to poor fragmentation efficiency of phosphorylated proteins in positive mode ESI, this protein was not effectively identified by both the methods and the site of modification could be Tyr160, Ser242, or Ser249 as indicated in the UniProt database. The y-ions obtained for this protein from Gln102 to Pro108 were too far from these sites that are known to be phosphorylated and were unable to locate the site of modification. Targeted studies on characterization of the h20S complex using bottom-up42 and top-down methods40 have shown

54 only Ser249 to be phosphorylated. But large-scale phosphopeptide profiling by ETD and

55 immunoaffinity based profiling using tyrosine specific antibodies have shown Ser242 and Tyr160 respectively to be phosphorylated too. This variation could be due to the origin of the α3 protein being from different cell lines in these studies56 and different enrichment methods used could also cause variable results.

The MaxEnt decovolution of this protein shows the presence of unmodified form at 5 times lower in intensity than the singly phosphorylated form (Figure 3.10). This correlates well with a darker 2D gel spot being obtained for the singly phosphorylated form at Ser249 in a trail of

100 three gel spots for this protein in the proteomic analyses of the h20S complex by the Monsarrat group.40, 57 The presence of the phosphorylated form of this protein at a higher intensity when compared to unmodified form of this protein and also to the other phosphorylated subunits in the proteasome complex has been observed by other groups too58 and it has been suggested to stabilize the association of 19S regulatory complex to the 20S catalytic core complex to form the

26S functional proteasome.59

3.4.5. FS-CAD followed by ion enrichment using CASI

In addition to using the above described dissociation methods, we have also enriched for the fragment ions generated by FS-CAD by choosing a mass range in the quadrupole and accumulating the fragment ions generated from FS-CAD. Applying the CASI technique after FS-

CAD selectively enriches only the fragments which are within 800-1200 m/z mass range from the proteins in the h20S complex, as opposed to scanning the entire mass window after the FS-

CAD. When the h20S subunits elute from the column, the voltage in the skimmer 1 was increased to 100 V to induce fragmentation and the fragment ions were enriched by using the mass window in the quadrupole, but the collision energy was not applied, which would otherwise initiate further fragmentation of the fragments. An ion accumulation time of 0.2 second in the quadrupole was used for this method.

The data collected with FS-CAD alone and with enriching the FS-CAD fragments by using CASI were compared, and we find that using the latter approach produces more matching fragment ions than the former technique alone. This is shown for α4 protein (Figure 3.11) which was identified with 50 b- and y-type ions by FS-CAD alone (Figure 3.11A), whereas the FS-CAD with CASI identified this protein with a total of 62 b- and y-ions (Figure 3.11B). Enriching for the fragments by using the CASI technique increases the total number of ions exported for the search (Figure 3.11 inset) and the e values obtained are lower for the CASI enriched data,

101 indicating a much confident identification, with the higher number of fragment ions being mapped. Similar results with higher total number of matching ions and lower e values were obtained for 7 other proteins in the h20S complex (highlighted in blue in Table 4.1).

We also used longer ion accumulation time such as 0.4 and 0.6 second to find if more sequence coverage could be obtained. However, the 0.4 second accumulation time did not yield a significant increase in the number of matching fragment ions and with 0.6 second, the number of matching fragments decreased (data not shown). This result confirms our previous observation of using longer ion accumulation time during CASI-CAD (Figure 3.7). The apparent decrease in the sequence coverage at higher ion accumulation times could be due to the space- charge effect in the collision cell at longer accumulation time. A decrease in the chromatographic resolution was also noticed at longer ion accumulation time.

3.4.6. LC-MS of h19S complex

The 19S complex is composed of at least 19 different canonical subunits of theoretical

MWs ranging from 8.2 kDa to 105.8 kDa (obtained from primary sequence information with N- terminal modification reported in the UniProt database) and is relatively more dynamic when compared to the 20S complex. For the efficient recovery of the high MW proteins, the LC-MS of the h19S complex was performed by changing the solvent B to 90/10/0.1 ACN/IPA/FA. Though

IPA works efficiently in eluting high MW species from the column, higher amounts such as 20%, resulted in higher back pressure and unstable spray, and so 10% IPA was used for h19S separations.

With 1 pmol of the h19S complex loaded on the monolithic column, the intact mass of the proteins were measured including Rpn1 (Figure 3.12), which is one of the two largest subunits in the 19S complex. Rpn1 and Rpn2 are suggested to have a MW of 110 and 104 kDa respectively,60 and are explored by different research groups to understand the topology and

102 role played by these high MW proteins in the protein destruction machinery.61 The intact mass measurement by LC-MS yielded an average MW of 100.2392 kDa which is 0.0017 kDa (17 ppm at a MW of 100 kDa) less than the theoretical average MW of 100.2409 kDa calculated with a

N-terminal acetylation for the Rpn1 subunit as identified by bottom-up studies.42 The approximately 10 kDa difference in the suggested and experimental mass for the Rpn1 subunit cannot be explained by the observed PTMs such as phosphorylations in the UniProt database, and so further targeted top-down characterization is required to find the intact mass of such high

MW proteins. Recently, these proteins have been suggested to act as docking sites for proteins conjugated with ubiquitin for degradation, to coordinate deubiquitination, and movement of the unfolded protein to the catalytic core.60 Any novel PTMs identified could shed more light the additional roles played by these high MW proteins.

3.5. Conclusions

The top-down on-line protein identification technique that we have presented here using the proteasome complex in a hybrid Qq-FT-ICR instrument can be applied to samples of medium complexity, such as identifying proteins in endogenous protein complexes. Generally, the factors that limit the number of proteins identified by on-line protein identification methods are the peak capacity of the LC column and the duty cycle of the instrument. Using a data- independent fragmentation method reduced the duty cycle of the instrument to within 0.6 second in our method. This is beneficial since the current data-dependent fragmentation methods employed in FT-ICRs and orbitraps require 2-4 fold more time to acquire high resolution data. By increasing the duty cycle, it gives more room to accommodate high speed and sensitive separation techniques such as capillary electrophoresis with high resolution FT-

ICR instruments to attain best performance. Also, with further improvement in chromatographic performance to limit the peak widths to within 10-20 seconds for proteins in 30-50 kDa MW

103 range, peak capacity can be increased. For samples of higher complexity such as whole cell lysates, further up-front fractionation techniques are required.

When this method is compared to our previous method using the TOF instrument, there are obviously several advantages. First, the mass accuracy of the intact proteins significantly improved and all protein subunits were measured with 1 ppm deviation using the ultra-high resolution of the FT-ICR instrument. This is a significant advantage when complex cell lysates are analyzed since unambiguous identifications during database searches are possible by improved mass accuracy. In addition, as the MW of the protein increases, there is an increased necessity to measure it more accurately. An error of 1 Da relates to a deviation of 10 ppm at a

MW of 100 kDa when compared to 20 ppm at a MW of 50 kDa. It becomes difficult to differentiate the biological changes (such as an amidation or deamidation) from instrument performance at these high MWs. This is also true for measuring the high MW b- and y- ions which extend up to hundreds of amino acids in length and are formed due to cleavage of the amide bond close to the middle of the protein. Measuring these high MW fragments accurately leads to deriving more sequence tags and increases the confidence of the identification.

Furthermore, in the TOF instrument only ions up to 6+ charge states were assigned and this limited the sequence coverage to the terminal regions of the protein. The resolution available in the FT-ICR instrument allowed assigning high MW fragment ions which originated from cleavages in the middle of the protein.

With new developments in on-line intact protein separation techniques, increased mass accuracy for high MW proteins, and including high efficient fragmentation methods in the LC timescale, top-down protein identification will become a popular choice to study samples of higher complexity.

104

3.6. References

1. Kocher, T.; Pichler, P.; Swart, R.; Mechtler, K., Analysis of protein mixtures from whole-cell extracts by single-run nanoLC-MS/MS using ultralong gradients. Nat Protoc 2012, 7, (5), 882- 890.

2. Di Palma, S.; Hennrich, M. L.; Heck, A. J.; Mohammed, S., Recent advances in peptide separation by multidimensional liquid chromatography for proteome analysis. J Proteomics 2012, 75, (13), 3791-3813.

3. Zhang, Z., Automated Precursor Ion Exclusion During LC-MS/MS Data Acquisition for Optimal Ion Identification. J Am Soc Mass Spectrom 2012, 23, (8), 1400-1407.

4. Schluter, H.; Apweiler, R.; Holzhutter, H. G.; Jungblut, P. R., Finding one's way in proteomics: a protein species nomenclature. Chem Cent J 2009, 3, 11.

5. Loo, J. A.; Edmonds, C. G.; Smith, R. D., Tandem mass spectrometry of very large molecules: serum albumin sequence information from multiply charged ions formed by electrospray ionization. Anal Chem 1991, 63, (21), 2488-2499.

6. Breuker, K.; Jin, M.; Han, X.; Jiang, H.; McLafferty, F. W., Top-down identification and characterization of biomolecules by mass spectrometry. J Am Soc Mass Spectrom 2008, 19, (8), 1045-1053.

7. Han, X.; Jin, M.; Breuker, K.; McLafferty, F. W., Extending top-down mass spectrometry to proteins with masses greater than 200 kilodaltons. Science 2006, 314, (5796), 109-112.

8. Lanucara, F.; Eyers, C. E., Top-down mass spectrometry for the analysis of combinatorial post-translational modifications. Mass Spectrom Rev 2012, doi: 10.1002/mas.21348.

9. Garcia, B. A.; Pesavento, J. J.; Mizzen, C. A.; Kelleher, N. L., Pervasive combinatorial modification of histone H3 in human cells. Nat Methods 2007, 4, (6), 487-489.

10. Capriotti, A. L.; Cavaliere, C.; Foglia, P.; Samperi, R.; Lagana, A., Intact protein separation by chromatographic and/or electrophoretic techniques for top-down proteomics. J Chromatogr A 2011, 1218, (49), 8760-8776.

11. Parks, B. A.; Jiang, L.; Thomas, P. M.; Wenger, C. D.; Roth, M. J.; Boyne, M. T., 2nd; Burke, P. V.; Kwast, K. E.; Kelleher, N. L., Top-down proteomics on a chromatographic time scale using linear ion trap fourier transform hybrid mass spectrometers. Anal Chem 2007, 79, (21), 7984-7991.

12. Wu, Q.; Yuan, H.; Zhang, L.; Zhang, Y., Recent advances on multidimensional liquid chromatography-mass spectrometry for proteomics: from qualitative to quantitative analysis--a review. Anal Chim Acta 2012, 731, 1-10.

13. Zhou, H.; Ning, Z.; Starr, A. E.; Abu-Farha, M.; Figeys, D., Advancements in top-down proteomics. Anal Chem 2011, 84, (2), 720-734.

105

14. Michalski, A.; Damoc, E.; Lange, O.; Denisov, E.; Nolting, D.; Muller, M.; Viner, R.; Schwartz, J.; Remes, P.; Belford, M.; Dunyach, J. J.; Cox, J.; Horning, S.; Mann, M.; Makarov, A., Ultra high resolution linear ion trap Orbitrap mass spectrometer (Orbitrap Elite) facilitates top down LC MS/MS and versatile peptide fragmentation modes. Mol Cell Proteomics 2011, 11, (3), O111.013698.

15. Ahlf, D. R.; Compton, P. D.; Tran, J. C.; Early, B. P.; Thomas, P. M.; Kelleher, N. L., Evaluation of the compact high-field orbitrap for top-down proteomics of human cells. J Proteome Res 2012, 11, (8), 4308-4314.

16. Chi, A.; Bai, D. L.; Geer, L. Y.; Shabanowitz, J.; Hunt, D. F., Analysis of intact proteins on a chromatographic time scale by electron transfer dissociation tandem mass spectrometry. Int J Mass Spectrom 2007, 259, (1-3), 197-203.

17. Tran, J. C.; Zamdborg, L.; Ahlf, D. R.; Lee, J. E.; Catherman, A. D.; Durbin, K. R.; Tipton, J. D.; Vellaichamy, A.; Kellie, J. F.; Li, M.; Wu, C.; Sweet, S. M.; Early, B. P.; Siuti, N.; LeDuc, R. D.; Compton, P. D.; Thomas, P. M.; Kelleher, N. L., Mapping intact protein isoforms in discovery mode using top-down proteomics. Nature 2011, 480, (7376), 254-258.

18. McLafferty, F. W.; Breuker, K.; Jin, M.; Han, X.; Infusini, G.; Jiang, H.; Kong, X.; Begley, T. P., Top-down MS, a powerful complement to the high capabilities of proteolysis proteomics. FEBS J 2007, 274, (24), 6256-6268.

19. Udeshi, N. D.; Shabanowitz, J.; Hunt, D. F.; Rose, K. L., Analysis of proteins and peptides on a chromatographic timescale by electron-transfer dissociation MS. FEBS J 2007, 274, (24), 6269-6276.

20. Loo, J. A.; Udseth, H. R.; Smith, R. D., Collisional Effects on the Charge Distribution of Ions from Large Molecules, Formed by Electrospray-ionization Mass Spectrometry. Rapid Commun Mass Spectrom 1988, 2, (10), 207-210.

21. Vellaichamy, A.; Tran, J. C.; Catherman, A. D.; Lee, J. E.; Kellie, J. F.; Sweet, S. M.; Zamdborg, L.; Thomas, P. M.; Ahlf, D. R.; Durbin, K. R.; Valaskovic, G. A.; Kelleher, N. L., Size- sorting combined with improved nanocapillary liquid chromatography-mass spectrometry for identification of intact proteins up to 80 kDa. Anal Chem 2010, 82, (4), 1234-1244.

22. Lee, J. E.; Kellie, J. F.; Tran, J. C.; Tipton, J. D.; Catherman, A. D.; Thomas, H. M.; Ahlf, D. R.; Durbin, K. R.; Vellaichamy, A.; Ntai, I.; Marshall, A. G.; Kelleher, N. L., A robust two- dimensional separation for top-down tandem mass spectrometry of the low-mass proteome. J Am Soc Mass Spectrom 2009, 20, (12), 2183-2191.

23. Scherperel, G.; Reid, G. E., Emerging methods in proteomics: top-down protein characterization by multistage tandem mass spectrometry. Analyst 2007, 132, (6), 500-506.

24. Chanthamontri, C.; Liu, J.; McLuckey, S. A., Charge State Dependent Fragmentation of Gaseous alpha-Synuclein Cations via Ion Trap and Beam-Type Collisional Activation. Int J Mass Spectrom 2009, 283, (1-3), 9-16.

25. Mekecha, T. T.; Amunugama, R.; McLuckey, S. A., Ion trap collision-induced dissociation of human hemoglobin alpha-chain cations. J Am Soc Mass Spectrom 2006, 17, (7), 923-931. 106

26. Karabacak, N. M.; Li, L.; Tiwari, A.; Hayward, L. J.; Hong, P.; Easterling, M. L.; Agar, J. N., Sensitive and specific identification of wild type and variant proteins from 8 to 669 kDa using top-down mass spectrometry. Mol Cell Proteomics 2009, 8, (4), 846-856.

27. Witt, M.; Fuchser, J.; Koch, B. P., Fragmentation studies of fulvic acids using collision induced dissociation fourier transform ion cyclotron resonance mass spectrometry. Anal Chem 2009, 81, (7), 2688-2694.

28. Ye, H.; Greer, T.; Li, L., Probing neuropeptide signaling at the organ and cellular domains via imaging mass spectrometry. J Proteomics 2012, 75, (16), 5014-5026.

29. Fucher, J.; Cornett, S.; Becker, M., High Resolution Molecular Imaging of Pharmaceuticals at Therapeutic Levels. Bruker Daltonik GmbH 2008.

30. Zhong, W.; Yang, J.; Yang, X., Structural determination of an unknown degradation product in posaconazole drug product using CASI technology in a Fourier transform ion cyclotron resonance mass spectrometer. Rapid Commun Mass Spectrom 2011, 25, (23), 3650-3655.

31. Senko, M. W.; Hendrickson, C. L.; Emmett, M. R.; Shi, S. D.-H.; Marshall, A. G., External Accumulation of Ions for Enhanced Electrospray Ionization Fourier Transform Ion Cyclotron Resonance Mass Spectrometry. J Am Soc Mass Spectrom 1997, 8, (9), 970-976.

32. Marshall, A. G.; Rodgers, R. P., Petroleomics: the next grand challenge for chemical analysis. Acc Chem Res 2004, 37, (1), 53-59.

33. Leach, F. E., 3rd; Wolff, J. J.; Xiao, Z.; Ly, M.; Laremore, T. N.; Arungundram, S.; Al-Mafraji, K.; Venot, A.; Boons, G. J.; Linhardt, R. J.; Amster, I. J., Negative electron transfer dissociation Fourier transform mass spectrometry of glycosaminoglycan carbohydrates. Eur J Mass Spectrom 2011, 17, (2), 167-176.

34. Ferguson, J. T.; Wenger, C. D.; Metcalf, W. W.; Kelleher, N. L., Top-down proteomics reveals novel protein forms expressed in Methanosarcina acetivorans. J Am Soc Mass Spectrom 2009, 20, (9), 1743-1750.

35. Tipton, J. D.; Tran, J. C.; Catherman, A. D.; Ahlf, D. R.; Durbin, K. R.; Lee, J. E.; Kellie, J. F.; Kelleher, N. L.; Hendrickson, C. L.; Marshall, A. G., Nano-LC FTICR tandem mass spectrometry for top-down proteomics: routine baseline unit mass resolution of whole cell lysate proteins up to 72 kDa. Anal Chem 2012, 84, (5), 2111-2117.

36. Wong, E.; Cuervo, A. M., Integration of clearance mechanisms: the proteasome and autophagy. Cold Spring Harb Perspect Biol 2010, 2, (12), a006734.

37. Bedford, L.; Paine, S.; Sheppard, P. W.; Mayer, R. J.; Roelofs, J., Assembly, structure, and function of the 26S proteasome. Trends Cell Biol 2010, 20, (7), 391-401.

38. Lander, G. C.; Estrin, E.; Matyskiela, M. E.; Bashore, C.; Nogales, E.; Martin, A., Complete subunit architecture of the proteasome regulatory particle. Nature 2012, 482, (7384), 186-191.

107

39. Loo, J. A.; Berhane, B.; Kaddis, C. S.; Wooding, K. M.; Xie, Y.; Kaufman, S. L.; Chernushevich, I. V., Electrospray ionization mass spectrometry and ion mobility analysis of the 20S proteasome complex. J Am Soc Mass Spectrom 2005, 16, (7), 998-1008.

40. Uttenweiler-Joseph, S.; Claverol, S.; Sylvius, L.; Bousquet-Dubouch, M. P.; Burlet-Schiltz, O.; Monsarrat, B., Toward a full characterization of the human 20S proteasome subunits and their isoforms by a combination of proteomic approaches. Methods Mol Biol 2008, 484, 111- 130.

41. Huang, L.; Burlingame, A. L., Comprehensive mass spectrometric analysis of the 20S proteasome complex. Methods Enzymol 2005, 405, 187-236.

42. Wang, X.; Chen, C. F.; Baker, P. R.; Chen, P. L.; Kaiser, P.; Huang, L., Mass spectrometric characterization of the affinity-purified human 26S proteasome complex. Biochemistry 2007, 46, (11), 3553-3565.

43. Overath, T.; Kuckelkorn, U.; Henklein, P.; Strehl, B.; Bonar, D.; Kloss, A.; Siele, D.; Kloetzel, P. M.; Janek, K., Mapping of O-GlcNAc Sites of 20 S Proteasome Subunits and Hsp90 by a Novel Biotin-Cystamine Tag. Mol Cell Proteomics 2012, 11, (8), 467-477.

44. Bousquet-Dubouch, M. P.; Fabre, B.; Monsarrat, B.; Burlet-Schiltz, O., Proteomics to study the diversity and dynamics of proteasome complexes: from fundamentals to the clinic. Expert Rev Proteomics 2011, 8, (4), 459-481.

45. Roth, M. J.; Forbes, A. J.; Boyne, M. T., 2nd; Kim, Y. B.; Robinson, D. E.; Kelleher, N. L., Precise and parallel characterization of coding polymorphisms, alternative splicing, and modifications in human proteins by mass spectrometry. Mol Cell Proteomics 2005, 4, (7), 1002- 1008.

46. Leduc, R. D.; Kelleher, N. L., Using ProSight PTM and related tools for targeted protein identification and characterization with high mass accuracy tandem MS data. Curr Protoc Bioinformatics 2007, 19, 13.6.1-13.6.28.

47. van de Meent, M. H.; Eeltink, S.; de Jong, G. J., Potential of poly(styrene-co-divinylbenzene) monolithic columns for the LC-MS analysis of protein digests. Anal Bioanal Chem 2011, 399, (5), 1845-1852.

48. Mohr, J.; Swart, R.; Samonig, M.; Bohm, G.; Huber, C. G., High-efficiency nano- and micro- HPLC--high-resolution Orbitrap-MS platform for top-down proteomics. Proteomics 2010, 10, (20), 3598-3609.

49. Eeltink, S.; Wouters, B.; Desmet, G.; Ursem, M.; Blinco, D.; Kemp, G. D.; Treumann, A., High-resolution separations of protein isoforms with liquid chromatography time-of-flight mass spectrometry using polymer monolithic capillary columns. J Chromatogr A 2011, 1218, (32), 5504-5511.

50. Kellie, J. F.; Catherman, A. D.; Durbin, K. R.; Tran, J. C.; Tipton, J. D.; Norris, J. L.; Witkowski, C. E., 2nd; Thomas, P. M.; Kelleher, N. L., Robust analysis of the yeast proteome under 50 kDa by molecular-mass-based fractionation and top-down mass spectrometry. Anal Chem 2012, 84, (1), 209-215. 108

51. Legido-Quigley, C.; Marlin, N.; Smith, N. W., Comparison of styrene-divinylbenzene-based monoliths and Vydac nano-liquid chromatography columns for protein analysis. J Chromatogr A 2004, 1030, (1-2), 195-200.

52. Cobb, J. S.; Easterling, M. L.; Agar, J. N., Structural characterization of intact proteins is enhanced by prevalent fragmentation pathways rarely observed for peptides. J Am Soc Mass Spectrom 2010, 21, (6), 949-959.

53. Rozman, M.; Gaskell, S. J., Charge state dependent top-down characterisation using electron transfer dissociation. Rapid Commun Mass Spectrom 2011, 26, (3), 282-286.

54. Molina, H.; Horn, D. M.; Tang, N.; Mathivanan, S.; Pandey, A., Global proteomic profiling of phosphopeptides using electron transfer dissociation tandem mass spectrometry. Proc Natl Acad Sci U S A 2007, 104, (7), 2199-2204.

55. Rush, J.; Moritz, A.; Lee, K. A.; Guo, A.; Goss, V. L.; Spek, E. J.; Zhang, H.; Zha, X. M.; Polakiewicz, R. D.; Comb, M. J., Immunoaffinity profiling of tyrosine phosphorylation in cancer cells. Nat Biotechnol 2005, 23, (1), 94-101.

56. Drews, O.; Zong, C.; Ping, P., Exploring proteasome complexes by proteomic approaches. Proteomics 2007, 7, (7), 1047-1058.

57. Claverol, S.; Burlet-Schiltz, O.; Girbal-Neuhauser, E.; Gairin, J. E.; Monsarrat, B., Mapping and structural dissection of human 20 S proteasome using proteomic approaches. Mol Cell Proteomics 2002, 1, (8), 567-578.

58. Lu, H.; Zong, C.; Wang, Y.; Young, G. W.; Deng, N.; Souda, P.; Li, X.; Whitelegge, J.; Drews, O.; Yang, P. Y.; Ping, P., Revealing the dynamics of the 20 S proteasome phosphoproteome: a combined CID and electron transfer dissociation approach. Mol Cell Proteomics 2008, 7, (11), 2073-2089.

59. Bose, S.; Stratford, F. L.; Broadfoot, K. I.; Mason, G. G.; Rivett, A. J., Phosphorylation of 20S proteasome alpha subunit C8 (alpha7) stabilizes the 26S proteasome and plays a role in the regulation of proteasome complexes by gamma-interferon. Biochem J 2004, 378, (Pt 1), 177-184.

60. Rosenzweig, R.; Bronner, V.; Zhang, D.; Fushman, D.; Glickman, M. H., Rpn1 and Rpn2 coordinate ubiquitin processing factors at proteasome. J Biol Chem 2012, 287, (18), 14659- 14671.

61. Lasker, K.; Forster, F.; Bohn, S.; Walzthoeni, T.; Villa, E.; Unverdorben, P.; Beck, F.; Aebersold, R.; Sali, A.; Baumeister, W., Molecular architecture of the 26S proteasome holocomplex determined by an integrative approach. Proc Natl Acad Sci U S A 2012, 109, (5), 1380-1387.

109

Figure 3.1. Schematic representation of the top-down protein identification using nano LC FT-

ICR MS. The human 20S proteasome complex was separated into its components using a monolithic column followed by intact mass measurement using FT-ICR MS and fragmentation using either (a) FS-CAD, (b) CASI-CAD, or (c) selectively enriching the fragments generated by

FS-CAD using CASI technology. The intact and fragment masses of each protein were then searched against the database to aid protein identification.

110

1 2 3 4 5 6 7 8 9 10 11 m/z

1750

1500

1250

1000

750

11 8 10 5 2 3 7 4 9 1 6

25.0 27.5 30.0 32.5 35.0 37.5 40.0 42.5 45.0 Time [min]

Figure 3.2. TIC of h20S complex run through RP monolithic column. All the 14 protein subunits elute within 27 to 42 min in the peaks labeled 1 through 11 (bottom panel). A simulated gel-view of the protein(s) eluting in each peak is depicted in the top panel for the corresponding elution window shown in the TIC.

111

1 / 0.0416 = 24+ A 29376.122 Da 1 / 0.0295 = 34+ B 24376.105 Da Monoisotopic Mass Monoisotopic Mass 20+ 31+ 29+ 27+ 18+ 33+ 22+

35+ 25+ 24+ 23+ 16+ 21+ 37+ 19+

26+ 865.3 865.5 865.7 m/z 1017.0 1017.4 m/z

28+

C 26436.206 Da 1 / 0.033 = 30+ D 28405.987 Da 1 / 0.032 = 31+ 23+ Monoisotopic Mass 28+ 26+ Monoisotopic Mass 25+ 24+ 27+ 30+ 21+ 22+ 29+ 32+ 19+ 20+ 17+ 18+ 31+ 882.5 882.7 882.9 m/z 917.7 917.9 m/z

800 1000 1200 1400 1600 1800 2000 800 1000 1200 1400 1600 1800 2000 m/z m/z

Figure 3.3. Intact mass spectra of h20S proteins averaged from peaks in the TIC in Figure 3.2:

(A) peak 9, (B) peak 5, (C) peak 10, and (D) peak 6 in the TIC, with the inset in each panel showing the resolution obtained for the highlighted charge state in the corresponding mass spectrum.

112

Funnel 1 Skimmer 1

Octopole Quadrupole Collision Cell

Figure 3.4. Schematic representation of the Bruker 15 T FT-ICR MS. The declustering potential voltage of skimmer 1 was increased to facilitate FS-CAD. During CASI-CAD, a mass range of

800-1200 m/z was chosen in the external quadrupole and all the ions within this range (shown in red) were transferred to the collision cell where it is accumulated for a certain amount of time and dissociated by CAD.

113

A 22+ Monoisotopic mass 29376.122 Da 21+ 20+ 23+

24+ 19+ 18+

17+ 29385 29395 29405 mass

16+ 15+

800 1000 1200 1400 1600 1800 m/z 2000

B 8+ 8+ b74 b77 M AC S VAG I

8+ b70 b 8+ 8+ 73 b76 4+ b 8+ 8+ b36 72 b75

8+ 8+ b71 6+ b78 b58

1010 1020 1030 1040m/z 1050 1060 1070 1080 1090 1100 C 1S-R-R-Y-D-S-R-T-T-I-F-S-P-E-G-R-L-Y-Q-V-E-Y-A-M-E-A-I-G-H-A-G-T-C-L-G-I-L-A-N-D-G- V-L-L-A-A-E-R-R-N-I-H-K-L-L-D-E-V-F-F-S-E-K-I-Y-K-L-N-E-D-M-A-C-S-V-A-G-I-T-S-D-A- N-V-L-T-N-E-L-R-L-I-A-Q-R-Y-L-L-Q-Y-Q-E-P-I-P-C-E-Q-L-V-T-A-L-C-D-I-K-Q-A-Y-T-Q-F- G-G-K-R-P-F-G-V-S-L-L-Y-I-G-W-D-K-H-Y-G-F-Q-L-Y-Q-S-D-P-S-G-N-Y-G-G-W-K-A-T-C-I-G- N-N-S-A-A-A-V-S-M-L-K-Q-D-Y-K-E-G-E-M-T-L-K-S-A-L-A-L-A-I-K-V-L-N-K-T-M-D-V-S-K-L- S-A-E-K-V-E-I-A-T-L-T-R-E-N-G-K-T-V-I-R-V-L-K-Q-E-V-E-Q-L-I-K-K-H-E-E-E-E-A-K-A-E- R-E-K-K-E-K-E-Q-K-E-K-D-K260

Figure 3.5. FS-CAD of α4 protein: (A) full m/z range mass spectrum obtained after FS-CAD and the inset shows the decharged isotopic distribution of α4 protein, (B) 1000-1100 m/z region zoomed in to show the sequence tag identified from mapping the fragment ions, and (C) amino acid sequence of α4 protein with the b (red) and y (green) ions mapped by database searching.

The sequence tag obtained by mapping the fragments in (B) is labeled in blue in (C).

114

A B 10+ C 8+ 10+ 11+ 9+ 9+ 8+ 7+

9+ 12+ 8+

5+ y40 7+ 3+ 7+ 7+ y24 y58 4+ y24 5+ 3+ 5+ y44 4+ y37 2+ y40 y18 2+ 6+ b18 y13 2+ b52 11+ y14 8+ y58

700 800 900 1000 1100 1200 700 800 900 1000 1100 1200 700 800 900 1000 1100 1200 m/z m/z m/z

Figure 3.6. CASI-CAD of ubiquitin: (A) full m/z range mass spectrum of ubiquitin, (B) mass spectrum obtained after accumulating only the ions between 800-1200 m/z, and (C) mass spectrum obtained after applying the collision energy, with the ubiquitin fragment ions mapped.

115

A M-F-L-T-R-S-E-Y-D-R-G-V-N-T-F-S-P-E-G-R-L-F-Q-V-E- Y-A-I-E-A-I-K-L-G-S-T-A-I-G-I-Q-T-S-E-G-V-C-L-A-V- E-K-R-I-T-S-P-L-M-E-P-S-S-I-E-K-I-V-E-I-D-A-H-I-G- C-A-M-S-G-L-I-A-D-A-K-T-L-I-D-K-A-R-V-E-T-Q-N-H-W- F-T-Y-N-E-T-M-T-V-E-S-V-T-Q-A-V-S-N-L-A-L-Q-F-G-E- E-D-A-D-P-G-A-M-S-R-P-F-G-V-A-L-L-F-G-G-V-D-E-K-G- P-Q-L-F-H-M-D-P-S-G-T-F-V-Q-C-D-A-R-A-I-G-S-A-S-E- Number of Ion matching Total ions G-A-Q-S-S-L-Q-E-V-Y-H-K-S-M-T-L-K-E-A-I-K-S-S-L-I- accumulation e value fragment exported I-L-K-Q-V-M-E-E-K-L-N-A-T-N-I-E-L-A-T-V-Q-P-G-Q-N- time (sec) ions F-H-M-F-T-K-E-E-L-E-E-V-I-K-D-I

0.2 46 154 2.19E-44 B M-F-L-T-R-S-E-Y-D-R-G-V-N-T-F-S-P-E-G-R-L-F-Q-V-E-

Y-A-I-E-A-I-K-L-G-S-T-A-I-G-I-Q-T-S-E-G-V-C-L-A-V- 0.3 45 145 2.6E-44 E-K-R-I-T-S-P-L-M-E-P-S-S-I-E-K-I-V-E-I-D-A-H-I-G- C-A-M-S-G-L-I-A-D-A-K-T-L-I-D-K-A-R-V-E-T-Q-N-H-W- F-T-Y-N-E-T-M-T-V-E-S-V-T-Q-A-V-S-N-L-A-L-Q-F-G-E- E-D-A-D-P-G-A-M-S-R-P-F-G-V-A-L-L-F-G-G-V-D-E-K-G- P-Q-L-F-H-M-D-P-S-G-T-F-V-Q-C-D-A-R-A-I-G-S-A-S-E- G-A-Q-S-S-L-Q-E-V-Y-H-K-S-M-T-L-K-E-A-I-K-S-S-L-I- I-L-K-Q-V-M-E-E-K-L-N-A-T-N-I-E-L-A-T-V-Q-P-G-Q-N- F-H-M-F-T-K-E-E-L-E-E-V-I-K-D-I

Figure 3.7. Fragmentation map of α5 protein identified with CASI-CAD using different ion accumulation times: (A) 0.2 second ion accumulation time and at (B) 0.3 second accumulation time. The table in the inset provides the total number of fragment ions exported for data analysis, number of matching fragments, and the corresponding e values for accumulation times compared.

116

6 x10 11+ 1168.86075 8 11+ 1112.84147

6 7+ 1127.44169 11+ 12+ 1182.22980 1060.78913 12+ 1020.18976 23+ 2+ 12+ 4 973.95823 1150.40985 12+ 1101.14327 1008.01336 12+ 1034.27668 11+ Relative Relative intensity 13+ 1069.54283 940.40591 12+ 13+ 2 999.75830 1046.38169 6+ 13+ 985.50124 948.04810

0 6 11+ 11+ x10 1112.84190 1168.86137 8

7+ 11+ 1127.44168 1182.23011 6

12+ 12+ 2+ 1020.19011 1060.78924 973.95862 23+ 12+ 1150.41007 12+ 1101.14420 4 1008.01360

12+ 11+ 1034.27699 1069.54462 2+ Relative intensity 930.44243 12+ 13+ 999.75829 1046.38118 2 6+ 985.50183 13+ 948.04834

0 925 950 975 1000 1025 1050 1075 1100 1125 1150 1175 m/z

Figure 3.8. CASI-CAD mass spectra of α5 subunit obtained with different ion accumulation times. The relative intensities of fragment ions obtained at 0.2 second (top spectrum) and 0.3 second (bottom spectrum) are similar.

117

A 25+ Monoisotopic mass 24+ 29376.122 Da 26+ 23+ 27+

29385 29395 29405 Mass

700 800 900 1000 1100 1200 1300 1400 m/z B Y L L Q Y

11+ 8+ y110 b77

b 7+ 10+ 65 b96 10+ b97 16+ y156

8+ b75 b 8+ 10+ 76 b98 b 10+ 10+ 95 b99 10+ b100

1060 1070 1080 1090 m/z 1100 1110 1120 1130

C 1S-R-R-Y-D-S-R-T-T-I-F-S-P-E-G-R-L-Y-Q-V-E-Y-A-M-E-A-I-G-H-A-G-T-C-L-G-I-L-A-N-D-G- V-L-L-A-A-E-R-R-N-I-H-K-L-L-D-E-V-F-F-S-E-K-I-Y-K-L-N-E-D-M-A-C-S-V-A-G-I-T-S-D-A- N-V-L-T-N-E-L-R-L-I-A-Q-R-Y-L-L-Q-Y-Q-E-P-I-P-C-E-Q-L-V-T-A-L-C-D-I-K-Q-A-Y-T-Q-F- G-G-K-R-P-F-G-V-S-L-L-Y-I-G-W-D-K-H-Y-G-F-Q-L-Y-Q-S-D-P-S-G-N-Y-G-G-W-K-A-T-C-I-G- N-N-S-A-A-A-V-S-M-L-K-Q-D-Y-K-E-G-E-M-T-L-K-S-A-L-A-L-A-I-K-V-L-N-K-T-M-D-V-S-K-L- S-A-E-K-V-E-I-A-T-L-T-R-E-N-G-K-T-V-I-R-V-L-K-Q-K-E-V-E-Q-L-I-K-K-H-E-E-E-E-A-K-A- E-R-E-K-K-E-K-E-Q-K-E-K-D-K260

Figure 3.9. CASI-CAD of α4 protein: (A) full m/z range mass spectrum obtained after CASI-

CAD and the inset shows the decharged isotopic distribution of α4 protein, (B) 1060-1150 m/z region zoomed in to show the sequence tag identified from mapping the fragment ions, and (C) amino acid sequence of α4 protein with the b (red) and y (green) ions mapped by database searching. The sequence tag obtained by mapping the fragments in (B) is highlighted in a blue box in (C).

118

7 x10 6 - -79.985 Da α3 protein + [PO3] Monoisotopic Mass 5 28405.998 Da

4 Intensity

3 Relative 2 α3 protein Monoisotopic Mass 28326.013 Da 1

0 28320 28340 28360 28380 28400 28420 28440 28460 Mass

Figure 3.10. MaxEnt deconvoluted zero-charge mass spectrum of α3 protein. The singly phosphorylated form is present at 5 times more intensity than the non-phosphorylated form.

119

A 1S-R-R-Y-D-S-R-T-T-I-F-S-P-E-G-R-L-Y-Q-V-E-Y-A-M-E-A-I-G-H-A-G-T-C-L-G-I-L-A-N-D-G- V-L-L-A-A-E-R-R-N-I-H-K-L-L-D-E-V-F-F-S-E-K-I-Y-K-L-N-E-D-M-A-C-S-V-A-G-I-T-S-D-A- N-V-L-T-N-E-L-R-L-I-A-Q-R-Y-L-L-Q-Y-Q-E-P-I-P-C-E-Q-L-V-T-A-L-C-D-I-K-Q-A-Y-T-Q-F- G-G-K-R-P-F-G-V-S-L-L-Y-I-G-W-D-K-H-Y-G-F-Q-L-Y-Q-S-D-P-S-G-N-Y-G-G-W-K-A-T-C-I-G- N-N-S-A-A-A-V-S-M-L-K-Q-D-Y-K-E-G-E-M-T-L-K-S-A-L-A-L-A-I-K-V-L-N-K-T-M-D-V-S-K-L- S-A-E-K-V-E-I-A-T-L-T-R-E-N-G-K-T-V-I-R-V-L-K-Q-E-V-E-Q-L-I-K-K-H-E-E-E-E-A-K-A-E- R-E-K-K-E-K-E-Q-K-E-K-D-K260

B 1S-R-R-Y-D-S-R-T-T-I-F-S-P-E-G-R-L-Y-Q-V-E-Y-A-M-E-A-I-G-H-A-G-T-C-L-G-I-L-A-N-D-G- V-L-L-A-A-E-R-R-N-I-H-K-L-L-D-E-V-F-F-S-E-K-I-Y-K-L-N-E-D-M-A-C-S-V-A-G-I-T-S-D-A- N-V-L-T-N-E-L-R-L-I-A-Q-R-Y-L-L-Q-Y-Q-E-P-I-P-C-E-Q-L-V-T-A-L-C-D-I-K-Q-A-Y-T-Q-F- G-G-K-R-P-F-G-V-S-L-L-Y-I-G-W-D-K-H-Y-G-F-Q-L-Y-Q-S-D-P-S-G-N-Y-G-G-W-K-A-T-C-I-G- N-N-S-A-A-A-V-S-M-L-K-Q-D-Y-K-E-G-E-M-T-L-K-S-A-L-A-L-A-I-K-V-L-N-K-T-M-D-V-S-K-L- S-A-E-K-V-E-I-A-T-L-T-R-E-N-G-K-T-V-I-R-V-L-K-Q-E-V-E-Q-L-I-K-K-H-E-E-E-E-A-K-A-E- 260 R-E-K-K-E-K-E-Q-K-E-K-D-K

Number of Total Technique matching ions e value fragment ions exported

FS-CAD 50 135 4.94E-47

FS-CAD with 62 171 4.98E-59 CASI

Figure 3.11. Comparison of sequence coverage for α4 protein by FS-CAD and FS-CAD with

CASI: (A) FS-CAD alone and (B) enriching for the fragments from FS-CAD by using the CASI technique. The table in the inset provides the total number of fragment ions exported for data analysis, number of matching fragments, and the corresponding e values for both the methods.

120

108+ 105+ 93+ 112+ 101+ 98+ 116+

120+

123+

49 50Time [min]

800 900 1000 1100 1200 1300 1400 m/z 1500

Figure 3.12. Intact mass spectrum of Rpn1, a 100.2 kDa protein in the h19S complex. The mass spectrum was averaged from the peak obtained in the TIC which is shown in the inset.

121

FS-CAD CASI-CAD FS-CAD with CASI Protein Expt. MW Number of Uniprot ID Modification Total b & y Total b & y Total b & y subunit (Monoisotopic) amino acids e value e value e value ions ions ions

+Nacetyl P25786 α1 29578.8498 Metini 263 14 2.4E-04 17 1.65E-08 9 747

-Metini P25787 α2 25793.2227 +Nacetyl Ala1 233 38 4.69E-28 48 2.62E-38 45 8.64E-32

-Metini P25788 α3 28405.9878 +Nacetyl Ser1 254 7 1.46 15 1.41E-05 17 3.97E-07 - +1 PO3 -Metini P25789 α4 29376.1224 +Nacetyl Ser1 260 50 4.94E-47 32 1.37E-19 62 4.98E-59

+Nacetyl 241 P28066 α5 26436.2068 Metini 53 4.01E-48 42 4.15E-29 56 1.99E-58

-Met ini 245 P60900 α6 27292.7554 +Nacetyl Ser1 67 4.18E-79 42 1.96E-42 62 1.88E-57

-Met ini 247 O14818 α7 27780.5607 +Nacetyl Ser1 39 9.12E-36 18 7.29E-10 39 2.35E-29

No P20618 β1 23533.9295 modification 213 39 9.13E-31 27 6.01E-16 34 1.2E-20

+Nacetyl P49721 β2 22863.6804 Metini 201 28 1.25E-20 21 1.18E-09 33 1.89E-23

-Metini P49720 β3 22845.3970 +Nacetyl Ser1 204 63 1.09E-70 56 1.09E-45 71 5.67E-68

No P28070 β4 24376.1058 modification 219 49 1.75E-57 48 1.82E-50 57 7.87E-63

No P28074 β5 22444.0922 modification 204 40 1.12E-38 25 3.42E-18 39 8.09E-33

No P28072 β6 21889.7342 modification 205 13 2.25E-10 27 8.33E-21 19 2.14E-12

No Co-elutes Co-elutes Co-elutes Q99436 β7 25278.7467 234 - - - modification with β5 with β5 with β5

Table 3.1. Summary of h20S subunits identified by FS-CAD, CASI-CAD, and FS-CAD with

CASI techniques. The total number of ions matched and the corresponding e values obtained by all three fragmentation methods are given. The highest number of ions matched by FS-CAD,

CASI-CAD, and FS-CAD with CASI are in order highlighted in red, green, and blue. Except the

β7 subunit, all the other proteins were identified by both the methods.

122

Thr. MW Expt. MW Uniprot ID Protein subunit Error Da Error ppm (monoisotopic) (monoisotopic) P25786 α1 29578.8710 29578.8498 0.0212 0.717

P25787 α2 25793.2400 25793.2227 0.0173 0.672

P25788 α3 28406.0180 28405.9878 0.0302 1.064

P25789 α4 29376.1500 29376.1224 0.0276 0.940

P28066 α5 26436.2160 26436.2068 0.0092 0.349

P60900 α6 27292.7790 27292.7554 0.0236 0.865

O14818 α7 27780.5847 27780.5607 0.0240 0.865

P20618 β1 23533.9450 23533.9295 0.0155 0.660

P49721 β2 22863.6890 22863.6804 0.0086 0.377

P49720 β3 22844.4210 22844.3789 0.0421 1.842

P28070 β4 24376.1190 24376.1058 0.0132 0.543

P28074 β5 22444.1170 22444.0922 0.0248 1.106

P28072 β6 21889.7860 21889.7342 0.0518 2.368

Q99436 β7 25278.7900 25278.7467 0.0433 1.713

Average 0.0252 Da 1.006 ppm

Table 3.2. Summary of mass accuracies for the proteins in the h20S complex. Monoisotopic theoretical and experimental MWs of all the proteins in h20S complex, along with the errors in the measured mass in Da and ppm are tabulated.

123

CHAPTER 4

Reagents for Enhancing Protein Charging for Electrospray Ionization Mass Spectrometry

from Organic/Aqueous Solvent Systems

4.1. Abstract

Reagents increasing charge from acidic organic/aqueous solvents (e.g., m-nitrobenzyl alcohol) and neutral pH aqueous solutions (e.g., sulfolane, m-nitrophenyl ethanol) have been reported by different groups including ours. Here, we demonstrate that reagents that enhance charge for proteins in aqueous solutions (in native pH) can also increase charge in solutions that tend to denature proteins (low pH/organic solvents). Some compounds ineffective at neutral pH (e.g., m-nitrophenol) can increase charging from denaturing solutions. Extent of charge- shifting is protein-dependent, as some proteins do not exhibit the same proportion of increased charging as other, similarly-sized proteins. Increase in charge by adding supercharging reagents was higher for disulfide bonded proteins than for high MW proteins. Effective charging is important, since it is difficult to add more charges due to the compactly folded nature of the former and limitations in the MW for the latter. Increase in the number of charges acquired moves the charge state distribution of these proteins into the lower mass/charge region where it can be effectively analyzed. Furthermore, we have tested the supercharging potential of these reagents to increase the number of charges in peptides and we found that peptides which lack secondary structures can also be supercharged. Our analysis of these reagents with different classes of proteins and peptides under denaturing solvent conditions helps with understanding some of the mechanisms behind supercharging and showed some interesting observations which encourage further research in this area.

124

4.2. Introduction

The hallmark of electrospray ionization (ESI) as a technique to generate and introduce gas phase molecules into a mass spectrometer is its capability of generating multiple charging.

Before John Fenn’s landmark ESI-MS study of proteins, he had noted in an earlier 1985 publication that small peptides such as gramicidin S (MW 1141) and substance P (MW 1347) yielded abundant doubly charged ions, and if they were able to determine how multiple charging occurs, then it would “extend substantially the effective mass range, not only of the (ESI) source but of available mass analyzers as well.”1 Nearly at the same time, similar observations from an atmospheric pressure ionization source were noted by former Soviet Union researchers.2 The efficiency of tandem mass spectrometry (MS/MS) was also predicted by Fenn to benefit from the merits of multiple charging,3 and this feature was later demonstrated to be correct, as deriving sequence-informative product ions from peptides to large proteins by collisionally activated dissociation (CAD) and other activation/dissociation methods was shown to be possible.4

There has been a large body of work investigating the factors that govern the extent of

ESI multiple charging for a given biomolecule.5 The role of solvent properties, solution pH, peptide/protein sequence, and size has been explored. However, enhanced multiple charging for peptides and proteins was shown by Iavorone and Williams with compounds such as m- nitrobenzyl alcohol (m-NBA) in solutions containing a high proportion of organic solvents and at low pH, i.e., solvent systems that are typical for ESI-MS to promote high sensitivity analysis.6

Marshall’s group reported a similar “supercharging” effect in low pH, organic/aqueous solutions for proteins containing small amounts of DMSO and DMF.7 Very recently, Douglass and Venter have reported on the supercharging ability of sulfolane in a denaturing environment using

50/50/0.1 methanol/water/formic acid.8

125

Our laboratory first reported the supercharging effect by m-NBA in aqueous solutions

(i.e., without organic solvent) at near neutral pH for the measurement of non-covalently bound protein complexes.9 Native protein complexes can remain intact, albeit with increased charging, for their measurement by ESI-MS. Newly discovered reagents, such as m-nitrophenyl ethanol, sulfolane (tetramethylene sulfone), and m-(trifluoromethyl)-benzyl alcohol were reported subsequently by our laboratory to enhance protein charge for ESI-MS from aqueous neutral pH solution; these reagents promote charging to an equal or higher level than that observed for m-

NBA.10 Regardless of the mechanism of how these compounds are able to enhance ESI charging (vide infra), we demonstrate that sulfolane and all other reagents that enhance protein charge from neutral pH solutions have similar effects in solutions containing organic solvents and/or low pH. We also report that some compounds that do not supercharge native proteins

(i.e., neutral pH) can increase charge in denaturing solutions. Moreover, we have investigated different classes of proteins such as disulfide bonded proteins, high MW proteins, and membrane proteins, and found that the extent of enhanced charging is protein dependant, as some proteins do not exhibit the same proportion of increased charging as another similar sized protein. Finally, we compared the maximum charge (highest observed charge state) and average charge attained by a model protein, lysozyme by adding different supercharging reagents to find the lowest concentration that is required to obtain the supercharging effect.

4.3. Experimental section

4.3.1. Sample preparation

Bovine proteins (insulin, β-lactoglobulin A, α-lactalbumin, ribonuclease A, ribonuclease

B, and albumin), chicken lysozyme, human transferrin, antibody IgG1 kappa murine myeloma

(IgG1), bradykinin, and substance P were purchased from Sigma-Aldrich (St. Louis, MO), as

126 were m-NBA, m-nitrophenyl ethanol, nitrobenzene, m-nitrobenzonitrile, m-nitrophenol, and sulfolane.

Protein samples were diluted to a final protein concentration of approximately 5 -10 M for ESI-MS measurements in the solvent systems described in the text and the pH adjusted by the addition of formic acid (FA). IgG1 sample was desalted and concentrated using centrifugal filter devices and the final concentration was 6 M in ACN/H2O/FA 50/49.9/0.1.

Bacteriorhodopsin from the purple membrane of Halobacter salinarium (from Prof. Jim Bowie’s lab) was isolated and precipitated by methanol-chloroform precipitation protocol as described by

Whitelegge et al.11 The precipitated protein was resuspended by adding a low volume of 100%

FA followed by diluting the FA 25 fold using solvents described in the text. Increased ESI- generated charge states for peptides and proteins were enabled by adding the reagents to analyte solutions either directly (for reagents that were liquids at room temperature) or by dissolving the reagents in solvents described in the text (for reagents that were in solid state).

4.3.2. Mass spectrometry

Positive ion ESI mass spectra were acquired with three different systems: ESI-TOF mass spectrometer with a dual electrospray source (Agilent ESI-TOF MS, Santa Clara, CA), a

LTQ linear ion trap system (Thermo Fischer Scientific, Waltham, MA) with a nanoESI source

(New Objective, Woburn, MA), and a hybrid quadrupole time-of-flight (QTOF) mass spectrometer with a Triwave ion mobility (IM) separator (QTOF/IM; Waters Synapt HDMS,

Manchester, UK). The nanoESI source using borosilicate glass capillaries with Au/Pd coatings

(Proxeon Biosystems, Odense, Denmark) was operated at low analyte flow conditions (50 nL/min). For a given comparison, instrument settings are held constant as reagent concentration is varied, to ensure that changes to the stability and charging of protein complexes may properly be attributed to changes in solution conditions.

127

4.3.3. Calculation of average and maximum charges

Charge state distribution (CSD) refers to the observed set of charges acquired by a protein during ESI-MS. From the CSD, the average charge was calculated using the formula:

(ƩqiIi)/(ƩIi) from i to N, where qi is the charge of the i-th charge state, Ii is the intensity of the i-th charge state and N is the number of measured charge states in a mass spectrum. Average charge can be simply correlated to the charge state with the highest intensity in the CSD obtained for a given protein. When comparing the supercharging effect, the maximum charge state or the highest observed charge state was also taken into consideration. While average state gives an estimate for the overall charge distribution upon supercharging, the maximum charge state is an indication of the highest number of charges that can be attained without limitations due to electrostatic repulsion from the existing charges.

4.4. Results and discussion

4.4.1. Reagents for supercharging under denaturing conditions

In our previous work, we disclosed the following 9 reagents that increased ESI positive charging for our test protein (myoglobin) when added to the analyte aqueous solution buffered to pH 6.8 with ammonium acetate: benzyl alcohol, m-nitroacetophenone, m-nitrobenzonitrile, o-

NBA, m-NBA, p-NBA, m-nitrophenyl ethanol, sulfolane, and m-(trifluoromethyl)-benzyl alcohol.10

Based on the average charge state, sulfolane displayed a greater charge increase (61%) than m-NBA (21%) for myoglobin in aqueous solutions.10

These new reagents also increased protein charge from denaturing solutions, i.e., solutions containing a high proportion of organic solvent and/or low pH that tend to induce protein unfolding. Chicken lysozyme (14.3 kDa) and insulin (5.8 kDa) were used as test proteins to measure the efficiency of supercharging reagents under denaturing conditions. m-

Nitrobenzonitrile (22 mM) and sulfolane (300 mM) potently supercharged insulin when sprayed

128 from the denaturing solvent system, ACN/H2O/FA 50/49.9/0.1 (v/v/v), shifting insulin’s average charge from 4.4 to 5.4 and 5.7, respectively (Figure 4.1). 1% m-NBA also has a similar effect as the other two reagents, shifting insulin’s average charge to 5.8 and the maximum charge observed is 7+ in the presence of the either of these supercharging reagents (Table 4.1). This is a substantial increase in charging for such a small protein like insulin with three disulfide bonds.

In the presence of the above discussed denaturing solvent system (ACN/H2O/FA

50/49.9/0.1), 50 mM sulfolane potently supercharges another disulfide-intact protein, lysozyme by shifting the average charge from 11.1 to 14.4 (Figure 4.2A and B). Equivalent supercharging effect was observed when 50 mM m-nitrophenyl ethanol was used as the supercharging reagent (average charge 14.1) (Figure 4.2C). Also, the maximum number of charges obtained by lysozyme increases from 15+ to 19+ in the presence 100 mM sulfolane (Table 4.1), which happens to correlate to the number of basic groups in lysozyme (11 Arg, 6 Lys, 1 His, and N- terminus). Earlier studies have shown that in small proteins the number of basic residues can be directly correlated to the maximum charge state observed whereas in disulfide bonded proteins, the maximum charge state observed is less than the number of basic groups due to the higher order structure that exists from disulfide bridges.12 Loo et al. have shown that under denaturing conditions, 14+ is the most highly charged species observed for disulfide-intact lysozyme and it increases dramatically to 20+ by cleaving the cysteine-cysteine bond using DTT.13

Supercharging lysozyme increases the maximum charge to an extent formerly attainable by reducing the disulfides. This is advantageous since disulfide containing proteins can be characterized without the necessity to reduce with DTT and it also removes the additional steps required to alkylate the protein to prevent disulfide bond reformation.

Reducing the disulfide linkages generally allows more multiple charging to be observed by ESI-MS. In order to see if more charges can be added, the four disulfide bonds in the lysozyme were cleaved by adding 10 mM DTT. The increase in charge is less dramatic for the

129 disulfide-reduced form of lysozyme, as the average charging without and with sulfolane is fairly similar (16.5 and 18.0, respectively) (Figure 4.3A and B) and a maximum charge of 22+ is observed for disulfide-reduced lysozyme regardless of the addition of sulfolane. Supercharging the reduced form of lysozyme would not be expected to generate a substantial gain in ESI charging, as the maximum charge observed for reduced-lysozyme is beyond the number of basic sites.

Smith et al. have tabulated the increase in ESI charging beyond the number of basic sites in disulfide-reduced proteins such as proinsulin, ribonuclease A, interferon-1α, and lysozyme from studies by different groups.12 In addition, Loo et al. have also reported that the maximum ESI charge of 20+ for reduced lysozyme was more than the number of basic sites.13

They attributed this to the presence of glutamine residues which have similar gas-phase proton affinity as lysine as suggested by Bojesen.14 Lysozyme has three glutamine residues and therefore has a total of 22 sites (19 basic + 3 glutamine residues) where the charge can potentially reside upon extending the conformation by disulfide reduction.

4.4.2. Compounds effective only in denaturing conditions

Some compounds that do not increase ESI charging from aqueous neutral pH solutions increase charge from denaturing solutions. Previously, we discussed that 20 mM m-nitrophenol appeared to denature holo-myoglobin in aqueous solutions to generate an ion series for the apo-myoglobin form.10 However, m-nitrophenol increased charging for lysozyme under denaturing conditions. The average charge state increases to 14.1 upon adding 50 mM m- nitrophenol (Figure 4.4B) in comparison to lysozyme sprayed by itself (Figure 4.4A) with an average charge of 11.1. To make an effective comparison under the same instrumental parameters, we sprayed lysozyme in a native solvent such as 10 mM ammonium acetate.

Figure 4.4C and 4.4D display the CSDs of lysozyme in native conditions without and with 10

130 mM m-nitrophenol respectively. 10 mM concentration was chosen because of the solubility limit of this reagent in aqueous buffers. The apparent no change in the CSD acquired by lysozyme, when m-nitrophenol is added to the native solution conditions can be attributed to the low solubility of the reagent in aqueous buffers and it confirms our previous results obtained in native conditions.10

We also tested the supercharging ability of m-nitrophenol under denaturing conditions with a smaller size protein like insulin and found that insulin showed increased charging with 20 mM m-nitrophenol (5.3 average charge) (Figure 4.1E). Another reagent that is effective only under denaturing conditions is nitrobenzene. Nitrobenzene additions to 0.5% v/v did not alter protein charge state distributions from aqueous solutions.10 At 0.9% v/v (88 mM), close to the miscibility limit in water, unresolved peaks tailed to higher m/z, suggesting adduct binding and/or

10 incomplete desolvation. In ACN/H2O/FA 50/49.9/0.1, however, increased charging with 1% nitrobenzene for insulin is evident as it increases the average charge to 5.1 (Figure 4.1F).

4.4.3. Effect of different solvents in the supercharging efficiency of m-nitrophenol

m-Nitrophenol was found to supercharge in denaturing conditions, but not in native solvents. To further investigate the effects of solvent conditions in supercharging, lysozyme was supercharged with m-nitrophenol in each of following solvent concoctions: ACN/H2O/FA

50/49.9/0.1, ACN/water 50/50, methanol/H2O/FA 50/49.9/0.1, methanol/H2O 50/50, H2O/FA

99.9/0.1, and aqueous 10 mM ammonium acetate buffer. 5 µM lysozyme was diluted in each of the solvents separately and tested with/without 10 mM m-nitrophenol, which was dissolved in the same diluent as the protein. 10 mM concentration of m-nitrophenol was chosen because of the limited solubility of the reagent in aqueous buffers above this concentration and so, we decided to test this concentration across all solvent systems studied.

131

The panel of Figures in 4.5A-L shows the mass spectra acquired from testing lysozyme with m-nitrophenol in different solvent conditions. Higher charge state ions were observed for m- nitrophenol addition to disulfide-intact lysozyme sprayed from ACN/H2O/FA 50/49.9/0.1 and

ACN/water 50/50, but not from methanol/H2O/FA 50/49.4/0.1, methanol/H2O 50/50, H2O/FA

99.9/0.1, and aqueous 10 mM ammonium acetate buffer. With ACN/H2O/FA 50/49.9/0.1, the maximum degree of charge shift (the average charge shifted from 11.3 to 13.6) among all the solvent systems tested is achieved (Figure 4.5A and B). When the FA is dropped from the above solvent mixture, there is still an appreciable amount of shift in average charge (from 10.6 to 11.9) (Figure 4.5C and D). With 50/49.9/0.1 methanol/water/FA, the average charge increases from 9.6 to 10.4 (Figure 4.5E and F) and in 50/50 methanol/water the average charge remains the same with addition of m-nitrophenol (Figure 4.5G and H). In the presence of

99.9/0.1 water/FA, m-nitrophenol did not show any supercharging effect, as noticed by the no change in the number of charges (Figure 4.5I and J). Finally, with 10 mM ammonium acetate buffer, there was no supercharging, confirming our previous results that this reagent does not supercharge under aqueous conditions (Figure 4.5K and L).

This behavior contrasts with that of sulfolane and m-NBA, i.e., reagents which increased lysozyme charging not only from ACN-containing solvents, but also from 99.9/0.1 water/FA. A key difference between m-NBA and m-nitrophenol is that the former reagent is both a very weak

15 16 base and very weak acid (pKa=14.9), whereas the latter reagent is acidic (pKa=8.4). We predict that the supercharging of m-nitrophenol could be attenuated in aqueous solutions by ion- pairing; i.e., nitrophenolate anion-binding to protonated sites on the protein. Such complexes decrease overall charge when present and when collisionally dislodged as a neutral species.

This is explained in detail below with the observations from the multiple charging in different solvent conditions.

132

In aqueous solvents, m-nitrophenol dissociates into m-nitrophenyl anion, which shares a proton with the acidic amino acids in the protein and when desolvation occurs during ESI, the m-nitrophenyl anion may take a proton with it and leave as a neutral, and hence there is no supercharging. In solvents containing a weak base like acetonitrile (50/49.9/0.1 ACN/water/FA), m-nitrophenol dissociates less and donates the proton to the protein and which results in supercharging. To confirm this theory, the supercharging efficiency was tested in methanol containing solvents (50/49.4/0.1 methanol/water/FA and 50/50 methanol/water) and the absence/negligible amount of supercharging in these solvents supports the above theory. Also, as a control, 99.9/0.1 water/FA was tested and we found that there is no supercharging in this solvent condition, which indicates that this effect is independent of the pH. m-Nitrophenol does supercharge to some extent in 50/50 ACN/water, which again indicates the necessity of a weak base like acetonitrile. This also explains the individual roles of ACN and formic acid in supercharging. When 50/50 ACN/water and 50/49.9/0.1 ACN/water/FA are compared, the latter solvent has lots of excess protons that can be donated to the protein and hence it supercharges more than 50/50 ACN/water alone.

4.4.4. Supercharging disulfide bonded proteins

β-lactoglobulin A (18.3 kDa) and α-lactalbumin (14.1 kDa) are disulfide bonded proteins prevalent in milk with two and four disulfide bonds respectively. The isoelectric point of these proteins are equivalent (pI=4.8) and these proteins are considered as acidic proteins. This is in direct contrast to lysozyme, which is a basic protein with a pI of 9.32 but shares 40% sequence homology with α-lactalbumin and the three dimensional structures are also similar.17 As discussed earlier for lysozyme, the ESI charging of disulfide bonded proteins are reduced due to the spatial hindrance from the compactly folded structure and electrostatic repulsion from the existing protonated sites.

133

We wanted to explore if more charges can be added to these acidic disulfide bonded proteins by supercharging. Figure 4.6A and B are the ESI mass spectra of β-lactoglobulin A without and with 200 mM sulfolane obtained under denaturing conditions with ACN/H2O/FA

50/49.9/0.1. The average charge increases from 15.3 to 19.5 and the maximum charge attained increases from 19+ to 23+ upon addition of sulfolane (Table 4.1). It is important to note that β- lactoglobulin A has 21 basic sites and a maximum of 19+ charges have been reported for the disulfide-intact form of this protein.13 Figure 4.6C and D show the charge increased mass spectra of α-lactalbumin without and with adding 200 mM sulfolane, with the average charge increasing from 9.3 to 13.5. Upon sulfolane addition, the maximum charge increases from 13+ to 17+ which is equivalent to the total number of basic sites in α-lactalbumin.

While the maximum charge obtained is more than the number of basic sites in β- lactoglobulin A only, α-lactalbumin shows a greater increase (45.2%) than β-lactoglobulin A

(27.4%) when the number of average charges acquired is compared. In spite of the extremely different pI, lysozyme and α-lactalbumin show significant amount of supercharging, and this suggests that the supercharging effect in compactly folded proteins is independent of the overall charge of the protein. We also tested if other acidic proteins that do not have any disulfide bonds such as calmodulin (16.7 kDa, pI = 4.09) and actin (42.1 kDa, pI = 5.23) can be charge enhanced, since this will bring to light the supercharging ability of acidic proteins without the influence from the disulfide bonded architecture. Calmodulin and actin did not show any substantial increase in the average and maximum charges acquired in the presence of 200 mM sulfolane (Table 4.1 and Supplemental Figure 4.1 and 4.2).

We also investigated the ability to add more charges in two other disulfide bonded proteins, ribonuclease A (RNase A) and ribonuclease B (RNase B). These two proteins have exactly the same amino acid composition but only RNase B is glycosylated by attachment of N- linked oligosaccharides to Asn34. Five predominant types of glycosylations have been reported

134 in RNase B, which differ in the number of mannose residues attached to the chitobiose core

(Man5GlcNAc2, Man6GlcNAc2, Man7GlcNAc2, Man8GlcNAc2 and Man9GlcNAc2).18 The presence of these glycan groups imparts negative charges on the surface of RNase B and we were especially interested to find if the sugar residues impact supercharging. Figure 4.7A shows the ESI mass spectra of RNase A sprayed from denaturing solvents where the average charging increases from 11.2 to 15 upon addition of 200 mM sulfolane (Figure 4.7B), which corresponds to a 33.9% increase. RNase B, on the other hand, is not efficiently supercharged under the same conditions, the average charge being shifted from 11.2 to 13.7 which is a 22.3% increase (Figure 4.7C and D). It should also be noticed that due to the heterogeneity in the number of mannose residues present, the signal from the protein gets distributed in a wide range of m/z. This less dramatic shift in the number of charges gained can be explained by the bulky glycan groups that are present in the surface of the protein, which may cause a spatial hindrance and also impart negative charge in the surface of the protein which decreases the likelihood of picking more charges.

Supercharging reagents can also be used in the negative mode to abstract protons from the protein and hence being able to ionize with more negative charges in the negative mode.

Sulfolane is both a very weak base and a weak acid, and has been reported to supercharge oligosaccharides in the negative mode.19 We wanted to see if the negative charges in the glycans in RNase B can be supercharged in the negative mode. RNase A and RNase B were diluted in ACN/H2O 50/50 containing 10 mM ammonium acetate to facilitate charging in the negative mode. In RNase A, the base peak shifted by one charge state from 6- to 7- (Figure

4.7E and F), while there was no change in the base peak in RNase B upon supercharging

(Figure 4.7G and H).

For RNase A which does not have any glycans to be ionized in the negative mode, our results are consistent with the results published recently by Douglass and Venter.8 These

135 authors supercharged cytochrome c with 100 mM sulfolane in the positive and negative modes in the presence of methanol/H2O (96/4) as solvent. While the positive mode showed a dramatic increase in both average and maximum charges, there was no supercharging in the negative mode.8 On the other hand, the observation that RNase B does not supercharge in the negative mode in spite of the presence of glycans is in contrast to the report by Huang et al., where they showed that heparan sulfate oligosaccharides can be supercharged with sulfolane.19 While heparan sulfate is a linear acidic polysaccharide, RNase B is a polypeptide with both acidic and basic groups. More experiments are required to determine if supercharging in acidic proteins and glycosylated proteins is compromised due to the presence of competing acidic and basic groups.

4.4.5. Supercharging high MW proteins

Supercharging high MW proteins is advantageous since the increased charging shifts the mass spectrum into the lower m/z region from the typical higher m/z area where most proteins with a significant size ionize. We explored the multiple charging effects of high MW proteins such as bovine serum albumin (BSA, 66.5 kDa), transferrin (78 kDa), and IgG1 (148.5 kDa) by adding supercharging reagents. This will also help with answering the underlying question if high MW proteins can be charge shifted, since the charge density and size of the molecule play important roles in the supercharging phenomenon. 5-10 µM concentration of these proteins in ACN/H2O/FA 50/49.9/0.1 was analyzed with and without the addition of reagents.

Figure 4.8A-F shows the MS profiles of transferrin with increasing concentrations of sulfolane ranging from 25 mM to 200 mM. The average charge increases from 38 without any sulfolane to 43.4 at 75 mM sulfolane, with intermediate concentrations showing proportional increase in average charge state (Table 4.2). Further increase in the amount of sulfolane to 100

136 mM and 200 mM did shift the charge state envelope of transferrin, but the protein-sulfolane adducts also increases and the activation energy applied is not enough to dissociate the adducts. Higher resolution is required to assign the charge state distributions obtained at high concentrations of the reagent. We also noticed that the product ions formed by applying higher voltages in the nozzle-skimmer region can be supercharged when higher concentration of sulfolane is added. The b-ions obtained are at 6+ and 7+ charge states without any reagent

(Figure 4.9A), but addition of 200 mM sulfolane yields the same b-ions at 8+ and 9+ charges

(Figure 4.9B), which becomes very useful in subsequent MS techniques to obtain more information.

Transferrin also showed similar supercharging effects with m-nitrophenyl ethanol in denaturing conditions (Table 4.2). Previously, Marshall’s group has shown charge increases for transferrin eluting from a reversed phase column by adding 5% DMSO v/v in the LC solvent and they reported that a dramatic shift in the highest charge state from 67+ to 77+.7 The maximum charge attained by transferrin (52+) in our experiments was less when compared to their LC results, but this can be attributed to the high percentages of ACN that exists when transferrin elutes from the column. Moreover, regardless of the efficiency of supercharging, DMSO has additional conformational effects and proteins can be denatured at high DMSO concentrations. 20

Yet another similarly-sized protein as transferrin is BSA and addition of 200 mM sulfolane shifted the average charge of BSA from 51 to 59 (Figure 4.10) and the maximum charge increased from 68+ to 73+. With 22.5 mM m-nitrophenyl ethanol, similar results were obtained (Table 4.2).

IgG1 is almost twice the size of transferrin and we were interested to find if such large proteins can be charge enhanced. Mass spectra were obtained from desalted IgG1 samples by adding different supercharging reagents, since with sizeable proteins, it is necessary to try different reagents to make sure that the limited supercharging is not due to the reagent itself.

137

The average charge without any reagent was 50.7 and it shifted to 60.9 with the addition of m- nitrophenyl ethanol (Figure 4.11). m-NBA and sulfolane also increased the number of average charges to 64.2 and 62.7 respectively (Table 4.2). By increasing the charge on such large proteins like antibodies, it helps with top-down characterization by utilizing the charge increase in the precursor and subsequent product ions. It will be interesting to compare the sequence coverage obtained in disulfide-intact IgG1 with and without supercharging. The limits on the size and disulfide bonded architecture creates a sufficient level of hindrance to fragmenting such proteins and if supercharging can circumvent this requirement without the need to reduce the disulfide bonds, larger proteins can be characterized intact.

4.4.6. Supercharging membrane proteins

Membrane protein analysis is challenging due to the precipitation/aggregation issues encountered during sample preparation for proteomic investigations.21 The hydrophobic stretches of amino acids referred to as the ‘transmembrane domains’ require special solvent conditions to keep the protein soluble for analysis by mass spectrometry.22 We were interested to find if the membrane protein can be solubilized by adding supercharging reagents, since this will eliminate the requirement of special solvents for ESI-MS analysis. In addition, we wanted to explore the supercharging potential of membrane proteins since this will help with their characterization by analyzing the higher charge states formed during supercharging.

Bacteriorhodopsin (BR) is a classic example of an integral membrane protein that constitutes 75% of the membrane of purple bacterium Halobacter salinarium. Different groups have characterized this protein by precipitating it out of the membrane.23-26 After precipitation, the rhodopsin moiety falls off from BR and the apo-protein is termed bacterioopsin (BO). We precipitated BO from the membrane using methanol-chloroform protocol11 and resolubilized the protein using a small amount of neat FA followed by diluting the FA about 25 fold by adding

138 water. Aliquots were taken from this preparation and different concentrations of sulfolane starting from 300 mM to 3 M were added to find if adding the supercharging reagent will yield a signal from BO under conditions that are devoid of denaturing solvents. Addition of 300 mM sulfolane to BO dissolved in water gave only negligible amount of signal in ESI-MS, and the signal intensity and CSD of BO was the same with and without adding sulfolane (Figure 4.12).

There was no improvement in signal intensity with increasing concentrations of sulfolane. This can be attributed to the low solubility of BO in a polar solvent such as water and addition of sulfolane did not increase the solubility of BO in water. There is an older report on utilizing sulfolane as a solubilizing agent for hydrophobic proteins such as zein and globins, and the researchers employed circular dichroism to show that sulfolane is an effective solubilizing agent for water-insoluble proteins.27 They also demonstrated various properties of sulfolane such as stability of sulfolane in aqueous solutions, does not chemically modify proteins in comparison to other denaturants like urea, and its compatibility with electrophoretic and chromatographic techniques.27 Though our results contradict with their findings, similar proteins and experiments types have to be compared before making any further conclusions.

To confirm that solubility of BO is low in the absence of denaturing solvents, BO was resuspended in ACN/H2O 50/50 (this step was done after solubilizing the precipitated protein pellet in neat FA as in earlier experiment). In the presence of 50% ACN, ESI-MS signal was much better and the maximum charge attained was 24+, with the average charge being 16.9

(Figure 4.13A). When 300 mM sulfolane was added to the BO in 50% ACN, the CSD narrowed to a few predominant charge states. The maximum charge attained did not increase and the average charge was 18.1 (Figure 4.13B).

The narrowing of the CSD upon addition of sulfolane was contrasting to the results obtained from all other proteins. To investigate more on this observation, different denaturing solvent conditions were tested to find out if this was related to the nature of the solvent. BO was

139 precipitated and solubilized in neat FA in the above-mentioned procedure and diluted in each of the following solvents: chloroform/methanol/1%FA (4/4/1), ACN/H2O 80/20, and isopropanol/H2O 50/50. In the presence of chloroform/methanol/1% FA (4/4/1), the classical solvent for membrane protein analysis, the maximum and average charges decreased in the presence of 300 mM sulfolane (Figure 4.14 and Table 4.2). All the other solvent conditions tested also showed similar results (Table 4.2) and the general observation was that adding sulfolane lead to charge state coalescence, which results in narrowing the signal to a few predominant charge states. Recently, Meyer and Komives reported a similar effect for peptides when DMSO was added as a supercharging reagent.28 They proposed that DMSO improves spray stability and desolvation efficiency, which might result in increasing the total signal to a few predominant charge states.

From our experiments on adding sulfolane to different classes of proteins in denaturing and native conditions, we did not notice this kind of effect. To test if this effect is specific to hydrophobic class of proteins such as BO, we tested BO with m-NBA. Addition of 0.5% m-NBA did not narrow the CSD of BO. The maximum charge and average charge obtained remained the same after adding m-NBA (Table 4.2). These results suggest that the charge state coalescence effect was specific to BO and sulfolane co-addition, and is independent of the choice of denaturing solvent, supercharging reagent used, or nature of the protein.

4.4.7. Supercharging peptides

Previously Jensen’s group have explored the supercharging effects of m-NBA on tryptic peptides and phosphopeptides, and they have shown that the predominant charge increases for peptides by adding 0.1% m-NBA in the LC solvent during reversed phase separations.29 We chose to explore this multiple charging phenomenon of peptides by testing a few standard

140 peptides with supercharging compounds, since this will answer basic questions such as conformational effects during supercharging.

Reagents such as sulfolane, m-NBA, and m-nitrophenyl ethanol have been shown to increase charging across all sizes of proteins. When extending the same approach to peptides in denaturing solvent conditions, we found the ESI charging of small peptides can also be increased. The most abundant charge state of bradykinin (RPPGFSPFR) under denaturing conditions was 2+ (Figure 4.15A). In the presence of 25 mM sulfolane the most abundant charge shifts to 3+ (Figure 4.15B). Moreover, the maximum charge also increases to 4+ upon addition of sulfolane (maximum charge without sulfolane is 3+), despite the fact that there are only 3 potentially basic sites (two Arg residues and the N-terminus). Another aspect that is evident from supercharging peptides is that, the mechanism of supercharging may not based only on denaturation, since in biomolecules like peptides which do not have a significant amount of secondary structure, denaturation alone could not be the cause of supercharging. The roles played by other characteristics such as reagent basicity can explain the supercharging of peptides.9

The same effect was observed for substance P (RPKPQQFFGLM) where the most abundant charge state shifts from 2+ to 3+ with 25 mM sulfolane and a 4+ charge state was observed (Table 4.2). Substance P also has only three basic sites in the sequence where the charge could potentially reside. The presence of 4th charge in both bradykinin and substance P shows that the other residues such as glutamine or proline could potentially be the sites where an extra charge could remain under supercharging conditions.

4.4.8. Reagent concentration versus supercharging

To find the concentrations of various supercharging reagents necessary to enhance charging without losing signal intensity, lysozyme was tested with increasing concentrations of

141 supercharging reagents to find the concentration at which there is no further increase in charge.

All the experiments were performed in denaturing conditions with 50/49.9/0.1 ACN/H2O/ FA as the solvent to dilute the protein and to make appropriate concentrations of the supercharging reagents. The plots of reagent concentration against average charge and maximum charge are shown in Figure 4.16A-D and the corresponding mass spectra acquired at increasing concentrations of each supercharging reagent are displayed in Supplemental Figures 4.3-4.6.

The average charge acquired by lysozyme increases significantly upon addition of 50 mM sulfolane (equivalent to 0.47%) and ascends steadily until 250 mM concentration where an average charge of 16.4 is obtained and does not increase further at 300 mM concentration

(Figure 4.16A and Supplemental Figure 4.3). The maximum charge acquired without any sulfolane is 15+ and at 150 mM sulfolane (equivalent to 1.41%), the maximum charge observed increases to 20+, which exceeds the number of basic sites and does not increase further with higher concentration.

With m-NBA, the average charge increases from +11.2 to +14.1 at 50 mM m-NBA

(which is equivalent to 0.6% m-NBA) and at 100 mM m-NBA (equivalent to 1.19%), there is a very slight increase to 14.2, after which it does not increase (Figure 4.16B and Supplemental

Figure 4.4). A maximum of 18 charges is attained with adding 100 mM m-NBA and there is no additional increase at 200 mM concentration (equivalent to 2.3%).

m-Nitrophenyl ethanol is slightly distinct, since the average charge of 14.1 acquired with

50 mM m-nitrophenyl ethanol drops to 14 and 13.3 at 100 and 200 mM concentrations respectively, while the maximum charge remains stable at 17+ at these concentrations (Figure

4.16C and Supplemental Figure 4.5). This drop in the supercharging trend is unexplainable at this point, but it is important to note that with some reagents there is not always a sigmoidal trend in the degree of supercharging. m-Nitrophenol shows the same amount of supercharging as m-NBA, with the average charge increased to a maximum of 14.4 at 200 mM concentration

142 and the maximum charge does not increase above 18+ (Figure 4.16D and Supplemental Figure

4.6).

When equivalent concentrations of all the reagents are compared, sulfolane shows the highest amount of charge increase at the lowest possible concentration (50 mM). These results also indicate that, in terms of maximum charge, other reagents reach up to 18+, whereas sulfolane at 150 mM concentration generates 20+ charges on lysozyme. In addition, when the supercharging efficiency of sulfolane and m-NBA are compared, 100 mM sulfolane (equivalent to 0.94%) yields an average charge of 15.3 and a maximum charge of 19+ whereas 100 mM m-

NBA (equivalent to 1.19%) generates an average charge of 14.2 and a maximum charge of

18+. These results are in contrast to the recent study by the Williams group where they reported that the supercharging efficiency of sulfolane is lower than m-NBA on a per mole basis comparison.30 However, the dissimilarity between the results could be due to the solvent conditions (denaturing in our study versus native pH in their report) to test the supercharging ability.

4.4.9. Protein conformation during supercharging

To investigate the effects of protein conformation on supercharging, we analyzed the ion mobility of proteins with and without supercharging. An ion in a compact-form has a higher mobility, hence a shorter drift time than the same ion in a more open conformation as it travels through the ion mobility separation device. The drift time distributions for multiply charged ions of transferrin before and after adding 50 mM sulfolane (Figure 4.17A and B) display the arrival times of different charge states. However, the ion mobility arrival time distributions for the same charge state of transferrin with and without supercharging could not be compared because the peak width of different charge states of transferrin increase upon supercharging (data not shown) due to formation of protein-sulfolane adducts and the charge states are also not

143 isotopically resolved. Therefore, we analyzed the ion mobility profile of a low MW protein such as lysozyme, and compared the drift time of a particular charge state with and without supercharging since the charge states can be isotopically resolved and a more accurate comparison of drift time can be performed.

The two dimensional plots of m/z against drift time for lysozyme with and without 200 mM sulfolane in denaturing conditions (Figure 4.18A and B) and native solvent conditions

(Figure 4.19A and B) are displayed and the corresponding mass spectra are shown in Figure

4.20A-D. Although it is ideal to compare the same charge state of lysozyme with and without supercharging in both denaturing and native solvent conditions, there is no common charge state that is present in the charge state distributions obtained for lysozyme at all the different conditions (Figure 4.20A-D). When the 9+ charge state of lysozyme in native solvent conditions is compared with and without supercharging (Figure 4.21A and B), there is no difference in the drift time obtained. Similar results were obtained when the 12+ charge state is compared in denaturing solvent conditions (Figure 4.21C and D). However, the Williams group had noticed that in native conditions when different charge states of lysozyme with and without reagent were compared, the collision cross sections of higher charge state ions were higher and they had concluded that supercharging denatures the protein and tends to cause more open conformation of the protein which in turn reduces the mobility.30 But from our data, there is a clear trend across all the charge states of lysozyme that the addition of the reagent does not cause any difference in the drift time in both native and denaturing conditions and hence our data suggests that the conformation of the protein remains unchanged up on supercharging.

144

4.5. Conclusions

Compounds such as sulfolane, nitrobenzonitrile, and m-nitrophenyl ethanol were found to be effective in enhancing charge from both native and denaturing environments. m-

Nitrophenol and nitrobenzene were limited in solubility in aqueous solvents and did not affect the CSD of proteins. However, these two reagents were found to be potent supercharging agents in denaturing conditions. For m-nitrophenol, the supercharging ability was found to be greater in a weak base such as ACN when compared to other solvents studied. This highlights the importance of investigating the chemical properties of the supercharging reagents when probing the mechanisms behind supercharging. Disulfide bonded proteins displayed a significant amount of supercharging than the other classes of proteins. This aspect of increasing the number of charges acquired without the need for disulfide reduction is beneficial for characterizing disulfide bonded proteins. High MW proteins did not show the same extent of supercharging as the other classes of proteins and the charge density might play a role in limiting the number of charges in this class. In addition, the gas-phase electrophoretic mobility diameters of high MW proteins can be studied to find if any correlation exists between size and supercharging. Supercharging peptides showed that species that lack secondary structure can be supercharged and future investigations with synthetic peptides will be beneficial to find the roles played by non-basic amino acids in supercharging. Finally, the concentration of the supercharging reagent chosen is important since the loss of signal intensity and adduct formation can be limited by selecting appropriate concentration of the reagents.

145

4.6. References

1. Whitehouse, C. M.; Dreyer, R. N.; Yamashita, M.; Fenn, J. B., Electrospray interface for liquid chromatographs and mass spectrometers. Anal Chem 1985, 57, (3), 675-679.

2. Alexandrov, M. L.; Gall, L. N.; Krasnov, N. V.; Nikolaev, V. I.; Pavlenko, V. A.; Shkurov, V. A., Extraction of ions from solutions under atmospheric pressure as a method for mass spectrometric analysis of bioorganic compounds. Rapid Commun Mass Spectrom 2008, 22, (3), 267-270.

3. Meng, C. K.; Mann, M.; Fenn, J. B., Of protons or proteins. Atoms, Molecules and Clusters 1988, 10, 361-368.

4. Loo, J. A.; Edmonds, C. G.; Smith, R. D., Primary sequence information from intact proteins by electrospray ionization tandem mass spectrometry. Science 1990, 248, (4952), 201-204.

5. Cole, R. B., Some tenets pertaining to electrospray ionization mass spectrometry. J Mass Spectrom 2000, 35, 763-772.

6. Iavarone, A. T.; Jurchen, J. C.; Williams, E. R., Supercharged protein and peptide ions formed by electrospray ionization. Anal Chem 2001, 73, (7), 1455-1460.

7. Valeja, S. G.; Tipton, J. D.; Emmett, M. R.; Marshall, A. G., New reagents for enhanced liquid chromatographic separation and charging of intact protein ions for electrospray ionization mass spectrometry. Anal Chem 2010, 82, (17), 7515-7519.

8. Douglass, K. A.; Venter, A. R., Investigating the role of adducts in protein supercharging with sulfolane. J Am Soc Mass Spectrom 2012, 23, (3), 489-497.

9. Lomeli, S. H.; Yin, S.; Ogorzalek Loo, R. R.; Loo, J. A., Increasing charge while preserving noncovalent protein complexes for ESI-MS. J Am Soc Mass Spectrom 2009, 20, (4), 593-596.

10. Lomeli, S. H.; Peng, I. X.; Yin, S.; Loo, R. R.; Loo, J. A., New reagents for increasing ESI multiple charging of proteins and protein complexes. J Am Soc Mass Spectrom 2010, 21, (1), 127-131.

11. Whitelegge, J. P., HPLC and mass spectrometry of intrinsic membrane proteins. Methods Mol Biol 2004, 251, 323-340.

12. Smith, R. D.; Loo, J. A.; Edmonds, C. G.; Barinaga, C. J.; Udseth, H. R., New developments in biochemical mass spectrometry: electrospray ionization. Anal Chem 1990, 62, (9), 882-899.

13. Loo, J. A.; Edmonds, C. G.; Udseth, H. R.; Smith, R. D., Effect of reducing disulfide- containing proteins on electrospray ionization mass spectra. Anal Chem 1990, 62, (7), 693-698.

14. Bojesen, G., The Order of Proton Affinities of the 20 Common L-α-Amino Acids. J Am Chem Soc 1987, 109, 5557-5558.

15. Sowa, G. A.; Hengge, A. C.; Cleland, W. W., 18O Isotope Effects Support a Concerted Mechanism for Ribonuclease A. J Am Chem Soc 1997, 119, 2319-2320. 146

16. Reich, H. J., http://www.chem.wisc.edu/areas/reich/pkatable/index.htm. 2005.

17. Arnon, R.; Maron, E., An immunological approach to the structural relationship between hen egg-white lysozyme and bovine-lactalbumin. J Mol Biol 1971, 61, (1), 225-235.

18. Reid, G. E.; Stephenson, J. L., Jr.; McLuckey, S. A., Tandem mass spectrometry of ribonuclease A and B: N-linked glycosylation site analysis of whole protein ions. Anal Chem 2002, 74, (3), 577-583.

19. Huang, Y.; Shi, X.; Yu, X.; Leymarie, N.; Staples, G. O.; Yin, H.; Killeen, K.; Zaia, J., Improved liquid chromatography-MS/MS of heparan sulfate oligosaccharides via chip-based pulsed makeup flow. Anal Chem 2011, 83, (21), 8222-8229.

20. Hamdy, O. M.; Julian, R. R., Reflections on charge state distributions, protein structure, and the mystical mechanism of electrospray ionization. J Am Soc Mass Spectrom 2011, 23, (1), 1-6.

21. Helbig, A. O.; Heck, A. J.; Slijper, M., Exploring the membrane proteome--challenges and analytical strategies. J Proteomics 2010, 73, (5), 868-878.

22. Whitelegge, J. P., Tandem mass spectromery of integral membrane proteins for top-down proteomics. Trends Anal Chem 2005, 24, (7), 576-582.

23. Whitelegge, J. P.; Gundersen, C. B.; Faull, K. F., Electrospray-ionization mass spectrometry of intact intrinsic membrane proteins. Protein Sci 1998, 7, (6), 1423-1430.

24. Schaller, J., Analysis of hydrophobic proteins and peptides by mass spectrometry. Methods Mol Biol 2000, 146, 425-437.

25. Hufnagel, P.; Schweiger, U.; Eckerskorn, C.; Oesterhelt, D., Electrospray ionization mass spectrometry of genetically and chemically modified bacteriorhodopsins. Anal Biochem 1996, 243, (1), 46-54.

26. Barnidge, D. R.; Dratz, E. A.; Jesaitis, A. J.; Sunner, J., Extraction method for analysis of detergent-solubilized bacteriorhodopsin and hydrophobic peptides by electrospray ionization mass spectrometry. Anal Biochem 1999, 269, (1), 1-9.

27. Vecchio, G.; Righetti, P. G.; Zanoni, M.; Artoni, G.; Gianazza, E., Fractionation techniques in a hydro-organic environment. I. Sulfolane as a solvent for hydrophobic proteins. Anal Biochem 1984, 137, (2), 410-419.

28. Meyer, J. G.; E, A. K., Charge state coalescence during electrospray ionization improves peptide identification by tandem mass spectrometry. J Am Soc Mass Spectrom 2012, 23, (8), 1390-1399.

29. Kjeldsen, F.; Giessing, A. M.; Ingrell, C. R.; Jensen, O. N., Peptide sequencing and characterization of post-translational modifications by enhanced ion-charging and liquid chromatography electron-transfer dissociation tandem mass spectrometry. Anal Chem 2007, 79, (24), 9243-9252.

147

30. Sterling, H. J.; Daly, M. P.; Feld, G. K.; Thoren, K. L.; Kintzer, A. F.; Krantz, B. A.; Williams, E. R., Effects of supercharging reagents on noncovalent complex structure in electrospray ionization from aqueous solutions. J Am Soc Mass Spectrom 2010, 21, (10), 1762-1774.

148

A No reagent 4+ 5+

3+

B 6+ 1% m-NBA

5+ 7+

C 6+ 300 mM sulfolane 5+

7+

D 6+ 5+ 22 mM m-nitrobenzonitrile

7+ 4+

E 5+ 20 mM m-nitrophenol 6+

7+ 4+

F 5+ 1% nitrobenzene

6+ 4+

7+

800 1000 1200 1400 1600 1800 m/z

Figure 4.1. Supercharging insulin with different reagents in denaturing conditions. ESI-MS of insulin sprayed with (A) no reagent, (B) m-NBA, (C) sulfolane, (D) m-nitrobenzonitrile, (E) m- nitrophenol, and (F) nitrobenzene in denaturing solvent conditions.

149

A 11+ No reagent

13+

9+

15+

B 15+ 50 mM sulfolane

13+ 17+

11+

C 15+ 50 mM m-nitrophenyl ethanol

13+

17+ 11+

800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 m/z

Figure 4.2. Supercharging disulfide-intact lysozyme with different reagents in denaturing conditions. ESI-MS of disulfide-intact lysozyme in ACN/H2O/FA 50/49.9/0.1 sprayed without any reagent (A), with 50 mM sulfolane (B), and 50 mM m-nitrophenyl ethanol (C).

150

A 16+ disulfide-reduced No reagent 18+

14+ 20+

12+ 10+

B 18+ disulfide-reduced 200 mM sulfolane 20+

16+

22+ 14+

600 800 1000 1200 1400 1600 m/z

Figure 4.3. Supercharging disulfide-reduced lysozyme with sulfolane in denaturing conditions.

ESI-MS of lysozyme reduced with DTT and sprayed without any reagent (A) and with 200 mM sulfolane (B) in denaturing solvent conditions.

151

A 11+ No reagent Denaturing 13+ 9+

15+

B 15+ m-nitrophenol Denaturing 13+

17+ 11+ 9+

C No reagent 8+ Native

7+ 9+

D 8+ m-nitrophenol Native

9+

7+

700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 m/z

Figure 4.4. Supercharging disulfide-intact lysozyme with m-nitrophenol in denaturing and native solvent conditions. ESI-mass spectra of lysozyme sprayed by itself (A) and with addition of 50 mM m-nitrophenol (B) in denaturing conditions. (C) and (D) are the charge state distributions of lysozyme sprayed without and with 10 mM m-nitrophenol respectively, in 10 mM ammonium acetate buffer (native conditions).

152

A 50/49.9/0.1 ACN/H2O/FA E 50/49.9/0.1 Methanol/H2O/FA I 99.9/0.1 H2O/FA No reagent 9+ No reagent No reagent 13+ 11+

9+ 9+ 11+ 11+ 13+

B 50/49.9/0.1 ACN/H2O/FA F 50/49.9/0.1 Methanol/H2O/FA J 99.9/0.1 H2O/FA 10 mM m-nitrophenol 10 mM m-nitrophenol 10 mM m-nitrophenol 13+

15+ 11+ 9+ 9+ 11+

11+ 13+

800 900 1000 1100 1200m/z1300 1400 1500 1600 1700 1800 1900 50/50 ACN/H O C 11+ 2 G 50/50 Methanol/H2O K 10 mM Amm. Ace. buffer No reagent No reagent No reagent

11+ 9+ 9+

13+ 9+ 7+

D 50/50 ACN/H2O H 50/50 Methanol/H2O L 10 mM Amm. Ace. buffer 10 mM m-nitrophenol 10 mM m-nitrophenol 10 mM m-nitrophenol 11+ 11+ 13+ 9+ 9+

15+ 9+ 7+

850 950 1050 1150 1250 1350 1450 1550 1650 1750 1850 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300 2400 m/z m/z m/z

Figure 4.5. Supercharging disulfide-intact lysozyme with m-nitrophenol in different solvent systems. ESI-mass spectra of disulfide-intact lysozyme sprayed in different solvent conditions without and with 10 mM m-nitrophenol (m-NP). (A) 50/49.9/0.1 ACN/H2O/FA, (B) 50/49.9/0.1

ACN/H2O/FA with m-NP, (C) 50/50 ACN/water, (D) 50/50 ACN/water with m-NP, (E) 50/49.9/0.1 methanol/H2O/FA, (F) 50/49.9/0.1 methanol/H2O/FA with m-NP, (G) 50/50 methanol/H2O, (H)

50/50 methanol/H2O with m-NP, (I) 99.9/0.1 H2O/FA, (J) 99.9/0.1 H2O/FA with m-NP, (K) 10 mM ammonium acetate buffer, and (L) 10 mM ammonium acetate buffer with m-NP.

153

A 15+ β-lactoglobulin A C 9+ α-lactalbumin No reagent No reagent

17+ 13+ 11+

19+ 7+

B β-lactoglobulin A α-lactalbumin 19+ D 200 mM sulfolane 200 mM sulfolane 13+ 21+ 15+

17+ 11+ 23+ 15+ 9+

800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 900 1100 1300 1500 1700 1900 2100 2300 m/z m/z

Figure 4.6. Supercharging β-lactoglobulin A and α-lactalbumin with sulfolane. (A) and (B) are

ESI-MS of β-lactoglobulin A without and with 200 mM sulfolane respectively. (C) and (D) are the

ESI-MS of α-lactalbumin without and with 200 mM sulfolane respectively. All spectra were obtained under denaturing conditions.

154

A + mode C + mode 11+ RNase A 11+ RNase B No reagent No reagent 13+ 13+ 9+ 9+

15+ 7+

B RNase A D RNase B 200 mM sulfolane 15+ 200 mM sulfolane 15+ 13+

17+ 13+ 11+

800 1000 1200 m/z 1400 1600 1800 2000 800 1000 1200 m/z 1400 1600 1800 2000

E 6- - mode G 6- - mode RNase A RNase B No reagent No reagent 7-

7- 5- 5-

F 7- RNase A H 6- RNase B 200 mM sulfolane 200 mM sulfolane 6- 7-

5-

8- 5-

1700 1900 2100 2300 2500 2700 2900 1900 2100 2300 2500 2700 2900 3100 3300 3500 m/z m/z

Figure 4.7. Supercharging RNase A and RNase B in positive and negative modes with sulfolane. RNase A and RNase B were sprayed without and with 200 mM sulfolane in positive and negative modes: (A) +ESI-MS RNase A no reagent, (B) +ESI-MS RNase A with sulfolane,

(C) +ESI-MS RNase B no reagent, (D) +ESI-MS RNase B with sulfolane, (E) -ESI-MS of RNase

A no reagent, (F) -ESI-MS of RNase A with sulfolane, (G) -ESI-MS of RNase B no reagent, and

(H) -ESI-MS of RNase B with sulfolane. ACN/H2O/FA 50/49.9/0.1 and ACN/water 50/50 with 10 mM ammonium acetate were used as ESI solvents for collecting the data in positive and negative modes respectively.

155

A 38+ No reagent 40+

42+ 44+

B 42+ 40+ 25 mM sulfolane 38+ 44+

42+ C 40+ 50 mM sulfolane 44+ 38+

42+ D 44+ 40+ 75 mM sulfolane

38+

E 100 mM sulfolane

F 200 mM sulfolane

1200 1400 1600 1800 2000 2200 2400 2600 2800 m/z

Figure 4.8. Supercharging transferrin with different concentrations of sulfolane. ESI-MS of transferrin acquired under denaturing solvent conditions with (A) no reagent, (B) 25 mM sulfolane, (C) 50 mM sulfolane, (D) 75 mM sulfolane, (E) 100 mM sulfolane, and (F) 200 mM sulfolane. 156

9+ 8+ 7+ 6+ bn bn bn bn A 58 59

56 60 60 58 61 61 57 57 59 56 60 61

B 58 58

59 59 60 59 61 57 60 56 58 61 60 61 60 56 57 61

700 800 m/z 900 1000 1100

Figure 4.9. Comparing nozzle-skimmer dissociation of transferrin with and without sulfolane.

Nozzle-skimmer dissociation of transferrin obtained without (A) and with 200 mM sulfolane (B).

The b-ion assignments from N-terminal 56-61 amino acids at different charge states are labeled in different colors: 6+ (red), 7+ (green), 8+ (blue), and 9+ (black).

157

A 51+ BSA 55+ No reagent

59+

B 59+ BSA 55+ 200 mM sulfolane 51+

900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300 2400 2500 m/z

Figure 4.10. Supercharging BSA with sulfolane. ESI-MS of BSA in denaturing solvents acquired in (A) without any reagent and (B) with 200 mM sulfolane.

158

A IgG1 50+ No reagent

52+

54+

56+

B IgG1 22.5 mM m-nitrophenyl ethanol 56+ 58+ 54+

61+ 52+

50+

2000 2100 2200 2300 2400 2500 2600 2700 2800 2900 3000 3100 3200 3300 3400 3500 3600 3700 3800 m/z

Figure 4.11. Supercharging IgG1 with m-nitrophenyl ethanol. ESI-MS of IgG1 in denaturing solvents acquired with (A) no reagent and (B) 22.5 mM m-nitrophenyl ethanol.

159

A BO in H2O No reagent

B BO in H2O 300 mM sulfolane

600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300 m/z

Figure 4.12. Supercharging bacterioopsin with sulfolane in aqueous conditions. ESI-MS of precipitated BO resuspended in a small volume of neat formic acid followed by 25 fold dilution in

100% water (A) and with addition of 300 mM sulfolane (B).

160

A BO in 50/50 ACN/H2O 14+ No reagent 18+ 16+

20+

B BO in 50/50 ACN/H2O 18+ 300 mM sulfolane 20+ 16+ 14+

600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300 m/z

Figure 4.13. Supercharging bacterioopsin with sulfolane in denaturing conditions. ESI-MS of precipitated BO resuspended in a small volume of neat formic acid followed by 25 fold dilution in

ACN/water 50/50 with (A) no reagent and (B) 300 mM sulfolane.

161

A BO in CHCl3/MeOH/1% FA (4/4/1) 20+ 18+ No reagent 22+ 24+ 16+

26+ 14+

B BO in CHCl /MeOH/1% FA (4/4/1) 18+ 3 300 mM sulfolane

20+ 16+

14+ 22+

24+

900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300 2400 2500 2600 m/z

Figure 4.14. Supercharging bacterioopsin with sulfolane in chloroform/methanol/1% FA. ESI-

MS of precipitated BO resuspended in a small volume of neat formic acid followed by 25 fold dilution in chloroform/methanol/1% FA (4/4/1) solvent with (A) no reagent and (B) 300 mM sulfolane.

162

A M+2H+ Bradykinin 100 No reagent

M+3H+ %

0

B + Bradykinin 100 M+3H 25 mM sulfolane M+4H+ 265.90

266.15 %

266.40 M+2H+ 266.65

0 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 420 430 440 450 460 470 480 490 500 510 520 530 540 550 560 570 580 590 600 m/z

Figure 4.15. Supercharging bradykinin with sulfolane in denaturing conditions.

Nanoelectrospray mass spectra of bradykinin in ACN/H2O/FA 50/49.9/0.1 solvent with no reagent added (A) and with addition of 25 mM sulfolane (B). The zoomed-in view shows the 4+ molecular ion of bradykinin which is present only when sulfolane is added.

163

Figure 4.16. Plots of reagent concentration versus average and maximum charge. Trends for average charge (blue) and maximum charge (red) as a function of concentration for (A) sulfolane, (B) m-NBA, (C) m-nitrophenyl ethanol, and (D) m-nitrophenol.

164

A m/z

drift time (bins) B m/z

drift time (bins)

Figure 4.17. Ion mobility spectrometry analysis of transferrin in denaturing conditions. Plots of drift time versus m/z for transferrin with (A) no reagent and (B) 50 mM sulfolane.

165

A m/z

drift time (bins) B m/z

drift time (bins)

Figure 4.18. Ion mobility spectrometry analysis of lysozyme in denaturing conditions. Plots of drift time versus m/z for lysozyme with (A) no reagent and (B) 200 mM sulfolane.

166

A m/z

drift time (bins) B m/z

drift time (bins)

Figure 4.19. Ion mobility spectrometry analysis of lysozyme in native solvent conditions. Plots of drift time versus m/z for lysozyme with (A) no reagent and (B) 200 mM sulfolane.

167

10 pm/ul

100 A 10+ 9+ Lysozyme denaturing No reagent

% 11+ 8+ 7+ 0 B 14+ Lysozyme denaturing 100 200 mM sulfolane 16+ 12+

% 18+ 20+

0 C 8+ Lysozyme native 100 No reagent 7+ % 9+

0 D 10+ 9+ Lysozyme native 100 200 mM sulfolane % 11+

0 m/z 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 m/z

Figure 4.20. Mass spectra of lysozyme acquired during drift time analysis. ESI-MS of lysozyme

in denaturing solvent conditions without (A) and with 200 mM sulfolane (B). The mass spectra in

(C) and (D) were acquired in native solvents without and with 200 mM sulfolane respectively. To

compare the drift times with and without supercharging, the 12+ charge state was selected in

denaturing conditions and the 9+ charge state was selected in native conditions.

168

100 A 100 C % %

0 0 5.00 7.50 10.00 5.00 6.00 7.00 8.00 9.00 100 B D 100 % %

0 0 5.00 7.50 10.00 5.00 6.00 7.00 8.00 9.00 Time, msec Time, msec

Figure 4.21. Ion mobility arrival time distributions for selected charge states of lysozyme with and without supercharging. Plots of drift time for the 9+ charge state of lysozyme obtained in native solvent conditions without (A) and with 200 mM sulfolane (B). (C) and (D) are the drift time plots for the 12+ charge state of lysozyme acquired in denaturing solvent conditions without and with 200 mM sulfolane respectively.

169

Protein Solvent Number of charges without Supercharging Number of charges with conditions any reagent reagent with supercharging reagent concentration Average Maximum Average Maximum charge charge charge charge

Insulin ACN/H2O/FA 4.4 6+ 22 mM m- 5.4 7+ 50/49.9/0.1 nitrobenzonitrile 300 mM sulfolane 5.7 7+ 1 % m-NBA 5.8 7+ 20 mM m- 5.3 7+ nitrophenol 1% nitrobenzene 5.1 7+

Lysozyme ACN/H2O/FA 11.1 15+ 50 mM sulfolane 14.4 18+ disulfide-intact 50/49.9/0.1 100 mM sulfolane 15.3 19+ 50 mM m- 14.1 17+ nitrophenyl ethanol 50 mM m- 14.1 17+ nitrophenol

10 mM Amm. 7.9 9+ 10 mM m- 8.2 9+ Acetate nitrophenol

Lysozyme ACN/H2O/FA 16.5 22+ 200 mM sulfolane 18 22+ disulfide- 50/49.9/0.1 reduced

β-lactoglobulin ACN/H2O/FA 15.3 19+ 200 mM sulfolane 19.5 23+ A 50/49.9/0.1

α-lactalbumin ACN/H2OO/FA 9.3 13+ 200 mM sulfolane 13.5 17+ 50/49.9/0.1

RNase A ACN/H2O/FA 11.2 15+ 200 mM sulfolane 15 18+ positive ESI 50/49.9/0.1

RNase A ACN/H2O 50/50 6 7- 200 mM sulfolane 6.6 8- negative ESI 10 mM Amm. Ace.

RNase B ACN/H2O/FA 11.2 15+ 200 mM sulfolane 13.7 17+ positive ESI 50/49.9/0.1

RNase B ACN/H2O 50/50 6.3 7- 200 mM sulfolane 6.1 7- negative ESI 10 mM Amm. Ace.

Calmodulin ACN/H2O/FA 13.7 19+ 200 mM sulfolane 13.2 18+ 50/49.9/0.1

Actin ACN/H2O/FA 38.2 53+ 200 mM sulfolane 41.9 53+ 50/49.9/0.1

Table 4.1. Summary of average and maximum charges acquired by supercharging different proteins with various reagents. Average and maximum number of charges obtained with and without adding supercharging reagents for insulin, lysozyme, β-lactoglobulin A, α-lactalbumin,

RNase A, RNase B, calmodulin, and actin are tabulated. The solvent condition and concentration of reagent tested are also provided.

170

Protein Solvent conditions Number of charges without Supercharging Number of charges with supercharging any reagent reagent with reagent concentration Average Maximum Average charge Maximum charge charge charge

BSA ACN/H2O/FA 51 68+ 200 mM sulfolane 59 73+ 50/49.9/0.1 22.5 mM m- 55 71+ nitrophenyl ethanol

Transferrin ACN/H2O/FA 38 46+ 25 mM sulfolane 41.2 52+ 50/49.9/0.1 50 mM sulfolane 42.3 52+ 75 mM sulfolane 43.4 52+ 100 mM sulfolane - - 200 mM sulfolane - -

22.5 mM m- 41.1 52+ nitrophenyl ethanol

IgG1 ACN/H2O/FA 50.7 62+ 50 mM sulfolane 62.7 62+ 50/49.9/0.1 22.5 mM m- 60.9 67+ nitrophenyl ethanol 0.5 % m-NBA 64.2 70+ 10 mM m- 53.5 62+ nitrophenol

BO ACN/H2O 50/50 16.9 24+ 300 mM sulfolane 18.1 24+

CHCl3/MeOH/1%FA 20 31+ 200 mM sulfolane 18.5 29+ (4/4/1)

CHCl3/MeOH/1%FA 20 31+ 0.5 % m-NBA 20.2 31+ (4/4/1) ACN/H2O 80/20 20.2 31+ 200 mM sulfolane 18.5 24+ IPA/H2O 50/50 21.2 31+ 200 mM sulfolane 19.2 25+

Bradykinin ACN/H2OO/FA 2+ 3+ 25 mM sulfolane 3+ 4+ 50/49.9/0.1

Substance P ACN/H2O/FA 2+ 3+ 25 mM sulfolane 3+ 4+ 50/49.9/0.1

Table 4.2. Summary of average and maximum charges acquired by supercharging high MW proteins and peptides with different reagents. Average and maximum number of charges obtained with and without adding supercharging reagents for high MW proteins (BSA, transferrin, and IgG1), BO, and peptides (bradykinin and substance P) are tabulated. The solvent condition and concentration of reagent tested are also provided.

171

A Calmodulin 14+ No reagent

16+ 12+

10+

B Calmodulin 14+ 200 mM sulfolane

12+

16+

10+

1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 m/z

Supplemental Figure 4.1. Supercharging calmodulin with sulfolane. ESI-MS of calmodulin in denaturing solvents acquired with (A) no reagent and (B) 200 mM sulfolane.

172

A Actin 39+ No reagent 44+

47+

B Actin 44+ 200 mM sulfolane

39+ 47+

700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 m/z

Supplemental Figure 4.2. Supercharging actin with sulfolane. ESI-MS of actin in denaturing solvents acquired with (A) no reagent and (B) 200 mM sulfolane.

173

10+ Lysozyme 12+ No reagent 14+

16+ + 50 mM sulfolane

+ 100 mM sulfolane

18+

+ 150 mM sulfolane

+ 200 mM sulfolane

20+

+ 250 mM sulfolane

+ 300 mM sulfolane

700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 m/z

Supplemental Figure 4.3. Supercharging lysozyme with increasing concentrations of sulfolane in denaturing conditions.

174

10+ Lysozyme 12+ No reagent 14+

+ 50 mM m-NBA 16+

+ 100 mM m-NBA

18+

+ 200 mM m-NBA

700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 m/z

Supplemental Figure 4.4. Supercharging lysozyme with increasing concentrations of m-NBA in denaturing conditions.

175

10+ 12+ Lysozyme No reagent 14+

+ 50 mM m-nitrophenyl ethanol 16+

+ 100 mM m-nitrophenyl ethanol

+ 200 mM m-nitrophenyl ethanol

700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 m/z

Supplemental Figure 4.5. Supercharging lysozyme with increasing concentrations of m- nitrophenyl ethanol in denaturing conditions.

176

10+ 12+ Lysozyme No reagent 14+

+ 50 mM m-nitrophenol 16+

+ 100 mM m-nitrophenol

+ 200 mM m-nitrophenol

700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 m/z

Supplemental Figure 4.6. Supercharging lysozyme with increasing concentrations of m- nitrophenol in denaturing conditions.

177

CHAPTER 5

Enhanced Multiple Charging for Protein LC-MS in the Chromatographic Timescale Utilizing Supercharging Reagents

5.1. Abstract

Supercharging reagents enhance the ESI multiple charging of proteins and peptides, and this shifts the charge state envelope acquired by the protein to a lower mass-to-charge region where it can be analyzed at a high resolution by most mass spectrometers. In addition, selecting the most abundant charge state upon supercharging yields higher fragmentation efficiency and hence, sequence-informative product ions are obtained in contrast to selecting the most abundant charge state without supercharging. So far, supercharging has been mostly studied by infusing targeted proteins/peptides with the reagent being added in the solvent used for ESI. Only a few groups have demonstrated charge enhancements for proteins/peptides eluting from a liquid chromatography (LC) column by adding the supercharging reagents to the

LC solvents. The compatibility and efficiency of other potent reagents such as sulfolane remains to be explored. Here, we report the supercharging ability of sulfolane for intact proteins of a wide

MW range during HPLC-MS. Other new methods of reagent introduction such as in the vapor phase during LC-MS were also explored. In addition to the charge enhancing effect, the introduction of the reagent in the vapor phase shows increased signal intensity when compared to experiments performed without the reagent and thus can be utilized as a way to enhance signal during circumstances where the signal intensity is low.

178

5.2. Introduction

The multiple charging phenomena observed during electrospray ionization (ESI) for mass spectrometry extends the typical limits of the common mass analyzers (2000 m/z) to measure high MW species.1 Protein-protein, protein-ligand, and covalent modifications also can greatly increase the MW of the complex/molecule and are efficiently analyzed by the merits of the multiple charging. In addition to the effective mass measurement, probing the primary sequence of the biomolecule by tandem MS methods are also dependent on the charge state chosen for dissociation. Several studies have reported the benefits of selecting highly charged species for fragmentation schemes based on electron transfer/capture.2-4

Enhanced multiple charging was introduced by the Williams group when they demonstrated that the extent of multiple charging observed for proteins and peptides during ESI can further be increased by adding reagents such as m-nitrobenzyl alcohol (m-NBA).5 This finding stirred further research in this area and lead to discovering new reagents by different groups.6, 7 Currently, the mechanism(s) by which these reagents increase charge are debatable.8, 9 However, it is evident that the supercharging effect varies with solvent conditions

(native/denaturing), chemical nature of the reagent (weakly basic/weakly acidic/denaturant), and also on the pI (acidic/basic) and MW of the protein. Regardless of these variations, these charge enhancing reagents have a wide variety of applications for both targeted and complex proteomic studies.

Increased multiple charging in native solvent conditions (at near neutral pH, without any organic solvent) is beneficial for targeted analysis of noncovalent interactions. Yin and Loo demonstrated that characterizing protein-ligand complexes can be improved by adding supercharging reagents. When the higher charge states obtained by supercharging were interrogated by CAD and ECD, it yielded information about the ligand-bound form of the protein in contrast to fragmenting the charge states from the non-supercharged form of the complex

179 that yielded product ions pertaining to the apo-form.10 Brodbelt’s group showed that IRMPD of highly charged peptides obtained by supercharging resulted in increased sequence coverage and yielded more diagnostic ions which help with characterizing the modifications present.11

These reports have highlighted the advantages of dissociating the higher charge state ions obtained by supercharging by different fragmentation techniques and have ultimately laid the foundation for its application in high throughput complex analysis.

For large-scale protein and peptide analysis, supercharging reagents can be added as modifiers to LC solvents to facilitate increased charging during elution from the column.

Jensen’s group first implemented this approach by adding m-NBA to the LC solvents and showed that ETD fragmentation efficiency and hence, sequence coverage increased for tryptic peptides and phosphopeptides during on-line LC-MS/MS.12 The Marshall lab extended this idea to proteins and they showed that intact proteins eluting from a column can be efficiently supercharged by adding reagents such as DMSO and DMF.7

During LC-MS analysis, the signal from the protein is constricted to the elution time window (usually 20-45 seconds for medium to high MW proteins) and medium to high MW proteins tend to have broad charge state envelopes that often stretches to the higher m/z region. High resolution instruments such as FT-ICR are often employed for analyzing high MW proteins and the resolving power of this instrument is inversely proportional to m/z.13 The resolving power achievable is also directly proportional to the time and frequency at which the data is acquired. Higher m/z ions have lower frequency and their resolving power is lower compared to their lower m/z (i.e., higher frequency) counterparts. Because the resolving power is directly related to the length of the acquired transient, if higher resolving power is required for high m/z ions, a longer transient must be acquired. Therefore, experiments which require high resolving powers will have a much lower duty cycle. From an LC perspective, this means that

180

LC experiments in which the MS is acquired at higher resolving powers will have fewer data points across the LC run, and therefore a lower LC resolution.

On-line supercharging with HPLC has significant advantages since enhanced charging during the elution timescale shifts the charge state envelope to a lower m/z region where it can be efficiently analyzed at high resolution without compromising the scan speed. At higher resolution, the mass measurement accuracy of these high MW species increases which becomes important for unambiguous protein identification in complex samples. Furthermore, during LC-MS/MS, the mass spectrometers are programmed to select the ‘most abundant’ charge states for fragmentation and the intense charge states chosen are not usually the highly charged ones. Many supercharging reagents shift the charge state envelope such that the maximum charge obtained without supercharging becomes the most abundant charge state with the addition of the reagent. When the most abundant charge state obtained during supercharging is selected, it has the potential to increase sequence coverage during on-line LC-

MS/MS. In addition, the product ions formed during supercharging could in turn be highly charged which makes its mass measurement more efficient.

Since the LC-MS analysis takes place in denaturing conditions, it is worthwhile to explore the capabilities of reagents which have shown promise in denaturing conditions.

Sulfolane is one such reagent and its charge enhancing efficiency in denaturing solvents is found to be higher than some of the other reagents compared.14 We have explored the efficiency of sulfolane for on-line LC-MS analysis of intact proteins by adding it to the LC solvent and we discuss the solvent compatibility, column performance, and supercharging ability when sulfolane is added to the LC solvent. In addition, we have also described methods by which supercharging reagents can be delivered in the vapor phase during ESI and this stimulates further research in this area to find novel ways by which supercharging reagents can be introduced.

181

5.3. Experimental section

5.3.1. Sample preparation

All protein standards and supercharging reagents (sulfolane and m-NBA) were purchased from Sigma-Aldrich (St. Louis, MO) unless otherwise stated. Standard proteins

(bradykinin, neurotensin, ubiquitin, ribonuclease A, transferrin, BSA, and carbonic anhydrase) were made at 1 mg/mL concentration in MS grade water. An equimolar mixture of these 7 proteins (10 pmoles each) was prepared by diluting in 0.1% formic acid (FA) and used for experiments with the Agilent column. For experiments with the Dionex column, an equimolar mixture of ubiquitin, RNase A, lysozyme, and myoglobin was prepared and diluted to 1 pmol of each protein in 5% ACN 0.1 % FA.

5.3.2. LC conditions and MS parameters

5.3.2.1. Supercharging during LC-MS by adding reagents directly to LC solvents

5 µL of the standard protein mix containing 10 pmoles of each protein was injected into the Agilent C3 column (pore size 300 Ǻ) with dimensions 2.1 mm x 150 mm. LC-MS was performed with an Agilent 1200 LC system connected on-line with an Agilent 6220 ESI-TOF mass spectrometer (Santa Clara, CA). Solvent A was 0.1% FA and solvent B was 90%

ACN/10% H2O containing 0.1% FA. For on-line supercharging experiments, 100 mM sulfolane was added to solvent A and B. The flow rate was 0.2 mL/min and the following gradient was used: 5% B for 3 min, 5%-25% B in 1 min, 25%-55% B in 30 min, 55%-80% B in 2 min, and held at 80% B for 9 min. The column was finally equilibrated with 5% B for 10 min prior to next run.

The column temperature was maintained at 60° C throughout the run.

The source parameters in the MS were as follows: ESI capillary voltage 4300 V, drying gas temperature 350° C, drying gas flow rate 10 L/min, nebulizer pressure 25 psig, and

182 fragmentor voltage 250 V. The data was acquired at the rate of 1 spectrum / second and the acquisition window was set from 100 to 3000 m/z.

5.3.2.2. Supercharging during LC-MS by delivering reagent vapors

Chromatography was performed using a Dionex Ultimate 3000 rapid separation LC system (Thermo Scientific, Waltham, MA), which consists of a binary high-pressure gradient pump and a ternary low-pressure loading pump. 5 µL of the standard protein mix containing 1 pmole of each protein was injected into a Dionex Pepswift reversed phase monolithic trap column (200 µm x 5 cm) and washed for 10 min with 100% water 0.05% TFA at a flow rate of 5

µL/min using the loading pump. This was followed by gradient elution into the Dionex Pepswift reversed phase monolithic analytical column (100 µm x 25 cm). 100% water 0.1% FA (solvent

A) and 100% ACN 0.1% FA (solvent B) were delivered at a flow rate of 1 µL/min and the following gradient was used: 10% B for 10 min, increased to 25% B at 15 min, 50% B at 49 min,

80% B at 54 min, held at 80% B for 5 min, and finally equilibrated at 10% B for 10 min. Both the analytical and trap columns were held at 60° C throughout the run to aid protein separations.

The Dionex nanoLC system was connected in-line with Bruker solariX hybrid Qq-15 T

FT-ICR mass spectrometer (Bruker Daltonics, Billerica, MA). The CaptiveSpray source (Bruker

Daltonics, Billerica, MA) was used to couple the LC flow directly into the inlet of the mass spectrometer. It consists of a non-tapered 20 µm i.d. silica tip which produces a stable spray with minimized clogging. The MS source parameters were optimized for the LC flow rate of 1

µL/min and were as follows: drying gas flow of 4 L/min at a temperature of 250°C and a capillary voltage of -1.5 kV was applied. The capillary exit and deflector plate were at 250 V and 230 V respectively. The funnel 1 and skimmer 1 were held at 140 V and 30 V respectively. The data was acquired in the broadband detection mode, using 1 MHz data points, with a mass range of

150 – 3000 m/z. The total duration of a MS scan event was 0.63 second, which includes a

183 transient length of 0.34 second. The number of scans averaged for one spectrum was set to one.

Sulfolane and m-NBA were diluted in 1:4 ratio in a volatile solvent such as isopropanol.

Supercharging experiments were performed by applying the reagents either directly to the syringe filter (Figure 5.1A) or by spotting it in the inside walls of the Luer-lock filter adapter

(Figure 5.1B) which connects the filter to the CaptiveSpray source (Figure 5.1C). In addition to using the air filter, another setup as described below was used to deliver the vapors generated from adding the supercharging reagents to a volatile solvent. 300 mM sulfolane and 0.3% m-

NBA were each solubilized separately in 100% ACN in a solvent bottle. The vapors from this bottle were sent inside the CaptiveSpray source by attaching a piece of tubing from the air filter inlet in the source to the headspace of the solvent bottle (Figure 5.2). The tubing that goes inside the bottle was monitored to make sure that it does not get in touch with the solvent in the bottle which would otherwise pull the solvent by the suction created by the vacuum in the source region of the instrument. Another piece of tubing was immersed in the solvent with the other end left open to keep it vented. During on-line supercharging, this setup was maintained throughout the run time.

5.3.3. Calculation of average and maximum charges

Charge state distribution (CSD) refers to the observed set of charges acquired by a protein during ESI-MS. From the CSD, the average charge was calculated using the formula:

(ƩqiIi)/(ƩIi) from i to N, where qi is the charge of the i-th charge state, Ii is the intensity of the i-the charge state and N is the number of measured charge states in a mass spectrum. Average charge can be simply correlated to the charge state with the highest intensity in the CSD obtained for a given protein. When comparing the supercharging effect, the maximum charge state or the highest observed charge state was also taken into consideration. While average

184 state gives an estimate for the overall charge distribution upon supercharging, the maximum charge state is an indication of the highest number of charges that can be attained without limitations due to electrostatic repulsion from the existing charges.

5.4. Results and discussion

5.4.1. Supercharging proteins during LC-MS with sulfolane in the LC solvent

Previously, sulfolane has been shown to charge enhance proteins in native conditions.15

Recent results by infusion experiments from our group14 and Venter’s group8 have shown that this reagent also works well under denaturing conditions. We wanted to extend this charge enhancement to proteins eluting from a column during LC since, increasing charge is beneficial to identify and characterize the proteins in a complex mixture when analyzed by LC-MS.

100 mM sulfolane was added to both solvent A and B to keep the amount of sulfolane uniform throughout the LC run. It is important to keep the concentration of sulfolane low since, sulfolane adducts are observed at higher charge states at increasing sulfolane concentrations.

These adducts are resolved well in low MW proteins, but for high MW proteins, the activation energy applied in the source region is not sufficient to dissociate these adducts and so, the charge state envelope of protein appears as an unresolved hump. The 100 mM concentration was chosen since during infusion experiments in denaturing conditions with high MW proteins such as BSA and transferrin, it showed the maximum amount of supercharging without forming unresolved charge state envelopes.14

In the total ion chromatograms (TIC) obtained without and with 100 mM sulfolane in the

LC solvents (Figure 5.3A and B), the order of elution of all the proteins in the standard protein mix is preserved. However, retention time (tR) of the proteins get shorter with sulfolane addition.

This decrease in tR upon addition of sulfolane can be attributed to the increase in the non-polar nature of the solvent. Bradykinin, RNase A, and neurotensin (denoted as peaks 1, 2, and 3 in

185

Figure 5.3A) elute 0.4 min early when sulfolane was added (Figure 5.3B). Shift in the tR was highest for ubiquitin and hemoglobin β chain (labeled as peaks 4 and 6 respectively), which elute 2.2 min and 1.2 min early respectively. Hemoglobin β chain was found to be a contaminant from bovine carbonic anhydrase sample. High MW proteins (peaks 7, 8, and 9) elute 1.5 min early on average with solvents containing sulfolane. Previously, different groups have reported such a decrease in tR for peptides and proteins when m-NBA and DMSO were added to LC solvents.7, 12 In addition, Jensen et al. had noticed an increase in peak width for peptides eluting from the column with increasing concentrations of m-NBA.12

Also, a small peak labeled 5 in Figure 5.3B appears only during sulfolane addition and it corresponds to a MW of 8591.28 Da which is 26.22 Da more than the MW of ubiquitin, 8565.06

Da. While the chromatographic resolution is high enough to separate this species from ubiquitin, this peak is unique to sulfolane addition and does not correspond to the any of the well-known protein-salt adducts.

Regarding the multiple charging effect, for small proteins such as RNase A, the average charge increases from 9.6 to 14.4 when sulfolane was present in the solvent (Figure 5.4A and

B). For the high MW proteins, BSA and transferrin, in spite of choosing a concentration that showed well-resolved charge states during infusion experiments, the protein-sulfolane adducts predominated and resulted in unresolved hump-like distribution (Figure 5.5A and B). The concentration of sulfolane in the solvents was dropped to 20 mM to find if resolved CSD can be obtained for high MW proteins. While the low sulfolane concentration restored the tR close to that of no sulfolane addition in the TIC (Figure 5.3C), the unresolved CSD of high MW proteins still existed (Figure 5.5C). On the low MW proteins such as RNase A, the 20 mM sulfolane still showed effective multiple charging similar to 100 mM sulfolane (Figure 5.4C). Furthermore, the peak 5 which corresponded to MW of 8591 Da was still present at 20 mM concentration of sulfolane (Figure 5.3C).

186

The addition of sulfolane to chromatographic solvents also resulted in other undesirable changes such as irreproducible retention times between different runs setup with sulfolane.

Furthermore, when the column was run without any sulfolane, the tR did not revert to the original values as in the TIC in Figure 5.3A. Sulfolane permanently changes the active sites in the column and efforts to regenerate the column did not alter this situation.

5.4.2. Supercharging proteins by spotting reagents in the CaptiveSpray air filter

Enhanced multiple charging for proteins eluting on-line from a column by adding reagents to the LC solvents is advantageous, however, the chromatographic effects and column performance are key characteristics to be kept under optimum conditions. Retention time changes, loss of column integrity, and inability to revert to the conventional solvent system are all impacted during reagent addition to LC solvents. In order to overcome these unfavorable chromatographic issues, other methods to deliver reagents to proteins eluting on-line can be considered.

The CaptiveSpray source couples an LC column through a capillary spray tip directly into the inlet of the mass spectrometer. A vacuum seal is then created and ambient air is pulled in and around the spray tip creating an air funnel that sweeps the spray tip and desolvates the sample ions as they are pulled into the instrument. A syringe air filter removes the contaminants present in the ambient air environment and thus reduces the background ions. These air filters can be adapted to spray lock-mass calibrants for on-line LC-MS experiments and the calibrant delivery can be influenced by increasing the source parameters such as drying gas flow and temperature. We extended the use of air filters to deliver the supercharging reagent during ESI by spotting the reagent in it.

This method of applying the reagent was tested by infusing a sample of lysozyme in denaturing solvent conditions and installing different filters with sulfolane or m-NBA spotted. As

187 a control, a filter with no reagent applied was used. With a filter that had no reagent, the average charge obtained by lysozyme was 10.5 (Figure 5.6A) and this increased to 10.8 with the filter that had m-NBA (Figure 5.6B). When the air filter supplemented with sulfolane was placed in the source, the average charge increased to 10.9 (Figure 5.6C). We also noticed a slight increase in the signal intensity in the presence of filters with supercharging reagents.

When the relative ion intensities with and without the filter are compared, there is a 2.5 fold increase in the presence of either of the supercharging reagents. Other adjustments such as increasing the drying gas flow and temperature were tried, since the drying gas heats the source block which might enhance the reagent to vaporize more. However, the increase in charge acquired was still the same. It is possible that the concentration of supercharging reagent in the vapor phase is low. Sulfolane and m-NBA have high boiling points and consequently the vapor pressure of these reagents is low. For instance, the boiling point of sulfolane is 287˚C16 and the vapor pressure exerted by sulfolane is 6.15E-04 mm Hg.17 For a comparison, the vapor pressure of ACN is 90 mm Hg.18 The relatively low concentration of sulfolane in the vapor phase could be limiting the charge enhancement in this experimental setting.

5.4.3. Supercharging during LC timescale using reagent vapors

Since spotting the reagents in the air filter showed only modest increments in the average and maximum charges, other methods such as adding the reagents in a volatile solvent such as ACN followed by delivering the vapors generated was explored. At room temperature,

ACN is highly volatile and when the ACN vapors transported into the source, the vapors from the supercharging reagents can also be delivered. A standard protein cocktail was analyzed by

LC-MS without and with solvent mediated reagent delivery as shown in the Figure 5.2 and the

TICs obtained are displayed (Figure 5.7A and B). This method of reagent delivery does not

188 impact column performance and retention time which are the main advantages when compared to adding the reagents to LC solvents.

Figure 5.8A and B are the mass spectra of RNase A obtained without and with 300 mM sulfolane in the delivery solvent. The average charge shifted from 9.2 to 9.8 upon transfer of sulfolane in the vapor phase and a maximum charge of 14+ is attained only when sulfolane was present. Similar results were obtained when 0.3% m-NBA was added to ACN (data not shown).

Other proteins in the chromatogram also showed similar slight charge increases as shown in

Figures 5.8 and 5.9 for ubiquitin and myoglobin respectively. The average charge for ubiquitin increased from 7.8 to 8.8 and for myoglobin it increased from 17.6 to 18.2. Similar to spotting the reagents in the air filter, this method also showed a 2-3 fold increase in the signal intensity for RNase A and ubiquitin, while the intensity of myoglobin remained the same with and without sulfolane vapor.

Signal enhancement is interesting since most supercharging reagents decrease the signal intensity while increasing charge, mostly due to adduct formation which results in splitting the signal among many peaks. Recently, our group and Chen’s group have noticed the similar signal enhancing effects with the addition of sulfolane and m-NBA in solvents containing 0.1% trifluoro acetic acid (TFA) as the modifier.19 In both infusion and LC-MS experiments, TFA was used as a modifier and supercharging reagents were introduced in the plume by desorption electrospray ionization (DESI). The authors postulate that the supercharging reagents might interact with TFA anions and limit the association of TFA anions with the protein ions, which might result in enhancing the signal in the presence of TFA. While TFA has excellent benefits by working as an ion-pairing agent during LC, it causes signal suppression in MS even at very low concentrations. Future experiments can be devised to get the benefits of using TFA in the LC, while supercharging reagents can be delivered in the vapor phase using the CaptiveSpray air filters to enhance the signal intensity in the MS.

189

5.5. Conclusions

Supercharging during on-line LC-MS/MS has tremendous advantages for both bottom- up and top-down proteomic studies. Enhancing charge while the proteins/peptides are getting eluted in the chromatographic timescale is beneficial for both MS and tandem MS. While addition of sulfolane to the LC solvents enhances the charge for both medium and high MW proteins, unfavorable chromatographic effects are also associated with it. Reagent delivery in the vapor phase overcomes the disadvantages encountered during LC solvent addition, but the charge enhancing effect is moderate. However, the increase in the signal intensity obtained during reagent delivery in the vapor phase is beneficial and can be considered for proteins with low ionization efficiency, when sample amounts loaded on the column is low or when TFA is used for performing chromatography. Very recently, Tsybin’s group utilized a dual spray tip for

ESI to infuse the supercharging reagents such as m-NBA or sulfolane through one spray tip and protein was infused through the other tip separately, and the supercharging reaction takes place just before ion liberation from the ESI droplet.20 With the development of innovative methods to introduce the reagent either in solution/vapor phase just before the formation of electrospray plume and with testing other supercharging reagents, the benefits of charge enhancement during LC-MS/MS can be achieved.

190

5.6. References

1. Fenn, J. B.; Mann, M.; Meng, C. K.; Wong, S. F.; Whitehouse, C. M., Electrospray ionization for mass spectrometry of large biomolecules. Science 1989, 246, (4926), 64-71.

2. Good, D. M.; Wirtala, M.; McAlister, G. C.; Coon, J. J., Performance characteristics of electron transfer dissociation mass spectrometry. Mol Cell Proteomics 2007, 6, (11), 1942-1951.

3. Horn, D. M.; Breuker, K.; Frank, A. J.; McLafferty, F. W., Kinetic intermediates in the folding of gaseous protein ions characterized by electron capture dissociation mass spectrometry. J Am Chem Soc 2001, 123, (40), 9792-9799.

4. Zubarev, R. A.; Horn, D. M.; Fridriksson, E. K.; Kelleher, N. L.; Kruger, N. A.; Lewis, M. A.; Carpenter, B. K.; McLafferty, F. W., Electron capture dissociation for structural characterization of multiply charged protein cations. Anal Chem 2000, 72, (3), 563-573.

5. Iavarone, A. T.; Jurchen, J. C.; Williams, E. R., Supercharged protein and peptide ions formed by electrospray ionization. Anal Chem 2001, 73, (7), 1455-1460.

6. Lomeli, S. H.; Peng, I. X.; Yin, S.; Loo, R. R.; Loo, J. A., New reagents for increasing ESI multiple charging of proteins and protein complexes. J Am Soc Mass Spectrom 2010, 21, (1), 127-131.

7. Valeja, S. G.; Tipton, J. D.; Emmett, M. R.; Marshall, A. G., New reagents for enhanced liquid chromatographic separation and charging of intact protein ions for electrospray ionization mass spectrometry. Anal Chem 2010, 82, (17), 7515-7519.

8. Douglass, K. A.; Venter, A. R., Investigating the role of adducts in protein supercharging with sulfolane. J Am Soc Mass Spectrom 2012, 23, (3), 489-497.

9. Hamdy, O. M.; Julian, R. R., Reflections on charge state distributions, protein structure, and the mystical mechanism of electrospray ionization. J Am Soc Mass Spectrom 2011, 23, (1), 1-6.

10. Yin, S.; Loo, J. A., Top-Down Mass Spectrometry of Supercharged Native Protein-Ligand Complexes. Int J Mass Spectrom 2010, 300, (2-3), 118-122.

11. Madsen, J. A.; Brodbelt, J. S., Comparison of infrared multiphoton dissociation and collision- induced dissociation of supercharged peptides in ion traps. J Am Soc Mass Spectrom 2009, 20, (3), 349-358.

12. Kjeldsen, F.; Giessing, A. M.; Ingrell, C. R.; Jensen, O. N., Peptide sequencing and characterization of post-translational modifications by enhanced ion-charging and liquid chromatography electron-transfer dissociation tandem mass spectrometry. Anal Chem 2007, 79, (24), 9243-9252.

13. Makarov, A.; Denisov, E.; Kholomeev, A.; Balschun, W.; Lange, O.; Strupat, K.; Horning, S., Performance evaluation of a hybrid linear ion trap/orbitrap mass spectrometer. Anal Chem 2006, 78, (7), 2113-2120.

191

14. Lakshmanan, R.; Loo, R. R.; Loo, J. A., Reagents for Enhancing Protein Charging for Electrospray Ionization Mass Spectrometry from Organic/Aqueous Solvent Systems. (Manuscript under preparation).

15. Lomeli, S. H.; Yin, S.; Ogorzalek Loo, R. R.; Loo, J. A., Increasing charge while preserving noncovalent protein complexes for ESI-MS. J Am Soc Mass Spectrom 2009, 20, (4), 593-596.

16. Sterling, H. J.; Daly, M. P.; Feld, G. K.; Thoren, K. L.; Kintzer, A. F.; Krantz, B. A.; Williams, E. R., Effects of supercharging reagents on noncovalent complex structure in electrospray ionization from aqueous solutions. J Am Soc Mass Spectrom 2010, 21, (10), 1762-1774.

17. http://www.gsi-net.com/en/publications/gsi-chemical-database/single/488.html.

18. http://www.gsi-net.com/en/publications/gsi-chemical-database/single/6.html.

19. Liu, Y.; Miao, Z.; Lakshmanan, R.; Loo, R. R.; Loo, J. A.; Chen, H., Signal and charge enhancement for protein analysis by liquid chromatography–mass spectrometry with desorption electrospray ionization. Int J Mass Spectrom 2012, 325-327, 161-166.

20. Miladinovic, S. M.; Fornelli, L.; Lu, Y.; Piech, K. M.; Girault, H. H.; Tsybin, Y. O., In-spray supercharging of peptides and proteins in electrospray ionization mass spectrometry. Anal Chem 2012, 84, (11), 4647-4651.

192

A

C

B

Figure 5.1. Supercharging reagent delivery through CaptiveSpray air filter. Sulfolane or m-NBA was spotted either directly on the syringe filter (A) or on the Luer-lock adapter (B). The syringe filter and adapter were re-installed and fixed on the CaptiveSpray source (C).

193

A B

Figure 5.2. Supercharging reagent delivery in the vapor phase using 100% ACN. 300 mM sulfolane or 0.3% m-NBA was solubilized in 100% ACN in a solvent bottle and the vapors from this bottle were sent inside the CaptiveSpray source by attaching a piece of tubing (A) from the headspace of the solvent bottle to the air filter inlet in the source. Another piece of tubing (B) was immersed in the solvent with the other end left open in order to keep it vented.

194

9 A 8

4

2 6 3 Relative Relative intensity 7 1

B 9 8

2 3 4

Relative intensity 7 6 1 5

C 8 9

3 4

intensity 2

7 Relative Relative 1 5 6

8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Time in min

Figure 5.3. Supercharging during LC-MS with sulfolane in the LC solvent. Total ion chromatograms of a standard 7 protein mix run through Agilent C3 column with LC solvents containing (A) no sulfolane, (B) 100 mM sulfolane, and (C) 20 mM sulfolane. The peaks labeled

1 through 9 represent the following proteins: bradykinin (1), RNase A (2), neurotensin (3), ubiquitin (4), unknown protein with MW 8591 Da (5), hemoglobin chain (6), transferrin (7), BSA

(8), and carbonic anhydrase (9).

195

A

9+ 11+

13+

B

15+

13+

C 15+

13+

800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 m/z

Figure 5.4. Mass spectra of RNase A with sulfolane in the LC solvent. Mass spectra of RNase

A (extracted from peak 2 in the TIC in Figure 5.3) eluting from the C3 column with LC solvents containing (A) no sulfolane, (B) 100 mM sulfolane, and (C) 20 mM sulfolane. The average and maximum charge acquired increases upon sulfolane addition to the LC solvent.

196

A 53+ 56+

59+

62+

B

C 58+ 62+

800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 m/z

Figure 5.5. Mass spectra of BSA with sulfolane in the LC solvent. Mass spectra of BSA

(extracted from peak 8 in the TIC in Figure 5.3) eluting from the C3 column with LC solvents containing (A) no sulfolane, (B) 100 mM sulfolane, and (C) 20 mM sulfolane. The average and maximum charge increased upon adding 100 mM sulfolane to the LC solvent, but the CSD is unresolved due to the presence of sulfolane adducts. Decreasing the sulfolane concentration to

20 mM reduced the adduct formation and resulted in a modest resolution of the charge states. 197

x107 A 10+

1.5 12+

1.0 8+

Relative Intensity 0.5 14+

x107 5 B 10+ 4 12+

3

2 8+ Relative Relative Intensity 1 14+

x107 C 5 12+ 10+

4

3

2 8+ 14+ Relative Relative Intensity 1

0 1000 1200 1400 1600 1800 2000 m/z

Figure 5.6. Mass spectra of lysozyme with supercharging reagents spotted in the air filters.

Mass spectra of lysozyme obtained with air filters containing (A) no reagent, (B) m-NBA, and (C) sulfolane. Though the increase in the average charge is modest upon adding reagents, the relative intensities of the peaks in the CSD increase during supercharging when compared to no reagent addition.

198

9 x10 A 3 2

2 4 3 1 1 Relative Relative Intensity 0 x1010 B 2 0.8 1 0.6

0.4 4 0.2 3 Relative Relative Intensity 0.0 11 12 13 14 15 16 17 Time in min

Figure 5.7. Supercharging during LC-MS with sulfolane delivery in the vapor phase. Total ion chromatograms of a standard protein mix run through Dionex monolithic column with (A) no reagent delivery and (B) vapors from 300 mM sulfolane mixed in 100% ACN. The peaks labeled

1 through 4 represent the following proteins: RNase A (1), ubiquitin (2), lysozyme (3), and myoglobin (4).

199

x107 A 2.0 8+

9+ 1.5 10+

1.0 11+

7+

Relative Relative Intensity 0.5 12+

13+ 6+ 7 x10 B 9+

6 10+

8+ 4 11+ 12+

Relative Intensity 2 13+ 7+ 0 1000 1200 1400 1600 1800 2000 2200 2400 m/z

Figure 5.8. Mass spectra of RNase A with sulfolane delivery in the vapor phase. Mass spectra of RNase A obtained with (A) no reagent and (B) 300 mM sulfolane delivered in vapor phase. In addition to charge enhancement, there is a 3 fold increase in the signal intensity during sulfolane addition.

200

x107 A 10+ 11+ 5 9+ 8+ 4 7+ 3 6+ 12+ 2 Relative Intensity 5+ 1 13+ x108 B 7+ 1.0 8+ 10+ 9+ 0.8 11+

0.6 6+ 0.4 12+ Relative Intensity 0.2 13+ 5+ 0.0 600 800 1000 1200 1400 1600 1800 m/z

Figure 5.9. Mass spectra of ubiquitin with sulfolane delivery in the vapor phase. Mass spectra of ubiquitin obtained with (A) no reagent and (B) 300 mM sulfolane delivered in vapor phase. In addition to charge enhancement, the signal intensity doubles during sulfolane addition.

201

x10 7 A 17+ 1.0 21+ 19+ 0.8 15+

0.6

0.4 23+ 13+ Relative Relative Intensity

0.2 11+

x10 7 1.2 B 17+ 19+ 1.0

0.8 21+ 15+ 0.6 23+

Relative Relative Intensity 0.4 13+

0.2 11+ 0.0 600 800 1000 1200 1400 1600 m/z

Figure 5.10. Mass spectra of myoglobin with sulfolane delivery in the vapor phase. Mass spectra of myoglobin obtained with (A) no reagent and (B) 300 mM sulfolane delivered in vapor phase. There is a slight increase in the average charge from 17.6 to 18.2 upon sulfolane addition.

202

CHAPTER 6

Conclusions and Future Directions

6.1. Overview

The top-down method developments I have presented in this thesis are efficient methods for high throughput protein identification in the chromatographic timescale. In the current methods used for top-down protein identification, instruments with dual mass analyzers are used to measure the intact and fragment ions separately. By taking advantage of the resolution and mass accuracy available in the TOF mass analyzer, we have combined it with in- source dissociation for proteins eluting on-line from a LC column and demonstrated that it is possible to utilize a stand-alone TOF instrument for the purpose of protein identification.

Additionally, in the current top-down methods, the instrument time spent in isolating and fragmenting intact protein charge states decreases the duty cycle of the instrument. To overcome this disadvantage, we have developed a new data-independent protein identification method known as CASI-CAD. In this technique, multiple intact protein charge states are selected and dissociated simultaneously inside the collision cell. This greatly reduces the time it takes to acquire the fragment ion data from the multiple charge states selected.

To increase the efficiency of intact protein measurement and also fragmentation of protein precursor charge states in the chromatographic timescale, we have developed techniques to introduce the reagent in the liquid and vapor phase in the electrospray plume. In addition to enhancing charge, these reagents are promising in that it can increase the signal intensity for proteins eluting from a column.

203

6.2. Top-Down protein identification using a single-stage LC-ESI-TOF MS

In Chapter 2, I describe a high throughput platform for protein identification using a stand-alone TOF instrument and the human 20S proteasome as a model complex. The protein subunits in the proteasome complex elute on-line from a LC column and undergo fragmentation in the nozzle-skimmer (NS) region. Both the intact and fragment ion masses are measured using the TOF mass analyzer.

Many research groups have infused targeted proteins and performed NS dissociation in instruments with single mass analyzer and employed it as a pseudo tandem MS method.

However, when combined with an up-front protein separation technique, it is an in-expensive solution to identify proteins in complex samples. The highlight of the method is that the NS voltage can be switched from low to high every second enabling both intact and fragment ion mass measurements in the same run. In addition, the resolving power of 10,000 for an m/z of

1000 available in this instrument is sufficient to precisely assign the multiply charged fragment ions up to 6+ during data interpretation. Furthermore, the advantage of using the TOF instrument is that the scan speed is high enough to preserve the chromatographic resolution.

The method was successful in that intact mass of all the subunits were obtained and short stretches of amino acid sequence information were obtained for ten subunits. Four proteins co-eluted with other proteins and the sequence information was not obtained for these co-eluting proteins. In addition, by comparing the intact MWs obtained to the theoretical MW in the database, the presence of a phosphorylation in one of the subunits was identified. Finally, the N-terminal modifications in all the subunits were identified by utilizing the intact mass obtained and fragment ions originating from the N-terminus of the subunits.

While this method is highly useful for identification of proteins in samples with moderate complexity, the upper limit on the number of proteins identified using this method is dependent on the peak capacity of the LC column. Co-eluting proteins often present a significant problem

204 in reversed phase LC and when a data-independent technique such as NS dissociation is chosen for fragmentation, fragments from the co-eluting proteins are present together in the mass spectrum, posing additional problems during data interpretation. For these situations where peak capacity is decreased due to co-elution, additional up-front fractionation techniques are required. Secondly, with the resolving power available in this instrument, fragment ions up to

6+ charges can be unambiguously assigned. This translates to identifying the sequence information pertaining to N- and C-terminal regions of the protein. While this is sufficient for identifying non-homologous proteins, for proteins which have a relatively high degree of sequence similarity in one or both terminals, higher resolution instruments such as FT-ICR MS or orbitrap are necessary. With the help of higher resolution, the higher m/z fragments originating from cleavages in the middle of the protein can be identified and this helps with identifying homologous proteins.

6.3. Top-Down protein identification using data-independent fragmentation in FT-ICR MS

In Chapter 3, I describe a new data-independent fragmentation method for selecting multiple charge states of the protein and fragmenting all simultaneously in the collision cell. A majority of current high throughput top-down protein identification methods employ CAD for proteins less than 20 kDa and NSD for proteins with higher MW. While NSD is a data- independent dissociation method, CAD is performed by isolating the precursor charge states and fragmenting each separately in the collision cell, followed by merging the fragment ions during post-processing. There are several disadvantages while employing data-dependent

CAD. First, the duty cycle of the instrument is lowered, particularly in high resolution instruments such as orbitraps and LTQ-FTs, when fragment ion data is set to be acquired at a high resolution. As it takes more time to acquire the fragment ion data, the subsequent proteins

205 eluting from the column are not adequately sampled and this lowers chromatographic resolution.

Next, during data-dependent CAD, usually the top three most abundant charge states of the protein are selected for fragmentation. Many reports have shown that different precursor charge states undergo cleavage through different channels and it is important to fragment multiple charge states to get comprehensive sequence coverage. In the new data-independent fragmentation method, ‘CASI-CAD’, multiple charge states were isolated and fragmented by selecting a broad mass window of 800-1200 m/z in the quadrupole. This eliminates the need to isolate the precursors based on data during acquisition and also yields fragmentation from multiple charge states simultaneously. In addition, the time it takes to acquire a scan is 0.63 second which is two-fold less than the current data-dependent cycle time in the orbitrap set at the same resolution.

When comparing CASI-CAD with the widely adopted NSD, the data from the proteasome complex shows that for proteins which co-elute, CASI-CAD yields more fragment ions than NSD. Typically, when two proteins co-elute from the column, the ionization efficiency of one of the two proteins is less and similarly, during fragmentation, one of the two proteins in the pair yields only few fragment ions. When CASI-CAD is used during this situation, the mass filter set in the quadrupole selects only the ions within the 800-1200 m/z for measurement as opposed to a broadband spectrum. The fragment ions that fall within this mass range are measured more efficiently than the intact protein ions. Consequently, for the co-eluting proteins,

CASI-CAD method picked up additional fragment ions.

Furthermore, as an additional data-independent dissociation technique, the fragment ions generated after NSD were enriched by using the CASI technique. By filtering the fragment ions in the 800-1200 m/z range after FS-CAD, more sequence coverage was obtained for 8 subunits in the complex. By combining the fragmentation efficiency of FS-CAD technique with

206 the fragment ion selectivity of the CASI principle, more sequence coverage was obtained than using either technique alone. This new method is an efficient addition to the top-down data- independent acquisitions, and can be employed in other instrument types such as LTQ-FT-ICR

MS by setting a mass range in the LTQ before fragment ion are measured in the FT mass analyzer.

In both CASI-CAD and FS-CAD followed by CASI enrichment, ion accumulation time in the quadrupole was increased from 0.2 second to 0.4 and 0.6 second to find if sequence coverage can be increased by accumulating the fragment ions for a longer time. For small molecule analysis, there are studies where the signal from an isolated small molecule ion is accumulated by CASI to increase the signal intensity before fragmenting by CAD. While this is advantageous for small molecules where only one or a few related ions are isolated, this strategy was not helpful in protein analysis where multiple charge states with the 800-1200 m/z window are selected. By increasing the accumulation time to 0.4 second, the sequence coverage was the same as that of 0.2 second. When 0.6 second accumulation was used, the number of fragment ions decreased and this can be attributed to more number of charges being present in the collision cell for a longer duration and leading to the space charge effect. A decrease in the chromatographic resolution (increase in the peak width and decrease in the total ion current) was also observed at 0.6 second accumulation.

When this method is compared to the NSD in the TOF instrument, there are obviously several advantages. First, the mass accuracy of the intact proteins significantly improved and all the subunits were measured with 1 ppm deviation using the ultra-high resolution of the FT-ICR instrument. This is a significant advantage when complex cell lysates are analyzed since unambiguous identifications during database searches are possible by improved mass accuracy. In addition when high MW proteins are analyzed, increased mass accuracy plays a significant role in differentiating small differences such as amidation/deamidation (mass

207 difference of 1 Da) and disulfide-intact/reduced states (1 Da difference for every disulfide bond reduced). These small changes are very common in therapeutic proteins such as antibodies and currently, these changes are measured by peptide mapping where the antibody is digested with a proteolytic enzyme and the digested sample is run through a column followed by characterizing the peptides. While this peptide mapping strategy is necessary when characterizing the site of modifications initially in the antibody, for routine stability/efficacy testing, this is not required. By measuring the intact mass accurately, the steps involved in digestion and characterizing the peptides can be avoided.

In the TOF instrument only ions up to 6+ charge states were assigned and this limited the sequence coverage to the terminal regions of the protein. On the other hand, the resolution available in the FT-ICR instrument allowed assigning high MW fragment ions which originated from cleavages in the middle of the protein. This is advantageous in situations where a protein could be potentially modified at one of the several possible sites by the same modification. For instance, if a protein’s intact MW observed and the theoretical MW in the database are different by 42 Da, it is evident that an acetylation is present, but given that there are several sites where this modification could reside (in the N-terminal methionine/alanine residues or any lysine residue), unambiguous characterization is not possible. By obtaining more sequence coverage pertaining to residues in the middle of the protein, it is possible that the site of modification could be identified. However, this also depends on the type of fragmentation used, since in CAD, labile PTMs are not preserved.

In the TOF and FT-ICR data, the presence of phosphorylated α3 subunit was identified by matching the intact MW obtained with the phosphorylation reported in the database for this subunit. This is an important advantage available in the top-down technique. Though the site of modification was not identified in the data from both the instrument platforms, determining the presence of the modification itself is a significant achievement and with further selective

208 enrichment for phosphoproteins, the site of modification can be identified. In addition, the nature of the N-terminus was identified by both the methods (TOF and FT-ICR) for all the subunits.

This was possible by identifying the fragments originating from the N-terminus for proteins that eluted singly. For co-eluting proteins, since the efficiency of fragmentation is low, the N-terminal fragment ions were not present. However, when the accurate intact mass obtained was compared to the theoretical mass in the database, the N-terminal modification was identified in these proteins.

Moreover, the intensity of phosphorylation of the α3 subunit was found to be more than the non-phosphorylated counterpart. While quantifying intact proteins using MS is in the primitive stages of method development currently, it holds promise for the future. Also, by combining MS with CE, instead of RPLC, the modified and unmodified forms of the protein can be separated before MS analysis, which gives a greater chance to identify the modification by fragmenting it separately from the unmodified form.

Finally, in both TOF and FT-ICR data, internal fragment ions were present and these were not interpreted. The current data processing tools do not consider internal fragments when comparing the observed and theoretical fragments. Further improvement in the search tools to look for internal fragment ions after generating a list of potential protein hits will be very valuable to interpret more fragment ions and thereby increase the sequence coverage.

6.4. Supercharging reagents for enhancing charge during protein elution from HPLC

In Chapters 4 and 5, I present the results from supercharging targeted proteins by direct infusion and delivering the reagent for proteins eluting from a LC column. Supercharging is a recent addition to the field of top-down mass spectrometry and only a few groups have tested these reagents for a few standard proteins. In order to better understand the mechanism behind supercharging and to develop methods for utilizing these reagents efficiently in top-down

209 studies, these reagents have to be tested with various classes of proteins in different solvent conditions.

There are several conclusions from supercharging targeted proteins. First, higher degree of supercharging was observed for medium sized disulfide-intact proteins while disulfide- reduced counterparts did not further increase in charge. This leads to questioning if the number of basic sites and glutamine residues (from supercharging disulfide-reduced lysozyme, the maximum charge happens to correlate to the number of basic sites and glutamine residues present) determines the number of charges that can be attained. In addition, supercharging proteins that are already in the denatured state addresses the important controversy of the mechanism behind supercharging. While it could be possible that proteins in the native state can tend to become denatured while supercharging, charge enhancing the proteins that are already in the denatured state can be explained by examining other factors such as reagent basicity and direct interaction of the reagent with the solvent. Similarly, for higher MW disulfide bonded proteins such as BSA and transferrin, the supercharging effect might be limited by electrostatic repulsion resulting from the size/charge ratio.

Next, acidic disulfide bonded proteins such as β-lactoglobulin A and α-lactalbumin were supercharged efficiently; however, acidic proteins such as calmodulin and actin did not show any supercharging. This brings out the individual roles of acidic amino acids and the presence of disulfide bonds in supercharging. Despite the denaturing solvent conditions, the disulfide- intact architecture might keep the protein in a relatively constrained form and when supercharging reagents are added, enhanced charging is obtained. On the other hand, in acidic proteins without any disulfide bonds, these constraints are not present and so supercharging ability is only limited by the constituent amino acids in the protein. Similarly, in the glycosylated proteins, the net negative charge might impede the supercharging efficiency. For these acidic/glycosylated proteins, sulfolane might not be a suitable reagent. Other reagents such as

210 m-NBA which is a weak base and weak acid can be used in the negative mode to supercharge these proteins.

Only a few groups have studied the efficiency of supercharging reagents in the LC-MS mode. The results presented in Chapter 5 are exploratory in nature to find the suitable reagent and delivery method for charge enhancing during LC-MS. By studying the efficiency of different reagents, sulfolane was found to be a potent supercharging reagent under denaturing conditions and its efficiency for charge enhancing proteins when eluting from the column was analyzed. By adding it to the LC solvents, the charge enhancing effect of sulfolane was significant, while its reactivity with the active sites in the column limited its usage.

In order to find other efficient methods for reagent delivery, the air filter adapters in the

CaptiveSpray source were employed. This is a simple method to deliver the reagent, with only the vapor pressure of the sulfolane limiting the concentration that can be achieved in the spray.

Sulfolane has a very high boiling point (287˚C at 760 Torr) and the vapor pressure exerted is consequently low. This might be the reason behind the moderate supercharging effect of sulfolane. Furthermore, to enhance the rate at which sulfolane can volatilize, 100% ACN was used as a delivery vehicle to increase the concentration of sulfolane in the spray. While this method also showed only moderate increase in supercharging, it increased the signal intensity of the ESI multiple charge states of several proteins and this could be used in situations where

TFA is added as a LC modifier. The signal suppression caused by TFA can be negated by delivering sulfolane in the vapor phase.

6.5. Concluding remarks

The goal of the dissertation was to develop high throughput top-down methods to identify proteins in complex samples during elution from a LC column. The methods presented here are valuable additions to the top-down community and can be used for data-independent

211 fragmentation of proteins in the LC timescale. In addition, supercharging reagents can be used as a means to enhance top-down protein identification in the chromatographic timescale.

212

Neurocrit Care (2010) 12:324–336 DOI 10.1007/s12028-010-9342-5

ORIGINAL ARTICLE

Metabolic Crisis After Traumatic Brain Injury is Associated with a Novel Microdialysis Proteome

R. Lakshmanan • J. A. Loo • T. Drake • J. Leblanc • A. J. Ytterberg • D. L. McArthur • M. Etchepare • P. M. Vespa

Published online: 12 March 2010 Ó Springer Science+Business Media, LLC 2010

Abstract Results Five patients were studied, 3 of whom had Background To examine if the metabolic distress after abnormal metabolism and 2 who had normal metabolism. traumatic brain injury (TBI) is associated with a unique By comparison, the abnormal group had higher LPR proteome. (1609 ± 3691 vs. 15.5 ± 6.8, P < 0.001), higher gluta- Methods Patients with severe TBI prospectively under- mate (157 ± 84 vs. 1.8 ± 1.4 lM, P < 0.001), and lower went cerebral microdialysis for the initial 96 h after injury. glucose (0.27 ± 0.35 vs. 1.8 ± 1.1 mmol/l, P < 0.001). Hourly sampling of metabolism was performed and The abnormal group demonstrated 13 unique proteins as patients were categorized as having normal or abnormal compared with the normal group in the microdialysate. metabolism as evidenced by the lactate/pyruvate ratio These proteins consisted of cytoarchitectural proteins, as (LPR) threshold of 40. The microdialysate was frozen for well as blood breakdown proteins, and a few mitochondrial proteomic batch processing retrospectively. We employed proteins. A unique as yet to be characterized peptide was two different routes of proteomic techniques utilizing mass found at m/z (mass/charge) 4733.5, which may represent a spectrometry (MS) and categorized as diagnostic and bio- novel biomarker of metabolic distress. marker identification approaches. The diagnostic approach Conclusion Metabolic distress after TBI is associated with was aimed at finding a signature of MS peaks which can a differential proteome that indicates cellular destruction differentiate these two groups. We did this by enriching for during the acute phase of illness. This suggests that metabolic intact peptides followed by MALDI-MS analysis. For the distress has immediate cellular consequences after TBI. biomarker identification approach, we applied classical bottom-up (trypsin digestion followed by LC-MS/MS) Keywords Traumatic brain injury Á Microdialysis Á proteomic methodologies. Proteomics Á Metabolic crisis Á Biomarkers

Introduction R. Lakshmanan Á J. A. Loo Á A. J. Ytterberg UCLA Department of Chemistry and Biochemistry, David Traumatic brain injury (TBI) results in an immediate cel- Geffen School of Medicine at UCLA, Los Angeles, CA, USA lular injury often consisting of a hemorrhagic injury and/or T. Drake Á J. Leblanc early cell death in a defined region of the brain. Secondary UCLA Department of Pathology and Laboratory Medicine, cellular injury resulting in cell loss occurs in regions that David Geffen School of Medicine at UCLA, were not directly involved in the primary injury. Brain Los Angeles, CA, USA regions that are in metabolic distress during the early post- D. L. McArthur Á M. Etchepare Á P. M. Vespa (&) injury phase are more prone to long-term tissue atrophy [1]. UCLA Department of Neurosurgery and Neurology, David The mechanisms involved that lead to metabolic distress Geffen School of Medicine at UCLA, Ronald Reagan UCLA and eventual tissue loss are as yet unknown. Better char- Medical Center, 757 Westwood Blvd, Room, RR 6236A Los Angeles, CA, USA acterization of the metabolic distress state is needed in e-mail: [email protected] order to understand its significance.

213 Neurocrit Care (2010) 12:324–336 325

Cerebral microdialysis has recently gained increased stabilization in the emergency room or after surgery. Cra- popularity as a clinical investigation monitor of brain niotomies were performed for evacuation of intracranial metabolism after TBI [2–4]. Preliminary studies have mass lesions and hematomas. Intracranial pressure was outlined several neurochemical markers of metabolic dis- measured using a ventriculostomy and intracranial pressure tress after injury, including elevated glutamate, elevated (ICP) was maintained below 20 mmHg using a standard- glycerol, and elevated lactate/pyruvate ratio as well as low ized treatment protocol including head of bed up to 30°, glucose concentrations. These markers tell us about the mild hyperventilation (PaCO2 = 30–35 mmHg), external metabolic state of the tissue, but tell us little about the ventriculostomy with cerebrospinal fluid (CSF) drainage, mechanisms leading to this distress. Conversely, microdi- moderate sedation with low doses of propofol, euglycemia alysis also is capable of measuring other non-metabolic (90–120 mg/dl), and maintenance of mild hypernatre- compounds that may reflect important mechanisms of how mia (sodium = 140–145 mmol/l). Refractory intracranial cells survive or die. Proteomics is one method of analyzing pressure was managed by pentobarbital-induced burst sup- microdialysis fluid and it may shed light on important pression coma. Cerebral perfusion pressure (CPP) was kept processes associated with metabolic distress in human above 60 mmHg with volume repletion and vasopressors. subjects. Formal assessments of the human proteome after Continuous jugular venous saturation was moni- TBI have recently been studied in CSF and blood [5–8], but tored with SJO2 kept 60–70% through adjustments in CPP. assessment of changes in proteome based on metabolic Continuous electroencephalographic (EEG) monitoring state has heretofore not been addressed. was performed to assess the presence of seizure activity or The purpose of this study is to initially characterize the to monitor the barbiturates effect when burst-suppression differential proteome after TBI based on the metabolic state was induced. of the tissue. The metabolic state of the tissue is defined by the lactate/pyruvate ratio (LPR). In normal metabolic states, Cerebral Microdialysis the LPR is <25 and in metabolic distress the LPR is >40. We have carefully analyzed five subjects in this report, three The microdialysis catheter (10-cm flexible shaft, 10-mm of whom are in metabolic distress and two of whom are not. membrane length, 20-kD cutoff, CMA microdialysis AB, Our proteomic methods enable us to evaluate the presence Solna, Sweden) was inserted via a twist-drill burr hole of protein fragments in the extracellular fluid despite the adjacent to the preexisting ventriculostomy at a depth of relatively limited membrane size of the microdialysis probe 2–3 cm under the skin and at an angle of 30° anterior and (i.e., 20 kD). We report the novel finding of a differential lateral to the ventriculostomy catheter so that its tip was proteome including a specific peptide under the various located in the frontal white matter. The microdialysis pathological conditions leading to metabolic distress. catheter was inserted in the right frontal lobe in 4/5 patients, with one placed on the right temporal lobe after removal of a traumatic hematoma on that side. The location Materials and Methods of the microdialysis catheter was confirmed by computer- ized tomographic (CT) scan to be present in normal Patient Population appearing white matter (Fig. 1). The microdialysis catheter was tunneled 3 cm under the skin and secured to the scalp This study was approved by our institutional review board with sutures. It was then attached to the perfusion pump and was conducted as part of the UCLA Brain Injury (CMA 106 microdialysis pump, CMA microdialysis AB, Research Center in patients with traumatic brain injury. Solna, Sweden). Normal saline was perfused through the The main inclusion criteria were (1) a Glasgow Coma catheter at a rate of 1 ll/min. Fluid in the samples was Scale (GCS) score of B8 or a GCS of 9–14 with comput- collected every hour, analyzed immediately on the CMA erized tomography (CT) brain scans demonstrating 600 for glucose, lactate, pyruvate, and glycerol and frozen intracranial bleeding, and (2) the need for intracranial in dry ice or in a freezer. We collected microdialysis hourly pressure monitoring. Surrogate informed consent was samples starting from post-injury hour 20 through post- obtained. Over 2 years, five consecutive patients were injury hour 140. Repeat analysis was conducted within enrolled and subsequently studied with continuous micro- 4 weeks with stability of the results demonstrated for all dialysis monitoring and acute and long-term brain MRI. analytes (R2 = 0.92). The concentration of four analytes (glucose, lactate, pyruvate, and glutamate) was measured in TBI Treatment Protocol the automated analyzer (CMA 600 microdialysis analyzer, CMA microdialysis AB, Solna, Sweden). The concentra- This protocol has been previously outlined (9). All patients tion of each analyte was measured twice and the mean were admitted to the neurointensive care unit after initial value used in the final analysis. Values were considered

214 326 Neurocrit Care (2010) 12:324–336

Fig. 1 The computerized tomographic (CT) and magnetic resonance normal. The arrow shows the locations of the microdialysis and the imaging of each patient. The three patients with elevated LPR are ventriculostomy (EVD). Each pair of images is labelled by patient labelled as abnormal and the two with normal LPR are labelled abnormal as follow: glutamate values above 5 lmol/l, 256 9 256, 1-mm slice thickness), axial fluid inversion lactate/pyruvate ratio (LPR) above 40, glucose values recovery (FLAIR: TR 9590 ms, TE 70 ms, FOV <0.20 mmol/l, and glycerol >50 lmol/l (16). Patients 512 9 384, slice thickness 3 mm), axial gradient recalled with LPR > 40 were designated as ‘‘abnormal’’ samples echo axial (GRE: TR 1500 ms, TE 7 ms, FOV 512 9 384, (Abn1, Abn2, Abn3) and patients with LPR < 40 were slice thickness 3 mm). Microdialysis catheter location was designated as ‘‘normal’’ samples (Nor1 and Nor2). For the verified on CT or MRI images. In normal subjects, the proteomic studies outlined below, representative hourly microdialysis catheter was contained within normal samples of microdialysate from each patient were taken appearing white matter on MRI. once every 4 h from the start of monitoring up to 84 h after injury or until the end of microdialysis monitoring. Proteomic Methods

Imaging and Volumetric Analysis Peptide Profiling with Magnetic Beads and Matrix-Assisted Laser Desorption Ionization Time-Of-Flight (MALDI-TOF) Acute computerized tomographic (CT) imaging was per- MS formed using 0.5-cm slice thickness, non-contrast axial images. Acute and magnetic resonance imaging (MRI) was Peptides in the hourly microdialysate samples from each performed using Volumetric T1 Magnetization Prepared patient were enriched by binding to C18 magnetic beads. Gradient Echo (MPRAGE: TR 1900 ms, TE 3.5 ms, FOV 5 ll of microdialysate sample was diluted with 65 llof

215 Neurocrit Care (2010) 12:324–336 327 binding solution which has 5% acetonitrile (ACN) and achieve a 3–4-mm column length and equilibrated with 0.1% formic acid (FA), and mixed with 10 ll of 25 mg/ml 20-ll of 5% FA. 15 ll of each of abnormal subject #3 C18 DynabeadsÒ (Invitrogen, Carlsbad, CA, USA). The (Abn3) pool and Nor pool were diluted with 20 llof5% samples were incubated in a shaker for 20 min. The FA and loaded slowly on the column. The column was then magnetic beads with the bound peptides were focused by washed with 20 ll of loading solvent and eluted directly placing a magnet externally to the sample tubes, and the into electrospray ionization (ESI) needles with 1 llof50% supernatant which has the salts and other contaminants was methanol and 1% FA. The collision energy for collisionally removed. The beads were washed thrice with 80 ll of the activated dissociation (CAD) MS/MS and the acquisition binding solution. To elute the bound peptides, 10 llof time were optimized for each targeted peptide to generate 50% ACN was added to the beads and mixed well. MS/MS spectra of sufficient quality. The spectra were Approximately 5 ll of elute was mixed with 5 llof processed using both MassLynx and MaxEnt3, and Mascot MALDI matrix (7 mg/ml a-cyano-4-hydroxycinnamic acid Distiller (Matrix Science, London, UK) to generate mass in 70% ACN and 0.1% trifluoroacetic acid), and 2 ll was lists to search an in-house version of the Mascot search spotted onto the MALDI target plate in duplicates and engine (Matrix Science). Additionally, the spectra were analyzed by MALDI-MS (PerkinElmer-Sciex prOTOF interpreted manually and the derived sequences searched MALDI-TOF mass spectrometer, Shelton, CT, USA). The against the NCBI database using BLAST. instrument was calibrated externally using a peptide mix- ture (PepMix1, LaserBio Labs, Sophia-Antipolis Cedex, Protein Digestion with Trypsin To identify all the pro- France) which has five different peptides with mass ranging teins present in the microdialysate, a classical proteomic from m/z 1000 to 2500. The laser intensity was set at 50%, bottom-up approach was undertaken. Protein concentration and the acquisition mass window ranged from m/z 1500 to in the pooled samples was determined using MicroBCA 15000. Spectra were collected using an automated data protein assay (Pierce, Rockford, IL, USA). The protein acquisition tool in the software, which irradiated each concentration ranged from 0.15 to 0.19 lg/ll in the sample well with 100 shots of laser at 32 different spots abnormal samples and from 0.08 to 0.13 lg/ll in the and the data from all these spots were added to generate the normal samples. To 10 lg of protein from each patient, accumulated mass spectrum. Mass spectra obtained from 25 ll of 100% dimethyl sulfoxide (final concentration both abnormal and normal samples were compared to find 10%) was added to denature the proteins. It was then peaks that are different between these two sample groups. digested with 0.5 lg of trypsin (Promega, Madison, WI, Progenesis PG600 software (Nonlinear Dynamics, New- USA) in 50-mM ammonium bicarbonate buffer at 37°C castle upon Tyne, UK) was used to recognize the overnight. After digestion, the samples were dried down, differential peaks. PG600 aligns all the spectra from both resuspended in 10 ll of 5% FA, and desalted using Omix groups and displays a ‘Simulated gel view’. C18 tips (Varian, CA, USA).

Pooling Microdialysate Samples Initially, from each Protein Identification by Liquid Chromatography-Tandem patient, 60 ll of microdialysate sample was collected every Mass Spectrometry (LC-MS/MS) LC-MS/MS was per- hour and from this 20 ll was removed for small molecule formed with a QTOF mass spectrometer (QSTAR Pulsar analysis and profiling with C18 beads. For further identi- XL, Applied Biosystems, Foster City, CA, USA) equipped fication of proteins and peptides in the microdialysate, all with a nanoelectrospray interface (Protana, Odense, Den- the hourly timepoints were pooled to have one represen- mark) and a nano-HPLC system (Dionex/LC Packings, tative sample for each patient. Sunnyvale, CA). For each LC-MS/MS run, 2 lg of the trypsin digest was resuspended in 6 ll of 0.1% FA and Tandem MS on Endogenous Peptides To obtain sequence injected into the nano-LC equipped with a homemade information of the peptides in the microdialysate sample, precolumn (150 lm 9 5 mm) and an analytical column tandem MS (MS/MS) of the pooled microdialysate sample (75 lm 9 150 mm) packed with Jupiter Proteo C12 resin using a quadrupole time-of-flight (QTOF) mass spectrom- (particle size 4 lm, Phenomenex, Torrance, CA). The eter (Waters Synapt HDMS, Manchester, UK) was precolumn was washed with the loading solvent (0.1% FA) performed. Pooled samples were concentrated using our for 4 min before the sample was injected onto the LC in-house made C18 microcolumns according to Gobom column. The eluants used for the LC were 0.1% FA (sol- et al. [9]. In short, the microcolumns were fabricated with vent A) and 95% ACN containing 0.1% FA (solvent B). gel loading pipette tips (Eppendorf, Westbury, NY, USA) The flow rate was 200 nl/min, and the following gradient and Poros R2 (C18) stationary phase (Applied Biosystems, was used: 3–21% B in 36 min, 21–35% B in 14 min, 35–80% Foster City, CA, USA). After constricting the ends of the B in 4 min, and maintained at 80% B for 10 min. The gel loading tip, C18 material was packed on these tips to column was finally equilibrated with 3% B for 16 min prior

216 328 Neurocrit Care (2010) 12:324–336

to the next run. All five patient samples were analyzed in Larsen et al. [10]. Briefly, 1 ll of the TiO2 purified phos- triplicate. The tandem MS results from both the sample phopeptides was added into a plug of 20 ll of 0.05-U alkaline groups were searched against various databases (IPI human phosphatase in 50-mM ammonium bicarbonate in a C18 version 3.39, MSDB_06 and SwissProt) using MASCOT microcolumn. The top of the column was covered with (Matrix Science) sequence search algorithm allowing for parafilm and column was incubated at 37°C for 1 h. 10 llof acetyl (N-term), formyl (N-term), oxidized methionine, 5% FA was added to stop the reaction by acidification and the carbamidomethylated cysteine, pyro-Glu (cyclization of peptides were loaded onto C18 material. After washing with N-terminal glutamine and glutamate), methyl (C-term), 5% FA, the peptides were eluted using 1 ll of DHB matrix methylated aspartate and glutamate. The mass tolerance and spotted on the MALDI target plate for analysis. was set to 0.3 Da for both precursor and fragment ions. Also, a semi-tryptic search against IPI human database was Phosphopeptide Identification by Tandem MS Using Synapt included to account for endogenous peptides that were HDMS To obtain phosphopeptide sequences, they were cleaved by trypsin. The results from all the searches were further purified using R3 oligo resin (Applied Biosystems) merged and recalibrated using our in-house developed Perl and fragmented using a QTOF mass spectrometer (Waters scripts. The query lists generated from the search results Synapt HDMS). Briefly, 1 ll of phosphopeptides eluted were converted to peptide/protein lists and grouped from TiO2 microcolumns was diluted with 20 llof5%FA according to proteins that share peptides. Finally, the fol- and loaded slowly onto R3 oligo stationary phase, washed lowing criteria were used for generating a list of proteins with 20 ll of 5% FA and eluted with 1 ll of 50% methanol that is present in each sample replicate: and 1% formic acid directly into ESI needles. Spectra were acquired manually on the Synapt HDMS as described above. (1) The protein should be identified in at least two replicates (out of 3 total) in each patient. (2) If it is a single peptide match in the protein identified, peptide score should be more than 35. Results (3) If there are two or more peptide matches, peptide score should be more than 20. Clinical Characteristics

Five TBI patients were studied with microdialysis. Three Phosphopeptide Enrichment with TiO Microcol- 2 subjects had metabolic distress and two did not. The injury umns Pooled abnormal and normal samples were enriched characteristics are outlined in Table 1. The three patients for phosphopeptides using in-house made titanium dioxide with metabolic distress had the microdialysis probe in (TiO ) microcolumns. These microcolumns were made on 2 injured tissue, and had elevated ICP, whereas the remain- P10 pipette tips using plugs of 3M Empore C8 extraction der had microdialysis probe in normal appearing tissue. disks (3M Bioanalytical Technologies, St. Paul, MN, USA) The injured tissue consisted of a secondary ischemic stroke to close the end of the tips. A 3–4-mm column length due to traumatic internal carotid artery dissection, peri- of TiO stationary phase was made using 4-lm TiO (GL 2 2 contusional tissue near an evacuated temporal contusion Sciences Inc., Torrance, CA, USA). The column was and hypoxic ischemic tissue due to hypotensive shock after equilibrated with 30 ll of 80% ACN, 3% trifluoroacetic TBI. Figure 1 outlines the probe location and tissue typing acid. The samples were diluted four times in the same in the five patients. solvent used for equilibration and loaded slowly on the column, followed by a wash with the same solvent. The peptides were eluted step-wise using 2 llofddH2O, 10 ll Metabolic Profile of Microdialysate of 1% ammonium hydroxide, and 2 ll of 40% ACN, 0.1% FA. 0.5 ll of elute and 0.5 ll of 2,5-dihydroxybenzoic acid Three patients had persistent metabolic crisis during the (DHB) matrix (10 mg/ml in 25% ACN, 15% isopropanol, observation period (Abnormal 1–3). The microdialysate 0.5% phosphoric acid) were spotted using dried droplet data were analyzed using the mixed effects model for method on a MALDI target plate coated with lanolin. The group, time, and interaction for group 9 time. During this spots were analyzed using MALDI-TOF MS (Voyager period, the abnormal group had higher LPR (1609 ± 3691 DE-STR, Applied Biosystems, Foster City, CA, USA). vs. 15.5 ± 6.8, P < 0.001), higher glutamate (157 ± 84 vs. 1.8 ± 1.4 lM, P < 0.001), and lower glucose Dephosphorylation of Phosphopeptides Using Phospha- (0.27 ± 0.35 vs. 1.8 ± 1.1 mmol/l, P < 0.001). Glycerol tase The phosphopeptides eluted from TiO2 microcolumns values were similar in the two groups by the mixed model were dephosphorylated to confirm the phosphorylation. The analysis despite higher mean values in the abnormal group peptides were dephosphorylated at pH 7.8 according to (186 ± 120 vs. 21 ± 9, P < 0.11). The temporal pattern

217 Neurocrit Care (2010) 12:324–336 329

Table 1 Injury characteristics for each patient. GOSe 12 indicates the Glasgow outcome score extended at 12 months after injury Subject Injury MD location MD tissue character Sampling time ICP elevation GOSe 12

Abnl 1 C-TBI R Temporal Pericontusional PIH 43–106 Y 4 Abnl 2 C-TBI R Frontal Ischemia PIH 29–80 Y 1 Abnl 3 GSW R Frontal Ischemia PIH 19–67 Y 1 Norm 1 C-TBI R Frontal NAWM PIH 60–108 N 8 Norm 2 C-TBI R Frontal NAWM PIH 23–111 N 8 of microdialysis changes indicates that the maximum proportion of structural proteins seen in the metabolic abnormality in the tissue was present within the initial 48 h distress group. after injury. Figure 2 shows the time course for the abnormal and normal patients. Table 2 outlines the mean Endogenous Peptides Identified by Tandem MS microdialysis results for each group. The fragmentation spectra obtained from tandem MS experiments were processed using designated software and Initial Peptide Profiling Using MALDI-TOF MS interpreted both manually and using the MASCOT search algorithm. Figure 5 shows the full-scan deconvoluted mass Using Progenesis PG600 to generate simulated gel views, spectrum of pooled Abn3 sample eluted from C18 micro- the abnormal and normal patient profiles were reviewed. column, with the labeled peaks further identified by tandem Each horizontal row in this gel view represents a mass MS experiments. The ions at m/z 997.49, 1195.62, and spectrum obtained from the fluid collected at a certain time 1308.71 matched to a part of hemoglobin beta chain pro- point (hours after injury) and profiled using magnetic beads tein sequence with N-terminal variations as shown in Fig. 5 coupled with MALDI-MS. The mass spectra of individual (inset). The m/z 1798.93 ion matched to a different stretch time points from all abnormal and normal samples are of amino acids within the same protein. The peptide at m/z stacked separately to obtain the gel view shown in Fig. 3a, 3039.54 matched to 30 amino acids from the N-terminus of b, respectively. The mass spectra obtained from all three the hemoglobin alpha chain and the m/z 2022.03 ion mat- abnormal patients have a peak at m/z 4733.5. This peak is ched to 20 amino acids close to the C-terminus of the same not present in both the normal LPR patients. polypeptide. Another cluster of four peptides (m/z 3326.67, 3342.66, 3358.64, and 3374.66) differing from one another Identification of Proteins Present in Microdialysate by 16 Da also matched to first 32 residues from the Samples N-terminus of hemoglobin alpha chain. The peptides at m/z 3342.66, 3358.64, and 3374.66 are singly, doubly, and A total of 35 proteins were identified by LC-MS/MS in triply oxidized forms of the m/z 3326.67 ion, respectively. Abn1, 29 proteins in Abn2, and 26 proteins in Abn3. In contrast, the full-scan mass spectra of normal sample Compared to this, the number of proteins identified in the (Fig. 6) eluted from C18 microcolumn does not have any normal samples was relatively small; eight proteins were of the peptides identified from the abnormal sample. identified from Nor1 and 9 proteins from Nor2. Venn diagrams that illustrate the number of proteins uniquely Phosphopeptide Enrichment, Confirmation, present and those that are common between patients in both and Identification the sample groups were constructed (Fig. 4). Proteins present in at least two out of three patients in abnormal Two phosphopeptides at m/z 1545.6 and 1616.6 were both samples are listed in Table 3 and for normal samples in present in all the pooled abnormal (Abn1, Abn2, and Abn3)

Table 4. All three abnormal samples have 13 proteins in samples enriched using TiO2 microcolumns (Fig. 7). None common and there are 9 other proteins that are present in at of these two peaks were present in pooled Nor1 sample and least two out of three abnormal samples (Table 3). There there is a small peak at m/z 1616.6 in the Nor 2 pool are six proteins common to both the normal samples (Fig. 7). The dephosphorylation of these two phospho- (Table 4), and three of these were present in all three peptides using alkaline phosphatase showed a 80 Da abnormals as well. The proteins unique to each patient in decrease in mass for both peaks, depicting the loss of the both the sample groups were not included in the analysis phosphate group (data not shown). The MS/MS spectra of because these proteins may represent individual variations the peptide ion at m/z 1616.6 matched to the published specific to each patient. As one can see, there are a greater sequence of fibrinopeptide A and the peptide at m/z 1545.6

218 330 Neurocrit Care (2010) 12:324–336

Fig. 2 The time plot of hourly LPR Abnl 1 microdialysis lactate/pyruvate 250 ratio for all five patients. The 200 abnormal patients are solid and normal patients are open circles. 150

Each patient is represented by a LPR 100 continuous stream of data 50 points. The LPR values have been log transformed to be able 0

to plot the time series 44 48 51 54 57 61 65 69 72 75 78 81 84 87 90 93 96 99 102 post injury hour

LPR Abnl 2 1200 1000 800 600 LPR 400 200 0 29 31 33 35 37 40 44 46 49 51 54 56 62 65 68 70 72 76 78 80 Post injury hour

LPR Abnl 3 20000 18000 16000 14000 12000 10000 LPR range 40-200 LPR 8000 6000 4000 2000 0 20 22 25 28 31 36 38 41 45 48 51 54 57 60 63 67 post injury hour

LPR Norm 1 30

25

20

15 LPR 10

5

0 60 63 63 64 66 68 70 72 74 76 78 80 81 83 85 87 89 91 93 95 97 99 Post injury hour

LPR Norm 2 35 30 25 20

LPR 15 10 5 0 24 28 34 38 44 48 51 55 59 64 67 71 76 81 85 89 93 97 101 105 109 Post injury day

219 Neurocrit Care (2010) 12:324–336 331

Table 2 Microdialysis mean values are shown for each group Glucose (mmol/l) Lactate (mmol/l) Glutamate (lM) Pyruvate (lM) L/P ratio Glycerol (lM) Urea (lM)

Abnl 0.27 ± 0.30 3.25 ± 2.1 161 ± 73 23 ± 29 1530 ± 2141 186 ± 89 1.34 ± 1.2 Norm 2.3 ± .74 1.1 ± 0.5 1.4 ± 0.05 89 ± 9 13.5 ± 6.4 15 ± 7 1.37 ± 0.56 The values were all significantly different except glcyerol and urea

Fig. 3 a, b The gel view of the mass spectra of abnormal and normal patients. The window shown is from m/z 4500–5000. The hyperdense band at m/z 4733.5 is seen in each of the abnormal patients and in none of the normal patients

also matched to fibrinopeptide A missing the N-terminal alanine residue (Fig. 8).

Discussion

The principal findings of this study are (1) the extracellular fluid of TBI patients has a representative proteome that can be detected through microdialysis, (2) the proteome appears to be differentially affected with greater numbers of cytoskeletal proteins in those patients exhibiting extreme metabolic distress, and (3) there appears to be a novel Fig. 4 A Venn diagram showing the number of unique proteins for peptide that as yet has not been well characterized. These each subject and those proteins that were common among the abnormal and normal subgroups. The intersection shows the number results indicate that metabolic distress is accompanied by of proteins that were in common various levels of early proteolysis and that this proteolysis

220 332 Neurocrit Care (2010) 12:324–336

Table 3 List of unique proteins present in at least two out of three abnormal samples Acc. No. Description Abn1 Abn2 Abn3 No. of AA MW

IPI00410714 Hemoglobin subunit alpha X X X 142 15258 IPI00654755 Hemoglobin subunit beta X X X 147 15998 IPI00029717 Isoform 2 of fibrinogen alpha chain X X X 644 69757 IPI00021885 Isoform 1 of fibrinogen alpha chain X X X 866 94973 IPI00021907 Isoform 1 of myelin basic protein X X X 304 33117 IPI00220327 Keratin, type II cytoskeletal 1b X X X 644 66039 IPI00021440 Actin, cytoplasmic 2 X X X 375 41793 IPI00798155 Ubiquitin X X X 106 12237 FETUA_HUMAN Alpha-2-HS-glycoprotein X X X 367 39325 IPI00643115 Stathmin X X X 116 13492 IPI00220828 Thymosin beta-4 X X X 44 5053 IPI00220827 Thymosin beta-10 X X X 44 5026 IPI00012011 Cofilin-1 X X X 166 18502 IPI00025363 Isoform 1 of glial fibrillary acidic protein X X 432 49880 IPI00217465 Histone H1.2 X X 213 21365 IPI00298497 Fibrinogen beta chain X X 491 55928 IPI00021841 Apolipoprotein A-I X X 267 30778 IPI00793930 Tubulin alpha X X 335 37218 IPI00745872 Albumin X X 609 69367 IPI00883934 Prothymosin alpha X X 73 8161 IPI00008868 Microtubule-associated protein 1B X X 2468 270620 IPI00216461 Acylphosphatase-2 X X 127 13891 Listed is the number of amino acids (AA) and the molecular weight (MW) of the native, undigested protein. We identified only fragments of the native protein using the methods outlined above

Table 4 List of proteins Acc. No. Description No. of AA MW present in both normal samples IPI00220327 Keratin, type II cytoskeletal 1b 644 66039 IPI00029717 Isoform 2 of fibrinogen alpha chain 644 69757 IPI00021885 Isoform 1 of fibrinogen alpha chain 866 94973 IPI00607600 Amyloid precursor-like protein 1 651 72247 IPI00429366 Programmed cell death 1 ligand 2 183 20867 IPI00470716 Isoform 2 of neuroendocrine protein 7B2 211 23659 may be one mechanism through which long-term damage on serum is challenging since the proteins or proteolytic is done by metabolic distress. fragments from the damaged brain tissue are reabsorbed via the arachnoid villie into the blood [7] and this may lead Previous Proteomic Studies in TBI to additional dilution effects. Proteomic studies on cerebral microdialysis have been Previously, a few groups have studied the proteome found much more limited. The first description of proteomic in human serum [6, 7] or cerebrospinal fluid (CSF) [8, 11, study of microdialysis was by Maurer et al. [16], and was 12] after TBI. These studies have focused on a general done in ischemic stroke patients using two dimensional gel description of the proteome or selective study of unique electrophoresis followed by mass spectrometry and they proteins as predictive biomarkers. A variety of cytoskeletal found 10 proteins unique to the microdialysate sample. and neuronal proteins such as S100B, neuron-specific Surprisingly, some proteins identified in Maurer’s study enolase (NSE), myelin-basic protein (MBP), and glial were larger than 20 kD which is the stipulated cut-off of the fibrillary acidic protein (GFAP) have been detected in microdialysate semi-permeable membrane. It could be that these studies [13–15]. Interpretation of proteomic studies the folded three-dimensional structure of some of these

221 Neurocrit Care (2010) 12:324–336 333

Fig. 5 An example of HbA HbB deconvoluted mass spectra in 100 1308.71 2022.03 Peptides matching Hemoglobin alpha (HbA) the abnormal patients. The inset HbB shows the list of peptides that HbB 1798.93 matched to hemoglobin beta. 1195.62 Many more peaks are visible as compared with the normal HbA subject in Fig. 6 3039.54 HbA 3342.66

HbA Peptides matching Hemoglobin beta (HbB) 3326.67 HbA 3374.66

2797.40

Relative intensity % 2532.24 4733.46

HbA 3358.64

HbB 997.49

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500 7000 Mass

Fig. 6 An example of 100 540.21 deconvoluted mass spectra in the normal patients

608.27 Relative intensity % Relative

518.23 451.41

427.32

0 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300 2400 Mass proteins may allow for their retainment in the microdial- amyloid proteins appeared to be more prevalent under ysate sample. A more likely explanation is that fragments conditions of normal LPR and increasingly present as a <20 kD of larger proteins are being identified using our function of time after injury. Marklund reported the pres- methods. A more limited analysis of selective whole pro- ence of Tau and Ab42 without specific comment about the teins in microdialysis, namely amyloid proteins have been state of metabolic distress. In neither study was a com- done by Brody et al. [17] and Marklund et al. [18], using prehensive approach performed to determine the range of larger membrane (100 kD) catheters. In the Brody report, potential proteins present nor the comparison of various

222 334 Neurocrit Care (2010) 12:324–336

Fig. 7 The phosphopeptides at m/z 1545 and 1616 aligned from all abnormal and normal samples

Fig. 8 Phosphopeptides at m/z 1616 and 1545 matched to fibrinopeptide A of fibrinogen alpha chain precursor protein. The difference between these two peptides is the alanine residue shown in box

metabolic states. In our study, we sought to explore the Our main findings are that there is a differential set of range of proteins that occur under conditions of metabolic proteins present in tissue that is under metabolic distress distress rather than focus on one pathway. as compared with tissue that is metabolically more

223 Neurocrit Care (2010) 12:324–336 335 normal. The types of proteins that were found are variable in Metabolic distress after TBI is characterized by a class but several of these proteins raise the possibility that reduction in energy formation that may or may not occur in particular chemical processes are occurring under condi- the context of ischemia. LPR has been correlated region- tions of metabolic distress. First, we see a wider array of ally with reduced oxidative metabolism [22]. The LPR is a cytoarchitectural proteins (i.e., keratin, actin, tubulin alpha, good biomarker of metabolic distress. The duration of cofilin) as compared with the normal metabolic state. elevated LPR correlates with long-term regional atrophy Stathmin is a protocongene that is important in the disas- [1]. Elevation of LPR for several days after TBI is asso- sembly of microtubules in the cytoskeleton that can become ciated with more severe atrophy of the frontal lobe, despite active once upregulation of phosphorylation kinases are normal appearance of the frontal lobe on acute imaging. activated [19]. Thymosin beta 4 is a regulatory protein of This suggests that there may be ongoing damage to normal actin, thereby regulating intracellular levels of actin. Pre- appearing brain over time after the initial impact. The sumably, metabolic distress leads to upregulation of these current proteomic data suggest that under extremely high kinases and in turn may activate stathmin, leading to LPR conditions, there is early loss of cellular integrity and microtubular disassembly. It is interesting to consider the proteolysis. This is a novel finding with regards to LPR and presence of cytoskeletal proteins together with the increased helps to underscore the importance of LPR as a biomarker level of extracellular glycerol in these samples. Glycerol is a in critically ill patients. proposed biomarked of cell membrane integrity and increases under conditions of membrane breakdown. It is possible that the observed proteins reflect loss of cytoar- Limitations chitecture and subsequent necrosis and exudation of cellular contents into the extracellular space. Secondly, we see the This study is limited by several important features. First, presence of acylphosphatase, which is a 98 amino acid the sample size is quite small but this was truly enzyme that has a known role in mediating glycolysis, exploratory and a proof of concept rather than a defini- mediates the endoplasmic reticulum’s function in protein tive assessment of the proteome. Second, the small folding and unfolding, and can mediate formation of membrane pore size limits the sampling to fragments of amyloid under experimental conditions. The presence of larger proteins and/or small proteins. Use of a 100-kD acylphosphatase may indicate compensatory responses in probe may be better, but is also fraught with technical glycolysis and/or alteration of the protein folding/unfolding/ issues of requiring the perfusate fluid to contain macro- aggregation functions under conditions of metabolic dis- molecules (dextran or albumin) which may in themselves tress. Activation of proteosome pathways may underlie that affect proteomic sampling. Hence, the ideal methodology presence of multiple structural proteins in these samples. and catheter size for proteomics is unknown. Given the Hence, metabolic distress may trigger cytoskeletal disrup- spectrum of sizes and the apparent proteolysis that tion leading to the proteome that we have observed. This occurred, rendering fragments of large proteins capable finding matches well with experimental reports of cyto- of being dialyzed via the small pore size, we are not skeletal disruption after TBI in which the calpain– certain that a bigger pore size is really necessary. calpastatin–caspase pathways are activated leading to cell Thirdly, we were not able to identify the novel protein at death [20] and with proteomics performed on fresh whole m/z 4733.5. This analysis requires more fluid than we brain pathological specimens after TBI [21]. had remaining. We anticipate needing an additional In addition to cytoskeletal proteins, we found evidence 600 lL of fluid to this. Hence, we have more work to do of hematological peptides (hemoglobin subunit alpha and to evaluate if this is truly unique and what potential beta, fibrinogen alpha chain, and fibrinopeptide A) under biological role it plays. conditions of metabolic distress. The presence of blood in the extracellular space could well explain this finding. However, this also raises the possibility that the presence of blood may have led to cellular toxicity through oxidative Conclusions stress. Alternatively, microscopic bleeding could have resulted from the demise of tissue and be a secondary event The evaluation of small molecular weight peptides is fea- rather than causative factor. Despite the presence of sible using cerebral microdialysis. Tissue under metabolic hematological peptides, the specific proteome in normal distress displays a reproducible and unique proteome as and abnormal subgroups were not representative of the compared with normal appearing TBI tissue. The proteome blood proteome after TBI [21] hence we do not think this is suggests ongoing cytoarchitectural disruption that may be related to blood brain barrier disruption alone. mediated by proteosomal activity.

224 336 Neurocrit Care (2010) 12:324–336

References 11. Zuberovic A, Wetterhall M, Hanrieder J, Bergquist J. CE MALDI-TOF/TOF MS for multiplexed quantification of proteins 1. Marcoux J, McArthur DA, Miller C, Glenn TC, Villablanca P, in human ventricular cerebrospinal fluid. Electrophoresis. 2009; Martin NA, et al. Persistent metabolic crisis as measured by 30:1836–43. elevated cerebral microdialysis lactate-pyruvate ratio predicts 12. Hanrieder J, Wetterhall M, Enblad P, Hillered L, Bergquist J. chronic frontal lobe brain atrophy after traumatic brain injury. Temporally resolved differential proteomic analysis of human Crit Care Med. 2008;36:2871–7. ventricular CSF for monitoring traumatic brain injury biomarker 2. Bullock R, Zauner A, Woodward JJ, Myseros J, Choi SC, Ward candidates. J Neurosci Methods. 2009;177:469–78. JD, et al. Factors affecting excitatory amino acid release fol- 13. Lumpkins KM, Bochicchio GV, Keledjian K, Simard JM, lowing human head injury. J Neurosurg. 1998;89:507–18. McCunn M, Scalea T. Glial fibrillary acidic protein is highly 3. Vespa M, Prins M, Ronne-Engstrom E, Caron M, Shalmon E, correlated with brain injury. J Trauma. 2008;65:778–82. discus- Hovda DA, et al. Increase in extracellular glutamate caused by sion 782–774. reduced cerebral perfusion presure and seizures after human 14. Zemlan FP, Jauch EC, Mulchahey JJ, Gabbita SP, Rosenberg traumatic brain injury: a microdialysis study. J Neurosurg. 1998; WS, Speciale SG, et al. C-tau biomarker of neuronal damage in 89:971–82. severe brain injured patients: association with elevated intracra- 4. Hillered L, Vespa PM, Hovda DA. Translational neurochemical nial pressure and clinical outcome. Brain Res. 2002;947:131–9. research in acute human brain injury: the current status and 15. Afinowi R, Tisdall M, Keir G, Smith M, Kitchen N, Petzold A. potential future for cerebral microdialysis. J Neurotrauma. 2005; Improving the recovery of S100B protein in cerebral microdi- 22:3–41. alysis: implications for multimodal monitoring in neurocritical 5. Pineda JA, Wang KK, Hayes RL. Biomarkers of proteolytic care. J Neurosci Methods. 2009;181:95–9. damage following traumatic brain injury. Brain Pathol. 2004;14: 16. Maurer MH, Berger C, Wolf M, Futterer CD, Feldmann RE Jr, 202–9. Schwab S, et al. The proteome of human brain microdialysate. 6. Berger RP. The use of serum biomarkers to predict outcome after Proteome Sci. 2003;1:7. traumatic brain injury in adults and children. J Head Trauma 17. Brody DL, Magnoni S, Schwetye K, Spinner ML, Esparza TL, Rehabil. 2006;21:315–33. Stocchetti N, et al. Science. 2008;321:1221–4. 7. Haqqani AS, Hutchison JS, Ward R, Stanimirovic DB. Bio- 18. Marklund N, Blennow K, Zetterberg H, Ronne-Engstrom E, markers and diagnosis; protein biomarkers in serum of pediatric Enblad P, Hillered L. Monitoring of brain interstitial total tau and patients with severe traumatic brain injury identified by ICAT- beta amyloid proteins by microdialysis in patients with traumatic LC-MS/MS. J Neurotrauma. 2007;24:54–74. brain injury. J Neurosurgery. 2009;110:1227–37. 8. Gao WM, Chadha MS, Berger RP, Omenn GS, Allen DL, Pisano 19. Cassimeris L. The oncoprotein 18/stathmin family of microtubule M, et al. A gel-based proteomic comparison of human cerebro- destabilizers. Curr Opin Cell Biol. 2002;14(1):18–24. spinal fluid between inflicted and non-inflicted pediatric 20. Ottens AK, Kobeissy F, Golden E, Ahang A, Haskins W, Chen S, traumatic brain injury. J Neurotrauma. 2007;24:43–53. et al. Neuroproteomics in neurotrauma. Mass Spec Rev. 2006;25: 9. Gobom J, Nordhoff E, Mirgorodskaya E, Ekman R, Roepstorff P. 380–408. Sample purification and preparation technique based on nano-scale 21. Yang X, Yang S, Wang J, Zhang X, Wang C, Hong G. Expressive reversed-phase columns for the sensitive analysis of complex proteomics profile changes of injured human brain cortex due to peptide mixtures by matrix-assisted laser desorption/ionization acute brain trauma. Brain Injury. 2009;23:830–40. mass spectrometry. J Mass Spectrom. 1999;34(2):105–16. 22. Vespa P, Bergsneider M, Hattori N, Wu HM, Huang SC, Martin 10. Larsen MR, Sørensen GL, Fey SJ, Larsen PM, Roepstorff P. NA, et al. Metabolic crisis without brain ischemia is common Phospho-proteomics: evaluation of the use of enzymatic after traumatic brain injury: a combined microdialysis and posi- de-phosphorylation and differential mass spectrometric peptide mass tron emission tomography study. J Cereb Blood Flow Metab. mapping for site specific phosphorylation assignment in proteins 2005;25:763–74. separated by gel electrophoresis. Proteomics. 2001;1(2):223–38.

225