<<

Fourier Transform Ion Resonance Mass Spectrometry Method Development for Top- Down Protein Fragmentation and Complex Organic Mixture Analysis

by Nicholas Daniel Schmitt

Bachelor of Science, University of Connecticut

A dissertation submitted to

The Faculty of the College of Science of Northeastern University in partial fulfillment of the requirements for the degree of

December 2nd, 2020

Dissertation Committee

Jeffrey N. Agar, Chair

Alexander R. Ivanov

Aron Stubbins

Roger W. Giese

1 Dedication

To my Mother, Paula Schmitt, my father, Thomas Schmitt, my siblings Anthony, Phillip, Joseph (Mar 2nd, 1995 – Jan 10th, 2016) and Anna Schmitt, my fiancée, Aishwarya Mondkar, and my cat, Winni, for all of your support and encouragement. Thank you for being there.

2

Acknowledgments

I would first like to express my gratitude to Prof. Jeffrey N. Agar for continuous support and guidance during my Ph.D. research. Jeff always encouraged me to learn by diving in and being OK with failure on a first attempt, which taught me to be less critical of myself, helped me gain momentum in the lab, and brought me from mechanically disinclined to mechanically competent. He was the constant and most welcome voice of reason, questioning and refining my ideas, teaching me to always consider the reviewer and average when planning experiments to ensure thoroughness, understanding, and acceptance. I am a greater scientist and person as a result of his mentorship.

I would like to thank my current and former lab-mates: Dr. Joseph P. Salisbury for introducing me to the lab, its ongoing projects, its instruments; Dr. Jeniffer V. Quijada for teaching me metabolic labeling techniques, Q-ToF mass spectrometry, and always being supportive and positive; Dr. Daniel P. Donnelly for his extraordinary efforts in keeping our lab functional and running smoothly, and always providing interesting insights; Richa Sarin for helping me become a better mass spectrometrist, providing useful critiques when asked, and always being supportive; Dr. Krishna Aluri and Dr. Catherine Rawlins for their support and advice through the years; Md. Amin Hossain, Novera Alam, Madison McMinn, and all other current Agar lab-members for their support and critical insights through the final years of my research; All of the masters and undergraduate students that I have mentored over the years, especially Natalie Leung, Khanh Vu, Joshua Berger, and Sydney Geyer, for their hard work and dedication, and for making me a better mentor and scientist.

Thank you to my thesis committee members and collaborators: Prof. Aron Stubbins, his former post-doc Sasha Wagner and their lab for allowing and helping me to develop my complex organic mixture analysis skills and teaching me important environmental concepts; Prof. Alexander Ivanov and his former research associate Antonius Koller for teaching me Orbitrap mass spectrometry and for their significant contributions to my understanding of separation chemistry; Prof. Nathalie Agar and her lab for teaching me tissue processing and MALDI imaging techniques; Prof. Roger Giese and research scientist Poguang Wang for teaching me GC-MS and small molecule analysis; Somak Ray for his support of all of my data and -omics pursuits and teaching me how to find and use the correct software for a given problem; Jeremy Wolff and everyone else at Bruker for making me a better mass spectrometrist and supporting my research through difficult times learning, mastering and maintaining the FTICR-MS.

3

Abstract of Dissertation

Mass spectrometry is a versatile technique capable of addressing many scientific queries. As new questions in the field of medicine and disease physiology arise, new analytical methods are needed which can probe the biochemistry, elucidate mechanisms, identify biomarkers, and better inform scientific research direction. This dissertation demonstrates analytical techniques developed to: help illuminate connections between the causes of familial and sporadic amyotrophic lateral sclerosis (ALS) disease progression; better understand antibody recognition as it relates to Cu/Zn superoxide dismutase (SOD1); better characterize proteoform differences in difficult-to-analyze proteins by utilizing internal fragments; more accurately assign cohorts using matrix-assisted laser desorption ionization (MALDI) mass spectrometry imaging (MSI); accurately identify a complex range of metabolites in an automated workflow; and achieve high mass accuracy on trace lipids from complex samples. This dissertation also shows how certain problems or gaps in the literature often need to be considered from a fresh perspective to push fields of research forward. The chapters included here contain articles and excerpts from published and submitted works, as well as some method development details. The focus of chapters 2-4 is the ALS- associated protein SOD1, mutations of which cause 2-7% of ALS cases. SOD1 post-translational modifications are associated, amidst controversy, with 50% of sporadic (idiopathic) ALS. Conformational changes in this protein have been associated with ALS and the aim of these chapters is to better-understand how these changes can be studied to produce more consistent experimental results in a field plagued by contradictions. Chapter five focuses on MALDI-MSI as a tool for single-cell imaging, showing how MSI data could be used to replace and improve upon traditional fluorescence methods for cell-type assignment. The techniques developed in chapter five have the potential to resolve major controversies in SOD1 structural biology (chapter 2) by integrating antibody-based structural characterization (chapter 3) with MS-based structural characterization (chapter 4). Chapter six details additional analytical methods developed for mainly dissolved organic matter (DOM) analysis and high-resolution lipid analysis, in collaboration with research projects in the marine sciences and metabolic pathway physiology. Notably, the methods of chapter six can be applied to improve the analysis of small molecules (e.g. < 700 Da) in general. Collectively, this work exemplifies how mass spectrometry and associated analytical techniques can be applied to solve a wide array of scientific questions.

4

Table of Contents

Dedication ...... 2 Acknowledgments ...... 3 Abstract of Dissertation ...... 4 Table of Contents ...... 5 List of Figures ...... 7 List of Tables ...... 10 Abbreviations ...... 11 Chapter 1: Introduction ...... 17 1.1 Dissertation organization, themes, and contributions ...... 17 1.2 SOD1-mediated ALS ...... 18 1.3 Improving the reproducibility of SOD1 immunochemistry ...... 19 1.4 Eliminating ambiguity in internal fragment ion identification ...... 20 1.5 Challenges in MALDI-MSI ...... 20 1.6 High-Resolution FTICR-MS ...... 21 Chapter 2: Parsing Disease-relevant Protein Modifications from Epiphenomena: Perspective on the Structural Basis of SOD1-Mediated ALS ...... 23 2.0 Statement of Contribution ...... 24 2.1 Abstract ...... 25 2.2 Introduction ...... 26 2.3 Results & Discussion ...... 30 2.4 Conclusion & Perspective...... 48 2.5 Acknowledgements ...... 51 Chapter 3: The observation of conformational bias in pan-specific anti-SOD1 antibodies ...... 52 3.0 Statement of Contribution ...... 53 3.1 Abstract ...... 54 3.2 Introduction ...... 55 3.3 Experimental Procedures ...... 57 3.4 Results ...... 61 3.5 Discussion ...... 76 Chapter 4: Increasing Top-Down MS Sequence Coverage by an Order of Magnitude through Optimized Internal Fragment Generation and Assignment ...... 78

5

4.0 Statement of Contribution ...... 79 4.1 Abstract ...... 80 4.2 Introduction ...... 81 4.3 Experimental Procedures ...... 84 4.5 Conclusions ...... 100 4.6 Method Development ...... 101 Chapter 5: Genetically Encoded Fluorescent Proteins Enable High-Throughput Assignment of Cell- cohorts Directly from MALDI-MS Images ...... 107 5.0 Statement of Contribution ...... 108 5.1 Abstract ...... 109 5.2 Introduction ...... 110 5.3 Experimental Procedures ...... 113 5.4 Results and Discussion ...... 122 5.5 Conclusions ...... 145 Chapter 6: High Resolution Complex Organic Mixture Mass Spectrometry Method Development ...... 147 6.0 Statement of Contribution ...... 147 6.1 Introduction ...... 147 6.2 Dissolved Organic Matter FTICR-MS Method Development ...... 148 6.3 Lipid Mass Spectrometry Method Development ...... 163 Chapter 7: Conclusions and Future Directions ...... 171 References ...... 174

6

List of Figures

Figure 2-1. Hypothesized model for SOD1-mediated fALS disease progression

Figure 2-2. Motor neurons expressing wild-type, G93A, and W32F/G93A SOD1 viability

Figure 2-3. Cysteinylation protects SOD1’s Cys111 from oxidative

Figure 2-4. Covalent dimerization of pathogenic G93A variant of SOD1 with DTME

Figure 2-5. SOD1 Protein structure perturbations and stabilizations

Figure 3-1. Western blotting with SOD-100 is biased against the detection of more-folded conformations of SOD1

Figure 3-2. Heat denaturation of SOD1 prevents surface-spreading if transferring proteins to nitrocellulose membrane under high salt conditions

Figure 3-3. Heat/Reduction treatment of nitrocellulose membrane can recover SOD1’s SOD-100 epitope exposure

Figure 3-4. Multiple conformational states of SOD1 are revealed by conformation-sensitive non- quenching intact protein global-HDX

Figure 3-5. In-gel heating and reduction protocol enables normalization of SOD1 epitope presentation for SOD-100

Figure 3-6. In-gel heating and reduction protocol enables cell-based assay detection of crosslink- stabilized SOD1

Figure 4-1. Evaluation of FSD b-ion generation of SOD1G93A as modulated by declustering potential

Figure 4-2. Top: FSD fragmentation of reduced and denatured SOD1G93A at various declustering potentials

Figure 4-3. Internal fragment ion ambiguity as a function of protein size, frameshift assignment and mass accuracy

Figure 4-4. “0 ppm” Ambiguity of three model proteins

Figure 4-5. Pseudo-MS3 analysis of the ambiguous internal fragments

Figure 4-6. Internal fragments enable 100% sequence coverage and disulfide bond localization

7

Figure 4-7. Internal fragment assignment mapping procedure

Figure 4-8. Proposed workflow for internal fragment assignment

Figure 5-1. The set-up of fiducial “teach points” in flexImaging introduces registration errors that preclude the targeted analysis of individual cells

Figure 5-2. Genetically-encoded fluorescence enables the detection of cell cohorts in situ

Figure 5-3. Isolation of EYFP proteoforms from YFP-16 mouse brains

Figure 5-4. Characterization of EYFP

Figure 5-5. Tissue washing techniques for proteins can delocalize EYFP

Figure 5-6. Detergent enhancement of matrix can improve protein ionization in MALDI-MS

Figure 5-7. Lowering the nozzle temperature on the TM Sprayer improved protein extraction and EYFP detection

Figure 5-8. Protein extraction and spatial resolution are limited by the size of the matrix droplets

Figure 5-9. The solvent composition of the matrix solution influences the detection of proteins in situ

Figure 5-10. Acetonitrile is the preferred organic solvent for automated matrix deposition techniques

Figure 5-11. Optimized in situ detection of EYFP in MALDI-MS using two different automated deposition systems

Figure 5-12. MALDI-MS images in register with the fluorescence image of a YFP-16 mouse brain at 50 and 25 µm spatial resolution

Figure 5-13. Segmentation map of manually segmented gray and white matter regions, overlaid on a fluorescence image of EYFP

Figure 5-14. EYFP is the most abundant protein within its mass range

Figure 6-1. FTICR-MS spectra of DOM from Suwannee River, varying time of flight

Figure 6-2. Examination of a single m/z range in a DOM spectrum

Figure 6-3. Optimization of skimmer 1 voltage for DOM samples

8

Figure 6-4. Data acquisition file size (through increased FID length) and DOM signal

Figure 6-5. Optimization of number of scans per sample

Figure 6-6. The impact of sample concentration on signal intensity, ions detected, and peak shape

Figure 6-7. Lowering ion accumulation time allows for use of higher concentration of sample in spray

Figure 6-8. Auto-sampling method development

Figure 6-9. SRFA Spectrum with finalized DOM analysis method (full spectrum)

Figure 6-10. SRFA Spectrum with finalized DOM analysis method (345 m/z to 380 m/z)

Figure 6-11. SRFA Spectrum with finalized DOM analysis method (353.00 m/z to 353.20 m/z)

Figure 6-12. High resolution mass spectra generated using ESI-FT-ICR MS in CASI mode, from lipid extracts of MDA-MB-231 cells

Figure 6-13. Increased phospholipid saturation in the fat1Δ mutant

9

List of Tables

Table 2-1. Select disease-associated mutations that cause protein deamidation

Table 4-1. Calculated ambiguity data on the three model proteins

Table 4-2. Fragmentation coverage and ions generated when using terminal fragments alone vs. using terminals and internals

Table 4-3. Demonstration of sequential top-down fragmentation of SOD1G93A

Table 4-4. Achieving high mass accuracy, resulting in a 1 ppm mass error tolerance, prevents incorrectly assigning internal fragments

Table 4-5. ‘N-terminal to alanine’ portion of denatured fragmentation propensity chart

Table 5-1: Known mouse brain proteins used for internal calibration of MALDI-MSI spectra

Table 6-1. DOM Analysis Final Method Parameters

Table 6-2. Molecular formulas and lipid identification was performed by querying the Human Metabolome Database

10

Abbreviations

2-ME 2-mercaptoethanol

3D Three dimensional

5-FUrd 5-fluorouridine

Ab Antibody

ACN Acetonitrile

AD Alzheimer’s disease

AEX Anion exchange chromatography

ALS Amyotrophic lateral sclerosis

Am. Ammonium

Amb. Ambiguity

Arr. Arrangement

BM BMOE-cross-linked

BME Beta-mercaptoethanol

BMOE Bismaleimidoethane

BSA Bovine serum albumin

CA Carbonic anhydrase

CAD Collisionally activated dissociation

CASI Continued accumulation of selected ions

Cer Ceramide

CHCA α-cyano-4-hydroxycinnamic acid

Cu Copper

11

D2O Deuterium oxide

DAPI 4′,6-diamidino-2-phenylindole

DC Direct current

DESI Desorption electrospray ionization

DMEM Dulbecco's modified eagle medium

DNA Deoxyribonucleic acid

DOI Digital object identifier

DOM Dissolved organic matter

DTME Dithiobismaleimidoethane

DTT Dithiothreitol

ECD Electron capture dissociation

EDTA Ethylenediaminetetraacetic acid

EMA European medicines agency

ESI Electrospray ionization

ETD Electron transfer dissociation

EYFP Enhanced yellow fluorescent protein

FA Frameshift ambiguity fALS Familial amyotrophic lateral sclerosis

FAT Fast axonal transport

FDR False discovery rate

FID Free induction decay

FP Fluorescent protein

FPLC Fast protein liquid chromatography

12

FRE Fiducial registration error

FS Frameshift

FSD Funnel-skimmer dissociation

FTICR Fourier transform ion cyclotron resonance

FTIR Fourier transform infrared spectroscopy

FTMS Fourier transform mass spectrometry

FUS Fused in sarcoma

GFP Green fluorescent protein

H Heated

H&E Hematoxylin and eosin

HCD Higher-energy collisional dissociation

HDX Hydrogen-deuterium exchange

HIC Hydrophobic interaction chromatography

HPLC High performance liquid chromatography

HRP Horseradish peroxidase

HSS Hallow structural sections

ICP-MS Inductively coupled plasma mass spectrometry

ICR Ion cyclotron resonance

ID Identification

IH In-house

IHC Immunohistochemistry

IMC Imaging mass cytometry

IPA Isopropyl alcohol

13

ITO Indium tin oxide ivLESA In vitro liquid extraction surface analysis

LAESI Laser ablation electrospray ionization

LC Liquid chromatography

LSC Laser scanning cytometry

LSDs Lysosomal storage diseases m/z Mass-to-charge ratio

MALDI Matrix-assisted laser desorption/ionization

MeOH Methanol mM and mL Millimolar and milliliter

MS/MS Tandem mass spectrometry

MSI Mass spectrometry imaging

MWCO Molecular weight cut-off

N Native (non-treated)

NaCl Sodium chloride

NCE Normalized collision energy

NMR Nuclear magnetic resonance

PAGE Polyacrylamide gel electrophoresis

PBS Phosphate-buffered saline

PC Phosphatidylcholine

PD Parkinson’s disease

PDB Protein data bank

PE Phosphatidylethanolamine

14

PLOT Porous layer open tubular

PMF Peptide mass fingerprinting

POM Particulate organic matter ppm Parts-per-million

PTM Post-translational modification

Q Quadrupole

R Reduced

RF Radio frequency

RH Reduced and heated

RNA Ribonucleic acid

RNA Seq Ribonucleic acid sequencing

SA Sinapic acid sALS Sporadic Amyotrophic Lateral Sclerosis

SDS Sodium Dodecyl Sulfate

SIMS Secondary-Ion Mass Spectrometry

SNAP Sophisticated numerical annotation procedure

SOD1 Cu/Zn superoxide dismutase

SPE Solid phase extraction

TBS Tris-buffered saline

TBST Tris-buffered saline tween-20

TD Top-down

TDMS Top-down mass spectrometry

TFA Trifluoroacetic Acid

15

TIC Total ion current

ToF Time-of-Flight

TOF/TOF Tandem Time-of-Flight

Tris Tris(hydroxymethyl)aminomethane

UB Ubiquitin

μM and μL Micromolar and microliter

US FDA United States Food and Drug Administration

UVPD Ultraviolet photodissociation

WT Wild type

YFP Yellow Fluorescent Protein

16

Chapter 1

Introduction

1.1 Dissertation organization, themes, and contributions

This dissertation is organized into five content-focused chapters, preceded by this

introduction, and concluded with closing remarks and future directions. The chapters contain

some combination of published work, completed work submitted or in revision for submission to

peer-reviewed journals, and bioanalytical chemistry method development centered on expanding

our laboratory’s scope and capabilities. The general theme of the dissertation is that confounding

and unaddressed issues persistent to various fields of research often require deep consideration

followed by new perspectives or analytical approaches. With respect to ALS, we develop

methods to address one of the major controversies in the field: why approximately equal

numbers of reputable groups have reported evidence for—and against—the hypothesis that post-

translational modification of SOD1 can cause sporadic ALS. More specifically, we address these

issues here in the following themes by chapter: Chapter 2) Amyotrophic lateral sclerosis,

particularly how sporadic (non-inherited and idiopathic) forms of the disease may be causes by similar mechanisms as familial (inherited) forms; Chapter 3) The risks inherent to interpreting antibody-protein interaction results, particularly from immunohistochemistry and western blot experiments, due to irregularities in the antibodies being used; Chapter 4) Experimental and data- analysis methods that allow for the complete assignment of all fragments generated in top-down proteomic experiments, significantly expanding sequence and PTM coverage, fragmentation sites within the protein, and number of unique ions assigned; Chapter 5) How combining various

17

proteomic and auxiliary analytical techniques can lead to the ability to assign cell-cohorts in

MALDI-MSI studies without the need of fluorescence microscopy or similar image registration techniques; Chapter 6) A transparent display of FTICR-MS method development for complex organic mixtures, how various tuning parameters interplay with each other, and how these methods can be adapted for different applications. Each chapter herein is self-contained, providing a great deal of cited background information preceding the results and discussion presented that place things into context. In what follows in this introduction, I will introduce topics discussed in this dissertation, as well as provide thoughts on challenges facing these fields of research and how the work presented in the following chapters serve to address these challenges.

1.2 SOD1-mediated ALS

Amyotrophic lateral sclerosis (ALS) is a late-onset syndromic neurodegenerative disease characterized by the death of spinal and cortical motor neurons. As with other neurodegenerative diseases, a small population of cells is selectively vulnerable to dying (in early stages < 1% of total neuronal cells). As a result of tissue homogenization techniques diluting (rare) cell-specific

PTMs below the detection limit, the imaging techniques developed in chapter 5 can provide unique insight into ALS etiology. A portion of ALS (c.a 20%) has been linked to genetic mutations in several proteins, including Cu/Zn superoxide dismutase (SOD1), the majority is sporadic without discernable genetic basis and elusive in origin.1-2 One hypothesis for sporadic

cases of ALS is that their onset could be caused by similar mechanisms as those that lead to

motor neuron death in familial ALS, notably destabilization of SOD1, leading to protein

18

aggregation.3-5 Post-translational modifications (PTMs) are present in the majority of human proteins, resulting in a variety of proteoforms for each expressed transcript.6-7 Chapter 2 of this

dissertation explores the concept that specific PTMs of SOD1 and other disease-associated proteins will cause the exact same molecular change that mutations of the genome would cause,

providing a potential link between familial and sporadic forms of these diseases.4, 8-9

1.3 Improving the reproducibility of SOD1 immunochemistry

Improper interpretation of experimental results due to lack of proper antibody validation

or incomplete of antibody validation has caused a growing crisis in biomedical research in recent

years.10-14 Such challenges have affected SOD1-mediated ALS research, leading to contradictory

results in studies of sporadic ALS under similar experimental conditions.8, 15-17 Slight differences

in sample handling prior to immunohistochemistry and western blot analysis could

unsuspectingly lead to divergent experimental outcomes. Chapter 3 of this dissertation explores

what factors may be leading to these disparate results, specifically examining what factors

control SOD1 epitope exposure when using the popular anti-SOD1 antibody, SOD-100. It further

examines SOD1 membrane binding properties, treatment-specific conformations and their

dynamics, and develops a protocol for normalizing western blot signal to ameliorate the issue.

This protocol could have implications beyond ALS research alone, suggesting that normalization

treatments could be useful in all biomedical fields of disease research involving protein

aggregation.

19

1.4 Eliminating ambiguity in internal fragment ion identification

Top-down proteomics is often limited in its ability to provide adequate sequence

coverage in the middle regions of a protein due to prioritization of the generation and assignment

of terminal fragment ions.18-19 Major barriers to this are general inabilities to consistently

produce and assign internal fragment ions, which preferentially cover these middle regions.20-22

Poor mass accuracy can account for >75% of these internal fragments being ambiguous, while the remainder are ambiguous by . Chapter 4 of this dissertation breaks down exactly what factors lead to internal fragment ambiguity, introduces new classifications to aid in assigning

these fragments, demonstrates statistics on these classes across a range of model proteins, provides a framework for assigning 100% of internal fragments, and demonstrates this method

with familial ALS-associated SOD1G93A. It also expands on previously existing textual and

graphical methods of interpreting internal fragment assignment at the level of individual peptides

and the entire protein. These advancements will allow for significantly expanded sequence

coverage in top-down proteomics, enable higher proteoform distinction, and facilitate

discussions pertaining to protein sequence coverage.

1.5 Challenges in MALDI-MSI

Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry imaging (MSI)

in an advanced technique growing in popularity that allows for tissue sections to be analyzed via

mass spectrometry in situ, providing localization information for biomolecules within tissue.

This technique is limited largely by spatial resolution and proper cellular registration. These

20

challenges inhibit proper inferences as to which molecules are present in specific cell-types to be made. Chapter 5 of this dissertation introduces a technique that addresses these challenges by elucidating cell type via mass spectrometry directly, whereas normally, auxiliary techniques would be required for this. It accomplishes this through genetically encoding specific neurons with a fluorescent protein to allow the co-registration of fluorescence imaging and MALDI-MSI, proving a concept that can be expanded to other proteins native to specific cell-types. These advancements provide a path towards single-cell MSI, especially as instrumentation continues to improve.

1.6 High-Resolution FTICR-MS

Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS) is a high- resolution mass spectrometry technique that interprets ion mass to charge ratios in a magnetic field by their cyclotron frequency.23-24 This type of mass spectrometer allows for the highest resolving power of complex organic mixtures of commercially available instrument at this time.25 In our laboratory and employed in the work displayed in this dissertation, a Bruker

solariX XR 9.4 T FTICR mass spectrometer was used for most spectral data acquisition. Chapter

6 demonstrates how a method was developed for dissolved organic matter (DOM), a complex

organic mixture of molecules present in all natural water and soil samples. Tuning this

instrument for analysis of small molecules requires several key considerations (from over

seventy adjustable parameters) compared to tuning for top-down protein analysis, including

oscillation rates and voltage differentials throughout the instrument’s ion optics and trapping

21 plates. These considerations are then applied to complex lipid samples to provide isotopic fine structure and enable relative quantitation.

22

Chapter 2

Parsing Disease-relevant Protein Modifications from Epiphenomena: Perspective on the Structural Basis of SOD1-Mediated ALS

Nicholas D. Schmitt1,2, Jeffrey N. Agar1,2,3

1 Department of Chemistry and , 2 Barnett Institute of Chemical and Biological Analysis, 3Department of Pharmaceutical Sciences, Northeastern University, Boston, MA 02115, USA

Reproduced with permission from Parsing Disease-relevant Protein Modifications from Epiphenomena: Perspective on the Structural Basis of SOD1-Mediated ALS. Journal of Mass Spectrometry 2017, 52 (7), 480-491. DOI: 10.1002/jms.3953 Copyright 2020 Wiley Online Library.9

23

2.0 Statement of Contribution

Nicholas D. Schmitt and Jeffrey N. Agar reviewed all relevant literature, formulated and wrote the perspective, and generated all figures.

24

2.1 Abstract

Conformational change and modification of proteins are involved in many cellular

functions. However, they can also have adverse effects that are implicated in numerous diseases.

How structural change promotes disease is generally not well understood. This perspective

illustrates how mass spectrometry (MS), followed by toxicological and epidemiological

validation, can discover disease-relevant structural changes and therapeutic strategies. We (with

our collaborators) set out to characterize the structural and toxic consequences of disease-

associated mutations and post-translational modifications (PTMs) of the cytosolic antioxidant

protein Cu/Zn-Superoxide dismutase (SOD1). Previous genetic studies discovered > 180

different mutations in the SOD1 gene that caused familial (inherited) amyotrophic lateral

sclerosis (fALS). Using HDX-MS, we determined that diverse disease-associated SOD1

mutations cause a common structural defect – perturbation of the SOD1 electrostatic loop. X-

crystallographic studies had demonstrated that this leads to protein aggregation through a

specific interaction between the electrostatic loop and an exposed beta-barrel edge strand. Using

epidemiology methods, we then determined that decreased SOD1 stability and increased protein

aggregation are powerful risk factors for fALS progression, with a combined hazard ratio > 300

(for comparison, a lifetime of smoking is associated with a hazard ratio of ~15 for lung cancer).

The resulting structural model of fALS etiology supported the hypothesis that some sporadic

ALS (sALS, ~80% of ALS is not associated with a gene defect) could be caused by post-

translational protein modification of wild-type SOD1. We developed immunocapture antibodies

and high sensitivity top-down MS methods, and characterized PTMs of wild-type SOD1 using human tissue samples. Using global-HDX, X-ray crystallography, and neurotoxicology we then

characterized toxic and protective subsets of SOD1 PTMs. To cap this perspective, we present

25

proof-of-concept that post-translational modification can cause disease. We show that numerous mutations (ND; QE), which result in the same chemical structure as the PTM deamidation, cause multiple diseases.

2.2 Introduction

Most human proteins are post-translationally modified, often in more than one way, from

simple N-terminal acetylation to complex phosphorylation, lipidation, and glycosylation patterns.7 These processes, among others, result in multiple natural proteoforms of each protein.26 Conversely, aberrant protein modifications can result from changes in post-

translational processing and protein catabolism, xenobiotics (e.g. nutrients, pesticides,

pharmaceuticals), and genetic mutation. Changes in protein amino acid composition can lead to

changes in their secondary, tertiary, and quaternary structures, including aggregation. For example, modifications as small as a replacing a hydrogen with a methyl group (G93A) lead to exposure of a toxic epitope within the Cu/Zn-Superoxide dismutase (SOD1) protein,8 and

a highly penetrant (all carriers develop the disease), rapidly progressing (3.1 years mean

survival) 5 form of amyotrophic lateral sclerosis (ALS). Modifications can result in loss-of-

function (e.g. the metabolic diseases including Myotonia congenital 27 and the lysosomal storage

diseases (LSDs) 28) or a gain-of-function, which can either augment physiological processes or

be cytotoxic.29-30

The consequence of protein modifications depends upon evolutionary context: mutations

in the gene encoding hemoglobin protect carriers from malaria while causing sickle-cell anemia.

Likewise, the myriad of mutations that cause late-onset diseases were not manifest until human

survival increased significantly (more than doubled 31) during the 19th and 20th centuries. A

26

testament to the disease-relevance of aberrantly modified proteins is that they are histopathological hallmarks of the “proteinopathies.” These include some of the most prevalent – often late onset – diseases. Modified proteins define, to name a few, the Tauopathies

(Alzheimer’s, Pick’s complex, chronic traumatic encephalopathy, etc. 32), Synucleinopathies

(Parkinson’s, dementia with Lewy bodies, multiple system atrophy 33), and secondary

Ubiquitinopathies (lysosomal storage disorders, amyotrophic lateral sclerosis, etc. 34-36).

Amyotrophic lateral sclerosis (ALS), as well as Parkinson’s disease (PD) and

Alzheimer’s disease (AD), are syndromic diseases. Unknown causes, as well as a variety of

mutations, somehow converge upon the death of disease-specific populations of selectively

vulnerable neurons (e.g. the death of brain and spinal cord motor neurons in ALS). The

molecular mechanisms of cell death, including whether common pathways are involved (e.g.

proteostasis or RNA processing), is unknown and research is fundamentally limited by the

following problems. The majority of disease is idiopathic/sporadic (i.e., no discernable genetic

basis) and is not strongly associated with environment (e.g. behavior or xenobiotics). Thus, idiopathic disease cannot be recapitulated in laboratory models, and since biopsy is not an

option, studies must involve post-mortem human tissue samples. Unfortunately, most changes in

observed end-stage neurodegenerative disease tissues are epiphenomena (e.g. most cells of

interest are already dead and inflammation dominates). A minority of disease cases are familial

and further subdivided into a myriad of mutations in multiple genes. These can be used to create

suitable transgenic laboratory models of disease, with the major caveat of potentially not

representing the disease as a whole, and the major benefits of permitting longitudinal disease

studies and facilitating drug development.

27

Research must strike a balance between human tissue samples, which best represent

disease variation but underrepresent causative mechanisms, and animal models, which do the

opposite. In either case, we use mass spectrometry (MS) to compare disease and control

preparations, and validate the MS findings using secondary methods (e.g. toxicology, phenotype,

epidemiology, familial diseases models, etc.). We primarily studied SOD1-mediated ALS, which

is an adult onset, rapidly progressive (generally ~2-5 years survival) motor neuron disease that

affects ~ 20,000 Americans at any given time, and results in the death of ~1/1000 humans (for

review see Taylor et al. 37). Voluntary muscle control is lost as motor neurons in the brain and

spinal cord degenerate by a yet-uncharacterized mechanism. The first genetic cause of ALS was

found in the gene encoding SOD1.2 Since this time, more than 180 mutations in the SOD1 gene

have been linked to familial ALS (fALS) (http://alsod.iop.kcl.ac.uk/). Like most dominantly

inherited diseases, SOD1 fALS results from a toxic gain-of-function rather than the loss of

SOD1 enzymatic activity. This new toxic function does not appear to be related to SOD1’s canonical function (dismutase activity). One of the leading hypotheses is that the toxic gain-of- function is a modified conformation that leads to SOD1 dimer dissociation 8, 38-39 followed by

aggregation.8, 40-41 In the first part of this perspective we demonstrate that diverse SOD1

mutations result in a common structural consequence that can lead to toxic protein aggregation.

Recent evidence, including our collaborative work, extended this association, linking modified wild type SOD1 (SOD1WT) to sporadic ALS (sALS) 2, 42 (for dissenting opinions see 43-

44). This perspective outlines our hypothesis-driven approach to characterizing the structural and

toxic consequences of mutations in SOD1 that cause a subset of fALS, and studies of

modifications of the same protein in sporadic ALS. SOD1 mutations are found in only 2-7 % of

ALS cases (e.g. ~ 6 % of US cases), (http://alsod.iop.kcl.ac.uk/)45 and there are other prevalent

28

mutations that were not studied (e.g. C9orf72, TDP-43, etc.).2, 46-47 We concentrated on SOD1-

ALS mutations because they cause – are not just associated with – ALS, and because they have

diverse, well-characterized severity. For example, carriers of a single allele of either the A4V or

H46R SOD1 mutation that live a normal lifespan unfortunately develop ALS. Whereas A4V

patients survive an average of 1.2 years after the onset, H46R patients survive an average of 18

years.5 This allows the structural changes we observe in fALS variants to be rank-ordered in

terms of disease severity.

One hypothesis for how modified SOD1WT can lead to sALS asserts that modifications

that affect surface charge such as phosphorylation, deamidation, and oxidation, can lead to toxic

conformations,8 instability, and misfolding.48-49 Indeed, certain mutations and PTMs induce

similar – even identical – structural change. For example, mutations are often used as a mock-

PTM (e.g., converting serine or threonine to aspartic acid to imitate permanent phosphorylation).

Due to structural similarities between their mutationally- and post-translationally-modified proteoforms, certain proteins (including all of the histopathological hallmarks discussed above) have been considered as possible links between familial and sporadic forms of disease. These include -beta 50 and Tau 51 (AD), alpha-synuclein 52 and Parkin 53 (PD), and TDP-43 54

and SOD1 42, 55-57 (ALS). Although there is considerable support for the hypothesis that PTMs can be responsible for sporadic disease,42, 53, 58-59 this has never been proven. In the second part of this perspective we present evidence of both toxic and protective (oxidation 8, 49 and

cysteinylation,4 respectively) PTMs in human tissue, and how these informed our current

therapeutic strategies. We punctuate this perspective with proof that PTMs can cause disease, if

occurring on a large enough scale, by illustrating a variety of diseases (see Table 2-1) that are

29

caused by mutations that result in a structure that is equivalent to that of post-translational

deamidation.

2.3 Results & Discussion

To understand most dominantly inherited diseases, and probably most neurodegenerative

diseases, we must avoid the common misconception that disease results from the loss of a

particular protein’s canonical (normal) function. Instead, disease symptoms commonly result from a novel modification-induced function. We hypothesized this toxic-gain-of-function involved a change in SOD1 conformation and employed top-down MS to characterize the structural consequences of ALS-associated SOD1 modifications. Numerous 3D structures (NMR and X-ray), neurotoxicology assays, and epidemiology studies allowed our results to be placed into context, resulting in a structure-based model for SOD1-mediated ALS onset and progression. This knowledge was then translated into therapeutic strategies. This work was highly collaborative (see acknowledgments).

2.3.1 Top-down and ultrahigh resolution mass spectrometry methods to enrich the detection of disease-relevant modifications.

Popular MS sample preparation techniques were designed for large-scale protein identification rather than discovering disease-relevant protein modifications. Most proteomics studies involve disulfide reduction, cysteine alkylation, and endoproteinase digestion.60-61

Reduction and alkylation result in loss of important prevalent in vivo cysteine modifications (e.g.

disulfides, sulfenic acid, nitrosylation, and S-thiolation).62 This problem greatly affects SOD1 – a

disulfide-containing metalloprotein with variable S-thiolation – and all of these PTMs could be

30

involved in ALS etiology. The endoproteinase digestion step results in: 1) loss of metal

cofactors (~30% of proteins are metalloproteins); 2) loss of correlation between PTMs (e.g.,

phosphorylation of one residue promotes methylation of a second residue); 3) loss of information

concerning the relative abundance of PTMs;63 and 4) scrambling of the sites of disulfide bonds

and S-thiolation.64 All of these problems apply to SOD1 and potentially ALS. The high sensitivity of mass spectrometers, which is generally an advantage, also increases the probability of detecting low-abundance modifications, including artifacts of sample preparation and

proteoforms with concentrations that are too low to have a significant physiological effect.

Top-down MS techniques can overcome all of the problems described above: eliminating

artifacts of reduction and alkylation, maintaining labile PTMs and their location, and providing

information on their relative abundance and functional relationship.18 In a series of manuscripts,

we developed top-down techniques for PTM characterization. The most important of these

include the following: high sensitivity immunoaffinity techniques;4, 41 the debut of the Big

Mascot (Mascot TD) search engine, which remains one of the few search engines capable of assigning internal fragment ions (which are much more abundant in top-down studies);22 a

comprehensive model of the dissociation pathways of intact protein collisionally activated

dissociation;20 and methods for detecting S-thiolation, a prevalent PTM that is either missed or

location-scrambled in bottom-up studies.62 Other groups have employed top-down MS to

demonstrate the importance of characterizing PTMs to identify disease biomarkers, such as with

chronic heart failure.65 To better define the chemical composition of PTMs and to enable

quantitative metabolic labeling experiments in future studies, we developed ultrahigh-resolution

MS methods. In this series of manuscripts we presented fast and accurate algorithms and

programs for modeling spectra with resolved isotopic fine structure.66-69 Isotopic fine structure

31

can be used, among other things, to determine the elemental composition of PTMs and to discern

between isobaric modifications, including the S1 and O2 modifications we observe on the Cys111

residue of SOD1.

2.3.2 HDX-MS demonstrates that diverse SOD1 mutations result in a common structural perturbation.

Given that hundreds of known fALS-related SOD1 mutations cause a similar disease phenotype, ALS, we hypothesized that SOD1 variants shared a common toxic conformation.

Whereas biochemical studies70 and solution x-ray scattering71 showed that the properties of

fALS-linked SOD1 variants differed significantly from SOD1WT, X-ray crystallographic studies

of fALS variants showed little structural variation from SOD1WT.72-75 It is well known that the crystallization process can favor stable conformations, which for fALS variants would lead to the underrepresentation of disease-relevant conformations in X-ray crystallography studies. To better understand the solution structures of fALS variants, we analyzed fourteen biochemically diverse

ALS variants using hydrogen-deuterium exchange mass spectrometry (HDX-MS), described in

Molnar 2009.76 HDX-MS allows for protein secondary and tertiary structure to be examined in

higher resolution than offered by circular dichroism or charge-state distribution MS, respectively. For a review from a leading HDX-MS group, see Wales et al.77 We determined that

all fALS variants share one thing in common – a perturbed electrostatic loop (Loop VII, residues

121-142, Figure 2-1). Several variants also had perturbations near the dimer interface and zinc binding loop. Note that the metal content of wild type SOD1 and each variant was determined by inductively coupled plasma-MS (ICP-MS), and while SOD1WT was found to be completely

32 metallated, the variants were metal deficient to varying degrees as detailed in our previous work.76

Figure 2-1. Hypothesized model for SOD1-mediated fALS disease progression. (Top) Mutations at various locations in SOD1, as demonstrated by HDX-MS in Molnar et al. 2009, lead to

33

perturbation of the electrostatic loop (residues 121-142). (Middle) This perturbation exposes a toxic epitope, which enables a gain-of-interaction, possibly between a region of the electrostatic

loop (red, rest of loop is pink) of one SOD1 homodimer and the exposed beta strand edges of beta strands V and VI (green and blue) of another SOD1 dimer, as demonstrated by the Hart

group (Elam, J.S., et al. 2003, adapted from PDB entry 1OZU). (Bottom) This leads to aberrant

interactions of SOD1 dimers, resulting in greater unnatural quaternary structure, fibrillization,

and aggregation.

2.3.3 A structure-based mechanism of neurodegeneration: The survival of SOD1-mediated

ALS patients is decreased by protein instability and protein aggregation.

Before we demonstrated that fALS variants shared a common structural defect,

perturbation of the electrostatic loop, the only universally shared characteristic was the formation

of intracellular aggregates. Many researchers considered fALS SOD1 aggregates to be

epiphenomena, perhaps even the result of a protective sequestering process. Likewise, some

researchers asserted that fALS SOD1 variants were less stable than SOD1WT,78 while other researchers found no correlation with stability and disease – noting that certain mutations had

WT-like stability and activity.79 Other neurodegenerative diseases were plagued by a similar lack of consensus as to whether aggregation was toxic, an epiphenomenon, or even protective. To address the importance of protein stability and aggregation in ALS, we assembled the largest fALS patient epidemiology database, the largest data set of SOD1 variant’s stabilities, and the

largest data set of protein aggregation rates. Using the theory introduced by Chris Dobson, we

generated a model that could predict the rate of protein aggregation based upon the

34

physicochemical changes induced by a protein modification (Δ charge, Δ hydrophobicity, Δ

entropy).

In the first study of its kind, we used well-established epidemiology hazard/risk-

analysis methods to model how protein physicochemical changes affect disease progression.5 We

found that decreased protein stability (hazard ratio of 24) and increased aggregation propensity

(hazard ratio of 13) are risk factors for SOD1-fALS. Taken alone, the hazard-ratio of loss-of- stability (or aggregation propensity) with respect to SOD-fALS progression is similar to that of smoking with respect to lung cancer (hazard ratio ~15).80 Simply put, smokers have a 15-fold

increased risk of dying from lung cancer, and the rate of fALS progression (severity) increases in

proportion to both protein instability and aggregation propensity. Taken together, loss-of-

stability and aggregation propensity have a hazard ratio of 333, indicating their effects are

synergistic and potent. A model combining loss-of-stability and aggregation could account for

more than two-thirds of the large variability in SOD1-fALS patient survival times. Given that

SOD1 mutations are highly penetrant (i.e. generally accepted as causing fALS) and the physicochemical changes are irreducible (i.e. not prone to spurious associations), this model accounts for most of what causes fALS progression. Combined with the relatively high cellular concentrations of SOD1 in motor neurons, this model also provides a potential mechanism for the selective vulnerability of motor neurons in ALS. Although the use of epidemiology methods to qualify protein structural change was unprecedented and initially controversial, this work has

since been confirmed in numerous ALS-related studies81-83 and extended to animal models of

AD.84

Combined with our structural characterization of fALS SOD1 variants and the X-ray crystallographic studies of the Hart group,85 a mechanism of fALS etiology emerged. Diverse

35

mutations in the SOD1 gene result in perturbation of the electrostatic loop and loss of stability,

which promotes a novel interaction with an exposed beta-barrel. This novel interaction, as well as the dissociation of the SOD1 dimer 86-87 provide templates that nucleate SOD1 aggregation.

Following nucleation, the rate of SOD1 aggregate propagation (and patient’s disease

progression) changes quantifiably based upon the physicochemical properties of the variant

(specifically aggregation increases in proportion to losses in charge and conformational entropy

and gain of hydrophobicity). (Figure 2-1) Additionally, the Hart group and collaborators have

shown that numerous individual peptides isolated from SOD1 form fibril-like aggregates, even in

their fALS-mutated form.88

2.3.4 Can PTMs of SOD1 lead to sporadic ALS?

In the 1950s Denham Harman proposed the free radical theory of aging, where

deleterious oxidative modifications of DNA, the equivalent of somatic mutations, increased with age.89 This theory was extended to other biomolecules, including lipids and proteins, and gave

rise to the theory that aging resulted from a viscous cycle involving increasing oxidative damage

to proteins that resulted in the inhibition of proteolytic enzymes. These theories, combined with the observation that just about any structural modification of SOD1 causes ALS (180/184 known coding mutations), apparently through similar mechanisms, led to the hypothesis that PTMs of

SOD1 can cause sporadic (no SOD1 mutation) ALS. The first step in addressing this hypothesis was to characterize whether SOD1 becomes modified in vivo during ALS. There are no animal

models of sALS and tissue biopsies of brain and spinal cord of ALS patients are not performed.

Our studies therefore involved post-mortem brain and spinal cord samples, but were augmented

by (living) human blood and mice overexpressing SOD1. It should be noted that end-stage

36

neurodegenerative disease tissue is highly inflammatory, and consequently there are bound to be

numerous modifications that are the result of inflammation rather than primary causes. Parsing

the toxic modifications from the epiphenomena would require additional neurotoxicology

studies.

The discovery of in vivo SOD1 PTMs including oxidation and S-thiolation. We began by developing a quick and comprehensive isolation for SOD1 from human samples. Using SOD1 purified from human erythrocytes, and then modified using SOD1’s enzymatic end-product, peroxide, until a subpopulation of the SOD1 contained oxidative PTMs, we raised polyclonal antibodies to SOD1.41 SOD1 specific antibodies were affinity-purified and could deplete all

SOD1 from tissue samples (as confirmed by western blot using commercial SOD1 antibodies).

Using SOD1 purified by our antibody affinity methods, we identified oxidative modifications in a fALS SOD1G93A mouse models, in human erythrocytes, and ALS tissues. Top-down MS

detected the relative abundance of the modified forms; including oxidation; metallation and disulfide bond status; relative tertiary structure through charge state distribution; and localization of the oxidative modifications (confirmed by endoproteinase digestion). The most abundant modification was cysteinylation of Cys111, followed by oxidative modifications of residues

Trp32 and Cys111. These studies showed that Trp32 and Cys111 were both modified by oxygen

(1-3), and the predominant Cys111 modification was to cysteine sulfonic acid.

2.3.5 The identification of toxic and protective SOD1 PTMS.

Toxic oxidative modifications of Trp32. Research had suggested that the SOD1 mutations associated with fALS promoted higher levels of PTMs,70, 90-91 SOD1WT could also be modified, and that oxidative modifications of SOD1 induce aggregation as shown in vitro.56, 92-93 We

37

investigated whether the oxidative modifications to Trp32 that we observed in humans promoted

aggregation and toxicity. A Trp32Phe mutation was introduced to minimize SOD1 oxidation at residue 32. Using a well-established primary cell culture model of toxicity, the survival and aggregation of neurons injected with the fALS SOD1G93A were compared to those injected with

SOD1G93A/ W32F (and to SOD1W32F and SOD1WT controls). The Trp32Phe mutation was able to fully “rescue” G93A mutations, with the SOD1G93A/ W32F double mutant having its toxicity

reduced to that of normal SOD1WT while also producing fewer aggregates. (Figure 2-2) Recent studies have shown that SOD1 aggregation can occur through Trp32 oxidation that leads to di- tryptophan covalent cross-links between SOD1 monomers.94

38

Figure 2-2. Over the course of 10 days, motor neurons expressing wild-type, G93A, and

W32F/G93A SOD1 were monitored for viability. (Top) SOD1G93A proved to be significantly

more toxic to motor neurons than wild-type SOD1. (Bottom) Mutating SOD1G93A’s Trp32 to Phe resulted in a rescue of the pathogenic variant’s toxicity to motor neurons, demonstrating similar viability to wild-type SOD1. (This research was originally published in Taylor, D.M., et al.

Tryptophan 32 Potentiates Aggregation and Cytotoxicity of a Copper/Zinc Superoxide

Dismutase Mutant Associated with Familial Amyotrophic Lateral Sclerosis. Journal of

Biological Chemistry. 2007; 282(22):16329-16335. © the American Society for Biochemistry and Molecular Biology.)

Toxic oxidative modifications of Cys111. In similar studies, the Borchelt, Carri, and Araki groups demonstrated that C111S mutations, which prevented the oxidation we observed in

Cys111, could rescue the toxicity of a variety of SOD1 mutants.95-97 A collaborative study with

the Bosco and Brown groups employed an important structural tool for SOD1-mediated ALS, the

C4F6 conformation-specific anti-SOD1 antibody. This monoclonal antibody was raised against

disease-variant SOD1G93A in mice,98 and does not interact with unmodified SOD1WT. This tool

serves as an important complement to MS and X-ray crystallography. In this study, top-down

MS/MS using electron capture dissociation (ECD) determined the oxidative site of modification

on SOD1 and enabled the production of SOD1 modified (by SOD1s end product, peroxide)

exclusively at Cys111. Recombinant SOD1WT with Cys111 oxidized to sulfonic acid was shown to be toxic – indeed as toxic as fALS variants – through blocking of anterograde Fast Axonal

Transport (FAT) in a squid axoplasm assay. SOD1WT extracted from C4F6-immunoreactive

sALS tissue was also shown to inhibit FAT, while recombinant SOD1WT and SOD1 extracted

39

from control tissue did not. Also of note was that inclusion of the C4F6 antibody in FAT-

inhibition assays prevents fALS-associated SOD1 variants from inhibiting FAT, indicating that

the conformational epitope that causes this inhibition is masked when the antibody is bound.8

Identification of the epitope recognized by conformation specific antibodies: SOD1’s electrostatic loop and zinc binding region mediate the exposure of a hidden toxic epitope. With

misfolded SOD1 established as a possible link between certain cases of fALS and sALS, and potential oxidative mechanisms and methods to protect against this having been identified, we

felt it was important to determine the epitope of toxic-SOD1 antibody C4F6.8, 98 Epitope

mapping can be a difficult process, most often completed through cross-linking MS with various chemical functionalities and spacer arm lengths, resulting in a rough idea of the binding site, not

necessarily an exact picture. Site-directed mutagenesis was again a useful tool helping elucidate

which residues in the identified loops (IV and VII, the electrostatic loop and zinc binding region,

respectively) were important for antibody recognition.99 This revealed that residues Asp92 and

Asp96 are crucial for C4F6 binding, whether as part of the epitope or in exposing the epitope. An

important note is that removal of loops IV and VII still results in a folded protein that interacts with C4F6 as well as the original antigen. This further confirms that the destabilization of SOD1 through mutations or PTMs results in instability of the electrostatic loop and zinc binding regions in some forms of SOD1-mediated ALS, exposing a previously concealed toxic epitope and releasing coordinated copper and zinc. Simple de-metalation of SOD1WT also exposes this

epitope to some degree, indicating that any metal-deficient disease-variant or PTM-product

SOD1 could potentially be neurotoxic.

Protective S-thiolation of Cys111. SOD1’s Cys111 can be oxidized,16, 100 and modified

by copper,101 glutathione,49, 102 and cysteine.101, 103 We isolated SOD1 from human nervous tissue

40

and identified both oxidation and cysteinylation of Cys111.4, 104 Cysteinylation had not been

previously identified on SOD1, and rarely is identified as standard bottom-up proteomic workflows eliminate the possibility of detecting thiol-modifications due to reduction and alkylation steps and the general nature of disulfide scrambling during MS analysis. Quantitative accuracy allowed for the determination of relative amounts of each species, and showed that unmodified SOD1 was not present in the oxidatively-stressed sample. (Figure 2-3) MS, with the help of X-ray crystallography, showed that cysteinylation was only possible at a ratio of one

modification per dimer. However, while one monomer’s Cys111 was blocked from oxidation, it

did not necessarily block oxidation of the adjacent cysteine in the monomer-pair.

Figure 2-3. Cysteinylation protects SOD1’s Cys111 from oxidative damage. Mass spectrometry

showed that SOD1wt underwent oxidation at Cys111 (MS/MS not shown), resulting in a mix of

SOD1-Cys111 Sulfinic and Sulfonic acid (2 and 3 oxygens on Cysteine, respectively) with no unmodified SOD1wt remaining. SOD1wt that had been cysteinylated at Cys111 was protected

41

from oxidative damage, showing very little oxidative products and remaining cysteinylated.

(Reprinted with permission from Auclair, J.R., et al., Post-Translational Modification by

Cysteine Protects Cu/Zn-Superoxide Dismutase from Oxidative Damage. Biochemistry, 2013.

52(36):6137-6144. © 2013 American Chemical Society.)

Further thermodynamic experiments with the Cys111-cysteinylated form of SOD1

revealed that oxidation of SOD1 decreased the protein’s melting temperature by 23°C, whereas

cysteinylation only decreased the melting temperature by 5 °C.104 This suggests that modification

by cysteine is slightly destabilizing, but prevents the much more destabilizing modification of

oxidation. The cellular concentration of cystine, the dimeric oxidized form of cysteine, has been

shown to increase (along with oxygen) in cells with age and oxidative stress.105 This supports our

hypothesis that SOD1 may become cysteinylated in vivo in a protective manner as a method to compensate for increased oxidative stress loads, protecting proteins from aberrant cysteine

oxidation. In this regard, the therapeutic approach of cross-linking the cysteine dyad pairs on

non-covalently bound monomer pairs also serves to protect Cys111 from oxidation.

2.3.6 Mass spectrometry-informed therapeutic strategies

Stabilizing the SOD1 dimer using cross-linkers. We hypothesized that covalently cross- linking the SOD1 dimer using Cys111 residues at the dimer interface 38 would simultaneously

prevent the toxicity associated with oxidative modification of Cys111, while kinetically

stabilizing the SOD1 dimer. Our proof-of-concept study demonstrated this using

dithiobismaleimidoethane (DTME) and bismaleimidoethane (BMOE), which are not practical as

therapeutics due to their lack of specificity for this target. This idea has since evolved to the use

42

of cyclic disulfides, as they specifically target proximal cysteine dyads and do not permanently

fix themselves to lone cysteines.106 The modification sites and products of protein cross-linking

reactions were confirmed using top-down and bottom-up MS. Because the process of electrospray ionization disrupts the natural noncovalent dimer of SOD1, a cross-linking assay

where two monomers are tethered together in solution is easy to monitor, as a shift in mass is

observed from monomer mass to the mass of the dimer plus the mass of what remains attached to

the protein of the cross-linker (Figure 2-4). This assay was also used to show that pathogenic

variants of SOD1, like the G93A variant, can be stabilized with cross-linking.

Figure 2-4. Covalent dimerization of pathogenic G93A variant of SOD1 with DTME – A

therapeutic stabilization strategy. (A) Mass spectrum of SOD1G93A. (B) Mass spectrum of

DTME-cross-linked SOD1G93A stabilizes the non-covalent dimer with a covalent cross-link that

is not lost upon ionization.

Studies by other groups indicate that increased SOD1 dimer dissociation occurs in several

fALS variants. A large scale thermodynamic analysis of SOD1 dimer and monomer stability by

43

the Dokholyan group showed that 70 out of 75 pathogenic SOD1 variants demonstrate decreased

dimer stability and/or increased dimer dissociation.87 Our HDX studies of demetallated fALS

variants (which are predominantly monomeric) are consistent with these observations. However,

we did not detect perturbation at the dimer interface in some of the metallated variants,

indicating that they are not the major species. Given that all of the fALS variants destabilize

SOD1 (either its metallated or apo form) and that the rate-limiting step of SOD1 aggregation is dimer dissociation, the balance of the evidence supports that stabilizing the SOD1 dimer is one viable therapeutic approach (Figure 2-5).

44

Figure 2-5. (Top) Oxidation of SOD1’s Cys111 and Trp32 perturb the structure of the protein in

a similar way as any of the pathogenic disease variants, leading to ALS-linked misfolded conformations of the protein. (Bottom) Therapeutic approaches that have been proposed to prevent these oxidation events are (Left) cross-linking proximal Cys111’s on SOD1 monomer

45

pairs to block the cysteines from oxidation and maintain the homodimer and (Right) blocking

oxidation of Trp32 with a ligand, which also serves to stabilize the protein structure.

Trp32 binding pharmacological chaperones. Because SOD1’s Trp32 was shown to be

oxidized in vivo and had high-potential to be involved in protein-protein interactions, we

hypothesized that blocking this residue with small-molecule binding could replicate the effects

we observe when mutating this residue to the less oxidation-prone phenylalanine. Mimicking this

rescue phenotype would allow for increased protein stability and may prevent SOD1-mediated

disease onset and progression. Our group began further examination of this binding site and its

potential as a therapeutic route to combat SOD1-mediated ALS through a combination of in silico docking, protein stability assays, and competitive MS binding assays. 107 The docking was examined using GLIDE,108 searching for compounds that have high theoretical propensity of

binding to this site (Figure 2-5). Three compounds previously shown to bind to Trp32 and three

new compounds from the in silico docking were chosen, and were incubated with unstable

fALS-variants of the protein. Binding affinity and protein stability before and after compound binding were evaluated with microscale thermophoresis and differential scanning fluorimetry.

These compounds were also evaluated by MS through a competitive binding assay under light electrospray conditions. We found that, of the compounds that bound, they each offered only slight increases in protein stability, their greater contribution likely being protection from oxidation. In fact, Trp32 and a small ligand binding site in loop II of the protein are the only native surface sites that have been confirmed as drug-targetable.109 Previously, X-ray

crystallography had determined that 5-fluorouridine (5-FUrd) binds to Trp32 through aromatic

stacking, offering increased stability to the protein upon binding.71

46

2.3.7 “Deamidating” mutations provide proof of concept that PTMs can cause disease.

One of the most commonly studied instances of a PTM causing disease, and one with the

most obvious potential connection between familial and sporadic disease onset, is found with

deamidation. Numerous familial diseases, including the neurodegenerative disorders AD and

ALS, provide genetic proof of the concept that deamidation can cause disease. For example, the

fALS SOD1 mutation, Asn86Asp (asparagine to aspartic acid mutation at residue 86) results in a

protein that is chemically identical to SOD1WT that underwent post-translational asparagine

deamidation at asparagine 86. A cursory review of the mutations that cause various diseases

(Table 2-1) reveals that deamidation mutations (AsnAsp or GlnGlu) occur in AD;110

Autosomal Recessive Polycystic Kidney Disease;111 Neuronal Ceroid Lipofuscinosis;112-113

Fanconi Anemia;114 Waardenburg II Deafness;115 and Prostate Cancer.116 If these protein variants

lead to disease, deamidation at the same residue as indicated here could lead to the same disease progression if it occurs on a large scale.

Disease Protein Mutation SOD1 Asn86Asp; Asn139Asp ALS TDP-43 Asn378Asp; Asn390Asp 117 Alzheimer Presenilin Asn135Asp 110 Autosomal Recessive PKHD1 Asn3175Asp 111 Polycystic Kidney Disease Neuronal Ceroid PPT1 Gln177Glu 112 Lipofuscinosis CNL8 Gln256Glu 113 FANCA Gln1128Glu 114 Fanconi Anemia Gln1235Glu 118 Waardenburg II Deafness MITF Asn278Asp 115 Prostate Cancer Androgen receptor Asn222Asp; Asn756Asp 116 Table 2-1. Select disease-associated mutations that cause protein deamidation.

There have also been documented cases of misdiagnoses when symptoms manifest in

similar ways, as has been the case with ALS and (the deamidation-dependent) Celiac disease. In

47

Celiac disease, transglutaminase-mediated deamidation of a wheat gluten-derived peptide

renders the peptide antigenic,119-122 thereby eliciting an autoimmune disorder. Celiac disease is

usually associated with neurological symptoms,123-124 including pure motor variants resembling

ALS.124-125 These and more recent studies provide evidence of overlap in Celiac disease and ALS symptoms to the extent that misdiagnosis of Celiac disease for ALS has occurred.126

Transglutaminase 6 antibodies have been found in ALS patient serum.127 Additionally,

cerebrospinal fluid of ALS patients has shown elevated levels of transglutaminase,128 and other

findings show that transglutaminase is involved in aberrant misfolded SOD1 assembly leading to

neuroinflammation and ALS disease progression in mice.129 This implies that sporadic ALS

etiology may have substantial overlap with Celiac disease.

2.4 Conclusion & Perspective

Mass spectrometry has proven an indispensable tool in aiding with localization of

modification sites, elucidation of the structural dynamics of protein variants, and revelation of

the location of pathogenic epitopes in the protein. As modern medicine allows for further

increases in life expectancy, neurological disorders have become more prominent and are

accounting for more and more deaths each year.[86] In fact, AD is practically inevitable if one

lives long enough, increasing exponentially with age and reaching ~50% prevalence by age

95.130 There are comprehensive reviews available about the current state of knowledge on several diseases discussed here, such as ALS,37 AD,131 PD,132 and prion diseases.133 We note that RNA metabolism appears to be critical to additional (non-SOD1) genetic variants of ALS, and was not

addressed by our studies. Another theme common to most, if not all, genetic variants of ALS and

indeed neurodegeneration in general, is that modified proteins inhibit proteolysis, promoting a

48 vicious cycle that ends in the accumulation and aggregation of misfolded proteins. We have shown that fALS SOD1 mutations selectively inhibit proteasome activity in affected motor neurons and tissues.134-137

Using a variety of toxicology assays, we have determined that oxidative and mutation- based modifications of SOD1 have similar toxicities. Using traditional HDX-MS we have determined that 14 of the most prevalent fALS mutations result in perturbation of the electrostatic loop, but we have not yet performed these studies with all 180+ fALS variants or with oxidative modifications. Instead, evidence for perturbation of the electrostatic loop of oxygen-modified SOD1 comes from conformation specific antibodies, and evidence for destabilization from reduction of their unfolding temperatures and from increased rates of global HDX. We have no direct evidence that electrostatic loop perturbation is toxic, per se, only x-ray crystallographic studies showing how electrostatic loop perturbation can lead to aggregation. We also do not understand the allostery of SOD1 - in particular, how distant structural perturbations affect the electrostatic loop or how electrostatic loop perturbation affects dimer dissociation.

This perspective concerns oxidative modifications of proteins. Metals, as discussed above, are also important structural determinants of many proteins, including SOD1.138-139 For example, loss of copper and zinc result in toxic proteoforms of SOD1 (Reviewed 140). Restoring these metals has resulted in the most effective treatment to date in animal models of ALS,

CuATSM.141-142 Our model for SOD1-mediated ALS indicates that the toxicity of mutations in the SOD1 gene result from changes to SOD1 protein structure. The correlation of changes in

SOD1 physicochemical properties to both fALS progression and in vitro toxicity models supports this. Whether changes in protein structure are responsible for all ALS or only a subset

49

of the disease is unknown. On the one hand, additional prevalent genetic risk factors for fALS

(FUS, TDP-43, Ubiquilin 2, C9orf72, etc.) generally result in protein aggregation. On the other

hand, these same risk factors also effect RNA function. The disease-relevance of changes

to protein and RNA function remains to be determined.

With few selective pressures against the plethora of different (late onset) fALS SOD1

mutations existing until lifespan rapidly increased in the 20th century, it appears that SOD1 evolved into a space of narrow protein stability. Even the slightest perturbations to SOD1 structure lead to disease. This is consistent with the concept of an entatic state – in which a protein is held in a highly strained and delicate state to promote catalysis. Our studies indicate that fALS mutations exert their toxic effects through a common, aggregation prone, structural intermediate. In human tissue, we identified oxidative PTMs that also destabilize SOD1 and are toxic. Our current therapeutic approach involves preventing these PTMs and kinetically

stabilizing SOD1. We have not demonstrated that oxidative modifications of SOD1WT occur on a large enough scale to match the pathogenicity of fALS-linked SOD1 variants. Whether oxidative

PTMs occur selectively in the highly metabolic motor neuron remains to be determined. Others have proposed that SOD1-linked sALS can progress through a small population of aberrant modifications that lead to protein-templating to convert SOD1WT to a toxic conformation.143

Further creative uses of bio-chemical and -physical experiments coupled with mass

spectrometry show great promise in advancing our understanding of disease at the molecular

level.

50

2.5 Acknowledgements

These studies began during Agar’s post-doctoral fellowship at the Montreal Neurological

Institute in Heather Durham’s neurotoxicology laboratory, and involving numerous clinical, biochemical, and structural collaborators, including the laboratories of Ashutosh Tiwari, Larry

Hayward, Yoshi Hamuro, Bernard Gibbs, Greg Petsko, Dagmar Ringe, Bob Brown Jr., Jared

Auclair, Nathalie Agar, and Daryl Bosco, and our ongoing collaboration with Bruker Daltonics.

We’d also like to thank all former Agar lab members for their contributions to the research presented.

51

Chapter 3

The Observation of Conformational Bias in Pan-specific Anti-SOD1 Antibodies

Nicholas D. Schmitt1,2, Meenal M. Chaudhari3, Fnu Ruchika3, Natalie Y. Leung, Jeffrey N. Agar1,2,3,*

1Department of Chemistry and Chemical Biology, 2Barnett Institute of Chemical and Biological Analysis, 3Department of Pharmaceutical Sciences, Northeastern University, Boston, MA, USA.

This chapter was written with intent to publish. It is currently in final revisions.

52

3.0 Statement of Contribution

Experimental contributions to this chapter by Nicholas D. Schmitt are as follows: Nicholas D. Schmitt, under the guidance of Jeffrey N. Agar, designed all experiments, performed all western blot analyses with assistance from Meenal Chaudhari, Fnu Ruchika, and Natalie Leung, designed and performed global spray-quench HDX analysis, and performed peptide-level HDX experiments with extensive assistance from Richa Sarin. Nicholas D. Schmitt generated all figures and wrote the manuscript.

53

3.1 Abstract

Numerous Cu/Zn-Superoxide Dismutase (SOD1) immunolabeling studies, often

employing the same reagents, have produced contradictory results with respect to the

involvement of modified wild-type SOD1 in sporadic ALS. Polyclonal anti-SOD1 antibodies are

a mainstay of these and other immunopurification, western blotting, immunocytochemistry, and

immunohistochemistry studies. These antibodies are raised against purportedly folded and active

SOD1, are widely considered to be able recognize both native and denatured SOD1, and are consequently termed “pan-SOD1” (pan-specific anti-SOD1) antibodies. Their many uses include detecting the total quantity of SOD1 and normalizing the results of conformation- and

modification-specific antibodies. We show that contrary to prevailing thought, these polyclonal

antibodies are not pan-SOD1 antibodies; to the extent that in certain preparations the entire

cellular complement of SOD1 is not detected. We found that all conformations of SOD1 other

than fully reduced and denatured preparations had diminished pan-SOD1 activity in western

blots, especially relatively folded, dimeric SOD1. Using a conformation-sensitive global

hydrogen-deuterium exchange mass spectrometry assay, SOD1 proteoforms, conformation and

dynamics were distinguished with respect to the extent of antigen retrieval. Based upon these

results, antibody recognition was optimized, resulting in a method that allows all forms of SOD1

to be detected by western blot, recovering signal from heretofore underrepresented native

conformations.

54

3.2 Introduction

Over 30,000 manuscripts include reference to Cu/Zn-Superoxide Dismutase (SOD1), a homodimeric protein that converts the reactive oxygen species (ROS), superoxide, into peroxide and water. In its role as the cell’s principle antioxidant enzyme, and other emerging roles, SOD1 is an important modulator of the aging process. In 1993, mutations in the SOD1 gene were shown to cause 20% of familial amyotrophic lateral sclerosis (fALS).2 These mutations result in

a gain-of-toxic function and perturb SOD1’s native structure. A variety of anti-SOD1 antibodies were used to help characterize these structural changes, which include reduction of the native, intrasubunit disulfide (Cys57 – Cys146); loss of metals; dimer dissociation;144 post-translational

modification; and loop-disorder.145

Post-translational modification, including those that lead to the formation of higher-order

structures, occur during aging and are accelerated by diseases of aging. This led many to propose

a role for SOD1 modification in sporadic neurodegeneration. Anti-SOD1 antibody titers are

associated with survival in sporadic ALS.146 Immunization with peptides that are revealed in

unfolded forms of SOD1,144, 147 as well as anti-SOD1 antibodies, are also being considered as

ALS treatments.148 Using anti-SOD1 antibodies and other tools, modified SOD1 has been

observed in heathy controls,41 aging,149 Parkinson,150 and Alzheimer.150 Using conformation-

specific and pan-SOD1 antibodies, modifications of SOD18, 151 and SOD1-positive inclusions57,

152-153 have been observed in sporadic ALS. There are high-profile examples of conflicting

results obtained with these antibodies.154 For example, some studies detected unfolded SOD1 in

ALS tissues,8, 151 whereas others did not.15, 17 Considering the rapid development of SOD1- targeting therapies, the resolution of this controversy carries therapeutic implications.155

55

Whether SOD1 is involved in a portion of sporadic ALS remains one of the largest

controversies in the field and is based entirely on some well-respected groups observing, and others not observing, anti-misfolded SOD1 immunolabeling in sporadic patients. Appropriate use of these antibodies requires that proteins not be exposed to the temperature extremes or detergents used in standard protocols, including native-gel electrophoresis preceding western blotting, and “gentle”, low temperature epitope presentation preceding conformation-specific immunohistochemistry. Slight differences in methodology, for example the heating step during antibody presentation, have been hypothesized as potential confounds, but have not been confirmed experimentally. Defining the appropriate experimental conditions for such antibodies is a critical unmet need.

Results with conformation-specific antibodies tend to be normalized to a positive control

pan-SOD1 antibody, and are therefore predicated upon antibodies that can recognize the entire

complement of SOD1. SOD-100 is the most commonly used pan-SOD1 antibody and is raised

in rabbits. It is thought to be raised by the inoculation of rabbits with commercially available

intact SOD1 purified from human erythrocytes, though exactly how this occurs is a trade secret.

Our group has also created SOD-100-like polyclonal antibodies using commercial SOD1

preparations and rabbits.41 SOD-100 and similar antibodies raised against native SOD18, 41 have been used to detect denatured SOD1 in denaturing SDS-PAGE western blotting, in immunocytochemistry and immunohistochemistry studies,156 and to immunopurify SOD1.41

These studies are consistent with SOD-100’s ability to interact with a wide variety of SOD1

conformations and its use as a pan-SOD1 antibody.

Here, we show that the SOD-100s antibody has little-to-no affinity for relatively folded

native forms of SOD1 in western and dot blots. We also developed a mass spectrometry assay

56

capable of monitoring treatment-dependent protein conformation in solution that is more

sensitive than charge state distribution mass spectrometry. The results of this assay applied to the

SOD1 conformers we generate indicate at least five unique conformations of SOD1 exist in

solution. Based upon these results we develop a western blotting assay that normalizes pan-

SOD1 activity. Collectively, our results demonstrate yet another example of the importance of protein standard and antibody validation.

3.3 Experimental Procedures

3.3.1 SOD1 Sample Preparation

Human SOD1 was purified from yeast cells using hydrophobic interaction chromatography followed by anion exchange chromatography as previously described.38 Mass

spectrometry revealed this preparation of SOD1 to be N-terminally acetylated as is found in humans, with an intact intramolecular disulfide bond, and lacking any unnatural modifications.

Protein samples used in either gel electrophoresis experiments or dot-blotting experiments were prepared to the following SOD1 sample types: 1) Non-treated Sample (N) – SOD1 was left in 10

mM ammonium acetate at room temperature for the duration of any other sample treatments. 2)

Reduced Sample (R) – SOD1 was incubated at room temperature in 10 mM ammonium acetate

with 10mM DTT for 10 minutes. 3) Heated Sample (H) – SOD1 was incubated in 10 mM ammonium acetate at 95 °C for 10 minutes. 4) Reduced and Heated Sample (RH) – SOD1 was incubated in 10 mM ammonium acetate with 10 mM DTT at 95 °C for 10 minutes.

57

3.3.2 Cell culture

HeLa cells were cultured in DMEM with 10% fetal bovine serum and

penicillin/streptomycin in a 24 well Corning CellBIND surface (Corning Life Sciences,

Tewksbury, MA) with 5% CO2 at 37 °C. The cells were cultured to monolayer confluency.

3.3.3 Dot Blotting

When monomer/dimer status of SOD1 was already known, gel electrophoresis was

skipped in the event that it may have been modifying SOD1 conformation and dot-blotting was

used for direct on-membrane probing of SOD1 conformation. SOD1 was diluted to 0.5 mg/ml

and 2 µL of this solution under experiment-specific buffer conditions was spotted onto a nitrocellulose membrane before allowing the sample to dry and proceeding to western blotting procedure, without the electrophoretic transfer step.

3.3.4 Gel Electrophoresis

Prepared and treated protein samples were loaded on a Mini-Protean TGX Native Gel

(BioRad, Hercules, CA) with no SDS in the sample or running buffer and no reductant present.

Samples were run until separated, typically 45 minutes at a constant amperage of 120 A. At this

point, an optional in-gel denaturing and reductant step was used in certain cases, depending on

the desired outcome of the experiment and is described below.

3.3.5 In-gel Denaturing and Reduction Procedure

The steps described here were added to the previously described gel electrophoresis when the experiment described used this optional step. Following native protein gel electrophoresis,

58

water in a water bath was heated to just-boiling, and a glass container floating in the water bath was filled one inch with 1X Tris-Glycine buffer with 10 mM BME and allowed to reach 90 °C in a fume hood. Protein gel was submerged in the reductive buffer and incubated for 8 minutes followed by rinsing the gel with water, before proceeding to transfer to a nitrocellulose membrane.

3.3.6 Western Blotting

A Trans-Blot Turbo was used for protein transfer from the native gel to a 0.2 µm nitrocellulose membrane (Millipore, Billerica, MA, USA). Following transfer, membrane was incubated in 0.1% Ponceau S with 5% acetic acid for 5 minutes, before de-staining and confirmation of protein transfer. Following that the Ponceau stain was washed off and membrane was blocked in 5% milk in TBST for 1 hour at room temperature. The membrane was then rinsed and incubated in 2.5% milk TBST with SOD-100 primary antibody (StressGen, San Diego, CA,

USA) overnight at 4 °C. Following primary antibody incubation, the membrane was washed 5 times in 1X TBST and then incubated in Pierce Goat Anti-Rabbit IgG secondary antibody

(Thermo Scientific, Waltham, MA, USA) in 2.5% milk TBST for 1.5 hours at room temperature.

Following secondary antibody incubation, the membrane was washed and exposed to Pierce®

ECL Western Blotting Substrate reagents for 5 minutes prior to imaging with a Bio-Rad Gel Doc

Imager. Images were acquired with exact exposure times dependent on the experiment to allow for comparison between various membranes that were part of the same experiment but imaged separately.

59

3.3.7 In-Gel Assay to Detect Full Cellular SOD1-component

Untreated HeLa cells were lysed in 6X-SDS-PAGE loading buffer. This aliquot was split to create identical samples that were then heated to a range of temperatures (100, 90, 80, 70, 60,

25 °C), with half of the samples containing 2-ME as a reductant. The In-gel Denaturing and

Reduction Procedure was then followed.

3.3.8 SOD1 Protein Mass Spectrometry

SOD1 was characterized using Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS). ESI-FTICR-MS analysis was performed using 1 µM SOD1 was introduced into the mass spectrometer at an infusion rate of 2 µL/min, and ionized at atmospheric pressure using electrospray ionization (ESI) in positive ion mode. Mass spectra were acquired using a 9.4 Tesla SolariX XR FT-ICR-MS using ftmsControl (Bruker Daltonics,

Billerica, MA). Spectra were analyzed using DataAnalysis (Bruker).

3.3.9 Conformation-Sensitive Global Hydrogen-Deuterium Exchange Mass Spectrometry

Native FTICR Mass Spectrometry spraying from 50:50 H2O:D20 with 10 mM ammonium acetate pH 7.4 was used to distinguish SOD1 conformers. Recombinant human

SOD1 after various pre-treatments, which had previously been extensively characterized by our lab4, 157 was diluted from 50 to 1 µM in 50:50 water and deuterium oxide with 10 mM

ammonium acetate at pH 7.4 containing no added organic solvent or acid, maintaining aqueous treatment-dependent conformation conditions. These samples were pre-treated as specified in 10 mM ammonium acetate and then diluted in a 10 mM ammonium acetate buffer containing 50% water and 50% deuterium oxide. Mass/charge measurements were taken on a logarithmic

60

timeline from 90 seconds to 6.4 hours in real time employing a spray-quench method. In data

analysis, relative rates of deuteration were measured with a consistent method between samples.

The samples were directly infused into a SolariX XR FT-ICR Mass Spectrometer for

measurement (Bruker Daltonics, Billerica, MA). This assay left solvent-accessible side-chains

and amide hydrogens half-deuterated upon measurement to assess exposed protein sidechains as

a proxy for exposed surface area.

3.4 Results

3.4.1 Pan-SOD1 antibody exhibits preparation-dependent selectivity

In the course of SOD-100 immunolabeling validation studies of SOD1, we observed

remarkable preparation-dependent differences in labeling intensity for samples with the same

quantity of SOD1. On the one hand, SOD-100 can immunodeplete and purify active SOD1 from a variety of preparation41 and is able to intensely label SOD1 in denaturing western blots,

consistent with pan-SOD1 activity. On the other hand, we observed SOD1 immunoreactivity

nearly disappear in non-reducing western blots (Figure 3-1). Given that such bias could provide a

basis for the disparate results obtained in ALS immunolabelling studies, we sought to determine the nature of this bias and then overcome it.

61

Figure 3-1. Western blotting with SOD-100 is biased against the detection of more-folded

conformations of SOD1, significantly underrepresenting the presence of these conformations.

SOD1 is a 15.9 kDa protein that migrates faster in native gels when more-denatured. Top: Four

lanes of 1 µg / lane SOD1 pre-treated as indicated and described in the experimental procedures

section were loaded in a non-reducing native gel. SOD1 quantitation appears relatively consistent when examined using Coomassie stain albeit with more sample spreading occurring in the pre-heated samples. Bottom: Western blot detection would suggest that these same four

62

samples are present in very different quantities. If SOD-100 detection of SOD1 were consistent

with universal recognition of several epitopes on the protein, western blot quantitation would

resemble Coomassie quantitation. SOD-100 better-recognizes denatured conformers of SOD1,

with its preference being the reduced and denatured sample. Bars represent average of three

individual protein gels and blots and image chosen was representative of the data, with relative

intensities to the most intense band being reported.

We began by addressing the repeatability of anti SOD1 antibody bias, ordering additional

SOD-100 lots, and retrieving archived SOD-100 lots as well our own polyclonal antibodies4, 41, 62

generated using commercially available SOD1 (S9636, Sigma-Aldrich, St. Louis, MO, USA) , which was folded and active upon injection into rabbits. Our results were not observed to be antibody or SOD1 preparation-dependent, rather they appear to be a consistent feature of rabbit anti-SOD1 polyclonal antibodies prepared using commercially available SOD1. To insure that these results were not an artifact of protein preparation of SOD1 we repeated experiments on multiple enzymatically-active and mass spectrometry-validated human SOD1 preparations38, 104

expressed in, and purified from, yeast cells using standard methods; and unpurified SOD1 from

human cell line extracts.

Having ruled out antibody lot- and SOD1-source specific artifacts, we explored the

nature of SOD1 preparations that do and do not exhibit strong SOD-100 immunoreactivity. A

salient structural feature of SOD1 is that it is among the most stable mesophilic proteins, with its

native form denaturing at temperatures > 90 °C. The loss of native Cu or Zn, and reduction of the

native intramolecular disulfide, are concerted and decrease this melting point by as much as 30

°C.158 In addition, SOD1 undergoes a three-state unfolding transition that begins with dimer

63

dissociation to folded monomers, followed by an unfolding of monomers.159-160 Thus, a variety of structural permutations could provide a basis for the observed differences in immunoreactivity.

Due to the published literature indicating SOD-100 is able to recognize native and

unfolded SOD1, it was impossible to predict whether a given permutation would increase or

decrease immunoreactivity. We therefore systematically characterized the immunoreactivity of

treatments that are known to unfold native SOD1 to varying (or no) degree, creating non-treated

native (N), intramolecular disulfide-reduced (R), heated (H), and disulfide-reduced and heated

(RH) samples as described in the experimental procedures section. Neither heating to 95 °C161

nor DTT treatment alone fully denature SOD1, but do so when combined. We then compared the

western blotting intensity of non-treated, heated, disulfide-reduced, and heated/reduced SOD1

(Figure 3-1). These results indicated that the western blot signal intensity of native SOD1 was

remarkably lower than that of denatured SOD1. In addition, neither heating nor reduction

treatment (i.e. partially folded) alone resulted in western blot intensity as high as their combined

treatment (i.e. denatured) (Figure 3-1). These results suggest that these four distinct treatment

patterns of SOD1 generated at least four distinct conformational populations of SOD1 with key

structural differences and the least-native preparations exhibiting the highest immunoreactivity.

3.4.2 Determining the mechanistic basis of antibody affinity

To determine whether SOD-100 detection was being biased by post translational

modifications, we characterized SOD1 expressed in yeast and purified in our laboratory by mass

spectrometry. We determined that this SOD1 was devoid of non-native PTMs, having proper N-

terminal acetylation in all cases and native intramolecular disulfide linkage before reduction. An

64

alternative hypothesis for the observed differences of N, H, R, and RH SOD1 preparation’s

immunolabeling was differences in transfer efficiency from the SDS-PAGE Gel to the

immunoblotting membrane. Transfer efficiency of the generated-conformations was tested in

three ways: i) membrane-binding of SOD1 by dot blot under two saline conditions using both

Ponceau S and amido black to probe for salinity and protein-binding-dye differences; ii)

Coomassie staining of protein gels after transfer to confirm protein was not getting stuck in the

gel matrix; iii) Ponceau S or amido black staining of nitrocellulose membranes after protein

transfer from gels. All of these methods indicated that protein transfer was not responsible for

loss of SOD-100 signal.

Membrane-protein interactions can be indirectly probed through dot-blotting out of

solutions of different ion content. Treating samples in our previously described N, H, R, RH

scheme and applying them to nitrocellulose under either low salt (1 mM) or high salt (1 M)

conditions, we observed three key findings. i) Our total protein quantitation results were likely

not biased due to dye-protein interactions as we observed similar results with both Ponceau Red

and Amido black, which have quite different structures and have been shown to bind to proteins

in slightly different ways162 (Figure 3-2). ii) Membrane binding capacity likely wasn’t the issue,

as all conformations of SOD1 seemed to adhere in relatively equal quantities with no clear trend

observed. ImageJ quantitation163 reveals that these protein molecules are indeed conserved on the membrane despite spreading more to find a place to bind or having slower on-rates (Figure 3-2).

iii) Treatment of the nitrocellulose membrane at 90 °C in 1X TBS with 10mM 2-

mercaptoethanol (reductant) for 8 minutes resulted in complete SOD-100 signal recovery for

SOD1 that was heated to 95 °C for 10 minutes prior to dot-blotting, indicating the SOD1 was

present but simply not previously presenting the correct antigens for SOD-100 (Figure 3-3).

65

Taken together, these results suggested that poor transfer efficiency was not responsible for the observed differences in western blot intensity and epitope recovery is possible both in vitro and on membrane.

Figure 3-2. Heat denaturation of SOD1 prevents surface-spreading if transferring proteins to

nitrocellulose membrane under high salt conditions. Folded SOD1 does not present enough

hydrophobic surface to efficiently transfer to nitrocellulose membrane without spreading. SOD1

was incubated under indicated conditions and spotted to membrane in both low salt solution (1

66

mM NaCl) and high salt solution (1 M NaCl). All forms of protein seemed to have high binding

capabilities, while the two non-heated forms of SOD1 were seen to spread more before adhering

to the membrane in high salt condition. Quantitation with ImageJ revealed that SOD1, though

spread out more on the membrane, was still present in the same quantities when probed with two common protein dyes, Ponceau S and Amido Black. These results also indicate normalization of protein conformation before transfer to membrane is essential for consistent transfer to membrane. Bar graphs indicate relative abundance between each set of samples, quantified relative to the most intense band for each dye used, and are averages of the intensities of the three visible spots. These results suggest different surface-exposed area or residues for these different conformations of SOD1.

67

Figure 3-3. Heat/Reduction treatment of nitrocellulose membrane can recover SOD1’s SOD-100 epitope exposure. Heat alone can recover signal of previously-reduced SOD1, but both treatments are required to normalize signal of all samples. Quantitation shows average of the three visible replicates.

68

3.4.3 SOD1 conformation assessment by conformation-sensitive global-HDX-MS

Native charge state distribution mass spectrometry was used to examine these various

treatment-generated conformations of SOD1 in greater detail, revealing the non-treated and disulfide-reduced conformations adopting predominantly a 6+ charge state , whereas the heated and heated and reduced samples adopted predominantly an 8+ charge state. To improve upon this assay’s sensitivity, we changed the aqueous portion of the spray solvent to 50:50 H2O:D2O, while otherwise maintaining the 10 mM ammonium acetate pH 7.4 buffer. As these samples were measured without an in vitro quenching step replacing exchanged deuterons with solvent hydrogens, fast-exchanging sidechains and solvent-accessible backbone amides remained at their in-solution deuteration levels which allowed us a more sensitive method to observe conformational difference between samples, as shown in Figure 3-4. Four separate preparations of SOD1 were tested as described previously in the experimental procedures section: non- treated, reduced, heated, and reduced and heated.

69

Figure 3-4. Multiple conformational states of SOD1 are revealed by conformation-sensitive non- quenching intact protein global-HDX. Hydrogen/deuterium exchange intact protein mass spectrometry shows at least five distinct conformational states of SOD1in solution using a spray- quench global HDX method, with variations between different metalation states within samples.

At least two metalation states were observed for all samples except for non-treated native, representing the protein binding one or both of its copper and zinc cofactors. In the case of the heat-only pre-treatment, two unique conformations were observed for the 1 metal cofactor state, designated as conformations A and B. Measurements were taken beginning at 90 seconds, doubling the time interval between each successive measurement until the 23,040 seconds (6.4 hr) time point was reached. The average of three separate experimental replicates are shown for each sample. Predominant charge states observed for native, and reduction only samples was 6+, while for denatured and reduced and denatured samples, it was the 8+ charge state.

70

The results shown in Figure 3-4 indicated that all pre-treatment conformations of SOD1 had different levels of deuterium uptakes with varying exchange dynamics, indicating at least five unique in-solution SOD1 conformations. All samples were observed to have a more- unfolded conformations than the non-treated sample. We observed that our reduced/heated

sample formed a more compact and side-chain protected conformation than the heat-only

sample, indicating levels of re-annealing that could indicate an alternative non-native but folded

conformation which allows for presentation of SOD-100’s primary epitope. Additionally, two

alternate conformations of single-metal-binding SOD1 in the heat-treated sample were observed,

one matching the two-metal-binding conformation and one unique, showing that metalation state

is an important driver of protein conformation and likely contributes to antibody recognition in

all conformations. This technique provided a sensitive method to monitor protein conformation

that was used to ensure heat and reductant treatment normalized protein conformation prior to

western blot detection.

3.4.4 Development of a normalization method for total SOD1 quantitation

We have demonstrated thus far that SOD-100 has an increased affinity for reduced and

denatured SOD1 and a decreased affinity for more-folded conformations, showing at least five

unique conformations present as confirmed by mass spectrometry. We then developed a method

that could normalize SOD-100 detection of SOD1, regardless of the conformation of the sample

prior to gel electrophoresis. In this way, differences in conformation could be maintained and

observed through migration distance or through SOD-100 probing, but if quantitation of total

SOD1 was required, a single-step procedure could be used to normalize SOD1 epitope

presentation. We had previously observed by dot blot that signal of previously heated SOD1 by

71

SOD-100 could be recovered using a reductive membrane treatment, but also that normalizing

conformation prior to nitrocellulose transfer resulted in more even and consistent transfer

without diffusion across the membrane. Incubation of the protein gel in 1X Tris-Glycine buffer with 10 mM 2-ME for 8 minutes at 90 °C normalized protein conformation of all samples tested

to relatively the same signal intensity (Figure 3-5). The combination of treatments without over-

treatment consistently showed signal recovery and normalization of all samples.

72

Figure 3-5. In-gel heating and reduction protocol enables normalization of SOD1 epitope presentation for SOD-100. Western blot detection using SOD-100 primary antibody and appropriate (goat anti-rabbit) secondary antibody with HRP conjugated under various sample pre-treatment and gel treatment conditions. Samples were pre-treated in vitro prior to gel- loading, ran on native gel in the absence of SDS and reductant, incubated in 1X Tris-Glycine

73

buffer for 8 minutes (No In-Gel Treatment), with either heat added to 90°C (In-Gel Heating), 10

mM DTT added (In-Gel Reduction), or both heat and DTT (In-Gel Heating and Reduction).

After gel treatments, proteins were semi-dry-transferred to nitrocellulose membrane for antibody

probing. Heat treatment of the gel was seen to improve detection of previously reduced or native

samples. Reduction treatment improved detection of previously heated sample. Heat and

reduction treatment together normalized detection of all samples. Quantitation relative to the

most intense band in each blot shown as the average of three replicates with most representative

gel image chosen for display.

3.4.5 Development of a cell-based assay of SOD1 disulfide status quaternary structure.

With this protocol in place, we were able to return to our original goal of developing a

cellular assay to monitor small molecule-mediated SOD1 cross-linking. A number of experiments require non-reducing sample preparation and gel conditions. These include defining the ratio of reduced to oxidized native SOD1 disulfide; assessing quaternary structure, which varies from monomer to microscopically visible inclusions in ALS-associated variants; detecting

S-thiolation; and screening for compounds that stabilize the SOD1 dimer. Reduction transforms

SOD1 from a dimer to a monomer, releases S-thiolation, and breaks the native disulfide.

Consistent with the studies with purified proteins presented above, SOD-1 was not detectable

from HeLa cell extracts following incubation in non-reducing 6X-SDS PAGE sample loading buffer even after incubation at 60 °C for 5 minutes in a thermomixer (Figure 3-6). We therefore reason that this protocol can easily be used following non-reducing gel electrophoresis on cell extracts to detect the entire cellular complement of SOD1, regardless of conformation, disulfide status, or cross-linker present prior to analysis.

74

Figure 3-6. In-gel heating and reduction protocol enables cell-based assay detection of crosslink- stabilized SOD1. (Top) Reduction of samples or heating to 80°C creates an SOD-100-detectable,

SOD1 monomer (red arrows). (Bottom) Reduction and heating of the gel, following electrophoresis but before electroblotting, allows the detection of SOD1 dimers (blue arrows).

3.4.6 Potential pitfalls of anti-SOD1 antibodies and current antibody validation protocols.

In “A Proposal For Validation of Antibodies” Uhlen et. al. gave five different methods

for vetting antibodies, e.g. knocking down the gene of interest, using a second antibody,

immunoprecipitation MS 12. Despite the inconsistencies presented here for anti-SOD1 antibodies,

they would have been validated using any of the five acceptable measures. As these authors

remarked, antibodies to post-translationally modified proteins-which we note should explicitly

include folding variants- “may require a unique set of strategies for confirming antibody

specificity.” The unusually high thermal stability of SOD1, the counterintuitive specificity of 75

SOD-100 (an antibody raised against folded SOD1) for unfolded SOD1, and purification-based,

unnatural post-translational modification affecting a potential antigen, led to circumstance where

SOD1 is not detected by western blotting (and presumably immunohistochemistry). Given that

part of the problem is inherent to SOD1, as well as the many contradictory results using other anti-SOD1 antibodies, this problem is likely to be relevant to SOD1 immunochemistry in general.

3.5 Discussion

Here, we have demonstrated an easily observable discrepancy that arises when probing for SOD1 in both protein-specific and cell extract experiments that can confound both qualitative and quantitative analysis through polyclonal antibody probing of protein antigens. This issue of conformation-specific selectivity is present when the protein is probed under non-denaturing and/or non-reducing conditions. We have provided a method for recovering signal from less- probable conformations of the protein that allow for normalization of quantitation and detection following non-denaturing PAGE and native dot-blotting. This method can be used effectively and preferably on a protein gel following electrophoresis and prior to protein transfer to membrane, but also can be performed slightly less effectively on a nitrocellulose membrane. We have additionally developed an improvement of native charged state distribution mass spectrometry that incorporates deuterium exchange to increase sensitivity when monitoring protein conformation. This assay can be used in quality control to ensure conformation is consistent between different preparations of proteins prior to immunochemistry experiments.

Our results indicate that many previously published works may have their results inaccurately portrayed due to this discrepancy between claimed antibody selectivity and actual

76

experimental selectivity. We recommend all those moving forward studying this protein with

immunochemistry methods to perform this or a similar procedure prior to antibody-probing. It is not our purpose to determine which publications were unknowingly analyzed incorrectly in light of these revelations, only to inform the research community of these findings, and raise this

question for all antibodies. We hope that each group can ensure that their previous interpretations

are correct and help lead SOD1-mediated ALS research in a successful direction going forward.

77

Chapter 4

Increasing Top-Down MS Sequence Coverage by an Order of Magnitude through Optimized

Internal Fragment Generation and Assignment

Nicholas D. Schmitt,1,3 Joshua M. Berger, 1,3 Jeremy B. Conway, 1,3 Jeffrey N. Agar1,2,3,*

1Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, United States 2Department of Pharmaceutical Sciences, Northeastern University, Boston, MA 02115, United States 3Barnett Institute for Chemical and Biological Analysis, Northeastern University, Boston, MA 02115, United States

Reproduced with permission from Analytical Chemistry, submitted for publication. Unpublished work copyright 2020 American Chemical Society.

78

4.0 Statement of Contribution

Experimental contributions to this chapter by Nicholas D. Schmitt are as follows: Nicholas D.

Schmitt, under the guidance of Jeffrey N. Agar, designed all experiments, performed all Mass

Spectrometry analyses, preprocessed all data, processed all data with Joshua Berger, generated all figures, and wrote the manuscript.

79

4.1 Abstract

A major limitation of intact protein fragmentation is the lack of sequence coverage within

proteins’ interiors. We show that collisionally activated dissociation (CAD) produces extensive

internal fragmentation within protein’s interiors that fill existing gaps in sequence coverage, including disulfide loop regions that cannot be characterized using terminal fragments. A barrier to the adoption of internal fragments is the lack of standardized methods for their generation and

assignment. To provide these we explore the effects of protein size, mass accuracy, internal

fragment size, CAD activation energy, and data preprocessing upon the production and

identification of internal fragments. We also identify and mitigate the major source of ambiguity

in internal fragment identification, which we term “frameshift ambiguity.” Such ambiguity

results from sequences containing any “middle” portion surrounded by the same composition on

both termini, which upon fragmentation can produce two internal fragments of identical mass,

yet out of frame by one or more amino acids (e.g. TRAIT producing TRAI or RAIT). We show

that such instances permit the a priori assignment of the middle sequence portion. This insight

and our optimized methods permit the unambiguous assignment of greater than 97% of internal

fragments using only accurate mass. We show that any remaining ambiguity in internal fragment

assignment can be removed by consideration of fragmentation propensities or by (pseudo)-MS3.

Applying these methods resulted in a 10-fold and 43-fold expanded number of identified ions,

and concomitant 7- and 16-fold improvement in fragmentation sites, respectively, for native and

reduced forms of a disease-associated SOD1 variant.

80

4.2 Introduction

Comprehensive protein analysis with mass spectrometry is traditionally performed

“bottom-up” by digesting the protein into smaller fragments with proteases and then analyzing

those fragments.164-165 Newer methods such as middle-up, middle-down, and top-down, where larger segments or the entire protein are analyzed, have gained popularity as technologies have improved.19, 166 These methods offer improved association of post-translational modifications

(PTMs) to proteoforms and minimize sample preparation artifacts.6, 167 Of the

activation/dissociation techniques used for top-down MS, collisionally activated dissociation

(CAD) remains the most accessible due to the availability of in-source and collision-cell/ion-trap

CAD on practically all hybrid mass spectrometers. The goal of this work is to enable MS

practitioners to improve the CAD MS/MS sequence coverage for top-down experiments, via the

use of internal fragment ions, particularly in the “missing middle” within the interior of proteins.

Notably, our results can also be applied to bottom-up and middle-down MS/MS, as well as other

dissociation methods.

In a seminal top-down study, McLucky and colleagues demonstrated that different

(intact) ubiquitin charge states produced remarkably different product ion during resonant (i.e.

ion-trap) excitation.168 The resulting product ions, however, could not be assigned due to the

limited resolution and mass accuracy of that era’s ion trap instrumentation. We debuted “Big

Mascot” (later marketed as Mascot TD), an upgrade to the popular bottom-up Mascot search engine,169 for automated top-down databases searches22 and defined methods for assigning

internal fragment ions generated in top-down MS.20, 22 Using Mascot TD, Fourier transform ion

cyclotron resonance (FTICR) MS, a range of activation energies, and additional proteins, we

confirmed the charge state-dependence for top-down CAD and assigned >95 % of the resulting

81

product ions.20 We also elucidated the prevalent top-down fragmentation channels and their energy-dependence, and showed that internal fragments (those resulting from at least two backbone cleavages and not retaining protein’s N- or C- termini) were among the most prevalent ions produced via top-down CAD MS/MS.

From this work and the comprehensive studies of polypeptide fragmentation propensities by the Wysocki (bottom-up)170-174 and Kelleher groups (native- and denatured-state top-down)21,

175 it is clear that peptides and proteins undergo similar CAD fragmentation mechanisms.

Kelleher and colleagues provided a rationale for the increased prevalence of internal fragments in top-down MS, i.e. the number of possible internal fragments scale as ((n – l – 1) + (n – l –

1)2)/2) where n represents the protein length and l represents the minimum fragment length

considered.21 Kelleher and colleagues proposed methods for presenting internal fragmentation

maps,21 which we employ and expand upon here. Notable applications of internal fragments include characterizing macromolecular complexes,176 intact disulfide bonds,177-178

biotherapeutics,179 NF Kappa proteoform quantification180 and two-dimensional “deep” top- down sequencing177 of calmodulin. These and additional studies demonstrated the generation of

internal fragments by a variety of dissociation techniques, including activated ion electron

transfer dissociation,181 ultraviolet photodissociation (UVPD),182 and high-energy electron

capture dissociation.183 Accounting for internal fragments may not be necessary if the goal is to

assign a particular mass (e.g. intact protein) to a particular gene (i.e. identification). In such experiments, or when search engines cannot utilize internal fragments, the generation of such fragments may be minimized by using relatively low collision energy, to increase the prevalence of terminal fragments.184 However, if the goal is to maximize sequence coverage, e.g. to characterize proteoforms185, accounting for internal fragments is indicated. Moreover, certain

82

applications, for example TD localization of PTMs within disulfide-bond intra-links, require the generation and analysis of internal fragments.

Despite their prevalence and potential for improving protein characterization, internal fragments often remain overlooked. As a result, most of the ions generated during typical

collisionally-activated top-down experiments, and a minority with UVPD, remain unassigned

during database searches.184, 186 Internal fragments are currently not accounted for because

popular top-down database search programs (e.g. those with well-tuned protein-ID scoring algorithms) don’t assign internal fragments. While software options for top-down proteomics data analysis continue to grow with new tools such as TopPIC,187 MASH Suite,188 and Informed-

Proteomics,189 assignment of internal fragments in these packages is still not possible,190-191

although other online tools allow for potential fragment lists to be generated.192 Those

laboratories that do utilize internal fragments currently do so using in-house private software

solutions and without explicit consideration of the ambiguity in their assignments . We show

here that Mascot TD (or any version of Mascot up to 16 kDa) can be enabled to automatically

and accurately assign internal fragments (we recommend against using Mascot TD for assigning

statistical significance to a top-down protein ID—better alternatives have been reviewed).190-191

There have been several studies showing how to use internal fragments to increase

sequence coverage.20-22, 183 However, several challenges remain, including identifying

applications that would benefit from internal fragment IDs and avoiding the false positive ID’s

associated with a larger search space. We are not aware of studies that explicitly address how to:

maximize the formation of internal fragments; account for their peculiar ambiguities during

identification; and assure their accurate assignment. We address these issues here, develop

guidelines for the assignment of internal fragments, and show examples of the fold-

83

improvements (e.g. 10-50) in sequence coverage and modification ID that internal fragments

provide. Collectively, these allow for significant increases in sequence coverage combined with

high confidence in internal fragment assignments.

4.3 Experimental Procedures

4.3.1 Expression of Recombinant SOD1G93A in yeast

Expression and purification of SOD1G93A was conducted as previously published.38, 193

Briefly, EGy118ΔSOD1 yeast transformed with an SOD1G93A YEp351 expression vector were

grown at 30 °C for 44 h. Cultures were centrifuged and lysed with 0.5 mm glass beads in a blender, and then subjected to a 60% ammonium sulfate precipitation. The sample was then

pelleted, and resultant supernatant was diluted to 2.0 M ammonium sulfate. This diluted sample

was passed over a phenyl-sepharose 6 fast flow (high sub) hydrophobic interaction

chromatography column (GE Life Sciences) using a linearly declining salt gradient for 300 mL

from a high salt buffer (2.0 M ammonium sulfate, 50 mM potassium phosphate dibasic, 150 mM

sodium chloride, 0.1 mM EDTA, 0.25 mM DTT, pH 7.0) to a low salt buffer (50 mM

dipotassium phosphate, 150 mM sodium chloride, 0.1 mM EDTA, 0.25 mM DTT, pH 7.0).

SOD1G93A fractions eluted between 1.6 and 1.1 M ammonium sulfate and were confirmed with

SDS-PAGE analysis. These fractions were pooled, and buffer exchanged to 10 mM TRIS pH

8.0. Pooled fractions were then loaded onto a Mono Q 10/100 anion exchange column (GE Life

Sciences) and eluted from a linearly inclining salt gradient from a low salt buffer (10 mM Tris pH 8.0) to a high salt buffer (10 mM Tris pH 8.0, 1 M sodium chloride) from 0 – 30%.

84

SOD1G93A fractions were collected between 5 and 12% high salt and were confirmed with SDS-

PAGE, western blot, and FT-ICR-MS analysis.

4.3.2 Sample Preparation and Mass Spectrometry

FTICR-MS analysis was performed on a Bruker 9.4 T solariX XR Mass Spectrometer

(Bruker, Billerica, MA) with samples introduced by direct infusion. For native analysis,

SOD1G93A was diluted to 2 µM in 10 mM ammonium acetate, pH 7. For “reduced” analysis, the native intrasubunit disulfide was reduced in 10 mM ammonium acetate, pH 7 with DTT at 95° C for 10 minutes before diluting to 2 µM in 50/50 aqueous 10 mM ammonium acetate, pH 7 and acetonitrile with 0.1% formic acid. Measurements were taken using variations of previously published techniques for FTICR-MS-FSD.20, 22 Intact SOD1G93A was infused at 2 µL/min and

100 two megaword transients were summed, examining the frequency range corresponding to

300-3000 m/z using ramped excitation. FSD was achieved in the region between skimmer 1 and funnel 2 with a declustering potential ranging between 80 and 140 V in increments of 10 V. For pseudo-MS3 analysis, isolation windows of 3 m/z were used with CAD differentials ranging from 20-35 V. For mass accuracy better than 1 ppm, the ICR cell should be properly shimmed and baked-out before taking measurements. All spectra should be either be externally calibrated and similar numbers of charges maintained within the cell, or internally calibrated to ubiquitous co-sprayed or product ions. In this study, internal calibrations were performed on all FSD spectra using between 10 and 15 well-characterized b-type product ions (Figure 4-1).

85

Figure 4-1. Evaluation of FSD b-ion generation of SOD1G93A as modulated by declustering

potential (Skimmer 1). Top Left and Top Right: 1+ and 2+, respectively, b-ion series intensities as they change with skimmer 1 voltage. Bottom: Bar graph of the sum of these intensities in both plots. These are the ions used for internal calibration.

4.3.3 Data Analysis

Bruker’s DataAnalysis with SNAP II deconvolution/deisotoping and BioTools software were used for data processing, peak-picking/mass assignment, and spectral interpretation, respectively. All SNAP II parameters were optimized to keep false discovery rate below 1% with a minimum signal/noise of 2/1 and a quality factor threshold of 0.5. Averagine was replaced with the exact elemental composition of SOD1 to improve ion assignment through better monoisotopic peak determination. Matrix Science’s Mascot TD was used for ion assignment and scoring. Custom instrument parameters were configured to search for appropriate precursors and

86

only b, y, and yb fragment types for direct comparison of scores. At the time of these

experiments Mascot TD was not equipped to account for disulfide bonds. A native workaround

was used introducing variable modification to cysteines of 2.016 Da, resulting in the

identification of fragments containing disulfides as well as correctly predicting location in all instances for fragments containing a disulfide-intralinked region. Microsoft Excel was used to organize all potential ions and further evaluate them for consideration below using the following logic. Internal fragments are generated sequentially from terminal fragments.21 Therefore, if an

ion matched both an internal fragment and a terminal fragment, it was observed at lower

fragmentation energy, and diminished at higher energy, it was assigned as the terminal fragment.

Ions matching internals fragments were sorted into two categories, unambiguous assignments or

ambiguous assignments. Of the ambiguous assignments (matching more than one theoretical

internal fragment), these were assigned to one of three subcategories: arrangement ambiguous

(the same empirical formula matching different locations within the protein, e.g. SEE, ESE, and

TDE); frameshift ambiguous (a sequentially identical “middle” portion flanked by the same

composition on the N- or C- termini, e.g. TRAI vs. RAIT or SEERAI vs. RAISEE), or mass

accuracy ambiguous (assignable with perfect mass accuracy but not with the mass error tolerance

used). For more details on these categories, see the results section. These ions were then

tabulated and used to construct the ion fragmentations maps depicted below.

4.4 Results and Discussion

The objectives of this study were to develop methods for accurately assigning internal

fragments and to use these to achieve comprehensive protein analysis. We explored the effects of

collision energy,21 error tolerance, protein size, fragment size, consideration of fragmentation

87

propensity, and MS3 on the accurate assignment of internal fragments. We identified a class of ambiguity that is unique to internal fragments (“frameshifts” detailed below), provide a theoretical framework for its origin, and propose methods to mitigate this type of uncertainty.

We applied these finding to the comprehensive characterization of the ALS-associated SOD1

variant SOD1G93A, which presents a number of challenges for CAD studies, including a stable β- sheet-rich structure that results in low fragmentation yields, an internal glycine to alanine mutation, and an intrasubunit disulfide.194-195

Previous studies have not addressed the percentage of internal fragments that can be assigned using only accurate mass. Guidelines for maximizing internal fragment production and

differentiating internal- from terminal fragments through variations in CAD energy are known,20-

21 are illustrated for our setup in Figure 4-2, and are applied throughout this study. To encompass the range of masses characterized by typical mass spectrometer settings, throughout this study a low mass limit of 300 Da was chosen, and the high mass limit chosen was that of intact protein.

Accurate assignment should depend upon how the number of theoretically possible fragments,

i.e. the size of the searched database, scales with protein size, which has been previously

described as [(n – 1) + (n – 1)2/2] where n represents the protein length.21 While it is not the

focus of this study, these scaling effects were confirmed within the range of small to average

sized proteins, namely SOD1G93A (15.9 kDa) compared to a protein approximately half of its

mass, bovine ubiquitin (8.6 kDa), one approximately twice its mass, human carbonic anhydrase

(29.2 kDa), and one approximately four times its mass, bovine serum albumin (BSA, 66.4 kDa).

88

Figure 4-2. Top: FSD fragmentation of reduced and denatured SOD1G93A at various declustering

potentials show that as fragmentation voltage increases, production of terminal ions decreases,

and production of internal fragments increases, before beginning to decrease at very high

potentials. This phenomenon was also observed for native SOD1G93A. Differentials of mascot scores including or ignoring internal yb fragments are shown below each declustering potential value. Bottom: Mascot scores of the corresponding graphs seen above, plotted both with only b and y terminal fragments (orange), and b and y terminal fragments with inclusion of yb internal fragments (blue).

89

Employing mass accuracies of 1 ppm as the only search constraint allowed c.a. 80% of internal fragments to be unambiguously assigned within the mass range of 9-29 kDa and c.a.

70% at 66 kDa (Table 4-1). The ability of accurate mass—alone—to be sufficient for assigning most internal fragments is understandable. For example, neglecting protein-specific motifs and composition, even the “worst case” for mass-degenerate internal fragments, i.e. those derived from two abundant amino acids (e.g. A or G occurring in c.a. 10% of positions), should occur c.a. once per 10 kDa within a given protein. Consistent with the second-order increase in the number of theoretical internal fragments (and the searched database size) with respect to protein size, we observed that mass accuracy was relatively more important for internal fragment identification with larger proteins (Figure 4-3 and Table 4-1). For example, using 20 ppm accuracy as the only search constraint, 69% and 46% of internal fragments could be definitively assigned from the 9 and 29 kDa proteins, ubiquitin and carbonic anhydrase, respectively (Table

4-1).

90

Figure 4-3. Internal fragment ion ambiguity as a function of protein size, frameshift assignment

(top: % decrease is shown for 1 ppm mass error tolerance and is constant throughout the range of mass error tolerance), and mass accuracy (bottom). For these proteins high mass accuracy measurements enable the accurate assignment of >79% of internal fragments by mass alone, with another 13-18% assignable by considering the rules we present for mitigating frameshift ambiguity. The remaining ambiguous fragments (arrangement ambiguity: c.a. 2-3% of total) can be assigned by fragmentation propensity scores or MS3, see Table 4-1 for tabulated values, including additional calculations for BSA (66.4 kDa).

91

Table 4-1. Calculated ambiguity data on the three model proteins used in this study, with

additional statistics on BSA as an exemplar protein approximately twice the size of carbonic anhydrase for scale, calculated up to 1 ppm mass error tolerance. All ambiguities calculated using 300 Da as minimum fragment length. The percentages in the bottom section of the table are plotted in Figure 4-3, after assignment of frameshift fragments. Abbreviations used: Frags. –

Fragments, Amb. – Ambiguity, FS – Frameshift, Arr.- Arrangement.

92

Our results (0 ppm, Figure 4-3 and Table 4-1) indicate that independent of protein size, c.a. 20% of internal fragments could not be assigned by accurate mass, i.e. two or more theoretical internal fragments share a molecular formula. Further analysis indicated the majority of this ambiguity resulted from what we term here “frameshift ambiguity” involving adjacent sequence motives, e.g. an experimental mass matched the theoretical masses of internal fragments TADR and ADRT, both of which had to have arisen from TADRT. Importantly, such

frameshift ambiguity is minor in degree, because despite the fact that a fragment ion can be assigned to two possible internal fragments (e.g. TADR and ADRT), the sequence assignment of

all but the terminal amino acid of each fragment is certain, a priori (e.g. ADR). To our

knowledge the ability to extract a sequence assignment based upon sequence alignment is novel.

Frameshift ambiguity will occur anytime a particular sequence is flanked by sequences having

the same composition (i.e. empirical formula and mass). Detailed analysis indicated that with but

a few exceptions (e.g. ISLSGDHCIIGRTLV in SOD1G93A sequence, where ISL and TLV have the

same empirical formula and mass) frameshift ambiguity resulted from sequences flanked (i.e.

surrounded) by one amino acid (e.g. GLTEGGLTE and LTEG in SOD1G93A sequence). Such

2 frameshift ambiguity therefore scales approximately as the sum of all [FAX = c – c] where FAX

represents number of frameshift ambiguity events resulting from a particular amino acid residue

type (e.g. FAT for Thr), and c represents the count of that residue type within the protein. Total

frameshift ambiguity for each protein is then calculated as the sum of frameshift ambiguity for

each residue type within the protein. This concept, as well as a three-term equation of FA that accounts for employing a low mass cutoff, are illustrated in detail in Figure 4-4. As shown in

Figure 4-3 accounting for “frameshifted” internal fragments allowed for the assignment of >94% of internal fragments, and a 17% increase in sequence coverage, at a mass accuracy of 1 ppm.

93

Figure 4-4. “0 ppm” Ambiguity (frameshift and arrangement, combined) of three model proteins was compared, accounting for how much stems from frameshift ambiguity, due to composition repeats and other events (see note below), and arrangement ambiguity, which collectively includes all other instances of non-mass accuracy ambiguity. A. Depiction of what frameshift ambiguity is and when it applies and does not apply. Fragments whose colors match are have frameshift ambiguity, where whether the fragment is the first or second in the set won’t affect which region of the protein is being evaluated. The two R’s at the end of the sequence shown are too proximal to generate any frameshift ambiguity if using a low-mass cutoff of 300 Da as we did, or if only considering internal fragments of length 3 or greater. B. Formula for

94 approximating frameshift ambiguity for each residue in a protein. Exact values depend on specific protein sequence. This formula uses the two left-most columns of each set in Table C to calculate the Frameshift ambiguity for each residue. These are then summed to calculate the total frameshift ambiguity for the protein. C. Table for calculating single-residue frameshift ambiguity for each residue set of each protein. This count is unique for all proteins, depends upon the arrangement of amino acids, and can be calculated directly with this method or approximated from only the count of each amino acid without considering the final term in the equation for larger proteins. Notice that leucine and isoleucine are combined here because they are isomers.

D. Summary table showing in both total count and percentage how much ambiguity results from frameshift and arrangement ambiguity for these three model proteins. Frameshift ambiguity as a percentage of total ambiguity increases with protein size as a general trend but will always be specific to sequence and degeneracy.

Note: Repeats of the terminal amino acid result in terminal:internal frameshift ambiguity, which could also be considered if fragmentation energy is not being considered to assign a fragment as a terminal or internal.

The remaining “0 ppm” ambiguity originated from spatially remote sequences of identical empirical formula that didn’t fit the definition of frameshift ambiguity, making up the smallest portion (c.a. 2-3%) of internal fragments observed. We collectively categorize these as

“arrangement ambiguity” as they differ only in amino acid sequence (e.g. ADR and RAD) or atomic connectivity (e.g. EQKES and TADKA, Figure 4-5). To distinguish between and assign internal fragments with arrangement ambiguity, and to assign the terminal amino acid in the case of frameshift ambiguity, two additional means can be used: 1) Using fragmentation propensities,

95

assign the terminal amino acid. Multi-protein studies have been conducted, and for a subset of

amino acids,175 can determine whether N- or C- terminal fragmentation is more likely; 2)

perform MS3 (or pseudo MS3). Figure 4-5 depicts two examples of pseudo-MS3 being used to determine which of two theoretical fragments an arrangement ambiguous fragment ion should be

assigned to. These sequence motifs lie within the “loop” region of the protein that results from an

intramolecular disulfide-bond. As a result, if this disulfide bond is intact, these sequence motifs can only be detected using internal fragments.

Figure 4-5. Pseudo-MS3 analysis of the ambiguous internal fragments observed at 602.2779

(top) and 1351.7141 (bottom) m/z. The intact mass of each ion above are an exact match to two

theoretical internal fragments. Isolation of the internal fragment of concern and subsequent MS3

with CAD permits the unambiguous assignment of these fragments.

96

We created a heatmap summarizing the results of this study following the methods of

Durbin et al.21 We also introduced an “All Ion Fragmentation Map”, where data from the

terminal fragments and internal fragments are combined into one image depicting protein coverage, fragmentation patterns and frequencies, and allowing for comparison of two different modification or connectivity states of a protein (Figure 4-6). In this example of our method, we were able to assign 76% of internal fragments using mass accuracy of 1 ppm, and the remaining ambiguous fragments using either fragmentation propensity or pseudo-MS3. When assigning

fragments with fragmentation propensity, at the time of publication, we recommend a method

comparing summed N- and C- terminal propensities from the work of Haverland et al.175

97

Figure 4-6. Internal fragments enable 100% sequence coverage and disulfide bond localization.

Comparison of native and reduced SOD1G93A top-down sequencing between using only terminal

fragments, only internal fragments, and both together. FSD energies of 100-140 V were used for

native protein, and 80-140 V for reduced protein, both in increments of 10 V. The black boxes

and connecting line in Native SOD1G93A indicate an intramolecular disulfide which complicates

terminal sequencing. The combined results of the top (orange-rimmed) and center (green-

98

rimmed) plots are complimentary and allow the creation of the bottom (purple-rimmed) plot.

Consideration of internal fragments expanded cleavage site sequence coverage of disulfide-

reduced SOD1G93A from 30% to 81%, coverage of native SOD1G93A from 15% to 65%, number

of product ions used for reduced SOD1G93A from 54 to 344, and number of product ions used

for native SOD1G93A from 32 to 157.

We compare Native and Reduced SOD1G93A, where a disulfide bond generating a loop

region within the protein is either intact or has been cleaved. Coverage within the loop region during native analysis is impossible without use of internal fragments, as shown in Figure 4-6.

Reducing the protein prior to analysis and spraying out of acidic solution allows for significantly

improved top-down protein coverage. Using the acquisition and data analysis methods described

in this work, we expanded “black box” (i.e. using default preprocessing parameters) sequence

coverage by 6- and 16-fold for native and reduced SOD1G93A, respectively, and “experienced-

user” sequence coverage by 4-and 3-fold for native and reduced SOD1G93A, respectively.

Frameshift ambiguity assignments alone accounted for a c.a. 1.5-fold improvement in unique ion assignments (Table 4-2).

99

Terminal Fragments Terminals and Only Internals Before Preprocessing Optimization Native Reduced Native Reduced Before Frameshift Assignment Residues Observed in Fragments 22.8% 10.5% 70.0% 88.2% Sequence Coverage (Cleavage Sites) 9.9% 5.2% 48.7% 58.0% Product Ions Assigned 15 8 92 119 After Preprocessing Optimization Native Reduced Native Reduced Before Frameshift Assignment Residues Observed in Fragments 100.0% 100.0% 100.0% 100.0% Sequence Coverage (Cleavage Events) 15.1% 29.6% 59.2% 75.0% Product Ions Assigned 32 54 97 234 After Preprocessing Optimization Native Reduced Native Reduced After Frameshift Assignment Residues Observed in Fragments 100.0% 100.0% 100.0% 100.0% Sequence Coverage (Cleavage Sites) 15.1% 29.6% 64.5% 80.9% Product Ions Assigned 32 54 157 344

Table 4-2. Fragmentation coverage and ions generated when using terminal fragments alone vs. using terminals and internals. Results shown at three indicated steps in acquisition and assignment process. Key improvements of preprocessing include correcting repetitive building block from Averagine to exact protein composition, tuning of quality factor threshold to achieve a lower-than 1% false positive rate, and applying an internal calibration to commonly observed b ions. Assigning previously ambiguous frameshift assignments accounted for a 1.62- and 1.47- fold increase in product ions assigned for native and reduced, respectively.

4.5 Conclusions

We addressed the principal challenges preventing the application of internal fragments in top-down protein mass spectrometry (e.g. scaling of ambiguity with protein size and mass accuracy) and introduced strategies to overcome these that enable increased sequence coverage.

100

We present the concept of frameshift ambiguity, and that the a priori assignment of all but the

terminal residue can be made in such cases. Of note, mass accuracy of better than 1 ppm and accounting for frameshift ambiguity should be prioritized for comprehensive protein analysis since these resulted in fragments covering 100% of SOD1’s sequence and a jump from a level IV

(high ambiguity proteoform identification) to a level I (lowest ambiguity proteoform

identification).185 Notably, we demonstrated these capabilities using a relatively challenging protein for TDMS, SOD1, which: has Amyotrophic Lateral Sclerosis-causing mutations and

potentially toxic modifications existing within the intra-linked “loop” region of the protein; is difficult to denature and fragment compared to benchmark proteins such as ubiquitin;167 is

multimeric, and; has a variety of function (e.g. disulfide and metals) and disease-associated modifications.9, 196

While the focus of this study is upon using internal fragments to improve sequence

coverage, we did note that the use of internal fragments also increased the confidence level in

Mascot ID’s (Figure 4-2). We expect that in experiments where the number of observed internal

fragments scales with the number of theoretically possible fragments, or in instances when only a

small proportion of terminal ions can be observed, internal fragments can have utility in protein

identification. Further investigations into the mechanisms governing internal fragment formation

will further guide ion assignment, along with improvements to mass spectrometer mass accuracy

and enhanced MS3 capabilities.

4.6 Method Development

This section serves to illustrate various elements of method development for internal fragment ion assignment discussed throughout this chapter: Table 4-3 shows one of many

101

examples of a terminal fragment generated at relatively low skimmer 1 voltage (70 V in this case) beginning to further fragment at higher voltages into a secondary terminal ion and its associated internal fragment; Table 4-4 shows the importance of achieving high mass accuracy when performing top-down fragmentation, preventing mis-assignments and achieving greater sequence coverage; Table 4-5 demonstrates how fragmentation propensities are calculated, assigned, and subsequently used to assign ambiguous internal fragments in the case of arrangement or frameshift ambiguity; Figure 4-7 demonstrates the internal fragment mapping process that is used to generate internal fragment and all ion coverage maps seen in Figure 4-6;

Figure 4-8 is an assignment chart that can be used to evaluate and assign all internal fragments in

a TDMS experiment.

Ion [M+H] Intensity (AU) m/z Charge Terminal b96 10075.981 16220193 1008.505 10+

2° Terminal b54 5768.928 8363859 962.328 6+ Associated Internal yb(55-96) 4308.060 5901099 1077.771 4+

Table 4-3. Demonstration of sequential top-down fragmentation of SOD1G93A at a declustering potential (90 V) where a terminal b-ion begins to undergo further fragmentation to a secondary terminal ion and its associated internal fragment. This sequence shows SOD1G93A intact protein

initially fragmenting to a b96 ion observed with a 10+ charge, which subsequently begins to

further fragment at the amide bond between residues 54 and 55 generating a 6+ b54 ion and a 4+

yb(55-96) ion.

102

Calc. Meas. Error Potential Internal Fragment Mass Mass (ppm) ID? DVSIED 659.2883 659.2875 1.18 No

GPKDEERHVGDLGNVTADKDAVADVSIEDSVISLS 3619.7722 3619.7761 1.07 No PKDEERHVGDLGNVTADKDAVADVSIEDSVISLSG 3619.7722 3619.7761 1.07 No SAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKD 3619.7749 3619.7761 0.33 Yes

Table 4-4. Achieving high mass accuracy, resulting in a 1 ppm mass error tolerance, prevents incorrectly assigning internal fragments. At mass accuracies greater than 1.18, the top fragment

(DVSIED) would have been incorrectly assigned while better mass accuracy suggests it would have been mis-assigned and MS3 confirms this (data not shown). The bottom three potential fragments would have either been assigned as all three or none depending on software settings at mass accuracies greater than 1.07. Mass error tolerance of 1 ppm allows for assignment of this fragment as the third option and MS3 confirms this assignment (data not shown).

103

N-terminal C-terminal Residue Propensity to fragmentation to fragmentation Pair (dTD) A V A|V 19.7% A L A|L 13.8% A P A|P 13.2% A Y A|Y 13.0% A M A|M 12.6% A A A|A 12.0% A I A|I 12.0% A C A|C 11.3% A T A|T 11.0% A N A|N 10.7% A G A|G 10.1% A D A|D 9.5% A W A|W 9.2% A F A|F 8.6% A S A|S 8.0% A Q A|Q 6.7% A E A|E 6.0% A K A|K 3.9% A H A|H 1.5% A R A|R 0.4%

Table 4-5. ‘N-terminal to alanine’ portion of denatured fragmentation propensity chart with values taken from Haverland et al. (2017)175, re-organized by our lab for predicting internal fragment identity. These values are used in the case of ambiguous internal fragments. Haverland et al. also provided data for native fragmentation propensities which were used for assignment of native fragmentation internal ion assignment. Propensities for comparing internal fragments were calculated by comparing summed propensity scores for N- and C-terminal cleavage events and choosing the higher value if it was larger by more than 5%. If values were within a 5% margin, both fragments were assumed to have formed and were assigned.

104

Figure 4-7. Internal fragment assignment mapping procedure. Displayed is the color-coded method we used to map internal fragments to the SOD1G93A sequence. In the left-side vertical

columns are the protein sequence, coverage scores, and fragmentation scores. Internal fragments

observed are unambiguous as yb (magenta), unambiguous as yb and ya ions (blue), frameshift

ambiguous (yellow), or arrangement ambiguous (purple). The purpose of this figure is to show

the general method which could be further evaluated and translated into the final heat map seen

in figure 4-6.

105

Figure 4-8. Proposed workflow for internal fragment assignment. Accurate mass alone

(assuming good resolving power) allows for the majority of fragments to be assigned. Of the remaining ambiguous fragments, accounting for frameshifts allows for greater than 93% of total fragments to be locally assigned. Using established fragmentation propensities allows for putative assignments of all ambiguous fragments. Further evaluations with MS3 experiments allow for definitive assignments.

106

Chapter 5

Genetically Encoded Fluorescent Proteins Enable High-Throughput Assignment of Cell- cohorts Directly from MALDI-MS Images

Nicholas D. Schmitt†‡, Catherine M. Rawlins†‡, Elizabeth C. Randall§, Xianzhe Wang†, Antonius

Koller†, Jared R. Auclair†‖, Jane-Marie Kowalski , Paul J. Kowalski , Ed LutherO, Alexander R. ∇ ∇ Ivanov†, Nathalie Y.R. Agar§⁋, Jeffrey N. Agar*†O

†Department of Chemistry and Chemical Biology, and Barnett Institute of Chemical and Biological Analysis, Northeastern University, Boston, MA, 02115, USA ‡These authors contributed equally to this work.

§Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA

‖Biopharmaceutical Analysis Training Laboratory (BATL), Northeastern University Innovation Campus, Burlington, MA, 01803, USA

Bruker Daltonics, 40 Manning Road, Billerica, MA, 01821, USA ∇ ODepartment of Pharmaceutical Sciences, Northeastern University, Boston, MA, 02115, USA

⁋Department of Neurosurgery, Brigham and Women’s Hospital, Department of Cancer Biology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, 02115, USA

Reproduced with permission from Genetically Encoded Fluorescent Proteins Enable High-Throughput Assignment of Cell-cohorts Directly from MALDI-MS Images, Analytical Chemistry 2019 91 (6), 3810-3817 DOI: 10.1021/acs.analchem.8b03454 Copyright 2020 American Chemical Society.197

107

5.0 Statement of Contribution

Experimental contributions to this chapter by Nicholas D. Schmitt are as follows: Nicholas D.

Schmitt and Jeffrey N. Agar performed all Mass Spectrometry Imaging data analysis. Nicholas

D. Schmitt performed all top-down mass spectrometry data analysis. Nicholas D. Schmitt,

Jeffrey N. Agar, and Catherine M. Rawlins interpreted all results, prepared all Figures, and wrote the manuscript.

108

5.1 Abstract

Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry imaging (MSI)

provides a unique in situ chemical profile that can include drugs, nucleic acids, metabolites,

lipids, and proteins. MSI of individual cells (of a known cell-type) affords a unique insight into

normal and disease-related processes and is a prerequisite for combining the results of MSI and

other single-cell modalities (e.g. mass cytometry and next-generation sequencing). Technological

barriers have prevented the high-throughput assignment of MSI spectra from solid tissue

preparations to their cell-type. These barriers include obtaining a suitable cell-identifying image

(e.g. immunohistochemistry) and obtaining sufficiently accurate registration of the cell-

identifying and MALDI-MS images. This study introduces a technique that overcame these

barriers by assigning cell-type directly from mass spectra. We hypothesized that in MSI from

mice with a defined fluorescent protein expression pattern, the fluorescent protein’s molecular

ion could be used to identify cell cohorts. A method was developed for the purification of

enhanced yellow fluorescent protein (EYFP) from mice. To determine EYFP’s molecular mass for MSI studies, we performed intact mass analysis and characterized the protein’s primary structure and post-translational modifications through various techniques. MALDI-MSI methods were developed to enhance the detection of EYFP in situ, and by extracting EYFP’s molecular ion from MALDI-MS images, automated, whole-image assignment of cell-cohorts was achieved.

This method was validated using a well-characterized mouse line that expresses EYFP in motor and sensory neurons and should be applicable to hundreds of commercially available mice (and other animal) strains comprising a multitude of cell-specific fluorescent labels.

109

5.2 Introduction

Recent technological breakthroughs enable increasingly detailed analysis of individual

cells.198-199 In particular, the analysis of cell types that are amenable to cell-sorting (e.g. tissue dissociated, immune, and circulating tumor cells) is being revolutionized by RNA sequencing

(RNA Seq),200 liquid chromatography-tandem mass spectrometry (LC-MS/MS),201 and mass

cytometry-based proteomics.202 Unfortunately, cells within solid tissues—especially cells with

fragile projections (e.g. neurons)—are not amenable to high-throughput cell sorting. In addition,

many normal and pathological processes depend upon cell-cell interactions, necessitating in situ

analysis.203 For example, the death of motor neurons leads to paralysis, the predominant phenotype of Amyotrophic Lateral Sclerosis (ALS). Interactions of motor neurons with

microglia, Schwann cells, and astrocytes modulate the age of onset and severity (rate of

progression) of ALS.204 In ALS and other diseases of solid tissues, single-cell protein analysis has traditionally been limited to the targeted analyses of a few proteins using immunohistochemistry. This study addresses the unmet need for the unbiased, in situ proteomics characterization of specific cell-types, including motor neurons.

Single-cell mass spectrometry methods, including secondary-ion mass spectrometry

(SIMS), imaging mass cytometry (IMC), and matrix assisted laser desorption/ionization mass spectrometry imaging (MALDI-MSI) have been recently reviewed.205-206 Due to their unique

strengths and weaknesses, SIMS, IMC, and MALDI-MSI are considered complementary

techniques.207-208 IMC is currently the only technique that can analyze the proteome of single

cells to sufficient depth—as many as 36 proteins simultaneously—to potentially identify cell-

types in situ.209 IMC does not suffer from the low-mass bias of other MSI techniques, affording

superior sensitivity for large proteins. However, IMC is a targeted technique with many of the

110

drawbacks of immunohistochemistry (IHC), including the requirements for antigen presentation

and validated antibodies.

SIMS is the most prolific method for single-cell MSI 210-211 and can produce multiplexed

elemental (multiple stable-isotopes) and molecular images (> 20 molecules) at subcellular

resolution. A unique characteristic of SIMS is its focused ion beam, which can permit imaging of

biomolecules in 3D, notably at a depth resolution as high as 5 nm.212 Nano (magnetic sector)

SIMS can probe the location and half-lives of proteins,213 lipids,214 and neurotransmitters at lateral spatial resolutions of 50-100 nm.215 TOF-SIMS is employed for many single-cell analyses, including imaging of vitamin,216 metabolite,217 and lipid217-218 molecular ions at lateral

spatial resolutions as high as 2 μm.217 However, SIMS techniques generally cannot detect the

intact molecular ions of endogenous peptides and proteins, or other masses greater than 1,000

Da. Such molecular ions (detected by MALDI-TOF MS) have proven diagnostic utility, serving

as the basis for the clinical identification of bacterial species (with regulatory approval from the

US FDA, EMA, etc.).219

MALDI-MSI can detect hundreds of chemically diverse molecules per image including

drugs, metabolites,220 nucleic acids,221 lipids222, peptides,223 and proteins.224 A few of the many applications of MALDI-MSI include: measuring the infiltration of pesticides into plant tissues,225

identifying the substances of abuse within a single human hair,226 characterizing bacterial

biofilms,227 augmenting the traditional tools used by pathologists to classify diseased tissue,228-229

and analyzing drug distribution230-231 in pre-clinical and clinical studies.232 MALDI-MSI

techniques, including advances in spatial resolution, have been reviewed.233-234 Laborious sample

preparation techniques, such as laser microdissection,235 tissue stretching,236 and microinjection

of matrix, 237 enable single cell MALDI-TOF MS. Innovations including lasers with increased

111

repetition rates, improved laser optics, laser oversampling,238 and automated matrix deposition

enable high-throughput imaging at ~5-10 µm spatial resolution using commercial MALDI-MSI instrumentation, and c.a. 2.5 μm with custom-built instrumentation.224, 239

Assigning a mass spectrum to the cell it originated from remains a major challenge. In one technique, cells are dissociated from tissue, and unsupervised classification is used to cluster spectra and infer several cell-types.240 The goal of this study, however, is assigning mass spectra

derived from intact tissue sections to their cell type. Previous studies approached this goal by the

registration of MSI with a secondary cell-identifying image but encountered several limiting

factors. Obtaining a cell-identifying image can be difficult: histology stains such as hematoxylin

and eosin (H&E) are not cell-selective; immunohistochemistry can be cell-selective, but requires

validated antibodies; and only a limited number of cell types can currently be classified using

spectroscopic methods such as FTIR imaging241. In the event a cell identifying image is

obtained, it cannot be registered to the MS image with cellular-scale accuracy.

As we demonstrate below, typical prospective MALDI-MSI methods result in registration errors of 200 µm or more. Improvement of the size and shape of the fiducial markers, the number of fiducials (c.a. 20), and registration (probabilistic non-rigid transformation),240 have been implemented in fit-for-purpose software.242 These innovations

improved expert’s average registration accuracy from 164 µm to 38 µm (range of registration

errors c.a. 10-140 µm), permitting targeted studies of dispersed, individual cells.240 McDonnell and colleagues developed an automated approach that can register MSI to histology for a variety

of tissue types and instrument platforms, which reduces retrospective registration errors to 40-80

µm.243 Previous techniques, however, are not sufficient for identifying the densely populated

cells in thin tissue sections.

112

The in situ MALDI-MS analysis of known mammalian cells is currently limited by the

requirements of cell-identifying images and image registration. Here, we present a technique

with potential to forgo both of these requirements. A fluorescent protein with cell-cohort specific expression is used as a “mass marker” to identify cell types directly from MALDI-MS images.

Prior to the present study the mass of our intended mass marker, enhanced YFP (EYFP), was uncharacterized. We purified EYFP from mouse brains; characterized its mass and primary structure using MS; optimized the in situ MALDI detection of EYFP; and used EYFP detection as a proxy for neuron-type in MSI. The strains of mice used here to label specific cells types, and hundreds of others like them, are commercially available through Jackson Laboratories.

5.3 Experimental Procedures

5.3.1 Chemicals

Sinapic acid (SA) matrix, α-cyano-4-hydroxycinnamic acid (CHCA) matrix, HPLC grade acetonitrile (ACN), HPLC grade water + 0.1% formic acid (FA), and HPLC grade water + 0.1% trifluoroacetic acid (TFA) were purchased from Sigma-Aldrich (St. Louis, MO, USA). HPLC grade ethanol was purchased from Fisher Scientific (Hampton, NH, USA).

5.3.2 Vertebrate Animal Subjects

Transgenic “thy-1 YFP-16 mice” (B6. Cg-Tg(Thy1-YFP)16Jrs/J, Stock No. 003709)

(The Jackson Laboratory, Bar Harbor, ME, USA) were housed in groups at the Northeastern

University Animal Care Facility. The use of these animals was approved by the Animal Care and

113

Use Committee at Northeastern University (IACUC protocol #16-0303R) and was in accordance with the federal, local, and institutional guidelines.

5.3.3 Additional mice used

Transgenic “thy1 YFP-H” (B6. Cg-Tg(Thy1-YFP) HJrs/J, Stock No. 003782) (The

Jackson Laboratory, Bar Harbor, ME, USA) that were housed at the Brandeis University Animal

Care Facility. Adult male and female mice were sacrificed between 197 - 250 days old via CO2

inhalation and standard methods. Brains were extracted, frozen in liquid nitrogen, and stored at -

80 °C.

5.3.4 Ammonium Sulfate Fractionation

An LS50B Luminescence Spectrometer (Perkin Elmer, Waltham, MA, USA) was used to

monitor EYFP at 485 nm excitation and 527 nm emission, and a dilution series of recombinant

YFP standard (BioVision Incorporated, Milpitas, CA, USA) was used to determine the

approximate concentration of the EYFP. For purification experiments, one EYFP-expressing mouse brain was homogenized with a Dyna-Mix homogenizer (Fisher Scientific, Hampton, NH,

USA) in 10 mM Tris buffer (pH 8.0) in ten 30 second intervals on ice. This homogenate was centrifuged at 13,000 g for 10 minutes at 4 °C and the supernatant was found to contain EYFP fluorescence. Following optimization experiments at a variety of ammonium sulfate (AS) concentrations, fractionation using 20% w/v AS (centrifugation for 10 minutes, 4 °C at 13,000 g) was determined to result in optimal EYFP enrichment.

114

5.3.5 Fast Protein Liquid Chromatography (FPLC) Separation of EYFP

The 20% AS supernatant was solvent exchanged into water, then purified on a MonoQ

10/100 GL anion exchange column (GE Healthcare, Piscataway, NJ, USA) using an ÄTKA

FPLC system with INV-907 Valve System. The gradient went from 0-100% buffer B with 10

mM Tris, pH 8.0 (A) and 10 mM Tris with 1 M NaCl, pH 8.0 (B) buffers collecting 5 mL

fractions. 41 fractions were collected, concentrated to 1 mL, and cleaned via ultrafiltration.

5.3.6 Ultrafiltration

The supernatant was transferred to Amicon® Ultra 50 mL Centrifugal Filter with a regenerated cellulose membrane and a 3 kDa molecular weight cutoff (MWCO) (Millipore, Billerica,

Massachusetts, USA). The sample was diluted with 45 mL of 10 mM ammonium bicarbonate at pH 7 and centrifuged at 2700 RPM for 60 minutes three times, then exchanged into HPLC grade water three times.

5.3.7 LC-ESI-MS Intact Protein Analysis

Using standard procedures (see page S3), fractions with YFP fluorescence were analyzed

on a H-Class Acquity UPLC system coupled to a Xevo G2-S Q-ToF mass spectrometer (Waters

Corp, Milford, MA) using an Acquity UPLC Protein BEH C4 (300 Å pore size, 1.7 µm particle size, 2.1 mm ID x 100 mm) column (Waters Corp, Milford, MA).

5.3.8 Peptide Mass Fingerprinting

Select fractions from each purification were digested in solution with 1 µg/µL of trypsin

at 30 °C overnight. The peptides were spotted on a MALDI target with recrystallized CHCA

115

matrix in 60:40 HPLC grade ACN: HPLC grade water + 0.1% TFA. The data were collected on

a solariX XR 9.4T FTICR (Bruker Daltonics, Billerica, MA, USA) with a 1.6 ms TOF, sweep

excitation power set at 20%, 22% laser power on the small laser setting with 32 shots/spot. The

SNAP II algorithm was applied within DataAnalysis (Bruker Daltonics, Billerica, MA) to

deisotope and generate monoisotopic mass (0.7 quality factor threshold and an S/N threshold of

2). Mass determination and ID of purified EYFP came from MASCOT (Matrix Science, Boston,

MA, USA) and BioTools (Bruker Daltonics, Billerica, MA, USA). Digests were searched using

peptide mass fingerprinting with a 15 ppm error tolerance allowing for two partial missed

cleavages. Sequence modifications included N-terminal acetylation and a custom modification of

-20.026 Da (monoisotopic mass) on the chromophore region, residues 65-67 (G’Y’G’).

5.3.9 LC-ESI-MS

The YFP standard, diluted to 250 nM, 500 nM, and 1 µM stocks, were run as controls

accounting for the His-tag on the N-terminus. The solvents were, A: 95% HPLC grade water, 5%

ACN+0.2% formic acid and B: 5% HPLC grade water, 95% HPLC grade ACN+0.2% FA, and

were used with a gradient of 0-10 minutes at 5% B, 12 minutes at 15% B, 37 minutes at 55% B,

40-43 minutes at 95% B, and 45-60 minutes at 5% B. The method was run in sensitivity analyzer

mode with a 500 – 4,000 m/z mass range at 1.00 s/scan time at a 200 µL/min flow rate with the

capillary voltage set to 3 kV and the sample cone voltage set at 40 V. The source temperature

was kept at 150 °C and desolvation temperature at 350 °C with a gas flow of 800 L/h. All

analyses and processing were performed using Waters UNIFI 1.7.1 software.

116

5.3.10 Top-Down EYFP Characterization

Top-down protein characterization was performed on a Bruker 9.4 T solariX XR Mass

Spectrometer (Bruker, Billerica, MA) using ESI-FSD. Intact EYFP was infused at 3µL/min in

50:50 acetonitrile:water with 0.1% formic acid and fragmented in the region between skimmer 1 and funnel 2 with a declustering potential of 132 V. Ions were accumulated for 0.085 seconds per scan and transferred to a paracell. Forty 1.2 second transients were summed at 2 megawords per scan. N-terminal sequence and modifications were determined using DataAnalysis and BioTools software from Bruker with an RMS error of 2.8 ppm at 1,000 m/z.

5.3.11 Matrix Deposition and Protein Extraction

Three different methods of matrix deposition were tested with sinapic acid (SA). For the

TM Sprayer (HTX Technologies, Carrboro, NC), SA was deposited at 20 mg/mL in 50:50 HPLC grade ACN:HPLC grade water + 0.2% TFA. The nozzle temperature was set at 30 °C with 50:50

ACN:HPLC grade water + 0.2% TFA as the pushing solvent (flow rate 0.200 mL/min). The sprayer velocity was set at 900 mm/min with 2 passes, 10 psi gas flow, resulting in a density of

0.0044 mg/mm2. After coating, the slide was rehydrated at 80 °C for 3.5 minutes with 1 mL of

5% acetic acid.

5.3.12 Additional Matrix Deposition

For the ImagePrep (Bruker Daltonics, Billerica, MA, USA), SA was deposited at 20

mg/mL in 60:40 HPLC grade ACN: HPLC grade water + 0.2% TFA across half of a slide at a

time using the ImagePrep (Bruker Daltonics, Billerica, MA) to ensure even coating. The settings

117 were adjusted to account for a thick deposition with medium wetness. After coating, the slide was incubated for 2 minutes and then rehydrated for 3.5 minutes with 5% acetic acid at 80 °C.

5.3.13 Tissue Preparation for MSI

The tissue was thawed from -80 to -15 °C in a Microm HM525 Cryostat and affixed to the target with OCT media. Tissue was cryosectioned to 12 µm thickness, thaw mounted onto

ITO slides (Bruker Daltonics, Billerica, USA) and vacuum desiccated for ~30 minutes after removal from the cryostat. The slides were then washed with 70% ethanol two times for 30 seconds and once with 90% ethanol for 30 seconds followed by vacuum desiccation to dry.

5.3.14 Laser Scanning Cytometry (LSC) for Fluorescence Detection

Quantitative images of 12 µm sections of the YFPH tissue were generated using an iCyte

Automated Imaging Cytometer with iGen Software (CompuCyte Corporation, Westwood MA,

USA). Hoechst stain 33342 (Thermo Fisher Scientific, Waltham, MA) was manually deposited onto the tissue with a final concentration of 10 µg/ml in 70: 30 HPLC grade ethanol: HPLC grade water. The system used 405 nm and 488 nm filters, and raster arrays of fluorescence measurements were generated matching the spectra of the Hoechst and EYFP. Spatial resolutions ranging from 20 to 0.25 µm resolution were generated by controlling the spacing between the scan lines obtained with a 40X objective lens allowing consistent sensitivity levels at all magnifications.

118

5.3.15 Fluorescence Microscopy

A Zeiss Axio Observer Z1 fluorescence microscope with an AxioCam HRC color camera

(Carl Zeiss, Oberkokchen, Germany) was used to collect fluorescence images of the mouse brain

tissue sections. A green fluorescent filter (507 nm emission) was used and a yellow pseudo-color

applied. An image of the whole slide, including fiducial marks, was collected using the 10x

fluorescent objective and stitching was applied with 2% overlap. Images were exported as high-

resolution JPEGs to be uploaded into flexImaging.

5.3.16 MALDI-MSI

Imaging experiments were conducted using an ultrafleXtreme MALDI TOF/TOF with a

1 kHz Smartbeam laser (Bruker Daltonics, Billerica, MA, USA). Intact proteins were analyzed from 4,000 – 40,000 m/z with the global attenuator offset at 25%, and the lens adjusted to 7 kV.

The digitizer was set to 0.08 Gs/s and the pulsed ion extraction was set to 50 ns. MSI was

collected at 25 and 50 µm spatial resolution at 30 shots/spot, 1000 Hz, on medium energy

distribution set to 75% laser power with hexagon measuring raster. Analysis of MALDI-MSI

data was conducted in flexImaging, and flexAnalysis (Bruker Daltonics, Billerica, MA, USA)

and SCiLS 2016b (Bremen, Germany).

5.3.17 Analysis of MALDI-MSI Data

MS images were processed in FlexImaging (Bruker Daltonics, Billerica, MA, USA) as

follows. Spectra were subject to the minimum possible data reduction and subject to RMS

normalization (other normalization methods were also tested but increased the chance of false-

positives). Mass filters were chosen to include the majority of the peak of interest (e.g. >80 %),

119

and signal intensity was integrated throughout this region. To assure that pixels depicted as YFP

were true-positives, a local minimum (i.e. no protein ions detected) adjacent to the YFP mass

filter (and of the same m/z window width) was used to establish a noise threshold for the YFP

mass filter. Thresholding was applied such that in a given image, the number of colored pixels in

the negative control region was < 5% of the number of pixels in the YFP region (i.e. greater than

95% of the pixels depicted as YFP, and far greater than 95% of the total YFP intensity, are true- positives). This same method was also applied to the analysis of Purkinje cell protein 4 and myelin basic protein as described in section 4.4.6.

Individual spectra (e.g. summed or extracted) were subject to centroid peak detection with an S/N threshold of 2, top-hat baseline subtraction, and were calibrated in flexAnalysis

(Bruker Daltonics, Billerica, MA, USA) using internal calibration of known mouse brain

proteins, listed in Table 5-1.244-246 MSI data were analyzed in SCiLS 2016b (Bremen, Germany) applying manual segmentation of the motor cortex and corpus callosum, by reference to the

Mouse Brain Atlas in Stereotaxic Coordinates. Spectra exported from SCiLS were processed by manual linear baseline subtraction (c.a. five points) in Excel.

5.3.18 LC-MS/MS-based Proteomic Profiling of Mouse Brain Tissue Sections

Proteomic analysis was conducted according to experimental protocols described in Li et

al.68 with minor changes detailed below. For LC separation, 5 cm of 50 μm i.d. 1-decene

modified PS-DVB monolithic SPE precolumn was connected to a 4.2 m PLOT by a PicoClear

Tee (NewObjective, Woburn MA). Digested brain tissue lysates were first loaded on the

monolithic SPE precolumn at a flow rate of 200 nL/min using a NCS 3500 RS pump (Thermo

Fisher Scientific, Sunnyvale, CA). Then, the digest was eluted off the precolumn and separated

120 on the PLOT column using a linear solvent gradient at a 20 nL/min flow rate split from 400 nL/min using an NCS 3500 RS pump (Thermo Fisher Scientific). The separation was performed using a 2-hour gradient of 0 - 27% mobile phase B (mobile phase A: 0.1% FA in water; mobile phase B, 0.1% FA in ACN). After completion of the gradient, the SPE and PLOT columns were washed with 90% B for 10 min and re-equilibrated with mobile phase A for another 20 minutes.

Nano-ESI spray used an electrospray voltage of 1.25 kV and a distal coated tip (FS360-

20-5-D-20, NewObjective) butt-to-butt connected with the outlet of the PLOT column via a zero dead volume PicoClear union (New Objective). The ion transfer tube temperature in the MS was set for 275 0C. (Note: the electrospray source must be handled with caution to avoid contacts with heated surfaces and high voltage.)

MS analysis was performed using a top 12 MS/MS data-dependent scan protocol on a Q

Exactive (Thermo Fisher Scientific) mass spectrometer. Full MS scans were acquired over the range of m/z 400 - 1600 Thompson units with resolution set to 70,000 (at m/z 200) and an automatic gain control (AGC) target set to 1x106. The 12 most intense parent ions excluding singly charged ions and ions with unassigned charges were selected for higher-energy collisional dissociation (HCD) fragmentation with a normalized collision energy (NCE) at 28%. The

MS/MS spectra were analyzed in the Orbitrap mass analyzer using resolution of 17,500 and

AGC of 1x105. The isolation window was 2 m/z and dynamic exclusion of 45 s. The maximum ion injection time was 80 ms for full MS scans and 120 ms for MS/MS scans.

LC-MS/MS raw data files were analyzed using Proteome Discoverer 2.1 (Thermo Fisher

Scientific) using a database search engine Sequest HT (Thermo Fisher Scientific) against the

UniProt mouse database (October 22, 2018, containing 54189 sequences) with appended database of common contaminants (275 common contaminants and EYFP protein sequence).

121

MS/MS spectra were searched for fully tryptic peptide matches using a false discovery rate

(FDR) ≤1% for filtering. Carbamidomethylation (57.021 Da) was set as a fixed modification, and

N-terminal acetylation, methionine oxidation and deamidation (asparagine and glutamine) were

variable modifications as well as a custom fluorescence modification of -20.026Da on tyrosine.

The precursor peptide mass tolerance was 10 ppm and fragment tolerance at 0.02 Da. The results

of the searches were verified using the Percolator module with filters set to high peptide

identification confidence to achieve an FDR ≤1%. The protein identifications were reported as

protein groups according to the parsimony approach in case that a peptide sequence can be

identified as a match for multiple proteins in the database.

5.4 Results and Discussion

Most published MALDI-MSI studies employ Bruker’s software, which limits the number

of fiducials to three and employs a rigid image registration. Considering the work of Sweedler

and colleagues242, these conditions extrapolate to a maximum (expert) registration accuracy of

185 µm.240 This accuracy is insufficient for assigning mass spectra to cell-defining features and,

in many cases, to an anatomical region (e.g. distinguishing the adjacent, functionally distinct cerebellum and locus coeruleus). Even when restricted to a single point pair (i.e. simulating the

image-guided analysis of one cell), cellular-resolution registration accuracy could not be achieved using instrument manufacturer’s software (Figure 5-1). An additional systematic registration error is introduced by laser misalignment, which for example, was 349 ± 19 µm and

265 ± 17 µm in the x and y axes respectively in our as-installed MALDI-FTICR source. Our previous studies worked around these limitations by the manual intensity-based (retrospective) registration of small areas (e.g. 200 µm2) or by microinjection of individual fluorescent neurons

122

with MALDI matrix.234, 237 The aftermarket software microMS242 enables additional fiducials (20

or more) and fluorescent cells to be targeted for MALDI-MS acquistion, provided the cells are

sufficiently dispersed (i.e. are not part of a thin tissue section). We reasoned the sources of error

in registration, the laborious workarounds, and even the need for an optical cell-identifying

image, could all be avoided by using the in situ mass of EYFP as a cell-specific mass marker.

Figure 5-1. The set-up of fiducial “teach points” in flexImaging introduces registration errors

that preclude the targeted analysis of individual cells. As recorded from the camera inside the

rapifleX, A) and C) show WiteOut™ marks deposited on the ITO coated glass slide. Unique

features are selected as fiducials and registered (at the red crosshairs) with the internal

coordinates of the rapifleX (and MALDI image). B and D) The same fiducials in register with a

fluorescent microscopy image (white crosshairs). Employing three teach points (i.e. the

maximum permitted by the software), flexImaging was used to register and fuse the internal

instrument coordinates (denoted with the red “+”) with the fluorescence image, and the fiducial

registration error (FRE) was calculated. These fiducial registration errors represent the expected

123

error in targeting one or two fluorescent cells for MALDI-MS analysis, but their effect upon whole-image registration error must be interpreted cautiously. In the present case (rigid transformation of well-defined fiducials positioned near the corners of the image) FREs should

be proportional to the image registration error.247

5.4.1 EYFP as a Tool to Define Neuron Cohorts

The objective of this study is to identify the spectra within a MALDI-MS image that are

derived from a specific neuronal cohort, using the detection of a fluorescent protein’s molecular

ion. Fluorescent proteins derived from the green fluorescent protein (GFP) of Aequorea

victoria,245 are well-established tools for in vivo labeling.244, 246 Sanes and colleagues established lines of mice with red, green, yellow, and cyan fluorescent proteins (together termed XFP) expressed in distinct neuronal cohorts, under control of the neuron-selective Thy1 promoter. The expression pattern of each of these lines, which has been described in detail,248 covers many

parts of the peripheral and central nervous system, and favors motor and sensory neurons.

Our interest in motor neuron physiology led to the use of two mouse lines expressing enhanced

YFP (EYFP), a variant of GFP245, 249 that includes the fluorescent-yield enhancing T203Y

mutation. ‘Thy-1 YFP-16’ mice express EYFP in motor, sensory, and a subset of central neurons

in cortical layers 2-6, along their axons. ‘Thy-1 YFP-H’ mice express EYFP in the cells that are

selectively vulnerable to degeneration in ALS, layer 5 motor neurons (Figure 5-2).250 Laser

scanning cytometry (LSC), which quantifies fluorescence throughout an entire tissue section (i.e.

signal is not attenuated by depth in thin tissue sections), was used to identify all cells in a

Hoechst stained tissue section.251 Consistent with numerous qualitative histopathology studies,

quantitative LSC images confirm that other cell-types vastly outnumber neurons, and that

124

neuron-specific chemical profiles would be diluted by glial cell chemical profiles at spatial

resolutions above c.a. 25 µm.

Figure 5-2. Genetically-encoded fluorescence enables the detection of cell cohorts in situ.

Quantitative laser scanning cytometry (LSC) images of a 12 µm thick section from a ‘Thy-1

YFP-H’ mouse strain shown at varying magnifications illustrate neurons (yellow, entire neuron labeled) and all cells (blue, nuclei labeled with Hoechst stain) using 408 and 488 nm fluorescent filters. A) Entire coronal tissue section with (B) and (C) showing magnified regions within the motor cortex that contain Layer V motor neurons labeled with EYFP.

5.4.2 Purification of EYFP Proteoforms

The DNA sequence of the XFP variants, including the EYFP variants used to create the

YFP-H/16 mouse lines used here, were not included in the original manuscripts and could not be obtained. Regardless, the DNA sequence would not have accounted for post-transcriptional or post-translational processing. Therefore, the purification of EYFP from Thy-1 YFP-H mice was undertaken to enable purified protein characterization. EYFP fractionation was followed throughout the purification process using fluorescence spectroscopy. Hydrophobic interaction chromatography (HIC) was used previously for YFP and was therefore attempted.252 However, with mouse brain preparations, HIC exhibited insufficient retention of EYFP, and we therefore developed an alternative purification method. This method is described in detail above in the

125

experimental section. Briefly, high resolution (4% w/v intervals) ammonium sulfate fractionation

(Figure 5-3A) was followed by anion exchange chromatography (AEX) (Figure 5-3B). Special care was taken to avoid low pH because at pH 6.5 (the isoelectric point of YFP) 50% of total fluorescence activity is lost.253-254 Additionally, increasing the pH of the buffers from 6.5 to 7 resulted in improved AEX retention and fractionation. EYFP proteoforms were present in five of the 41 AEX fractions. Following buffer-exchange, these fractions were analyzed by LC-ESI-MS to determine the intact mass, and two proteoforms were detected at 26,882 and 26,754 Da

(Figure 5-3C). This intact mass difference and peptide mass fingerprinting (PMF, see below)

indicate that the less abundant proteoform (c.a. 25% of total YFP by MS or fluorescence intensity) results from the loss of the C-terminal lysine from the more abundant proteoform,

likely during purification by carboxypeptidase activity. Only the more abundant proteoform was

detected in situ (see below).

126

Figure 5-3. Isolation of EYFP proteoforms from YFP-16 mouse brains. A) Fluorimetry at 527

nm confirms successful ammonium sulfate fractionation of EYFP. B) The EYFP-enriched

ammonium sulfate fraction was subjected to anion exchange chromatography. Subsequent fluorimetry at 527 nm indicated that fractions 7-11 contained EYFP. C) LC-ESI-MS analysis

(top: raw spectra, bottom: deconvoluted spectra) of fractions 8 (exemplar of fractions 7-9) and 11

(exemplar of fractions 10 and 11) assessed the intact mass and purity of EYFP. Two EYFP

127

proteoforms were detected at intact masses of 26,882 and 26,754 Da. The lighter proteoform was found to be the product of carboxypeptidase activity during purification and was not used for subsequent analysis.

5.4.3 Characterization of EYFP

To date, the purification and fluorometric analysis of YFP has been from bioluminescent

bacteria Vibrio fischeri.255 GFP from A. victoria and several of its variants have been characterized by mass spectrometry,256 but EYFP had not. In addition, there are multiple gene products referred to as EYFP as well as sequence ambiguity within certain gene products.

Employing various mass spectrometry methods, we determined that the EYFP present in these mice contained the following amino acid insertions and substitutions: M1_S2insV, S65G, V68L,

S72A, T203Y, and H231L as compared to GFP (Figure 5-4), consistent with a published EYFP

DNA sequence.51,254 Primary structure and modification state of EYFP were determined through

a combination of top-down protein characterization, peptide mass fingerprinting, and LC-

MS/MS peptide analysis (described in methods) of the EYFP expressed in the studied mouse neurons. The theoretical (26,882.1Da) and experimentally determined (26,881.7 Da, Waters Q-

ToF) average masses of the proposed EYFP proteoforms differed by 15 ppm.

128

Figure 5-4. Characterization of EYFP. Top-down protein characterization, peptide mass

fingerprinting, and LC-MS/MS analysis, together with the intact masses determined above, were

used to confirm the primary structure and modifications of EYFP. A) ESI-FTICR-MS using funnel-skimmer dissociation determined that the EYFP expressed in studied mice underwent N- terminal methionine excision followed by N-terminal acetylation, and also confirmed the N- terminal sequence up to residue twelve. B) Determined sequence and modifications of EYFP expressed in mouse neurons with peptides observed by PMF of neuron-purified EYFP and LC-

129

MS/MS of neuron protein extract shown. G’Y’G’ fixed modification of -20.026 Da

(monoisotopic) was used. C) Crystal structure of EYFP (PDB ID 2YFP) and chemical structure of the G’Y’G’ chromophore (5-imidazolinone ring and didehydrotyrosine).

5.4.4 Tissue Washing to Maintain Fluorescence and Minimize Protein Delocalization

Tissue washing is an essential step for protein MALDI-MSI sample preparation. Washing removes lipids, salts and other interfering substances that can suppress protein ionization allowing the matrix crystals to coalesce with the proteins in situ.257 We determined that

commonly applied washing procedures, such as Carnoy’s wash, as well as the “6-step” wash

developed by Yang et. al.,258 removed all traces of EYFP fluorescence. Experiments with

mixtures of washing solvents indicated optimal wash conditions for maintaining EYFP

localization and fluorescence as 70% ethanol (twice for 30 seconds each) followed by 30 seconds

of 90% ethanol (Figure 5-5).

130

Figure 5-5. Tissue washing techniques for proteins can delocalize EYFP, or otherwise eliminate its fluorescence. One of the most prevalent tissue washing methods for protein MSI involves a 6- step wash combination of organic and aqueous solvents. However, these washes remove the fluorescence. A) The mouse brain tissue before washing and B) after the 6-step wash: 70% ethanol and 100% ethanol for 30 seconds each, Carnoy’s wash (6:3:1 methanol: chloroform: acetic acid) for 2 minutes, 100% ethanol for 30 seconds, HPLC grade water for 30 seconds, and

100% ethanol again for 30 seconds. A combination of ethanol washes was established to be the best method to prevent the diffusion and removal of EYFP while still removing interfering lipids and salts. C) The mouse brain tissue before washing and D) after washing with 70% ethanol twice for 30 seconds each and 90% ethanol for 30 seconds.

131

5.4.5 Optimization for High Mass Protein Detection

The sensitivity of MALDI-MS analysis decreases in proportion to protein size.259 New

detectors, such as the CovalX high mass HM1 detector, are being developed to address this

need.260 Technical challenges associated with imaging of intact proteins of high mass are significant enough that many studies resort to on-tissue digestion.223 For proper

desorption/ionization, proteins need to be sufficiently extracted from the tissue

microenvironment to interact with the matrix, without becoming delocalized.261 The major

factors known to affect protein detection in situ262 were explored in preliminary experiments and

critical parameters are discussed. These included: detergent enhancement of signal (Figure 5-

6),263 temperature (Figure 5-7), the ratio of aqueous to organic solvents in the matrix solution

(i.e. wetness), matrix layer thickness and homogeneity, solvent saturation of the deposition

chamber, automated matrix deposition methods, and droplet size (Figure 5-8). ACN concentration was determined to be critical for manual and automated deposition techniques

(Figures 5-9 and 5-10).257, 264 A subsequent recrystallization or “rehydration” step265 was also

found to improve signal-to-noise. Optimized matrix deposition parameters for the ImagePrep and

TM Sprayers result in comparable protein spectra, including EYFP detection (Figure 5-11). In

control experiments with non-transgenic animals the signal identified as EYFP was not observed.

132

Figure 5-6. Detergent enhancement of matrix can improve protein ionization in MALDI-MS. SA in 60% ACN + 0.1% TFA was spiked with detergents, as described by Cohen and Chait262. A 1

µL droplet of 20 mg/mL SA was deposited in 70:30 HPLC grade ACN: HPLC grade water +

0.2% TFA onto the tissue in EYFP regions left to dry under a small fan with no subsequent rehydration. A) 0.05% SDS, B) 0.1% SDS, C) 0.5% tween 20, and D) 0.1% tween 20 did not enable EYFP detection but had good protein extraction. E) 0.05% triton x100 and F) 0.1% triton x100 enabled protein extraction and EYFP detection. This degree of improved YFP detection was observed with 1 µL droplets applied to tissue (and result in protein delocalization). A commensurate improvement was not observed with (relatively dry) automated matrix deposition.

133

Figure 5-7. Lowering the nozzle temperature on the TM Sprayer improved protein extraction and EYFP detection. We hypothesized that the rate of evaporation (via nebulization gas flow) was too high using default settings, preventing the proper integration of the matrix with the tissue. A) 35 °C and B) 45 °C temperatures enabled a “wetter” deposition similar to the

ImagePrep. Higher nozzle temperatures of C) 55 °C and D) 65 °C had reduced protein extraction and EYFP was not detected.

134

Figure 5-8. Protein extraction and spatial resolution are limited by the size of the matrix droplets. Using a DAPI filter, the matrix crystals can be resolved to determine the homogeneity of the deposition. A) Displays a 10x objective fluorescence microscopy image and B) a 20x objective fluorescence microscopy image of the SA deposited on tissue with 8 passes at 50

µL/min flow rate. C) The resulting MSI and D) the class average spectrum indicates that the lower flow rate results in smaller droplets and less fissures in the matrix layer, but with lower

135

protein extraction and no detection of EYFP. A higher flow rate on the TM Sprayer results in

larger droplets and more matrix fissures but improves protein extraction and EYFP is detected.

E) Displays a 10x objective fluorescence microscopy image and F) a 20x objective fluorescence microscopy image of the SA deposited on tissue with 200 µL/min flow rate and 2 passes. As seen in G) the resulting MSI, and H) the class average spectrum, EYFP is detected and the distribution of the EYFP mass filter corresponds to regions where it is expressed. With the exception of the matrix concentration and number of passes, all other settings in the TM Sprayer were the same resulting in a density of 0.00444 mg/mm2.

Figure 5-9. The solvent composition of the matrix solution influences the detection of proteins

in situ. Representative spectra of three different matrix solvents tested: acetonitrile, isopropyl

136 alcohol, and methanol. Ethanol, Carnoy’s solution, and methanol:ethanol:TFA:acetonitrile solvent combinations were also tested. A 1 µL droplet of 20 mg/mL SA was deposited in 70:30

HPLC grade ACN:HPLC grade water + 0.2% TFA on the tissue in EYFP regions and left to dry under a small fan with no subsequent rehydration A) 60% ACN + 0.1% TFA, B) 60% IPA +

0.1% TFA, and C) 60% methanol + 0.1% TFA, were tested and all enabled detection of EYFP.

ACN and IPA both had comparably higher detection of EYFP than that of methanol.

Figure 5-10. Acetonitrile is the preferred organic solvent for automated matrix deposition techniques. A) Although a 1 µL droplet of 20 mg/mL SA in 60% IPA + 0.1% TFA deposited onto the tissue resulted in protein extraction and EYFP detection, B) the volatile IPA solvent did not produce the same result in the ImagePrep with 20 mg/mL SA. C) A 1 µL droplet of 20 mg/mL

137

SA in 60% ACN + 0.1% TFA deposited onto the tissue and D) 20 mg/mL SA deposited via

ImagePrep both resulted in protein extraction and EYFP detection.

Figure 5-11. Optimized in situ detection of EYFP in MALDI-MS using two different automated deposition systems. A) 60% ACN + 0.1% TFA is the most commonly used solvent for SA but was not effective with the TM Sprayer. B) By adjusting the solvent ratio to 50% ACN, protein

extraction improved and EYFP was detected. C) Protein sensitivity and EYFP detection can also

be achieved with the ImagePrep (Bruker). Rehydration with a 5% acetic acid solution showed

significant improvement in the extraction of proteins and the detection of EYFP. The ImagePrep

produced comparable results to the TM Sprayer but required nearly 10x the amount of time to

coat a slide.

138

5.4.6 Mass Spectrometry Imaging of EYFP

Following the optimization described above, using the TM Sprayer to deposit matrix

(described in detail in the Experimental section above) and the Bruker ultrafleXtreme MALDI

TOF/TOF to acquire MSI, detection of EYFP was achieved at both 50 and 25 µm spatial resolution (Figure 5-12). In addition to EYFP, other proteins such as Purkinjie cell protein 4 and myelin basic protien (and many others not depicted) were localized with MSI in relevant brain regions (Figure 5-12). To improve mass accuracy we performed external calibration using a high mass protein calibrant pre-acquisition, and internal calibration using known mouse brain proteins post-acquisition (Table 5-1). To account for solvent-induced tissue deformation, a post-wash fluorescence microscopy image of the ITO slide was obtained prior to MSI.

139

Figure 5-12. MALDI-MS images in register with the fluorescence image of a YFP-16 mouse

brain at 50 and 25 µm spatial resolution (5-100% max. peak intensity for all proteins shown in all images). A and F) Fluorescence microscopy images of mouse brain tissue sections displaying where EYFP is expressed. B and G) MSI of EYFP with a yellow mass filter (27,000 ± 1.2%) applied. C and H) MSI of Purkinje Cell Protein 4 (Teal, 6720 ± 0.12%), Myelin Basic Protein

(Red, 14,125 ± .25%), and EYFP (Yellow) displayed simultaneously. D, E, I, and J)

Magnifications of the regions from panels C (for D and E) and H (for I and J) outlined in white and magenta, showing a region of the mouse brain containing the striatum, corpus callosum and

140

sensory cortex. K) Overall average spectrum of the MSI displayed in panels F-J, with proteins of interest and their mass filters shown. MSI data processing is described in detail in the methods section.

Theoretical Mass (Da) Protein

6,718 Purkinje cell protein 4

7,064 Myelin Basic Protein (MH2+)

8,565 Ubiquitin

12,138 Cytochrome C

14,126 Myelin Basic Protein (MH+)

16,796 Calmodulin

17,840 PT isomerase

22,104 Brain Acid Soluble Protein 1

23,478 Glutathione transferase

33,197 Syntaxin

Table 5-1: Known mouse brain proteins used for internal calibration of MALDI-MSI spectra.

Calibration can be improved post-acquisition to improve the accuracy of in situ EYFP detection

using the masses of known proteins. The calibration can often be performed within a range of

500 ppm error or ± 3 Da.

The localization of fluorescent cell bodies (neuronal soma) and the EYFP molecular ion

are in general agreement (Figure 5-12). Further analysis (e.g. changing laser spot size and raster

141

area) indicated that at the spatial resolutions required for single-neuron analysis, this technique is sample-limited and approached the detection limit of current instrumentation. One shortcoming of our method was that the EYFP molecular ion was not detected as well in white matter (e.g.

corpus callosum and striatum) as it is detected in gray matter. This is evident in the MS images

and is depicted in the segmentation map produced via bisecting k-means clustering of the

MALDI-MSI data (Figure 5-13). Attempts to improve EYFP detection in white matter were

successful, but required wet-deposition or rehydration that resulted in delocalization of smaller

(e.g. Ub) proteins.

142

Figure 5-13. Segmentation map (k = 2) of manually segmented gray and white matter regions, overlaid on a fluorescence image of EYFP. A segmentation map with 2 clusters demonstrated separation of pixels based on whether EYFP was detected at a given location. This enabled

143

average spectra from EYFP and non-EYFP locations to be produced and compared for different

brain structures. Green pixels indicate spectra containing the EYFP peak and black pixels indicate no or low intensity of EYFP. A) Average mass spectra from the motor cortex and B) the

sensory cortex regions show that EYFP is detectable in the gray matter. C) EYFP is detectable in

the corpus callosum (white matter) but the relative intensity is much lower. Sections of each

spectrum have been magnified to show key proteins in each cluster/anatomic region. The

numbers in blue represent: 1) Purkinje cell protein, 2) ubiquitin, 3) cytochrome c, 4) myelin basic

protein, and 5) PT isomerase, which are some of the proteins used for internal calibration (Table

5-1).

The differential EYFP intensity between gray and white matter was not explored in detail

in the present study, which focused on optimimizing signal from cortical neurons (gray matter).

Further examination of this would address whether EYFP is being preferentially extracted from

lipid-rich white matter during the delipidating tissue wash, or if EYFP signal is suppressed by

components of white matter. Despite these shortcomings, the EYFP molecular ion could be

detected in thousands of neurons per brain section, and this signal generally correlated with a

fluorescent foci (Figure 5-12). To determine whether EYFP-expressing cells happen to express a

protein with a mass similar to EYFP, which could be mis-identified in MS images, EYFP-

expressing cells were isolated, and their proteome was elucidated. From a thin tissue section,

within an area of the motor cortex with high EYFP expression, a region containing

approximately 100 cells was micro-dissected. This tissue sample was prepared and LC-MS/MS

was employed as previously described266, demonstrating EYFP was the predominant protein in

its mass range (Figure 5-14).

144

Figure 5-14. EYFP is the most abundant protein within its mass range. LC-MS/MS analysis was performed on extracts from a ca. 100 cell-area with abundant yellow fluorescence, which were microdissected from a tissue section of YFP-16 mice. Sixteen unique peptides were identified from EYFP. Only two other unique peptides were found from proteins in the mass range spanning ± 250 Da from the mass of EYFP. Details on the methods used are described in the methods section.

5.5 Conclusions

We demonstrate techniques that allow the detection of EYFP—and presumably, related

XFP variants—using MALDI-MSI. When combined with transgenic mice with cell-selective

YFP expression, this overcomes some of the challenges facing cell-cohort characterization using

MSI by eliminating the needs for an optical image and for image registration. If preceded by tryptic digestion, this technique has the potential to be applied to additional MALDI-MS workflows223 and other MSI techniques such as LAESI267 and DESI.268 Spectra generated by

145 this technique could serve as a basis for generating cell-type classifier that could serve to extract cell-cohorts from unlabelled tissues.269-270 MSI at high spatial resolution occurs near the detection limit of a number of proteins, but is expected to increase as instrumentation advances.

Promoters with alternative expression patterns have been employed to create numerous XFP animal lines, labeling additional cell-types271, and have been combined, resulting in XFP mosaics exemplified by the Brainbow mouse.271-272 MSI could complement this technique, or with further instrumentation and method advancement, replace it, if cell-type-specific mass markers are employed.

146

Chapter 6

High Resolution Complex Organic Mixture Mass Spectrometry Method Development

Nicholas D. Schmitt

6.0 Statement of Contribution

Experimental contributions to this chapter by Nicholas D. Schmitt are as follows: Nicholas D.

Schmitt performed all Mass Spectrometry method development, optimization, and data analysis

shown. Nicholas D. Schmitt performed all aspects of dissolved organic matter analysis except for sample collection and sample pre-treatment, though he was trained in both of these procedures and did perform them for a small subset of samples. For the lipid analysis section (6.3), in addition to the mass spectrometry method development, optimization and data analysis previously mentioned, Nicholas D. Schmitt supported sample preparation by advising on protocol and testing various preparations. He also wrote all sections shown and created or

significantly contributed to the creation of all figures shown.

6.1 Introduction

This chapter serves to demonstrate method development for complex organic mixture

high-resolution mass spectrometry. I developed methods such as these for many applications and

collaborative efforts, but the two examples displayed here are for the analysis of dissolved

organic matter (DOM) and cellular lipid extracts. Mass spectrometry of these complex organic

mixtures can be performed in either positive or negative ionization mode depending on the exact species or class of molecules targeted. In the instances shown, all analyses were performed in

147 negative ionization mode. In DOM analysis, the focus was on observing as many molecules as possible at high accuracy (> 1 ppm). The primary challenge with this is filling the ICR cell with an ideal number of molecules/charges, consistently between samples of various ionization potential and concentration. Desolvation energy, collision cell parameters, time of flight, cell shimming, and ion accumulation time were all found to be critical parameters. Employing ramped excitation and selective accumulation assist greatly in ensuring consistent measurements between samples. Similar tactics are used in developing broad range lipid methods. For isotopic fine structure lipid analysis, continued accumulation of selected ions (CASI) is employed to isolate a narrow frequency band for passage to the ICR cell, ensuring precise m/z measurements and separation in m/z space of various isotopologues of the same molecule, enabling the user to determine its elemental composition. The DOM method development was applied to several sample sets as described below, and the lipid methods were applied to two collaborative projects that reached publication as noted below.

6.2 Dissolved Organic Matter FTICR-MS Method Development

6.2.1 Introduction

Dissolved organic matter (DOM), a key component of biogeochemical carbon cycling, is a class of diverse water-soluble carbon-containing compounds. It differs from particulate organic matter (POM) in that these molecules can pass through a filter, and therefor are small enough to be analyzed by advanced analytical methods such as mass spectrometry. The major elemental components of DOM are carbon, hydrogen, oxygen, nitrogen, phosphorous, and sulfur. This heterogenous mixture varies widely in different marine and soil ecosystems and can serve as a

148

fingerprint of the organic molecules present in a particular sample.273 For these reasons, high

resolution mass spectrometry has become a popular method for analyzing a portion of the DOM

population of samples, primarily in marine and geochemistry. A wealth of information pertaining to the complex nature and cycles of DOM and methods used to analyze it has been thoroughly reviewed multiple times in the literature, most comprehensively in this review in the treatise on geochemistry, co-authored by my dissertation committee member Prof. Aron Stubbins (T.

Dittmar, A. Stubbins, 2014).274

Due to the extensively heterogenous nature of DOM within and between samples, the

analysis method of choice will largely determine which molecules within the sample are

primarily observed. When analyzing DOM by FTICR-MS, such choices are determined by the

instrument acquisition parameters. In developing a method to use to compare large sets of DOM samples, choices had to be made as to what range of the DOM particles, already limited by the

choice of mass spectrometry, would be compared. As a broad example, analysis is typically

performed in negative ion mode, eliminating approximately half of the observable ions

(assuming they are present as salts). As a more specific example, it is difficult to achieve high resolution across the entire mass range of molecules, so a choice must be made to prioritize the

monitoring of ions within a particular mass range, typically spanning from c.a. 150 m/z to 1,000 m/z. The exact parameters used, however, are not as important as the identification of as many ions as possible while introducing the least bias possible, as samples are typically compared using van Krevelen diagrams which compare H/C, O/C, aromaticity, and general mass ranges.274

The problem of various instruments and instrument settings and how they may represent samples differently and make it often difficult to compare results between labs has been examined in depth in a recent cross-laboratory methods validation study that I coauthored and contributed

149

data and revisions to (Hawkes et al. 2020).275 The general conclusion of this inter-laboratory

study was that ion count and sample representation varied considerably between labs, while van

Krevelen trends were generally preserved, but still necessitating that standard control samples be

run in all future DOM-analyses.275 The following figures and narrative detail important points in the development of a DOM-analysis method that I developed with assistance from the Stubbins

(Northeastern University) and Dittmar (Carl von Ossietzky Universität Oldenburg) labs, as undergraduate research assistants Khanh Vu and Sydney Geyer.

6.2.2 Method Development

In the early stages of method optimization, I looked at a wide variety of FTICR-MS parameters in efforts to resolve thousands of ions while not overloading the cell such as to cause peak splitting or coalescence. In FTICR-MS, there is a balance between the megawords of data

that will be collected and the low mass (which is actually a representation of the highest frequency and therefore energy that the instrument will try to detect). It can be seen then that lowering the low mass will increase the amount of time each measurement will take, and so will increasing the datafile size (i.e. there is relatively more data per m/z at low m/z). This balance must be struck all while not exceeding the charge limit within the FTICR cell. In the case of

DOM, I wanted to get the broadest selection of molecules possible, with enough resolution of each molecule to be able to derive a molecular formula. This derivation process is easier in the lower m/z range, as less molecular formulas fit each measured m/z in this range, so this lower mass range was slightly prioritized as seen below in Figure 6-1. Some of the first parameter I began optimizing were the low mass and time of flight. Each figure displayed below represents a

150

small portion of the entire optimization process, serving only to illustrate the sort of comparisons

performed.

Figure 6-1. FTICR-MS spectra of DOM from Suwannee River, varying time of flight. Data (not shown) also collected at low mass settings of 154 m/z. Lowering ToF below 0.8 ms impedes

observation of higher mass peaks in the DOM without offering any observable benefit to

sensitivity. It was also observed that no additional peaks at observed using a low m/z of 96 instead of 154 m/z. Subsequent experiments demonstrated that bringing ToF from 0.8 ms to 1.0

ms lost a great amount of sensitivity in the lower mass range.

Additional parameters optimized at this stage in the process include funnel RF, transfer

RF, octopole RF, collision cell voltage, ramped excitation, and transfer frequency, where it was

found that octupole and transfer RF values of 350 worked well, a funnel RF of 120, collision cell

151

voltage of 3 V, a ramped excitation of 7-18%, and a transfer frequency of 6 MHz seemed ideal.

The improvements to resolving power of these optimizations are shown in Figure 6-2.

Figure 6-2. Examination of a single m/z range in a DOM spectrum. Before optimization (top) and after optimization (bottom) are compared. Most of the differences observed between initial and optimized settings were seen in peak intensities and m/z range observed. Improvements in individual peaks within each peak group also were observed as seen here.

Further improvements in method development were in the a) optimization of desolvation energy are displayed in Figure 6-3, where an optimal level of ionization without breakdown of molecules was determined, b) refinement of datafile size to improve resolving power without extending the free induction decay (FID) too long to be measuring regions with little signal

(Figure 6-4), c) determining where signal/noise begins to plateau to stay at the beginning of the plateau region (Figure 6-5), d) determining optimal DOM spray concentration (Figure 6-6) in

152

conjunction with ion accumulation time to allow optimal ionization at the source and ions

trapped in the ICR cell for measurement (Figure 6-7).

Figure 6-3. Optimization of skimmer 1 voltage for DOM samples. Values seen on the right side of each spectrum are the skimmer 1 voltages used to collect those spectra. Lower skimmer 1 voltages allow most ionizable molecules in the sample to pass to the detector undamaged,

including contaminant ions in the sample (peaks extending above y-axis frame. Increasing skimmer 1 voltage will lower interfering detection of these ions but beginning at 70 V will begin

153

to degrade molecules in the DOM mixture. 55 V Skimmer 1 voltage was selected for subsequent

analyses.

Figure 6-4. Data acquisition file size (through increased FID length) and DOM signal.

Resolution of each observed peak observed as data acquisition file size was increased with extending the free induction decay observation length. DOM molecules appear to remain observable in the cell up to 8 megawords, corresponding to a transient length of 4.8 seconds.

Improvements were also noticed when examining the full spectrum of ions observed (data not

shown). Further optimizations of peak-matching to exemplar spectra and through transient

154 truncations using FTMS Processing program discovered that 4M acquisition (2.4 second transient) was sufficiently adequate for identifying DOM ions present.

Figure 6-5. Optimization of number of scans per sample. Using the Suwannee River control sample, number of scans with optimized parameters was increased and signal to noise ratios for

10 exemplar peaks were calculated and averaged per spectrum. In this particular experiment the

‘200 scan’ file became corrupted and would not open in data analysis, and for that reason was

155

excluded from the analysis. Signal to noise was found to increase linearly with square root of scans as expected. Depending on sample set, 150 – 200 scans per sample were used.

156

Figure 6-6. The impact of sample concentration on signal intensity, ions detected, and peak shape. Higher concentrations lead to more ions in the ICR cell and worsen peak shape. Samples for all previous tests were conducted at 20 mg / mL as it was the standard concentration for

DOM analysis to date.

Figure 6-7. Lowering ion accumulation time allows for use of higher concentration of sample in spray. This allowed us to overcome an issue where contaminant peaks in the solvents were over- powering the intensity of the analytes, spraying at a lower accumulation time but higher concentration of analytes to prioritize observation of the DOM.

After method optimization, the following method (only parameters unique to this method from default FTICR-MS method shown) was used for analysis of DOM samples for several projects from the Stubbins laboratory and collaborators of the Stubbins laboratory:

157

Section Parameter Value Acquisition Acquired Scans 200 scans Acquisition Detection Mode Broadband Acquisition Data Acquisition Size 4194304 words Acquisition No. of Cell Fills 1 fill Acquisition Selective Accumulation On Acquisition Broadband Low Mass 154.8 m/z Acquisition Ion Accumulation Time* 0.024 sec ESI Polarity Negative ESI Capillary 4300 V ESI Flow Rate 2 µL/m ESI Drying Gas Temperature 180 °C ESI Nebulizer Gas 0.5 bar Source Optics Capillary Exit -220 V Source Optics Funnel 1 -150 V Source Optics Skimmer 1 (differential)* -55 V Source Optics Funnel RF Amplitude 120 Vpp Quadrupole Q1 Mass 150 m/z Quadrupole Hexapole Frequency 5000 KHz Collision Cell Collision Voltage (Entrance) 2 V Collision Cell Collision Gas Flow Rate 30% Collision Cell Collision Cell RF 1600 Vpp Ion Transfer Optics Transfer Line RF 350 Vpp Ion Transfer Optics Transfer Exit Lens 20 V Analyzer Analyzer Entrance 14 V Analyzer Front Trap Plate 2 V Analyzer Back Trap Plate 2 V Analyzer Shimming DC Bias 90 -1.6 V Analyzer Shimming DC Bias 270 -1.5 V Analyzer Shimming DC Bias 0 -1.4 V Analyzer Shimming DC Bias 180 -1.9 V Frequency Excitation Mode Sweep Frequency Ramped Power Excite Ramp Style linear in % Frequency Ramped Power Excite Mode continuous Frequency Ramped Power Excite EXC_hi Power 18% Frequency Ramped Power Excite EXC_lo Power 7% Frequency Excitation Event Upper Limit 1000 m/z Frequency Excitation Event Lower Limit 154.8 m/z Processing Apodization Sine-Squared

158

Table 6-1. DOM Analysis Final Method Parameters. Displayed are parameters unique to this method, optimized for DOM analysis. *Parameters labeled with a star could be variable between samples. These were the setting originally used in the ROMEO sample set provided by the

Stubbins laboratory. It was found that some samples or sample sets ionized better than others or had fewer contaminants. Both of these factors required less ion accumulation time, and lower skimmer 1 voltage differentials.

This method was integrated with an auto-sampling method developed on a Waters

Acquity UPLC system developed to deliver samples to the FTICR-MS in blocks of pure sample, timing the acquisition to occur when sample was ionizing, with appropriate wash steps between samples. All samples were dissolved in 50/50 water/methanol and LC solvents were the same.

When integrating a waters ACQUITY UPLC and a Bruker solariX XR, Hystar must be used to synchronize the acquisition sequence beginning with each sample change in the sample manager.

However, if acquiring signal the same way in this process as LC-MS data is collected, individual spectra must be average to view a multi-scan spectrum. This signal averaging produces unnaturally noise-smoothed baselines typical of most LC-MS experiments on most commercial mass spectrometers. This practice allows for poor QC of spectral quality, with spectral adding preferred, so that the 200 scans are summed to better represent the complex DOM mixture and the noise concomitant with their spectra. As such, setting up the mass spectrometer to begin 200 scan selective accumulation acquisition at a time shortly after the sample reaches the source is a much better strategy (Figure 6-8). In this way, we were able to exactly replicate a direct infusion analysis in an automated fashion and analyze large sample sets totaling hundreds of samples with relative ease.

159

Figure 6-8. Auto-sampling method development. Selective accumulation begins when the sample hits the ion source and only saves scans that fall in a particular total ion current range.

This process is tuned such that samples can reach the desired number of scans in the duration of their ionization.

Figures 6-9, 6-10, and 6-11 below show a representative SRFA standard spectrum at three levels of X-axis zoom using my finalized method.

160

Figure 6-9. Suwannee River fulvic acid (SRFA) spectrum with finalized DOM analysis method shown here at three magnification levels. Full spectrum displayed. 8,531 unique ions identified with average resolving power of 430,000 at 381 m/z.

Figure 6-10. SRFA spectrum with finalized DOM analysis method. Spectrum displayed from

345 m/z to 380 m/z.

161

Figure 6-11. SRFA spectrum with finalized DOM analysis method. Spectrum displayed from

353.00 m/z to 353.20 m/z.

6.2.3 DOM Method Development Conclusions

Developing FTICR-MS methods for DOM analysis, and likely any complex organic mixture, is a challenging task and requires great understanding of the many parameters set in such an instrument. It can be a tedious process of finding that the optimization of a particular parameters negates benefits observed from optimization of a previously-tuned parameter.

Continuous method refinement is required during this process and understanding the physics of the how the instrument works was essential for method optimization. Further, certain parameters cannot be used for all DOM samples. Such parameters as ion accumulation time and skimmer 1 voltage can be adjusted for samples that do not acquire well at set method parameters and required sample re-runs to optimize signal to noise ratios and ion populations. This method was subsequently applied to several sample sets including ROMEO (Stubbins), Thomas, Guillemette,

Wagner, and the samples for the interlaboratory study previously mentioned.275

162

6.3 Lipid Mass Spectrometry Method Development

6.3.1 Lipid method development introduction

Lipid analysis is another application of mass spectrometry that often employs negative

mode acquisition. The most challenging aspect of lipid analysis is the isolation of the lipid

species one wishes to analyze. This can be thought of in three stages, 1) extraction of lipids from

cell culture, 2) enrichment of desired subspecies of lipid, 3) isolation of the target lipid molecule within the analysis system (LC-MS or MS alone).276 It is also often possible to avoid step 2 if the

lipid species being analyzed is prominent in the sample and the analysis system is powerful

enough to preferentially isolate the target lipid. In FTICR-MS, resolving power is often sufficient

enough in tandem with quadrupole isolation to be able to measure an accurate mass of a lipid

molecule even in a complex mixture, even achieving isotopic fine structure to determine an exact

formula and differentiate similar species. Using the FTICR-MS parameter-tuning skills

developed during DOM-analysis method development, I was able to develop a method

framework that allowed for high resolving power isotopic fine structure determination of a

variety of molecules. These methods were applied to lipid samples generated at Harvard Medical

School in collaborative projects with the Nathalie Agar and John Hanna laboratories, published

as a coauthored manuscript in Analytical Chemistry277 and Molecular Biology of the Cell.278

6.3.2 High Resolution Mass Spectrometry of Ceramides

Ceramide confirmation using high resolution mass spectrometry (HRMS): What follows are excerpts from a coauthored study published in ‘Analytical Chemistry’ as a collaborative effort between the Nathalie Y. R. Agar, Deborah A. Dillon, and Jeffrey N. Agar laboratories. All

163

high-resolution mass spectrometry molecular confirmation experiments were performed by

Nicholas D. Schmitt in the Jeffrey N. Agar laboratory.

Excerpts here reproduced in part with permission from In Vitro Liquid Extraction Surface Analysis Mass Spectrometry (ivLESA-MS) for Direct Metabolic Analysis of Adherent Cells in Culture, Analytical Chemistry 2018 Apr 17;90(8):4987- 4991. DOI: 10.1021/acs.analchem.8b00530 Copyright 2020 American Chemical Society.277

Lipid extraction

Lipids extracts from MDA-MB-231 cells were generated using a modified Folch extraction method.279 Briefly, cells from 6-well cell culture plates were washed 2X with PBS,

harvested by scraping, pelleted (1000g, 2 min), resuspended in 0.3 mL MeOH and vortexed with

0.6 mL CHCl3 for 10 min. Phase separation was achieved using 0.25 mL H2O, followed by

vortexing, incubation at RT for 10 min and centrifugation (1000g, 2 min). Following

centrifugation, the organic phase was carefully collected and diluted ten-fold in 2:1

CHCl3:MeOH.

ESI-FT-ICR MS analysis

Lipid extracts were introduced into the mass spectrometer at an infusion rate of 2 µL/min,

and ionized at atmospheric pressure using electrospray ionization (ESI) in negative ion mode.

Mass spectra were acquired using a 9.4 Tesla SolariX XR FT-ICR (Bruker Daltonics, Billerica,

MA), using ramped RF-excitation and a 4 MW dataset. After confirming an MS lipid profile

similar to that seen by ivLESA MS, HRMS measurements were performed in continuous

164 accumulation of selected ions (CASI) mode at 572.5 m/z and 683.0 m/z with a 20 m/z window to obtain high mass accuracy of ions contained within the mass ranges of interest.

Figure 6-12. High resolution mass spectra generated using ESI-FT-ICR MS in CASI mode, from lipid extracts of MDA-MB-231 cells, highlighting m/z ranges corresponding to those elevated in

MDA-MB-231 cells by ivLESA (A, B). Resolved isotopic fine structure (C) permits the unambiguous assignment of the overlapping (in nominal mass) isotopologues of the M peak of saturated ceramides and the M+2 peaks of unsaturated ceramides. Green represents acquired spectrum and dashed black represents simulated spectrum.

165

m/zmeas m/zcalc Molecular Adduct Molecular Delta Formula assignment (ppm) - 572.48120 572.48150 C34H67NO3 M+Cl Ceramide (d18:1/16:0) 0.5

- 682.59060 682.59105 C42H81NO3 M+Cl Cer(d18:1/24:1(15z)) 0.7

- 684.60623 684.60670 C42H83NO3 M+Cl Ceramide (d18:1/24:0) 0.7 - 684.60623 684.60670 C42H83NO3 M+Cl Cer(d18:0/24:1(15z)) 0.7

Table 6-2. Molecular formulas and lipid identification was performed by querying the Human

Metabolome Database280 for ions with m/z values of 572.4812, 682.5906 and 684.6062, with

each query having a mass tolerance of ± 1 ppm. Each query returned only chloride adducts of ceramide molecules within the mass range specified (synonymous molecules have been omitted).

In the case of 684.6062, two different possible ceramide species were returned, though

physiologically, ceramide (d18:1/24:0) is more likely.

6.3.3 Phospholipid Relative Quantitation

Phospholipid relative intensity determination: What follows are excerpts from a coauthored study published in ‘Molecular Biology of the Cell’ as a collaborative effort between the Nathalie Y. R. Agar, John W. Hanna, and Jeffrey N. Agar laboratories. All high-resolution

mass spectrometry lipidomic experiments were performed by Nicholas D. Schmitt in the Jeffrey

N. Agar laboratory. Reproduced here is the figure which Nicholas D. Schmitt contributed the most to, though he also contributed to other elements of the manuscript.

Reproduced in part with permission from Dysregulation of very-long-chain fatty acid metabolism causes membrane saturation and induction of the unfolded protein response, Molecular Biology of the Cell 2020 31:1, 7-17. DOI: 10.1091/mbc.E19-07-0392 Copyright 2020 The American Society for Cell Biology.278

166

Lipid Extraction

Total yeast lipid extracts were prepared as previously described281. Overnight cultures

grown at 30°C in rich media (OD600~5) were diluted into fresh media (starting OD600=0.1) and

cultured for 6 hours to a final OD600 of approximately 0.8. These logarithmic phase cells were

washed with water and resuspended in cold CHCl3/MeOH (2:1) solution. Cells were disrupted

with glass beads at 3200 rpm for 30 min at 4°C. MgCl2 was added to the suspension at a

concentration of 0.034% and mixed for 10 min at 4°C. Samples were centrifuged at 1000xg for 5

min at room temperature. The aqueous upper phase was discarded. MeOH/H2O/CHCl3

(48:47:3) solution was added to the suspension and mixed. Samples were centrifuged at 1000xg

for 5 min at room temperature. The aqueous upper phase was again discarded and the organic

lower phase containing the lipid fraction was transferred to a fresh tube. Collected lipid extracts

were desiccated in a vacuum chamber and stored at -80°C. Samples were dissolved in

CHCl3/MeOH (2:1) solution prior to lipidomic analysis.

Mass Spectrometry Lipidomic Analysis

Molecular identification was achieved with high resolution mass spectrometry using a 9.4

Tesla SolariX XR FT-ICR MS (Bruker Daltonics, Billerica, MA) employing ramped RF-

excitation and a 4 MW dataset. Lipids were introduced using syringe flow infusion at a rate of 2

μL/min. Electrospray ionization was performed in both negative and positive ion polarity modes,

with continuous accumulation of selected ions (CASI) employed when high mass accuracy for

lipid identification was needed. Spectra were analyzed using Data Analysis software (Bruker

Daltonics, version 4.2). Mass spectral peaks were assigned to particular lipids from the LIPID

MAPS Lipidomics Gateway (Wellcome Trust) Database using a < 2 ppm molecular mass cutoff.

167

Figure 6-13. Increased phospholipid saturation in the fat1Δ mutant. A) Relative intensity of long

(C16-C20) and very long chain (C22-C26) fatty acids in the wildtype and fat1Δ mutant, as determined by comprehensive FT-ICR mass spectrometry-based lipidomics. Relative intensity is expressed as a percentage of the total ion current (TIC). Similar results were obtained in more than three independent experiments. The mass to charge ratios are 291.2095 (16:0), 319.2408

(18:0), 347.2721 (20:0), 375.3033 (22:0), 403.3346 (24:0), and 431.3658 (26:0). B) Relative intensity of selected lipid species in the wild-type and fat1Δ mutant. The three ceramide (Cer)

168

species shown contain a VLCFA moiety, whereas sphingosine, sphinganine, and

phytospingosine are precursors for ceramide synthesis. Relative intensity is expressed as a percentage of the TIC. Similar results were obtained in more than three independent experiments. The mass to charge ratios are 300.2891 (sphingosine), 302.3047 (sphinganine),

318.2995 (phytosphingosine), 680.6764 (Cer (d44:0)), 708.7663 (Cer(d46:0)), and 724.7021

(Cer(t46:0)). C) Raw positive ion mode spectra from lipidomic profiles of wild-type and fat1Δ cells. The relative intensity of the two most abundant phosphatidylcholine species in yeast (PC-

32 and PC34) are indicated. PC (32:2) and PC (34:2) represent the di-unsaturated forms; PC

(32:1) and PC (34:1) represent the monounsaturated forms. Relative intensity is expressed as

peak intensity normalized to the base peak of the displayed m/z window. Similar results were

obtained in more than 15 independent experiments (see also Fig. 4C-D). The mass to charge

ratios are 730.5381 (32:2), 732.5538 (32:1), 758.5694 (34:2) and 760.5809 (34:1). D) Relative

intensity of di-unsaturated and monounsaturated PC-32 (left panel) and PC-34 (right panel)

species in the wild-type and fat1Δ mutant. The fraction of each species as a total of the sum of

the two species is plotted. Similar results were obtained in more than 15 independent

experiments. E) Relative intensity of di-unsaturated and monounsaturated PE-32 (left panel) and

PE-34 (right panel) species in the wild-type and fat1Δ mutant. The fraction of each species as a

total of the sum of the two species is plotted. Similar results were obtained in more than ten

independent experiments.

6.3.4 Lipid Method Development Conclusion

Lipid isotopic fine structure and relative intensity quantitation are two elements of lipid

mass spectrometry often essential to lipidomic experiments. Proper parameter optimization as

169 described in section 6.2.2 are essential for achieving both of these. In both of the publications for which I show excerpts here, high-resolution mass spectrometry was required for acceptance of the work to the journal. This demonstrates the importance of the technology in the field of biomedical research, adding essential value to the process of answering complex biological questions. More details on the biological significance of these findings, with which I was only minimally involved, can be found in the primary literature.277-278

170

Chapter 7

Conclusions and Future Directions

The work presented in this dissertation demonstrates how critical thought and new perspectives can change and improve how problems in analytical chemistry can be solved. For ease of reference, publications where I am an author have been underlined in the reference section. In the second chapter, we examined the value of taking perspective on a field of research periodically to evaluate what has been learned, what hypotheses have been developed, which of these hypotheses have been supported by the breadth of data acquired, and how that should inform the future directions of research in said field. We did this by examining SOD1-mediated

ALS and hypotheses that sporadic ALS may have similar mechanisms of onset to familial ALS, though caused by PTMs of the SOD1 protein rather than genetic mutations. This perspective that genetic mutations are essentially encoded PTMs at the level of protein translation could shed light on a number of diseases that have both familial and sporadic onsets. Further, we discovered several PTMs in the literature that make the exact same change as genetic mutations in a number of disease-associated proteins and believe these will be useful in informing future research directions.

In the third chapter, we then applied critical thinking to contradictory results that have been persistent in the literature produced by the use of SOD-100 as a pan-SOD1 antibody, which it claims to be. We show how potentially hundreds of publications could be compromised by this misleading assumption and draw attention to the importance of validating antibodies before assuming their mode of interaction with proteins of interest. Through means of hydrogen- deuterium exchange mass spectrometry and western blots, we saw how conformation of proteins

171

cannot be assumed by primary structure alone, necessitating these advanced techniques to better-

examine protein structure and form. We believe these results are the beginning of an effort to re- interpret past findings with modern insights to better inform future research.

In the fourth chapter, we examined why, in the field of top-down proteomics, that the

majority of fragments generated, namely internal fragments, were not being used. Among other reasons that we discussed, we found that this was partially due to an entire subset (c.a. 20%+) of

product ions being unassigned or double-assigned. We elucidated new ways to be able to assign these sequences without additional data. We also demonstrated our expertise in the literature,

developing a system to assign these fragments based on previous studies employing a scoring

system, and our expertise in mass spectrometry, demonstrating how MS3 can be used to assign

these previously-unassigned fragments. We demonstrated how taken together, these methods can

be used to expand characterization capabilities of top-down proteomics by significant factors.

Collectively, we showed how deep consideration of the factors complicating analysis can be used

to the analyst’s advantage, empowering future researchers in the field of top-down proteomics to

see PTMs they may have otherwise missed.

In the fifth chapter, we showed the development of a new technique in mass spectrometry

imaging that harnesses the power of intact protein analysis to push the field of MALDI-MSI in a

step towards single-cell resolution. Continuous technological breakthroughs paired with methodological advancements are the fuel of scientific progress. In this study we embraced recent innovations in mass spectrometry instrumentation and combined a variety of methods to

demonstrate high-throughput cell cohort assignment. We see this as the springboard upon which

future research can build, enabling signatures present, or genetically encoded into tissues, to

172

allow mass spectrometry to assign cell types in the same experiment that new data is being

collected without the need for relying on error-prone methods.

Lastly, in chapter six we chronical the FTICR-MS method development process as a

DOM analysis method was developed, leading to key parameter adjustments that enabled efficient and automated sample analysis of complex organic mixtures. We demonstrate how

methods developed for one class of samples could be applied to an entirely different class, in the

case of lipidomics, as both sets of analytes shared the same analysis mode and general m/z range.

Mass spectrometry method development on an advanced and highly adjustable instrument is a

tedious yet rewarding task, plagued by many hours of progressive frustration followed by

ecstatic moments of triumph as acquisition begins and a signal begins to register on the

instrument graphical user interface. It is often that the mass spectrometrist wishes to be handed a

method to plug and play with, but the lessons and knowledge they forgo with this approach

should not be discarded lightly.

173

References

1. Deng, H. X.; Hentati, A.; Tainer, J. A.; Iqbal, Z.; Cayabyab, A.; Hung, W. Y.; Getzoff, E. D.; Hu, P.; Herzfeldt, B.; Roos, R. P.; et, a., Amyotrophic lateral sclerosis and structural defects in Cu,Zn superoxide dismutase. Science 1993, 261 (5124), 1047. 2. Rosen, D. R.; Siddique, T.; Patterson, D.; Figlewicz, D. A.; Sapp, P.; Hentati, A.; Donaldson, D.; Goto, J.; O'Regan, J. P.; Deng, H. X.; et al., Mutations in Cu/Zn superoxide dismutase gene are associated with familial amyotrophic lateral sclerosis. Nature 1993, 362 (6415), 59-62. 3. Auclair, J. R.; Boggio, K. J.; Petsko, G. A.; Ringe, D.; Agar, J. N., Strategies for stabilizing superoxide dismutase (SOD1), the protein destabilized in the most common form of familial amyotrophic lateral sclerosis. Proceedings of the National Academy of Sciences 2010, 107 (50), 21394. 4. Auclair, J. R.; Johnson, J. L.; Liu, Q.; Salisbury, J. P.; Rotunno, M. S.; Petsko, G. A.; Ringe, D.; Brown, R. H.; Bosco, D. A.; Agar, J. N., Post-Translational Modification by Cysteine Protects Cu/Zn-Superoxide Dismutase from Oxidative Damage. Biochemistry 2013, 52 (36), 6137-6144. 5. Wang, Q.; Johnson, J. L.; Agar, N. Y. R.; Agar, J. N., Protein Aggregation and Protein Instability Govern Familial Amyotrophic Lateral Sclerosis Patient Survival. PLOS Biology 2008, 6 (7), e170. 6. Smith, L. M.; Kelleher, N. L.; Consortium for Top Down, P., Proteoform: a single term describing protein complexity. Nature methods 2013, 10 (3), 186-187. 7. Walsh, C. T.; Garneau-Tsodikova, S.; Gatto, G. J., Protein Posttranslational Modifications: The Chemistry of Proteome Diversifications. Angewandte Chemie International Edition 2005, 44 (45), 7342-7372. 8. Bosco, D. A.; Morfini, G.; Karabacak, N. M.; Song, Y.; Gros-Louis, F.; Pasinelli, P.; Goolsby, H.; Fontaine, B. A.; Lemay, N.; McKenna-Yasek, D.; Frosch, M. P.; Agar, J. N.; Julien, J. P.; Brady, S. T.; Brown, R. H., Jr., Wild-type and mutant SOD1 share an aberrant conformation and a common pathogenic pathway in ALS. Nat Neurosci 2010, 13 (11), 1396- 403. 9. Schmitt, N. D.; Agar, J. N., Parsing disease-relevant protein modifications from epiphenomena: perspective on the structural basis of SOD1-mediated ALS. Journal of mass spectrometry : JMS 2017, 52 (7), 480-491. 10. Roncador, G.; Engel, P.; Maestre, L.; Anderson, A. P.; Cordell, J. L.; Cragg, M. S.; Šerbec, V. Č.; Jones, M.; Lisnic, V. J.; Kremer, L.; Li, D.; Koch-Nolte, F.; Pascual, N.; Rodríguez-Barbosa, J.-I.; Torensma, R.; Turley, H.; Pulford, K.; Banham, A. H., The European antibody network's practical guide to finding and validating suitable antibodies for research. mAbs 2016, 8 (1), 27-36. 11. Begley, C. G.; Ellis, L. M., Raise standards for preclinical cancer research. Nature 2012, 483 (7391), 531-533. 12. Uhlen, M.; Bandrowski, A.; Carr, S.; Edwards, A.; Ellenberg, J.; Lundberg, E.; Rimm, D. L.; Rodriguez, H.; Hiltke, T.; Snyder, M.; Yamamoto, T., A proposal for validation of antibodies. Nature methods 2016, 13 (10), 823-7. 13. Bucur, O.; Pennarun, B.; Stancu, A. L.; Nadler, M.; Muraru, M. S.; Bertomeu, T.; Khosravi-Far, R., Poor antibody validation is a challenge in biomedical research: a case study for

174

detection of c-FLIP. Apoptosis : an international journal on programmed cell death 2013, 18 (10), 1154-62. 14. Voskuil, J. L. A., The challenges with the validation of research antibodies. F1000Res 2017, 6, 161-161. 15. Brotherton, T. E.; Li, Y.; Cooper, D.; Gearing, M.; Julien, J. P.; Rothstein, J. D.; Boylan, K.; Glass, J. D., Localization of a toxic form of superoxide dismutase 1 protein to pathologically affected tissues in familial ALS. Proc Natl Acad Sci U S A 2012, 109 (14), 5505-10. 16. Guareschi, S.; Cova, E.; Cereda, C.; Ceroni, M.; Donetti, E.; Bosco, D. A.; Trotti, D.; Pasinelli, P., An over-oxidized form of superoxide dismutase found in sporadic amyotrophic lateral sclerosis with bulbar onset shares a toxic mechanism with mutant SOD1. Proceedings of the National Academy of Sciences 2012, 109 (13), 5074-5079. 17. Kerman, A.; Liu, H. N.; Croul, S.; Bilbao, J.; Rogaeva, E.; Zinman, L.; Robertson, J.; Chakrabartty, A., Amyotrophic lateral sclerosis is a non-amyloid disease in which extensive misfolding of SOD1 is unique to the familial form. Acta Neuropathol 2010, 119 (3), 335-44. 18. Catherman, A. D.; Skinner, O. S.; Kelleher, N. L., Top Down proteomics: facts and perspectives. Biochem Biophys Res Commun 2014, 445 (4), 683-693. 19. Toby, T. K.; Fornelli, L.; Kelleher, N. L., Progress in Top-Down Proteomics and the Analysis of Proteoforms. Annual Review of Analytical Chemistry 2016, 9 (1), 499-519. 20. Cobb, J. S.; Easterling, M. L.; Agar, J. N., Structural characterization of intact proteins is enhanced by prevalent fragmentation pathways rarely observed for peptides. Journal of the American Society for Mass Spectrometry 2010, 21 (6), 949-959. 21. Durbin, K. R.; Skinner, O. S.; Fellers, R. T.; Kelleher, N. L., Analyzing Internal Fragmentation of Electrosprayed Ubiquitin Ions During Beam-Type Collisional Dissociation. Journal of The American Society for Mass Spectrometry 2015, 26 (5), 782-787. 22. Karabacak, N. M.; Li, L.; Tiwari, A.; Hayward, L. J.; Hong, P.; Easterling, M. L.; Agar, J. N., Sensitive and Specific Identification of Wild Type and Variant Proteins from 8 to 669 kDa Using Top-down Mass Spectrometry. Mol Cell Proteomics 2009, 8 (4), 846-856. 23. Marshall, A. G.; Comisarow, M. B.; Parisod, G., Relaxation and spectral-line shape in Fourier-transform ion resonance spectroscopy. J Chem Phys 1979, 71. 24. Marshall, A. G.; Hendrickson, C. L.; Jackson, G. S., Fourier transform ion cyclotron resonance mass spectrometry: A primer. Mass Spectrom Rev 1998, 17. 25. Nikolaev, E. N.; Kostyukevich, Y. I.; Vladimirov, G. N., Fourier transform ion cyclotron resonance (FT ICR) mass spectrometry: Theory and simulations. Mass Spectrometry Reviews 2016, 35 (2), 219-258. 26. Smith, L. M.; Kelleher, N. L., Proteoform: a single term describing protein complexity. Nature methods 2013, 10 (3), 186-7. 27. Ségalat, L., Loss-of-function genetic diseases and the concept of pharmaceutical targets. Orphanet Journal of Rare Diseases 2007, 2, 30-30. 28. Sidransky, E.; Ginns, E. I., Genetic basis of Gaucher disease. J Pediatr 1995, 127 (3), 510. 29. Mair, B.; Konopka, T.; Kerzendorfer, C.; Sleiman, K.; Salic, S.; Serra, V.; Muellner, M. K.; Theodorou, V.; Nijman, S. M. B., Gain- and Loss-of-Function Mutations in the Breast Cancer Gene GATA3 Result in Differential Drug Sensitivity. PLOS Genetics 2016, 12 (9), e1006279. 30. Oren, M.; Rotter, V., Mutant p53 gain-of-function in cancer. Cold Spring Harb Perspect Biol 2010, 2 (2), a001107.

175

31. Human Mortality Database. University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). Available at www.mortality.org or www.humanmortality.de (data downloaded on March 2, 2017): Vol. Available at www.mortality.org or www.humanmortality.de. 32. Dickson, D. W., Neuropathology of Non-Alzheimer Degenerative Disorders. International Journal of Clinical and Experimental Pathology 2010, 3 (1), 1-23. 33. McCann, H.; Stevens, C. H.; Cartwright, H.; Halliday, G. M., alpha-Synucleinopathy phenotypes. Parkinsonism Relat Disord 2014, 20 Suppl 1, S62-7. 34. Cullen, V.; Sardi, S. P.; Ng, J.; Xu, Y. H.; Sun, Y.; Tomlinson, J. J.; Kolodziej, P.; Kahn, I.; Saftig, P.; Woulfe, J.; Rochet, J. C.; Glicksman, M. A.; Cheng, S. H.; Grabowski, G. A.; Shihabuddin, L. S.; Schlossmacher, M. G., Acid beta-glucosidase mutants linked to Gaucher disease, Parkinson disease, and Lewy body dementia alter alpha-synuclein processing. Ann Neurol 2011, 69 (6), 940-53. 35. Dickson, D. W., Chapter 7 Ubiquitinopathies. Blue Books of Neurology 2007, 30, 165- 185. 36. Kabashi, E.; Durham, H. D., Failure of protein quality control in amyotrophic lateral sclerosis. Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease 2006, 1762 (11– 12), 1038-1050. 37. Taylor, J. P.; Brown Jr, R. H.; Cleveland, D. W., Decoding ALS: from genes to mechanism. Nature 2016, 539 (7628), 197-206. 38. Auclair, J. R.; Boggio, K. J.; Petsko, G. A.; Ringe, D.; Agar, J. N., Strategies for stabilizing superoxide dismutase (SOD1), the protein destabilized in the most common form of familial amyotrophic lateral sclerosis. Proc Natl Acad Sci U S A 2010, 107 (50), 21394-9. 39. Molnar, K. S.; Karabacak, N. M.; Johnson, J. L.; Wang, Q.; Tiwari, A.; Hayward, L. J.; Coales, S. J.; Hamuro, Y.; Agar, J. N., A Common Property of Amyotrophic Lateral Sclerosis- associated Variants: DESTABILIZATION OF THE COPPER/ZINC SUPEROXIDE DISMUTASE ELECTROSTATIC LOOP. Journal of Biological Chemistry 2009, 284 (45), 30965-30973. 40. Shaw, B. F.; Durazo, A.; Nersissian, A. M.; Whitelegge, J. P.; Faull, K. F.; Valentine, J. S., Local unfolding in a destabilized, pathogenic variant of superoxide dismutase 1 observed with H/D exchange and mass spectrometry. The Journal of biological chemistry 2006, 281 (26), 18167-76. 41. Taylor, D. M.; Gibbs, B. F.; Kabashi, E.; Minotti, S.; Durham, H. D.; Agar, J. N., Tryptophan 32 Potentiates Aggregation and Cytotoxicity of a Copper/Zinc Superoxide Dismutase Mutant Associated with Familial Amyotrophic Lateral Sclerosis. Journal of Biological Chemistry 2007, 282 (22), 16329-16335. 42. Bredesen, D. E.; Ellerby, L. M.; Hart, P. J.; Wiedau-Pazos, M.; Valentine, J. S., Do posttranslational modifications of CuZnSOD lead to sporadic amyotrophic lateral sclerosis? Ann Neurol 1997, 42 (2), 135-7. 43. Ayers, J. I.; Diamond, J.; Sari, A.; Fromholt, S.; Galaleldeen, A.; Ostrow, L. W.; Glass, J. D.; Hart, P. J.; Borchelt, D. R., Distinct conformers of transmissible misfolded SOD1 distinguish human SOD1-FALS from other forms of familial and sporadic ALS. Acta Neuropathol 2016, 132 (6), 827-840. 44. Da Cruz, S.; Bui, A.; Saberi, S.; Lee, S. K.; Stauffer, J.; McAlonis-Downes, M.; Schulte, D.; Pizzo, D. P.; Parone, P. A.; Cleveland, D. W.; Ravits, J., Misfolded SOD1 is not a primary component of sporadic ALS. Acta Neuropathol 2017.

176

45. Abel, O.; Shatunov, A.; Jones, A. R.; Andersen, P. M.; Powell, J. F.; Al-Chalabi, A., Development of a Smartphone App for a Genetics Website: The Amyotrophic Lateral Sclerosis Online Genetics Database (ALSoD). JMIR mHealth and uHealth 2013, 1 (2), e18. 46. Pasinelli, P.; Brown, R. H., Molecular biology of amyotrophic lateral sclerosis: insights from genetics. Nat Rev Neurosci 2006, 7 (9), 710-723. 47. Siddique, T.; Deng, H. X., Genetics of amyotrophic lateral sclerosis. Hum Mol Genet 1996, 5 Spec No, 1465-70. 48. Shi, Y.; Rhodes, N. R.; Abdolvahabi, A.; Kohn, T.; Cook, N. P.; Marti, A. A.; Shaw, B. F., Deamidation of asparagine to aspartate destabilizes Cu, Zn superoxide dismutase, accelerates fibrillization, and mirrors ALS-linked mutations. J Am Chem Soc 2013, 135 (42), 15897-908. 49. Wilcox, K. C.; Zhou, L.; Jordon, J. K.; Huang, Y.; Yu, Y.; Redler, R. L.; Chen, X.; Caplow, M.; Dokholyan, N. V., Modifications of superoxide dismutase (SOD1) in human erythrocytes: a possible role in amyotrophic lateral sclerosis. The Journal of biological chemistry 2009, 284 (20), 13940-7. 50. Shimizu, T.; Fukuda, H.; Murayama, S.; Izumiyama, N.; Shirasawa, T., Isoaspartate formation at position 23 of amyloid beta peptide enhanced fibril formation and deposited onto senile plaques and vascular in Alzheimer's disease. Journal of neuroscience research 2002, 70 (3), 451-61. 51. Yan, S. D.; Chen, X.; Schmidt, A. M.; Brett, J.; Godman, G.; Zou, Y. S.; Scott, C. W.; Caputo, C.; Frappier, T.; Smith, M. A., Glycated tau protein in Alzheimer disease: a mechanism for induction of oxidant stress. Proceedings of the National Academy of Sciences of the United States of America 1994, 91 (16), 7787-7791. 52. Giasson, B. I.; Duda, J. E.; Murray, I. V.; Chen, Q.; Souza, J. M.; Hurtig, H. I.; Ischiropoulos, H.; Trojanowski, J. Q.; Lee, V. M., Oxidative damage linked to neurodegeneration by selective alpha-synuclein nitration in synucleinopathy lesions. Science 2000, 290 (5493), 985-9. 53. LaVoie, M. J.; Ostaszewski, B. L.; Weihofen, A.; Schlossmacher, M. G.; Selkoe, D. J., Dopamine covalently modifies and functionally inactivates parkin. Nat Med 2005, 11 (11), 1214- 21. 54. Mackenzie, I. R.; Bigio, E. H.; Ince, P. G.; Geser, F.; Neumann, M.; Cairns, N. J.; Kwong, L. K.; Forman, M. S.; Ravits, J.; Stewart, H.; Eisen, A.; McClusky, L.; Kretzschmar, H. A.; Monoranu, C. M.; Highley, J. R.; Kirby, J.; Siddique, T.; Shaw, P. J.; Lee, V. M.; Trojanowski, J. Q., Pathological TDP-43 distinguishes sporadic amyotrophic lateral sclerosis from amyotrophic lateral sclerosis with SOD1 mutations. Ann Neurol 2007, 61 (5), 427-34. 55. Kabashi, E.; Valdmanis, P. N.; Dion, P.; Rouleau, G. A., Oxidized/misfolded superoxide dismutase-1: the cause of all amyotrophic lateral sclerosis? Ann Neurol 2007, 62 (6), 553-9. 56. Rakhit, R.; Cunningham, P.; Furtos-Matei, A.; Dahan, S.; Qi, X. F.; Crow, J. P.; Cashman, N. R.; Kondejewski, L. H.; Chakrabartty, A., Oxidation-induced misfolding and aggregation of superoxide dismutase and its implications for amyotrophic lateral sclerosis. The Journal of biological chemistry 2002, 277 (49), 47551-6. 57. Shibata, N.; Hirano, A.; Kobayashi, M.; Sasaki, S.; Kato, T.; Matsumoto, S.; Shiozawa, Z.; Komori, T.; Ikemoto, A.; Umahara, T.; et al., Cu/Zn superoxide dismutase-like immunoreactivity in Lewy body-like inclusions of sporadic amyotrophic lateral sclerosis. Neurosci Lett 1994, 179 (1-2), 149-52. 58. Ihara, Y.; Nukina, N.; Miura, R.; Ogawara, M., Phosphorylated tau protein is integrated into paired helical filaments in Alzheimer's disease. J Biochem 1986, 99 (6), 1807-10.

177

59. Kosik, K. S.; Joachim, C. L.; Selkoe, D. J., Microtubule-associated protein tau (tau) is a major antigenic component of paired helical filaments in Alzheimer disease. Proc Natl Acad Sci U S A 1986, 83 (11), 4044-8. 60. Shevchenko, A.; Wilm, M.; Vorm, O.; Mann, M., Mass spectrometric sequencing of proteins silver-stained polyacrylamide gels. Analytical chemistry 1996, 68 (5), 850-8. 61. Kuljanin, M.; Dieters-Castator, D. Z.; Hess, D. A.; Postovit, L.-M.; Lajoie, G. A., Comparison of sample preparation techniques for large-scale proteomics. Proteomics 2017, 17 (1-2), 1600337-n/a. 62. Auclair, J. R.; Salisbury, J. P.; Johnson, J. L.; Petsko, G. A.; Ringe, D.; Bosco, D. A.; Agar, N. Y.; Santagata, S.; Durham, H. D.; Agar, J. N., Artifacts to avoid while taking advantage of top-down mass spectrometry based detection of protein S-thiolation. Proteomics 2014, 14 (10), 1152-7. 63. Zhang, Y.; Fonslow, B. R.; Shan, B.; Baek, M.-C.; Yates, J. R., Protein Analysis by Shotgun/Bottom-up Proteomics. Chemical reviews 2013, 113 (4), 2343-2394. 64. Sung, W. C.; Chang, C. W.; Huang, S. Y.; Wei, T. Y.; Huang, Y. L.; Lin, Y. H.; Chen, H. M.; Chen, S. F., Evaluation of disulfide scrambling during the enzymatic digestion of bevacizumab at various pH values using mass spectrometry. Biochim Biophys Acta 2016, 1864 (9), 1188-94. 65. Zhang, J.; Guy, M. J.; Norman, H. S.; Chen, Y.-C.; Xu, Q.; Dong, X.; Guner, H.; Wang, S.; Kohmoto, T.; Young, K. H.; Moss, R. L.; Ge, Y., Top-Down Quantitative Proteomics Identified Phosphorylation of Cardiac Troponin I as a Candidate Biomarker for Chronic Heart Failure. Journal of Proteome Research 2011, 10 (9), 4054-4065. 66. Li, L.; Karabacak, N. M.; Cobb, J. S.; Wang, Q.; Hong, P.; Agar, J. N., Memory-efficient calculation of the isotopic mass states of a molecule. Rapid Communications in Mass Spectrometry 2010, 24 (18), 2689-2696. 67. Li, L.; Kresh, J. A.; Karabacak, N. M.; Cobb, J. S.; Agar, J. N.; Hong, P., A Hierarchical Algorithm for Calculating the Isotopic Fine Structures of Molecules. Journal of the American Society for Mass Spectrometry 2008, 19 (12), 1867-1874. 68. Liu, Q.; Easterling, M. L.; Agar, J. N., Resolving Isotopic Fine Structure to Detect and Quantify Natural Abundance- and Hydrogen/Deuterium Exchange-Derived Isotopomers. Analytical chemistry 2014, 86 (1), 820-825. 69. Salisbury, J. P.; Liu, Q.; Agar, J. N., QUDeX-MS: hydrogen/deuterium exchange calculation for mass spectra with resolved isotopic fine structure. BMC Bioinformatics 2014, 15 (1), 403. 70. Tiwari, A.; Xu, Z.; Hayward, L. J., Aberrantly Increased Hydrophobicity Shared by Mutants of Cu,Zn-Superoxide Dismutase in Familial Amyotrophic Lateral Sclerosis. Journal of Biological Chemistry 2005, 280 (33), 29771-29779. 71. Wright, G. S.; Antonyuk, S. V.; Kershaw, N. M.; Strange, R. W.; Samar Hasnain, S., Ligand binding and aggregation of pathogenic SOD1. Nat Commun 2013, 4, 1758. 72. Cao, X.; Antonyuk, S. V.; Seetharaman, S. V.; Whitson, L. J.; Taylor, A. B.; Holloway, S. P.; Strange, R. W.; Doucette, P. A.; Valentine, J. S.; Tiwari, A.; Hayward, L. J.; Padua, S.; Cohlberg, J. A.; Hasnain, S. S.; Hart, P. J., Structures of the G85R Variant of SOD1 in Familial Amyotrophic Lateral Sclerosis. The Journal of biological chemistry 2008, 283 (23), 16169- 16177. 73. Galaleldeen, A.; Strange, R.; Whitson, L. J.; Antonyuk, S.; Narayana, N.; Taylor, A. B.; Schuermann, J. P.; Holloway, S. P.; Hasnain, S. S.; Hart, P. J., Structural and Biophysical

178

Properties of Metal-Free Pathogenic SOD1 Mutants A4V and G93A. Archives of biochemistry and 2009, 492 (1-2), 40-47. 74. Rakhit, R.; Chakrabartty, A., Structure, folding, and misfolding of Cu,Zn superoxide dismutase in amyotrophic lateral sclerosis. Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease 2006, 1762 (11–12), 1025-1037. 75. Schmidlin, T.; Kennedy, B. K.; Daggett, V., Structural Changes to Monomeric CuZn Superoxide Dismutase Caused by the Familial Amyotrophic Lateral Sclerosis-Associated Mutation A4V. Biophysical Journal 2009, 97 (6), 1709-1718. 76. Molnar, K. S.; Karabacak, N. M.; Johnson, J. L.; Wang, Q.; Tiwari, A.; Hayward, L. J.; Coales, S. J.; Hamuro, Y.; Agar, J. N., A common property of amyotrophic lateral sclerosis- associated variants: destabilization of the copper/zinc superoxide dismutase electrostatic loop. The Journal of biological chemistry 2009, 284. 77. Wales, T. E.; Engen, J. R., Hydrogen exchange mass spectrometry for the analysis of protein dynamics. Mass Spectrometry Reviews 2006, 25 (1), 158-170. 78. Lindberg, M. J.; Byström, R.; Boknäs, N.; Andersen, P. M.; Oliveberg, M., Systematically perturbed folding patterns of amyotrophic lateral sclerosis (ALS)-associated SOD1 mutants. Proceedings of the National Academy of Sciences of the United States of America 2005, 102 (28), 9754-9759. 79. Valentine, J. S.; Doucette, P. A.; Zittin Potter, S., Copper-zinc superoxide dismutase and amyotrophic lateral sclerosis. Annu Rev Biochem 2005, 74, 563-93. 80. Jha , P.; Ramasundarahettige , C.; Landsman , V.; Rostron , B.; Thun , M.; Anderson , R. N.; McAfee , T.; Peto , R., 21st-Century Hazards of Smoking and Benefits of Cessation in the United States. New England Journal of Medicine 2013, 368 (4), 341-350. 81. Sekhar, A.; Rumfeldt, J. A. O.; Broom, H. R.; Doyle, C. M.; Sobering, R. E.; Meiering, E. M.; Kay, L. E., Probing the free energy landscapes of ALS disease mutants of SOD1 by NMR spectroscopy. Proceedings of the National Academy of Sciences 2016, 113 (45), E6939-E6945. 82. Ma, Q.; Fan, J.-B.; Zhou, Z.; Zhou, B.-R.; Meng, S.-R.; Hu, J.-Y.; Chen, J.; Liang, Y., The Contrasting Effect of Macromolecular Crowding on Amyloid Fibril Formation. PLOS ONE 2012, 7 (4), e36288. 83. Prudencio, M.; Hart, P. J.; Borchelt, D. R.; Andersen, P. M., Variation in aggregation propensities among ALS-associated variants of SOD1: correlation to human disease. Hum Mol Genet 2009, 18 (17), 3217-26. 84. Speretta, E.; Jahn, T. R.; Tartaglia, G. G.; Favrin, G.; Barros, T. P.; Imarisio, S.; Lomas, D. A.; Luheshi, L. M.; Crowther, D. C.; Dobson, C. M., Expression in Drosophila of Tandem Amyloid β Peptides Provides Insights into Links between Aggregation and Neurotoxicity. Journal of Biological Chemistry 2012, 287 (24), 20748-20754. 85. Elam, J. S.; Taylor, A. B.; Strange, R.; Antonyuk, S.; Doucette, P. A.; Rodriguez, J. A.; Hasnain, S. S.; Hayward, L. J.; Valentine, J. S.; Yeates, T. O.; Hart, P. J., Amyloid-like filaments and water-filled nanotubes formed by SOD1 mutant proteins linked to familial ALS. Nat Struct Biol 2003, 10 (6), 461-7. 86. Ray, S. S.; Nowak, R. J.; Strokovich, K.; Brown, R. H., Jr.; Walz, T.; Lansbury, P. T., Jr., An intersubunit disulfide bond prevents in vitro aggregation of a superoxide dismutase-1 mutant linked to familial amytrophic lateral sclerosis. Biochemistry 2004, 43 (17), 4899-905. 87. Khare, S. D.; Caplow, M.; Dokholyan, N. V., FALS mutations in Cu, Zn superoxide dismutase destabilize the dimer and increase dimer dissociation propensity: a large-scale thermodynamic analysis. Amyloid 2006, 13 (4), 226-35.

179

88. Ivanova, M. I.; Sievers, S. A.; Guenther, E. L.; Johnson, L. M.; Winkler, D. D.; Galaleldeen, A.; Sawaya, M. R.; Hart, P. J.; Eisenberg, D. S., Aggregation-triggering segments of SOD1 fibril formation support a common pathway for familial and sporadic ALS. Proc Natl Acad Sci U S A 2014, 111 (1), 197-201. 89. Harman, D., Aging: A Theory Based on Free Radical and Radiation Chemistry. Journal of Gerontology 1956, 11 (3), 298-300. 90. Andrus, P. K.; Fleck, T. J.; Gurney, M. E.; Hall, E. D., Protein oxidative damage in a transgenic mouse model of familial amyotrophic lateral sclerosis. J Neurochem 1998, 71 (5), 2041-8. 91. Poon, H. F.; Hensley, K.; Thongboonkerd, V.; Merchant, M. L.; Lynn, B. C.; Pierce, W. M.; Klein, J. B.; Calabrese, V.; Butterfield, D. A., Redox proteomics analysis of oxidatively modified proteins in G93A-SOD1 transgenic mice--a model of familial amyotrophic lateral sclerosis. Free Radic Biol Med 2005, 39 (4), 453-62. 92. Davies, K. J., Protein damage and degradation by oxygen radicals. I. general aspects. The Journal of biological chemistry 1987, 262 (20), 9895-901. 93. Zhang, H.; Joseph, J.; Crow, J.; Kalyanaraman, B., Mass spectral evidence for carbonate- anion-radical-induced posttranslational modification of tryptophan to kynurenine in human Cu, Zn superoxide dismutase. Free Radic Biol Med 2004, 37 (12), 2018-26. 94. Coelho, F. R.; Iqbal, A.; Linares, E.; Silva, D. F.; Lima, F. S.; Cuccovia, I. M.; Augusto, O., Oxidation of the tryptophan 32 residue of human superoxide dismutase 1 caused by its bicarbonate-dependent peroxidase activity triggers the non-amyloid aggregation of the enzyme. The Journal of biological chemistry 2014, 289 (44), 30690-701. 95. Cozzolino, M.; Amori, I.; Grazia Pesaresi, M.; Ferri, A.; Nencini, M.; Teresa Carrì, M., Cysteine 111 Affects Aggregation and Cytotoxicity of Mutant Cu,Zn-superoxide Dismutase Associated with Familial Amyotrophic Lateral Sclerosis. The Journal of biological chemistry 2008, 283 (2), 866-874. 96. Nagano, S.; Takahashi, Y.; Yamamoto, K.; Masutani, H.; Fujiwara, N.; Urushitani, M.; Araki, T., A cysteine residue affects the conformational state and neuronal toxicity of mutant SOD1 in mice: relevance to the pathogenesis of ALS. Hum Mol Genet 2015, 24 (12), 3427-39. 97. Roberts, B. L. T.; Patel, K.; Brown, H. H.; Borchelt, D. R., Role of Disulfide Cross- Linking of Mutant SOD1 in the Formation of Inclusion-Body-Like Structures. PLOS ONE 2012, 7 (10), e47838. 98. Urushitani, M.; Ezzi, S. A.; Julien, J. P., Therapeutic effects of immunization with mutant superoxide dismutase in mice models of amyotrophic lateral sclerosis. Proc Natl Acad Sci U S A 2007, 104 (7), 2495-500. 99. Rotunno, M. S.; Auclair, J. R.; Maniatis, S.; Shaffer, S. A.; Agar, J.; Bosco, D. A., Identification of a Misfolded Region in Superoxide Dismutase 1 that is Exposed in Amyotrophic Lateral Sclerosis. Journal of Biological Chemistry 2014. 100. Fujiwara, N.; Nakano, M.; Kato, S.; Yoshihara, D.; Ookawara, T.; Eguchi, H.; Taniguchi, N.; Suzuki, K., Oxidative modification to cysteine sulfonic acid of Cys111 in human copper-zinc superoxide dismutase. The Journal of biological chemistry 2007, 282 (49), 35933-44. 101. Liu, H.; Zhu, H.; Eggers, D. K.; Nersissian, A. M.; Faull, K. F.; Goto, J. J.; Ai, J.; Sanders-Loehr, J.; Gralla, E. B.; Valentine, J. S., Copper(2+) binding to the surface residue cysteine 111 of His46Arg human copper-zinc superoxide dismutase, a familial amyotrophic lateral sclerosis mutant. Biochemistry 2000, 39 (28), 8125-32.

180

102. Redler, R. L.; Wilcox, K. C.; Proctor, E. A.; Fee, L.; Caplow, M.; Dokholyan, N. V., Glutathionylation at Cys-111 induces dissociation of wild type and FALS mutant SOD1 dimers. Biochemistry 2011, 50 (32), 7057-66. 103. Nakanishi, T.; Kishikawa, M.; Miyazaki, A.; Shimizu, A.; Ogawa, Y.; Sakoda, S.; Ohi, T.; Shoji, H., Simple and defined method to detect the SOD-1 mutants from patients with familial amyotrophic lateral sclerosis by mass spectrometry. J Neurosci Methods 1998, 81 (1-2), 41-4. 104. Auclair, J. R.; Brodkin, H. R.; D'Aquino, J. A.; Petsko, G. A.; Ringe, D.; Agar, J. N., Structural consequences of cysteinylation of Cu/Zn-superoxide dismutase. Biochemistry 2013, 52 (36), 6145-50. 105. Johnson, J. M.; Strobel, F. H.; Reed, M.; Pohl, J.; Jones, D. P., A rapid LC-FTMS method for analysis of cysteine, cystine and cysteine/cystine steady-stateredox potential in human plasma. Clinica chimica acta; international journal of clinical chemistry 2008, 396 (1-2), 43-48. 106. Agar, J. N.; Salisbury, J., Tethering cysteine residues using cyclic disulfides. Google Patents: 2016. 107. Isim, S. Targeting Trp32 and Cys111 to Stabilize the ALS-Associated Protein Cu/Zn Superoxide Dismutase. Senior Thesis, Brandeis, 2014. 108. Friesner, R. A.; Murphy, R. B.; Repasky, M. P.; Frye, L. L.; Greenwood, J. R.; Halgren, T. A.; Sanschagrin, P. C.; Mainz, D. T., Extra Precision Glide: Docking and Scoring Incorporating a Model of Hydrophobic Enclosure for Protein−Ligand Complexes. Journal of Medicinal Chemistry 2006, 49 (21), 6177-6196. 109. Antonyuk, S.; Strange, R. W.; Hasnain, S. S., Structural Discovery of Small Molecule Binding Sites in Cu−Zn Human Superoxide Dismutase Familial Amyotrophic Lateral Sclerosis Mutants Provides Insights for Lead Optimization. Journal of Medicinal Chemistry 2010, 53 (3), 1402-1406. 110. Crook, R.; Ellis, R.; Shanks, M.; Thal, L. J.; Perez-Tur, J.; Baker, M.; Hutton, M.; Haltia, T.; Hardy, J.; Galasko, D., Early-onset Alzheimer's disease with a presenilin-1 mutation at the site corresponding to the Volga German presenilin-2 mutation. Ann Neurol 1997, 42 (1), 124-8. 111. Furu, L.; Onuchic, L. F.; Gharavi, A.; Hou, X.; Esquivel, E. L.; Nagasawa, Y.; Bergmann, C.; Senderek, J.; Avner, E.; Zerres, K.; Germino, G. G.; Guay-Woodford, L. M.; Somlo, S., Milder presentation of recessive polycystic kidney disease requires presence of amino acid substitution mutations. J Am Soc Nephrol 2003, 14 (8), 2004-14. 112. Waliany, S.; Das, A. K.; Gaben, A.; Wisniewski, K. E.; Hofmann, S. L., Identification of three novel mutations of the palmitoyl-protein thioesterase-1 (PPT1) gene in children with neuronal ceroid-lipofuscinosis. Hum Mutat 2000, 15 (2), 206-7. 113. Zelnik, N.; Mahajna, M.; Iancu, T. C.; Sharony, R.; Zeigler, M., A novel mutation of the CLN8 gene: is there a Mediterranean phenotype? Pediatr Neurol 2007, 36 (6), 411-3. 114. Levran, O.; Erlich, T.; Magdalena, N.; Gregory, J. J.; Batish, S. D.; Verlander, P. C.; Auerbach, A. D., Sequence variation in the Fanconi anemia gene FAA. Proc Natl Acad Sci U S A 1997, 94 (24), 13051-6. 115. Tassabehji, M.; Newton, V. E.; Liu, X. Z.; Brady, A.; Donnai, D.; Krajewska-Walasek, M.; Murday, V.; Norman, A.; Obersztyn, E.; Reardon, W.; et al., The mutational spectrum in Waardenburg syndrome. Hum Mol Genet 1995, 4 (11), 2131-7. 116. Gottlieb, B.; Trifiro, M.; Lumbroso, R.; Pinsky, L., The androgen receptor gene mutations database. Nucleic Acids Res 1997, 25 (1), 158-62.

181

117. Abel, O.; Powell, J. F.; Andersen, P. M.; Al-Chalabi, A., ALSoD: A user-friendly online bioinformatics tool for amyotrophic lateral sclerosis genetics. Hum Mutat 2012, 33 (9), 1345-51. 118. Levran, O.; Diotti, R.; Pujara, K.; Batish, S. D.; Hanenberg, H.; Auerbach, A. D., Spectrum of sequence variations in the FANCA gene: an International Fanconi Anemia Registry (IFAR) study. Hum Mutat 2005, 25 (2), 142-9. 119. Bruce, S. E.; Bjarnason, I.; Peters, T. J., Human jejunal transglutaminase: demonstration of activity, enzyme kinetics and substrate specificity with special relation to gliadin and coeliac disease. Clin Sci (Lond) 1985, 68 (5), 573-9. 120. Kim, C. Y.; Quarsten, H.; Bergseng, E.; Khosla, C.; Sollid, L. M., Structural basis for HLA-DQ2-mediated presentation of gluten epitopes in celiac disease. Proc Natl Acad Sci U S A 2004, 101 (12), 4175-9. 121. Molberg, O.; McAdam, S. N.; Korner, R.; Quarsten, H.; Kristiansen, C.; Madsen, L.; Fugger, L.; Scott, H.; Noren, O.; Roepstorff, P.; Lundin, K. E.; Sjostrom, H.; Sollid, L. M., Tissue transglutaminase selectively modifies gliadin peptides that are recognized by gut-derived T cells in celiac disease. Nat Med 1998, 4 (6), 713-7. 122. van de Wal, Y.; Kooy, Y.; van Veelen, P.; Pena, S.; Mearin, L.; Papadopoulos, G.; Koning, F., Selective deamidation by tissue transglutaminase strongly enhances gliadin-specific T cell reactivity. J Immunol 1998, 161 (4), 1585-8. 123. Cooke, W. T.; Smith, W. T., Neurological disorders associated with adult coeliac disease. Brain 1966, 89 (4), 683-722. 124. Hadjivassiliou, M.; Grunewald, R. A.; Chattopadhyay, A. K.; Davies-Jones, G. A.; Gibson, A.; Jarratt, J. A.; Kandler, R. H.; Lobo, A.; Powell, T.; Smith, C. M., Clinical, radiological, neurophysiological, and neuropathological characteristics of gluten ataxia. Lancet 1998, 352 (9140), 1582-5. 125. Hadjivassiliou, M.; Grunewald, R. A.; Kandler, R. H.; Chattopadhyay, A. K.; Jarratt, J. A.; Sanders, D. S.; Sharrack, B.; Wharton, S. B.; Davies-Jones, G. A., Neuropathy associated with gluten sensitivity. J Neurol Neurosurg Psychiatry 2006, 77 (11), 1262-6. 126. Turner, M. R.; Chohan, G.; Quaghebeur, G.; Greenhall, R. C.; Hadjivassiliou, M.; Talbot, K., A case of celiac disease mimicking amyotrophic lateral sclerosis. Nat Clin Pract Neurol 2007, 3 (10), 581-4. 127. Gadoth, A.; Nefussy, B.; Bleiberg, M.; Klein, T.; Artman, I.; Drory, V. E., Transglutaminase 6 antibodies in the serum of patients with amyotrophic lateral sclerosis. JAMA Neurology 2015, 72 (6), 676-681. 128. Fujita, K.; Honda, M.; Hayashi, R.; Ogawa, K.; Ando, M.; Yamauchi, M.; Nagata, Y., Transglutaminase activity in serum and cerebrospinal fluid in sporadic amyotrophic lateral sclerosis: a possible use as an indicator of extent of the motor neuron loss. Journal of the neurological sciences 1998, 158 (1), 53-57. 129. Oono, M.; Okado-Matsumoto, A.; Shodai, A.; Ido, A.; Ohta, Y.; Abe, K.; Ayaki, T.; Ito, H.; Takahashi, R.; Taniguchi, N.; Urushitani, M., Transglutaminase 2 accelerates neuroinflammation in amyotrophic lateral sclerosis through interaction with misfolded superoxide dismutase 1. Journal of Neurochemistry 2014, 128 (3), 403-418. 130. Prince, M. J.; Wimo, A.; Guerchet, M. M.; Ali, G. C.; Wu, Y.-T.; Prina, M., World Alzheimer Report 2015 - The Global Impact of Dementia. Alzheimer's Disease International: 2015. 131. Reiman, E. M., Alzheimer's disease: Attack on amyloid-[beta] protein. Nature 2016, 537 (7618), 36-37.

182

132. Klingelhoefer, L.; Reichmann, H., Pathogenesis of Parkinson disease[mdash]the gut- brain axis and environmental factors. Nat Rev Neurol 2015, 11 (11), 625-636. 133. Aguzzi, A.; Nuvolone, M.; Zhu, C., The immunobiology of prion diseases. Nat Rev Immunol 2013, 13 (12), 888-902. 134. Bendotti, C.; Marino, M.; Cheroni, C.; Fontana, E.; Crippa, V.; Poletti, A.; De Biasi, S., Dysfunction of constitutive and inducible ubiquitin-proteasome system in amyotrophic lateral sclerosis: implication for protein aggregation and immune response. Prog Neurobiol 2012, 97 (2), 101-26. 135. Hong, L.; Huang, H.-C.; Jiang, Z.-F., Relationship between amyloid-beta and the ubiquitin–proteasome system in Alzheimer’s disease. Neurological Research 2014, 36 (3), 276- 282. 136. Kabashi, E.; Agar, J. N.; Strong, M. J.; Durham, H. D., Impaired proteasome function in sporadic amyotrophic lateral sclerosis. Amyotroph Lateral Scler 2012, 13 (4), 367-71. 137. McNaught, K. S.; Jackson, T.; JnoBaptiste, R.; Kapustin, A.; Olanow, C. W., Proteasomal dysfunction in sporadic Parkinson's disease. Neurology 2006, 66 (10 Suppl 4), S37- 49. 138. Trumbull, K. A.; Beckman, J. S., A Role for Copper in the Toxicity of Zinc-Deficient Superoxide Dismutase to Motor Neurons in Amyotrophic Lateral Sclerosis. Antioxidants & Redox Signaling 2009, 11 (7), 1627-1639. 139. Estevez, A. G.; Crow, J. P.; Sampson, J. B.; Reiter, C.; Zhuang, Y.; Richardson, G. J.; Tarpey, M. M.; Barbeito, L.; Beckman, J. S., Induction of nitric oxide-dependent apoptosis in motor neurons by zinc-deficient superoxide dismutase. Science 1999, 286 (5449), 2498-500. 140. Hilton, J. B.; White, A. R.; Crouch, P. J., Metal-deficient SOD1 in amyotrophic lateral sclerosis. Journal of Molecular Medicine (Berlin, Germany) 2015, 93 (5), 481-487. 141. Soon, C. P.; Donnelly, P. S.; Turner, B. J.; Hung, L. W.; Crouch, P. J.; Sherratt, N. A.; Tan, J. L.; Lim, N. K.; Lam, L.; Bica, L.; Lim, S.; Hickey, J. L.; Morizzi, J.; Powell, A.; Finkelstein, D. I.; Culvenor, J. G.; Masters, C. L.; Duce, J.; White, A. R.; Barnham, K. J.; Li, Q. X., Diacetylbis(N(4)-methylthiosemicarbazonato) copper(II) (CuII(atsm)) protects against peroxynitrite-induced nitrosative damage and prolongs survival in amyotrophic lateral sclerosis mouse model. The Journal of biological chemistry 2011, 286 (51), 44035-44. 142. Williams, J. R.; Trias, E.; Beilby, P. R.; Lopez, N. I.; Labut, E. M.; Bradford, C. S.; Roberts, B. R.; McAllum, E. J.; Crouch, P. J.; Rhoads, T. W.; Pereira, C.; Son, M.; Elliott, J. L.; Franco, M. C.; Estevez, A. G.; Barbeito, L.; Beckman, J. S., Copper delivery to the CNS by CuATSM effectively treats motor neuron disease in SOD(G93A) mice co-expressing the Copper-Chaperone-for-SOD. Neurobiol Dis 2016, 89, 1-9. 143. Ayers, J. I.; Fromholt, S. E.; O’Neal, V. M.; Diamond, J. H.; Borchelt, D. R., Prion-like propagation of mutant SOD1 misfolding and motor neuron disease spread along neuroanatomical pathways. Acta Neuropathologica 2016, 131 (1), 103-114. 144. Rakhit, R.; Robertson, J.; Vande Velde, C.; Horne, P.; Ruth, D. M.; Griffin, J.; Cleveland, D. W.; Cashman, N. R.; Chakrabartty, A., An immunological epitope selective for pathological monomer-misfolded SOD1 in ALS. Nat Med 2007, 13 (6), 754-9. 145. Parakh, S.; Atkin, J. D., alterations in amyotrophic lateral sclerosis. Brain Res 2016, 1648 (Pt B), 633-649. 146. van Blitterswijk, M.; Gulati, S.; Smoot, E.; Jaffa, M.; Maher, N.; Hyman, B. T.; Ivinson, A. J.; Scherzer, C. R.; Schoenfeld, D. A.; Cudkowicz, M. E.; Brown, R. H., Jr.; Bosco, D. A.,

183

Anti-superoxide dismutase antibodies are associated with survival in patients with sporadic amyotrophic lateral sclerosis. Amyotroph Lateral Scler 2011, 12 (6), 430-8. 147. Liu, H. N.; Tjostheim, S.; Dasilva, K.; Taylor, D.; Zhao, B.; Rakhit, R.; Brown, M.; Chakrabartty, A.; McLaurin, J.; Robertson, J., Targeting of monomer/misfolded SOD1 as a therapeutic strategy for amyotrophic lateral sclerosis. J Neurosci 2012, 32 (26), 8791-9. 148. Broering, T. J.; Wang, H.; Boatright, N. K.; Wang, Y.; Baptista, K.; Shayan, G.; Garrity, K. A.; Kayatekin, C.; Bosco, D. A.; Matthews, C. R.; Ambrosino, D. M.; Xu, Z.; Babcock, G. J., Identification of human monoclonal antibodies specific for human SOD1 recognizing distinct epitopes and forms of SOD1. PLoS One 2013, 8 (4), e61210. 149. Takata, I.; Kawamura, N.; Myint, T.; Miyazawa, N.; Suzuki, K.; Maruyama, N.; Mino, M.; Taniguchi, N., Glycated Cu,Zn-Superoxide Dismutase in Rat lenses: Evidence for the Presence of Fragmentationin Vivo. Biochem Biophys Res Commun 1996, 219 (1), 243-248. 150. Choi, J.; Rees, H. D.; Weintraub, S. T.; Levey, A. I.; Chin, L. S.; Li, L., Oxidative modifications and aggregation of Cu,Zn-superoxide dismutase associated with Alzheimer and Parkinson diseases. The Journal of biological chemistry 2005, 280 (12), 11648-55. 151. Guareschi, S.; Cova, E.; Cereda, C.; Ceroni, M.; Donetti, E.; Bosco, D. A.; Trotti, D.; Pasinelli, P., An over-oxidized form of superoxide dismutase found in sporadic amyotrophic lateral sclerosis with bulbar onset shares a toxic mechanism with mutant SOD1. Proc Natl Acad Sci U S A 2012, 109 (13), 5074-9. 152. Shibata, N.; Hirano, A.; Kobayashi, M.; Siddique, T.; Deng, H. X.; Hung, W. Y.; Kato, T.; Asayama, K., Intense superoxide dismutase-1 immunoreactivity in intracytoplasmic hyaline inclusions of familial amyotrophic lateral sclerosis with posterior column involvement. J Neuropathol Exp Neurol 1996, 55 (4), 481-90. 153. Chou, S. M.; Wang, H. S.; Komai, K., Colocalization of NOS and SOD1 in neurofilament accumulation within motor neurons of amyotrophic lateral sclerosis: an immunohistochemical study. J Chem Neuroanat 1996, 10 (3-4), 249-58. 154. Rotunno, M. S.; Bosco, D. A., An emerging role for misfolded wild-type SOD1 in sporadic ALS pathogenesis. Front Cell Neurosci 2013, 7, 253. 155. Paré, B.; Lehmann, M.; Beaudin, M.; Nordström, U.; Saikali, S.; Julien, J.-P.; Gilthorpe, J. D.; Marklund, S. L.; Cashman, N. R.; Andersen, P. M.; Forsberg, K.; Dupré, N.; Gould, P.; Brännström, T.; Gros-Louis, F., Misfolded SOD1 pathology in sporadic Amyotrophic Lateral Sclerosis. Scientific Reports 2018, 8 (1), 14223. 156. Shinder, G. A.; Lacourse, M. C.; Minotti, S.; Durham, H. D., Mutant Cu/Zn-superoxide dismutase proteins have altered solubility and interact with heat shock/stress proteins in models of amyotrophic lateral sclerosis. The Journal of biological chemistry 2001, 276 (16), 12791-6. 157. Okado-Matsumoto, A.; Guan, Z.; Fridovich, I., Modification of Cysteine 111 in human Cu,Zn-superoxide dismutase. Free Radical Biology and Medicine 2006, 41 (12), 1837-1846. 158. Furukawa, Y.; O'Halloran, T. V., Posttranslational modifications in Cu,Zn-superoxide dismutase and mutations associated with amyotrophic lateral sclerosis. Antioxidants & redox signaling 2006, 8 (5-6), 847-867. 159. Stathopulos, P. B.; Rumfeldt, J. A.; Scholz, G. A.; Irani, R. A.; Frey, H. E.; Hallewell, R. A.; Lepock, J. R.; Meiering, E. M., Cu/Zn superoxide dismutase mutants associated with amyotrophic lateral sclerosis show enhanced formation of aggregates in vitro. Proc Natl Acad Sci U S A 2003, 100 (12), 7021-6.

184

160. Kayatekin, C.; Cohen, N. R.; Matthews, C. R., Enthalpic barriers dominate the folding and unfolding of the human Cu, Zn superoxide dismutase monomer. J Mol Biol 2012, 424 (3-4), 192-202. 161. Lepock, J. R.; Arnold, L. D.; Torrie, B. H.; Andrews, B.; Kruuv, J., Structural analyses of various Cu2+, Zn2+-superoxide dismutases by differential scanning calorimetry and Raman spectroscopy. Archives of Biochemistry and Biophysics 1985, 241 (1), 243-251. 162. Aebersold, R. H.; Leavitt, J.; Saavedra, R. A.; Hood, L. E.; Kent, S. B., Internal amino acid sequence analysis of proteins separated by one- or two-dimensional gel electrophoresis after in situ protease digestion on nitrocellulose. Proceedings of the National Academy of Sciences 1987, 84 (20), 6970-6974. 163. Schneider, C. A.; Rasband, W. S.; Eliceiri, K. W., NIH Image to ImageJ: 25 years of image analysis. Nature methods 2012, 9, 671. 164. Aebersold, R.; Mann, M., Mass spectrometry-based proteomics. Nature 2003, 422 (6928), 198-207. 165. Shevchenko, A.; Wilm, M.; Vorm, O.; Mann, M., Mass Spectrometric Sequencing of Proteins from Silver-Stained Polyacrylamide Gels. Analytical chemistry 1996, 68 (5), 850-858. 166. Strader, M. B.; VerBerkmoes, N. C.; Tabb, D. L.; Connelly, H. M.; Barton, J. W.; Bruce, B. D.; Pelletier, D. A.; Davison, B. H.; Hettich, R. L.; Larimer, F. W.; Hurst, G. B., Characterization of the 70S Ribosome from Rhodopseudomonas palustris Using an Integrated “Top-Down” and “Bottom-Up” Mass Spectrometric Approach. Journal of Proteome Research 2004, 3 (5), 965-978. 167. Donnelly, D. P.; Rawlins, C. M.; DeHart, C. J.; Fornelli, L.; Schachner, L. F.; Lin, Z.; Lippens, J. L.; Aluri, K. C.; Sarin, R.; Chen, B.; Lantz, C.; Jung, W.; Johnson, K. R.; Koller, A.; Wolff, J. J.; Campuzano, I. D. G.; Auclair, J. R.; Ivanov, A. R.; Whitelegge, J. P.; Pasa-Tolic, L.; Chamot-Rooke, J.; Danis, P. O.; Smith, L. M.; Tsybin, Y. O.; Loo, J. A.; Ge, Y.; Kelleher, N. L.; Agar, J. N., Best practices and benchmarks for intact protein analysis for top-down mass spectrometry. Nature methods 2019, 16 (7), 587-594. 168. Reid, G. E.; Wu, J.; Chrisman, P. A.; Wells, J. M.; McLuckey, S. A., Charge-State- Dependent Sequence Analysis of Protonated Ubiquitin Ions via Ion Trap Tandem Mass Spectrometry. Analytical chemistry 2001, 73 (14), 3274-3281. 169. Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S., Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20 (18), 3551-67. 170. Dongré, A. R.; Jones, J. L.; Somogyi, Á.; Wysocki, V. H., Influence of Peptide Composition, Gas-Phase Basicity, and Chemical Modification on Fragmentation Efficiency: Evidence for the Mobile Proton Model. Journal of the American Chemical Society 1996, 118 (35), 8365-8374. 171. Huang, Y.; Tseng, G. C.; Yuan, S.; Pasa-Tolic, L.; Lipton, M. S.; Smith, R. D.; Wysocki, V. H., A data-mining scheme for identifying peptide structural motifs responsible for different MS/MS fragmentation intensity patterns. Journal of proteome research 2008, 7 (1), 70-79. 172. Tabb, D. L.; Huang, Y.; Wysocki, V. H.; Yates, J. R., 3rd, Influence of basic residue content on fragment ion peak intensities in low-energy collision-induced dissociation spectra of peptides. Analytical chemistry 2004, 76 (5), 1243-1248. 173. Tsaprailis, G.; Nair, H.; Somogyi, Á.; Wysocki, V. H.; Zhong, W.; Futrell, J. H.; Summerfield, S. G.; Gaskell, S. J., Influence of Secondary Structure on the Fragmentation of Protonated Peptides. Journal of the American Chemical Society 1999, 121 (22), 5142-5154.

185

174. Wysocki, V. H.; Tsaprailis, G.; Smith, L. L.; Breci, L. A., Mobile and localized protons: a framework for understanding peptide dissociation. Journal of mass spectrometry : JMS 2000, 35 (12), 1399-406. 175. Haverland, N. A.; Skinner, O. S.; Fellers, R. T.; Tariq, A. A.; Early, B. P.; LeDuc, R. D.; Fornelli, L.; Compton, P. D.; Kelleher, N. L., Defining Gas-Phase Fragmentation Propensities of Intact Proteins During Native Top-Down Mass Spectrometry. J Am Soc Mass Spectrom 2017, 28 (6), 1203-1215. 176. Li, H.; Nguyen, H. H.; Ogorzalek Loo, R. R.; Campuzano, I. D. G.; Loo, J. A., An integrated native mass spectrometry and top-down proteomics method that connects sequence to structure and function of macromolecular complexes. Nature Chemistry 2018, 10 (2), 139-148. 177. Floris, F.; Chiron, L.; Lynch, A. M.; Barrow, M. P.; Delsuc, M.-A.; O'Connor, P. B., Application of Tandem Two-Dimensional Mass Spectrometry for Top-Down Deep Sequencing of Calmodulin. Journal of the American Society for Mass Spectrometry 2018, 29 (8), 1700-1705. 178. Srzentic, K.; Nagornov, K. O.; Fornelli, L.; Lobas, A. A.; Ayoub, D.; Kozhinov, A. N.; Gasilova, N.; Menin, L.; Beck, A.; Gorshkov, M. V.; Aizikov, K.; Tsybin, Y. O., Multiplexed Middle-Down Mass Spectrometry as a Method for Revealing Light and Heavy Chain Connectivity in a Monoclonal Antibody. Analytical chemistry 2018, 90 (21), 12527-12535. 179. Griaud, F.; Denefeld, B.; Kao-Scharf, C. Y.; Dayer, J.; Lang, M.; Chen, J. Y.; Berg, M., All Ion Differential Analysis Refines the Detection of Terminal and Internal Diagnostic Fragment Ions for the Characterization of Biologics Product-Related Variants and Impurities by Middle-down Mass Spectrometry. Analytical chemistry 2019, 91 (14), 8845-8852. 180. Savaryn, J. P.; Skinner, O. S.; Fornelli, L.; Fellers, R. T.; Compton, P. D.; Terhune, S. S.; Abecassis, M. M.; Kelleher, N. L., Targeted analysis of recombinant NF kappa B (RelA/p65) by denaturing and native top down mass spectrometry. Journal of proteomics 2016, 134, 76-84. 181. Rush, M. J. P.; Riley, N. M.; Westphall, M. S.; Coon, J. J., Top-Down Characterization of Proteins with Intact Disulfide Bonds Using Activated-Ion Electron Transfer Dissociation. Analytical chemistry 2018, 90 (15), 8946-8953. 182. Shaw, J. B.; Li, W.; Holden, D. D.; Zhang, Y.; Griep-Raming, J.; Fellers, R. T.; Early, B. P.; Thomas, P. M.; Kelleher, N. L.; Brodbelt, J. S., Complete Protein Characterization Using Top-Down Mass Spectrometry and Ultraviolet Photodissociation. Journal of the American Chemical Society 2013, 135 (34), 12646-12651. 183. Muhammad, Z.; Carter, L.; Taylor, P.; Janine, F.; Wonhyuek, J.; Rachel, R. O. L.; Joseph A., L., Internal Fragments Generated by Electron Ionization Dissociation Enhances Protein Top- down Mass Spectrometry. 2020. 184. Lyon, Y. A.; Riggs, D.; Fornelli, L.; Compton, P. D.; Julian, R. R., The Ups and Downs of Repeated Cleavage and Internal Fragment Production in Top-Down Proteomics. Journal of The American Society for Mass Spectrometry 2018, 29 (1), 150-157. 185. Smith, L. M.; Thomas, P. M.; Shortreed, M. R.; Schaffer, L. V.; Fellers, R. T.; LeDuc, R. D.; Tucholski, T.; Ge, Y.; Agar, J. N.; Anderson, L. C.; Chamot-Rooke, J.; Gault, J.; Loo, J. A.; Paša-Tolić, L.; Robinson, C. V.; Schlüter, H.; Tsybin, Y. O.; Vilaseca, M.; Vizcaíno, J. A.; Danis, P. O.; Kelleher, N. L., A five-level classification system for proteoform identifications. Nature methods 2019, 16 (10), 939-940. 186. Xiao, K.; Yu, F.; Fang, H.; Xue, B.; Liu, Y.; Li, Y.; Tian, Z., Are neutral loss and internal product ions useful for top-down protein identification? Journal of proteomics 2017, 160, 21-27.

186

187. Kou, Q.; Xun, L.; Liu, X., TopPIC: a software tool for top-down mass spectrometry- based proteoform identification and characterization. Bioinformatics (Oxford, England) 2016, 32 (22), 3495-3497. 188. Guner, H.; Close, P. L.; Cai, W.; Zhang, H.; Peng, Y.; Gregorich, Z. R.; Ge, Y., MASH Suite: a user-friendly and versatile software interface for high-resolution mass spectrometry data interpretation and visualization. Journal of the American Society for Mass Spectrometry 2014, 25 (3), 464-470. 189. Park, J.; Piehowski, P. D.; Wilkins, C.; Zhou, M.; Mendoza, J.; Fujimoto, G. M.; Gibbons, B. C.; Shaw, J. B.; Shen, Y.; Shukla, A. K.; Moore, R. J.; Liu, T.; Petyuk, V. A.; Tolić, N.; Paša-Tolić, L.; Smith, R. D.; Payne, S. H.; Kim, S., Informed-Proteomics: open-source software package for top-down proteomics. Nature methods 2017, 14 (9), 909-914. 190. Chen, B.; Brown, K. A.; Lin, Z.; Ge, Y., Top-Down Proteomics: Ready for Prime Time? Analytical chemistry 2018, 90 (1), 110-127. 191. Schaffer, L. V.; Millikin, R. J.; Miller, R. M.; Anderson, L. C.; Fellers, R. T.; Ge, Y.; Kelleher, N. L.; LeDuc, R. D.; Liu, X.; Payne, S. H.; Sun, L.; Thomas, P. M.; Tucholski, T.; Wang, Z.; Wu, S.; Wu, Z.; Yu, D.; Shortreed, M. R.; Smith, L. M., Identification and Quantification of Proteoforms by Mass Spectrometry. Proteomics 2019, 19 (10), e1800361- e1800361. 192. http://www.cheminfo.org/Spectra/Mass/Isotopic_mass_generator_with_peptides/index.ht ml#. 193. Doucette, P. A.; Whitson, L. J.; Cao, X.; Schirf, V.; Demeler, B.; Valentine, J. S.; Hansen, J. C.; Hart, P. J., Dissociation of human copper-zinc superoxide dismutase dimers using chaotrope and reductant. Insights into the molecular basis for dimer stability. The Journal of biological chemistry 2004, 279 (52), 54558-66. 194. Matsumoto, A.; Okada, Y.; Nakamichi, M.; Nakamura, M.; Toyama, Y.; Sobue, G.; Nagai, M.; Aoki, M.; Itoyama, Y.; Okano, H., Disease progression of human SOD1 (G93A) transgenic ALS model rats. Journal of neuroscience research 2006, 83 (1), 119-33. 195. Vinsant, S.; Mansfield, C.; Jimenez-Moreno, R.; Del Gaizo Moore, V.; Yoshikawa, M.; Hampton, T. G.; Prevette, D.; Caress, J.; Oppenheim, R. W.; Milligan, C., Characterization of early pathogenesis in the SOD1(G93A) mouse model of ALS: part II, results and discussion. Brain and behavior 2013, 3 (4), 431-57. 196. Rakhit, R.; Chakrabartty, A., Structure, folding, and misfolding of Cu,Zn superoxide dismutase in amyotrophic lateral sclerosis. Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease 2006, 1762 (11), 1025-1037. 197. Schmitt, N. D.; Rawlins, C. M.; Randall, E. C.; Wang, X.; Koller, A.; Auclair, J. R.; Kowalski, J.-M.; Kowalski, P. J.; Luther, E.; Ivanov, A. R.; Agar, N. Y. R.; Agar, J. N., Genetically Encoded Fluorescent Proteins Enable High-Throughput Assignment of Cell Cohorts Directly from MALDI-MS Images. Analytical chemistry 2019, 91 (6), 3810-3817. 198. Armbrecht, L.; Dittrich, P. S., Recent Advances in the Analysis of Single Cells. Analytical chemistry 2017, 89 (1), 2-21. 199. Galler, K.; Brautigam, K.; Grosse, C.; Popp, J.; Neugebauer, U., Making a big thing of a small cell - recent advances in single cell analysis. The Analyst 2014, 139 (6), 1237-1273. 200. Wang, Z.; Gerstein, M.; Snyder, M., RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews Genetics 2009, 10, 57.

187

201. Altelaar, A. F. M.; Heck, A. J. R., Trends in ultrasensitive proteomics. Current opinion in chemical biology 2012, 16 (1), 206-213. 202. Bandura, D. R.; Baranov, V. I.; Ornatsky, O. I.; Antonov, A.; Kinach, R.; Lou, X.; Pavlov, S.; Vorobiev, S.; Dick, J. E.; Tanner, S. D., Mass cytometry: technique for real time single cell multitarget immunoassay based on inductively coupled plasma time-of-flight mass spectrometry. Analytical chemistry 2009, 81 (16), 6813-22. 203. Holzlechner, M.; Strasser, K.; Zareva, E.; Steinhauser, L.; Birnleitner, H.; Beer, A.; Bergmann, M.; Oehler, R.; Marchetti-Deschmann, M., In Situ Characterization of Tissue- Resident Immune Cells by MALDI Mass Spectrometry Imaging. Journal of proteome research 2017, 16 (1), 65-76. 204. Yamanaka, K.; Chun, S. J.; Boillee, S.; Fujimori-Tonou, N.; Yamashita, H.; Gutmann, D. H.; Takahashi, R.; Misawa, H.; Cleveland, D. W., Astrocytes as determinants of disease progression in inherited amyotrophic lateral sclerosis. Nature neuroscience 2008, 11 (3), 251-3. 205. Zhang, L.; Vertes, A., Single-Cell Mass Spectrometry Approaches to Explore Cellular Heterogeneity. Angewandte Chemie 2018, 57 (17), 4466-4477. 206. Qi, M.; Philip, M. C.; Yang, N.; Sweedler, J. V., Single Cell Neurometabolomics. ACS chemical neuroscience 2018, 9 (1), 40. 207. Winograd, N., Gas Cluster Ion Beams for Secondary Ion Mass Spectrometry. Annual review of analytical chemistry 2016. 208. McDonnell, L. A.; Piersma, S. R.; Altelaar, A. F. M.; Mize, T. H.; Luxembourg, S. L.; Verhaert, P. D. E. M.; van Minnen, J.; Heeren, R. M. A., Subcellular imaging mass spectrometry of brain tissue. Journal of Mass Spectrometry 2005, 40 (2), 160-168. 209. Giesen, C.; Wang, H. A.; Schapiro, D.; Zivanovic, N.; Jacobs, A.; Hattendorf, B.; Schuffler, P. J.; Grolimund, D.; Buhmann, J. M.; Brandt, S.; Varga, Z.; Wild, P. J.; Gunther, D.; Bodenmiller, B., Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry. Nature methods 2014, 11 (4), 417-22. 210. Gode, D.; Volmer, D. A., Lipid imaging by mass spectrometry a review. The Analyst 2013, 138 (5), 1289-1315. 211. Chughtai, K.; Heeren, R., Mass Spectrometric Imaging for Biomedical Tissue Analysis. In Chem. Rev., 2010; Vol. 110, pp 3237-3277. 212. Körsgen, M.; Pelster, A.; Dreisewerd, K.; Arlinghaus, H., 3D ToF-SIMS Analysis of Peptide Incorporation into MALDI Matrix Crystals with Sub-micrometer Resolution. The official journal of The American Society for Mass Spectrometry 2016, 27 (2), 277-284. 213. Duan-Sun, Z.; Valeria, P.; Benjamin, J. P.; Agnieszka, K. R.; Poczatek, J. C.; Mei, W.; Haydn, M. P.; James, M. E.; David, P. C.; Claude, P. L., Multi-isotope imaging mass spectrometry reveals slow protein turnover in hair-cell stereocilia. Nature 2012, 481 (7382), 520. 214. Kraft, M. L.; Weber, P. K.; Longo, M. L.; Hutcheon, I. D.; Boxer, S. G., Phase separation of lipid membranes analyzed with high-resolution secondary ion mass spectrometry. Science (New York, N.Y.) 2006, 313 (5795), 1948. 215. Lovrić, J.; Dunevall, J.; Larsson, A.; Ren, L.; Andersson, S.; Meibom, A.; Malmberg, P.; Kurczy, M. E.; Ewing, A. G., Nano Secondary Ion Mass Spectrometry Imaging of Dopamine Distribution Across Nanometer Vesicles. ACS nano 2017, 11 (4), 3446. 216. Monroe, E. B.; Jurchen, J. C.; Lee, J.; Rubakhin, S. S.; Sweedler, J. V., Vitamin E Imaging and Localization in the Neuronal Membrane. Journal of the American Chemical Society 2005, 127 (35), 12152-12153.

188

217. Passarelli, M. K.; Pirkl, A.; Moellers, R.; Grinfeld, D.; Kollmer, F.; Havelund, R.; Newman, C. F.; Marshall, P. S.; Arlinghaus, H.; Alexander, M. R.; West, A.; Horning, S.; Niehuis, E.; Makarov, A.; Dollery, C. T.; Gilmore, I. S., The 3D OrbiSIMS-label-free metabolic imaging with subcellular lateral resolution and high mass-resolving power. Nature methods 2017, 14 (12), 1175-1183. 218. Sheng, L.; Cai, L.; Wang, J.; Li, Z.; Mo, Y.; Zhang, S.; Xu, J.-J.; Zhang, X.; Chen, H.-Y., Simultaneous imaging of newly synthesized proteins and lipids in single cell by TOF-SIMS. Int J Mass Spectrom 2017, 421, 238-244. 219. Sogawa, K.; Watanabe, M.; Sato, K.; Segawa, S.; Ishii, C.; Miyabe, A.; Murata, S.; Saito, T.; Nomura, F., Use of the MALDI BioTyper system with MALDI-TOF mass spectrometry for rapid identification of microorganisms. Analytical and bioanalytical chemistry 2011, 400 (7), 1905-11. 220. Dilillo, M.; Ait-Belkacem, R.; Esteve, C.; Pellegrini, D.; Nicolardi, S.; Costa, M.; Vannini, E.; Graaf, E. L. d.; Caleo, M.; McDonnell, L. A., Ultra-High Mass Resolution MALDI Imaging Mass Spectrometry of Proteins and Metabolites in a Mouse Model of Glioblastoma. Scientific reports 2017, 7 (1), 603. 221. Berkenkamp, S.; Kirpekar, F.; Hillenkamp, F., Infrared MALDI Mass Spectrometry of Large Nucleic Acids. Science 1998, 281 (5374), 260-262. 222. Mohammadi, A. S.; Phan, N. T.; Fletcher, J. S.; Ewing, A. G., Intact lipid imaging of mouse brain samples: MALDI, nanoparticle-laser desorption ionization, and 40 keV argon cluster secondary ion mass spectrometry. Analytical and bioanalytical chemistry 2016, 408 (24), 6857-68. 223. Heijs, B.; Tolner, E. A.; Bovee, J. V.; van den Maagdenberg, A. M.; McDonnell, L. A., Brain region-specific dynamics of on-tissue protein digestion using MALDI Mass Spectrometry Imaging. Journal of proteome research 2015. 224. Zavalin, A.; Yang, J.; Hayden, K.; Vestal, M.; Caprioli, R. M., Tissue protein imaging at 1 mum laser spot diameter for high spatial resolution and high imaging speed using transmission geometry MALDI TOF MS. Analytical and bioanalytical chemistry 2015, 407 (8), 2337-42. 225. Anderson, D. M.; Carolan, V. A.; Crosland, S.; Sharples, K. R.; Clench, M. R., Examination of the distribution of nicosulfuron in sunflower plants by matrix-assisted laser desorption/ionisation mass spectrometry imaging. Rapid communications in mass spectrometry : RCM 2009, 23 (9), 1321-7. 226. Porta, T.; Grivet, C.; Kraemer, T.; Varesio, E.; Hopfgartner, G., Single hair cocaine consumption monitoring by mass spectrometric imaging. Analytical chemistry 2011, 83 (11), 4266-72. 227. Lanni, E. J.; Masyuko, R. N.; Driscoll, C. M.; Aerts, J. T.; Shrout, J. D.; Bohn, P. W.; Sweedler, J. V., MALDI-guided SIMS: Multiscale Imaging of Metabolites in Bacterial Biofilms. Analytical chemistry 2014, 86 (18), 9139-9145. 228. Rauser, S.; Marquardt, C.; Balluff, B.; Deininger, S. O.; Albers, C.; Belau, E.; Hartmer, R.; Suckau, D.; Specht, K.; Ebert, M. P.; Schmitt, M.; Aubele, M.; Hofler, H.; Walch, A., Classification of HER2 receptor status in breast cancer tissues by MALDI imaging mass spectrometry. Journal of proteome research 2010, 9 (4), 1854-63. 229. Eberlin, L. S.; Norton, I.; Dill, A. L.; Golby, A. J.; Ligon, K. L.; Santagata, S.; Cooks, R. G.; Agar, N. Y., Classifying human brain tumors by lipid imaging with mass spectrometry. Cancer research 2012, 72 (3), 645-54.

189

230. Groseclose, M. R.; Castellino, S., A mimetic tissue model for the quantification of drug distributions by MALDI imaging mass spectrometry. Analytical chemistry 2013, 85 (21), 10099- 106. 231. Jonas, O.; Calligaris, D.; Methuku, K. R.; Poe, M. M.; Francois, J. P.; Tranghese, F.; Changelian, A.; Sieghart, W.; Ernst, M.; Krummel, D. A.; Cook, J. M.; Pomeroy, S. L.; Cima, M.; Agar, N. Y.; Langer, R.; Sengupta, S., First In Vivo Testing of Compounds Targeting Group 3 Medulloblastomas Using an Implantable Microdevice as a New Paradigm for Drug Development. Journal of biomedical nanotechnology 2016, 12 (6), 1297-302. 232. Kim, A. J.; Basu, S.; Glass, C.; Ross, E. L.; Agar, N.; He, Q.; Calligaris, D., Unique Intradural Inflammatory Mass Containing Precipitated Morphine: Confirmatory Analysis by LESA-MS and MALDI-MS. Pain Practice 0 (0). 233. Spraggins, J. M.; Rizzo, D. G.; Moore, J. L.; Noto, M. J.; Skaar, E. P.; Caprioli, R. M., Next-generation technologies for spatial proteomics: Integrating ultra-high speed MALDI-TOF and high mass resolution MALDI FTICR imaging mass spectrometry for protein analysis. Proteomics 2016, 16 (11-12), 1678-1689. 234. Rawlins, C. M.; Salisbury, J. P.; Feldman, D. R.; Isim, S.; Agar, N. Y.; Luther, E.; Agar, J. N., Imaging and Mapping of Tissue Constituents at the Single-Cell Level Using MALDI MSI and Quantitative Laser Scanning Cytometry. Methods in molecular biology 2015, 1346, 133-49. 235. Longuespee, R.; Alberts, D.; Pottier, C.; Smargiasso, N.; Mazzucchelli, G.; Baiwir, D.; Kriegsmann, M.; Herfs, M.; Kriegsmann, J.; Delvenne, P.; De Pauw, E., A laser microdissection- based workflow for FFPE tissue microproteomics: Important considerations for small sample processing. Methods 2016, 104, 154-162. 236. Zimmerman, T. A.; Rubakhin, S. S.; Romanova, E. V.; Tucker, K. R.; Sweedler, J. V., MALDI Mass Spectrometric Imaging Using the Stretched Sample Method to Reveal Neuropeptide Distributions in Aplysia Nervous Tissue. Analytical chemistry 2009, 81 (22), 9402-9409. 237. Boggio, K. J.; Obasuyi, E.; Sugino, K.; Nelson, S. B.; Agar, N. Y.; Agar, J. N., Recent advances in single-cell MALDI mass spectrometry imaging and potential clinical impact. Expert review of proteomics 2011, 8 (5), 591-604. 238. Wiegelmann, M.; Dreisewerd, K.; Soltwisch, J., Influence of the Laser Spot Size, Focal Beam Profile, and Tissue Type on the Lipid Signals Obtained by MALDI-MS Imaging in Oversampling Mode. Journal of the American Society for Mass Spectrometry 2016, 27 (12), 1952-1964. 239. Kompauer, M.; Heiles, S.; Spengler, B., Atmospheric pressure MALDI mass spectrometry imaging of tissues and cells at 1.4-[mu]m lateral resolution.(Report). Nature methods 2017, 14 (1), 90. 240. Jansson, E. T.; Comi, T. J.; Rubakhin, S. S.; Sweedler, J. V., Single Cell Peptide Heterogeneity of Rat Islets of Langerhans. ACS chemical biology 2016, 11 (9), 2588-2595. 241. Rabe, J. H.; D, A. S.; Schulz, S.; Munteanu, B.; Ott, M.; Ochs, K.; Hohenberger, P.; Marx, A.; Platten, M.; Opitz, C. A.; Ory, D. S.; Hopf, C., Fourier Transform Infrared Microscopy Enables Guidance of Automated Mass Spectrometry Imaging to Predefined Tissue Morphologies. Scientific reports 2018, 8 (1), 313. 242. Comi, T. J.; Neumann, E. K.; Do, T. D.; Sweedler, J. V., microMS: A Python Platform for Image-Guided Mass Spectrometry Profiling. Journal of the American Society for Mass Spectrometry 2017, 28 (9), 1919-1928.

190

243. Abdelmoula, W. M.; Škrášková, K.; Balluff, B.; Carreira, R. J.; Tolner, E. A.; Lelieveldt, B. P. F.; van der Maaten, L.; Morreau, H.; van den Maagdenberg, A. M. J. M.; Heeren, R. M. A.; McDonnell, L. A.; Dijkstra, J., Automatic Generic Registration of Mass Spectrometry Imaging Data to Histology Using Nonlinear Stochastic Embedding. Analytical chemistry 2014, 86 (18), 9204-9211. 244. Cristea, I. M.; Williams, R.; Chait, B. T.; Rout, M. P., Fluorescent proteins as proteomic probes. Mol Cell Proteomics 2005, 4 (12), 1933-41. 245. Tsien, R. Y., THE GREEN FLUORESCENT PROTEIN. Annual Review of Biochemistry 1998, 67 (1), 509-544. 246. Zhang, L.; Sevinsky, C. J.; Davis, B. M.; Vertes, A., Single-Cell Mass Spectrometry of Subpopulations Selected by Fluorescence Microscopy. Analytical chemistry 2018, 90 (7), 4626- 4634. 247. Sonka, M.; Fitzpatrick, J. M., Image Registration. Handbook of Medical Imaging, Volume 2 - Medical Image Processing and Analysis: pp 447-513. 248. Feng, G.; Mellor, R. H.; Bernstein, M.; Keller-Peck, C.; Nguyen, Q. T.; Wallace, M.; Nerbonne, J. M.; Lichtman, J. W.; Sanes, J. R., Imaging neuronal subsets in transgenic mice expressing multiple spectral variants of GFP. Neuron 2000, 28 (1), 41-51. 249. Ormo, M.; Cubitt, A. B.; Kallio, K.; Gross, L. A.; Tsien, R. Y.; Remington, S. J., Crystal structure of the Aequorea victoria green fluorescent protein. Science 1996, 273 (5280), 1392-5. 250. Porrero, C.; Rubio-Garrido, P.; Avendano, C.; Clasca, F., Mapping of fluorescent protein- expressing neurons and axon pathways in adult and developing Thy1-eYFP-H transgenic mice. Brain Res 2010, 1345, 59-72. 251. Luther, E.; Kamentsky, L.; Henriksen, M.; Holden, E., Next-generation laser scanning cytometry. Methods in cell biology 2004, 75, 185-218. 252. Noor, S. S. M.; Tey, B. T.; Tan, W. S.; Ling, T. C.; Ramanan, R. N.; Ooi, C. W., Purification of Recombinant Green Fluorescent Protein from Escherichia Coli Using Hydrophobic Interaction Chromatography. J Liq Chromatogr R T 2014, 37 (13), 1873-1884. 253. Griesbeck, O.; Baird, G. S.; Campbell, R. E.; Zacharias, D. A.; Tsien, R. Y., Reducing the environmental sensitivity of yellow fluorescent protein. Mechanism and applications. The Journal of biological chemistry 2001, 276 (31), 29188-94. 254. Wachter, R. M.; Elsliger, M. A.; Kallio, K.; Hanson, G. T.; Remington, S. J., Structural basis of spectral shifts in the yellow-emission variants of green fluorescent protein. Structure 1998, 6 (10), 1267-77. 255. Daubner, S. C.; Astorga, A. M.; Leisman, G. B.; Baldwin, T. O., Yellow light emission of Vibrio fischeri strain Y-1: purification and characterization of the energy-accepting yellow fluorescent protein. Proceedings of the National Academy of Sciences of the United States of America 1987, 84 (24), 8912-6. 256. Cannon, J. R.; Kluwe, C.; Ellington, A.; Brodbelt, J. S., Characterization of green fluorescent proteins by 193 nm ultraviolet photodissociation mass spectrometry. Proteomics 2014, 14 (10), 1165-73. 257. Seeley, E. H.; Oppenheimer, S. R.; Mi, D.; Chaurand, P.; Caprioli, R. M., Enhancement of protein sensitivity for MALDI imaging mass spectrometry after chemical treatment of tissue sections. Journal of the American Society for Mass Spectrometry 2008, 19 (8), 1069-77. 258. Yang, J.; Caprioli, R. M., Matrix sublimation/recrystallization for imaging proteins by mass spectrometry at high spatial resolution. Analytical chemistry 2011, 83 (14), 5728-34.

191

259. Park, J.; Qin, H.; Scalf, M.; Hilger, R. T.; Westphall, M. S.; Smith, L. M.; Blick, R. H., A Mechanical Nanomembrane Detector for Time-of-Flight Mass Spectrometry. Nano letters 2011, 11 (9), 3681-3684. 260. van Remoortere, A.; van Zeijl, R. J.; van den Oever, N.; Franck, J.; Longuespee, R.; Wisztorski, M.; Salzet, M.; Deelder, A. M.; Fournier, I.; McDonnell, L. A., MALDI imaging and profiling MS of higher mass proteins from tissue. Journal of the American Society for Mass Spectrometry 2010, 21 (11), 1922-9. 261. Franck, J.; Longuespee, R.; Wisztorski, M.; Van Remoortere, A.; Van Zeijl, R.; Deelder, A.; Salzet, M.; McDonnell, L.; Fournier, I., MALDI mass spectrometry imaging of proteins exceeding 30,000 daltons. Medical science monitor : international medical journal of experimental and clinical research 2010, 16 (9), BR293-9. 262. Cohen, S. L.; Chait, B. T., Influence of matrix solution conditions on the MALDI-MS analysis of peptides and proteins. Analytical chemistry 1996, 68 (1), 31-7. 263. Mainini, V.; Angel, P. M.; Magni, F.; Caprioli, R. M., Detergent enhancement of on- tissue protein analysis by matrix-assisted laser desorption/ionization imaging mass spectrometry. Rapid communications in mass spectrometry : RCM 2011, 25 (1), 199-204. 264. Agar, N. Y. R.; Yang, H. W.; Carroll, R. S.; Black, P. M.; Agar, J. N., Matrix solution fixation: Histology-compatible tissue preparation for MALDI mass Spectrometry Imaging. Analytical chemistry 2007, 79 (19), 7416-7423. 265. Hankin, J. A.; Barkley, R. M.; Murphy, R. C., Sublimation as a method of matrix application for mass spectrometric imaging. Journal of the American Society for Mass Spectrometry 2007, 18 (9), 1646-1652. 266. Li, S.; Plouffe, B. D.; Belov, A. M.; Ray, S.; Wang, X.; Murthy, S. K.; Karger, B. L.; Ivanov, A. R., An Integrated Platform for Isolation, Processing, and Mass Spectrometry-based Proteomic Profiling of Rare Cells in Whole Blood. Molecular & cellular proteomics : MCP 2015, 14 (6), 1672-83. 267. Shrestha, B.; Patt, J. M.; Vertes, A., In Situ Cell-by-Cell Imaging and Analysis of Small Cell Populations by Mass Spectrometry. Analytical chemistry 2011, 83 (8), 2947-2955. 268. Eberlin, L. S.; Liu, X.; Ferreira, C. R.; Santagata, S.; Agar, N. Y.; Cooks, R. G., Desorption electrospray ionization then MALDI mass spectrometry imaging of lipid and protein distributions in single tissue sections. Analytical chemistry 2011, 83 (22), 8366-71. 269. Ong, T. H.; Kissick, D. J.; Jansson, E. T.; Comi, T. J.; Romanova, E. V.; Rubakhin, S. S.; Sweedler, J. V., Classification of Large Cellular Populations and Discovery of Rare Cells Using Single Cell Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry. Analytical chemistry 2015, 87 (14), 7036-7042. 270. Comi, T. J.; Do, T. D.; Rubakhin, S. S.; Sweedler, J. V., Categorizing Cells on the Basis of their Chemical Profiles: Progress in Single-Cell Mass Spectrometry. Journal of the American Chemical Society 2017, 139 (11), 3920-3929. 271. Livet, J.; Weissman, T. A.; Kang, H.; Draft, R. W.; Lu, J.; Bennis, R. A.; Sanes, J. R.; Lichtman, J. W., Transgenic strategies for combinatorial expression of fluorescent proteins in the nervous system. Nature 2007, 450 (7166), 56-62. 272. Weissman, T. A.; Pan, Y. A., Brainbow: New Resources and Emerging Biological Applications for Multicolor Genetic Labeling and Analysis. Genetics 2015, 199 (2), 293-306. 273. Hartnett, H. E., Dissolved Organic Matter (DOM). In Encyclopedia of Geochemistry: A Comprehensive Reference Source on the Chemistry of the Earth, White, W. M., Ed. Springer International Publishing: Cham, 2017; pp 1-3.

192

274. Dittmar, T.; Stubbins, A., 12.6 - Dissolved Organic Matter in Aquatic Systems. In Treatise on Geochemistry (Second Edition), Holland, H. D.; Turekian, K. K., Eds. Elsevier: Oxford, 2014; pp 125-156. 275. Hawkes, J. A.; D'Andrilli, J.; Agar, J. N.; Barrow, M. P.; Berg, S. M.; Catalán, N.; Chen, H.; Chu, R. K.; Cole, R. B.; Dittmar, T.; Gavard, R.; Gleixner, G.; Hatcher, P. G.; He, C.; Hess, N. J.; Hutchins, R. H. S.; Ijaz, A.; Jones, H. E.; Kew, W.; Khaksari, M.; Palacio Lozano, D. C.; Lv, J.; Mazzoleni, L. R.; Noriega-Ortega, B. E.; Osterholz, H.; Radoman, N.; Remucal, C. K.; Schmitt, N. D.; Schum, S. K.; Shi, Q.; Simon, C.; Singer, G.; Sleighter, R. L.; Stubbins, A.; Thomas, M. J.; Tolic, N.; Zhang, S.; Zito, P.; Podgorski, D. C., An international laboratory comparison of dissolved organic matter composition by high resolution mass spectrometry: Are we getting the same answer? Limnology and Oceanography: Methods 2020, 18 (6), 235-258. 276. Harkewicz, R.; Dennis, E. A., Applications of mass spectrometry to lipids and membranes. Annual review of biochemistry 2011, 80, 301-325. 277. Basu, S. S.; Randall, E. C.; Regan, M. S.; Lopez, B. G. C.; Clark, A. R.; Schmitt, N. D.; Agar, J. N.; Dillon, D. A.; Agar, N. Y. R., In Vitro Liquid Extraction Surface Analysis Mass Spectrometry (ivLESA-MS) for Direct Metabolic Analysis of Adherent Cells in Culture. Analytical chemistry 2018, 90 (8), 4987-4991. 278. Micoogullari, Y.; Basu, S. S.; Ang, J.; Weisshaar, N.; Schmitt, N. D.; Abdelmoula, W. M.; Lopez, B.; Agar, J. N.; Agar, N.; Hanna, J., Dysregulation of very-long-chain fatty acid metabolism causes membrane saturation and induction of the unfolded protein response. Molecular Biology of the Cell 2020, 31 (1), 7-17. 279. Folch, J.; Lees, M.; Sloane Stanley, G. H., A simple method for the isolation and purification of total lipides from animal tissues. The Journal of biological chemistry 1957, 226 (1), 497-509. 280. Wishart, D. S.; Tzur, D.; Knox, C.; Eisner, R.; Guo, A. C.; Young, N.; Cheng, D.; Jewell, K.; Arndt, D.; Sawhney, S.; Fung, C.; Nikolai, L.; Lewis, M.; Coutouly, M.-A.; Forsythe, I.; Tang, P.; Shrivastava, S.; Jeroncic, K.; Stothard, P.; Amegbey, G.; Block, D.; Hau, D. D.; Wagner, J.; Miniaci, J.; Clements, M.; Gebremedhin, M.; Guo, N.; Zhang, Y.; Duggan, G. E.; MacInnis, G. D.; Weljie, A. M.; Dowlatabadi, R.; Bamforth, F.; Clive, D.; Greiner, R.; Li, L.; Marrie, T.; Sykes, B. D.; Vogel, H. J.; Querengesser, L., HMDB: the Human Metabolome Database. Nucleic Acids Research 2007, 35 (suppl_1), D521-D526. 281. Knittelfelder, O. L.; Kohlwein, S. D. Lipid Extraction from Yeast Cells Cold Spring Harb Protoc [Online], 2017. PubMed. http://europepmc.org/abstract/MED/28461651 https://doi.org/10.1101/pdb.prot085449 (accessed 2017/05//).

193