Development of Methods for the Analysis of Protein Post-translational
Modifications: IsoAspartic Acid and Protein Crosslinking
by Min Liu
A dissertation submitted to
The Faculty of the College of Science of Northeastern University in partial fulfillment of the requirements for the degree of Doctor of Philosophy
September 11, 2014
Dissertation directed by
Zhaohui Sunny Zhou Professor of Chemistry and Chemical Biology Da Ren
Amgen, Inc Acknowledgments
My journey on the path of the fascinating world of protein analysis began with my advisors, Professor Zhaohui Sunny Zhou and Dr Da Ren. I would like to take immense pleasure in thanking them for the opportunity on two challenging and interesting projects. Their passion and high standard for science will inspire me for my life and whole career. Their abundant guidance, valuable suggestion, endless encouragement and patience are greatly appreciated.
They have made me overcome my frustration and have made the past five years truly enjoyable and memorable for me.
I would like to express my deepest gratitude to Dr Janet Cheetham at Amgen for her endless support, many advice and encouragement. Without her support, it is impossible for me to transfer work experience from small molecule to bio-large molecules. Without her encouragement, I could not get the work done and reported here.
I would like to express my great appreciation to Dr Zhongqi Zhang at Amgen for his generous help, strong scientific guidance, and very helpful discussion. Sharing his expertise, he is my great mentor in protein characterization by mass spectrometry.
My special thanks go to my committee members: Professors Paul Vouros and William
Hancock for reviewing my thesis and the valuable suggestions.
I particularly thank my coworkers and lab mates: Chris Spahr, Drs David Brems, Judy
Ostovic, Nina Cauchon, Aleksander Swietlow, Peter Zhou, Wenqin Ni, Tianzhu Zang, Eddie
Zhou, Kevin Moulton, Aldina Mesic, Wanlu Qu, Nathaniel Kenton and Tianyi Bai for their help and scientific discussion. I would like to especially thank Professor Richard Duclos and Kalli
Catcott for their critical review and helpful suggestions on my manuscript preparations. I also ii
thank Drs Bin Ma, Dan Maloney, and Cassandra Wigmore at Bioinformatics Solutions for
helpful discussion on de novo sequencing.
Finally and most importantly, I would like to dedicate this thesis to my family: my
husband Chester Yuan, son Michael Yuan and daughter Emily Yuan. I am sincerely grateful for their generous understanding, endless love and selfless support. Their love, believing in me and encouragement made me completing this five-year endeavor in my career and life.
iii
Abstract
Analysis of protein posttranslational modifications (PTMs) plays pivotal roles for the
understanding of their biological importance. Isoaspartic acid (isoAsp) as the smallest PTM is
observed in vivo and in vitro. No mass difference and subtle difference in physiochemical property between isoAsp and Asp pose a great challenging for sensitive detection and ambiguous location of isoAsp site in complex samples. A novel assay of isoAsp by exploiting methylation specificity of protein isoaspartate methyltransferase (PIMT) at isoAsp and subsequent 18O-
incorporation during methyl ester hydrolysis is presented for sensitive detection and
unambiguous site location of several isoAsp residues in IgG1 (Anal Chem 2012, 84, 1056-1062).
The method can be applied to biological samples to understand the isoAsp process and identify
biomarkers.
Ubiquitous protein crosslinks in biological systems and biopharmaceuticals are reported
to result in loss of bioactivity and immunogenicity, but their characterization is poor, especially
when the crosslink chemistry is undefined, due to their intrinsic structural complexity and a lack
of a systematic analytical approach. A comprehensive methodology, XChem-Finder, has been
developed to break down the analytical challenge via 18O labeling and mass spectrometry,
leading to the discovery of a total of 14 cross-linked thioether peptides in IgG2, including those
that have not been previously reported (Anal Chem 2013, 85, 5900-5908). Furthermore, a novel
Histidine-Histidine (His-His) crosslink in IgG1 was successfully discovered and characterized
via our XChem-Finder (Anal Chem 2014, 86, 4940-4948). This again demonstrates the broad
applicability and utility of our XChem-Finder. The further improvement of XChem-Finder is
discussed. The discovery of more novel crosslinks in protein by XChem-Finder will be
successful without any doubt.
iv
Table of Contents
Acknowledgements …………………………………………………………………………….. ii
Abstract ………………………………………………………………………………….……... iv
Table of Contents …………………………………………………………………….….……... v
List of Figures …………………………………………………………………….….….……... xi
List of Tables ………..…………………………………………………………...….….……... xvi
List of Schemes ………………………..………………………………………...... ….……... xviii
Abbreviations and Symbols ….…………...………………………………..……….….……... xix
Chapter 1: Overview of Protein Post-translational Modification and Their Analysis .………... 1
1.1 Protein Post-translational Modification ………………………………………...... 1
1.2 Biological and Biopharmaceutical Importance of PTMs ………………………..……...... 1
1.3 Analysis of PTMs ………………………………………………………………...……….. 2
1.4 Deamidation and Isomerization ……………………………………………………..…...... 7
1.4.1 Isoaspartic Acid Formation …..…………………………………………………….…... 7
1.4.2 Biological and Biopharmaceutical Importance of Deamidation and Isomerization …... 11
1.4.3 Methods for Detection and Characterization of Deamidation and Isomerization …...... 12
1.4.3.1 Edman Degradation …………………………………………...……………....…… 12
1.4.3.2 Protein-L-isoaspartyl Methyltransferase (PIMT) ……………………………..….... 13
1.4.3.3 Mass Spectrometry …………………………………………………………..….…. 14
1.4.3.4 18O-incorporation …………………………………………………………….…...... 17
v
1.4.3.5 Asp-N Protease Peptide Mapping ……………………………………………..…… 19
1.4.3.6 Chemo-enzymatic Derivatization and Affinity-based Method …………………...... 20
1.4.3.7 Protein Isoaspartate Methyltransferase-mediated 18O-Labeling ………………..….. 22
1.4.3.8 Methods for Racemization Detection …………………………………………..….. 22
1.5 Protein Crosslinking …………………………..…………………….……….………....…. 23
1.5.1 Crosslink Formation …………………….……………………………………....……... 23
1.5.1.1 Crosslinks as degradants …………………………………………..………...……... 24
1.5.2 Biological and Biopharmaceutical Importance of Crosslinks ……………………..…... 32
1.5.3 Methods for Detection and Characterization of Crosslinks …………………………..... 33
1.5.3.1 HPLC with Fluorescence Detection …………………………………………….….. 33
1.5.3.2 MS-based Method …………………………………………………………….……. 33
1.5.3.2.1 C-Terminal 18O-Labeling ………………………...………..…………………… 34
1.5.3.2.2 N-Terminal Modification ………………………………..………………….….. 36
1.5.3.2.3 Chromatographic Sample Enrichment …………………………..…..….……… 38
1.5.3.3 Antibody-based Method …………………………..………………………...……… 39
1.6 Conclusions ……………..………………………………………………………….…...… 39
1.7 References ……………………..……………………………………………...……..……. 40
Chapter 2: Protein Isoaspartate Methyltransferase-Mediated 18O-Labeling of Isoaspartic Acid for
Mass Spectrometry Analysis .….……………………………………………………………..... 52
2.1 Abstract ………………………………………………………………………...……..…... 53
2.2 Introduction …………………………………………………………………….…….…… 53
2.3 Experimental Section ………….……………………………………………….…………. 58
vi
2.3.1 Chemicals ………………………………………………………………………….…… 58
2.3.2 Generation of isoAsp …………………………………………………………….…….. 59
2.3.3 Reduction, Alkylation, and Tryptic Digestion of IgG1 …………………………..……. 59
2.3.4 Methylation Catalyzed by PIMT ……………………………………………...…….…. 60
2.3.5 18O-Labeling ……………………………………………………………………..…….. 60
2.3.6 HPLC …………………………………………………………………………….…….. 61
2.3.7 Mass Spectrometry ……………………………………………………………..……… 61
2.4 Results and Discussion …………………………………………………………...….…… 62
2.4.1 Methylation of isoAsp ………………………………………………………….…..….. 64
2.4.2 Hydrolysis and 18O Incorporation ……………………………………………………… 68
2.4.3 Screening of 18O-Labeled isoAsp by Mass Spectrometry ………………………....….. 73
2.4.4 Co-elution of isoAsp and Asp and Overlapping of Isotope Patterns ………….….…… 75
2.4.5 Identification of isoAsp Sites in 18O-Labeled Peptides …………………………….…. 75
2.5 Conclusions ……………………………………………………………………….…..….. 83
2.6 References ………………………………………………………………….….…….…… 83
Chapter 3: A Comprehensive Methodology for the Identification of Protein Crosslinks without a
Prior Knowledge of Chemistry via 18O Labeling and Mass Spectrometry .….………...... 87
3.1 Abstract ………………………………………………………………………….………... 88
vii
3.2 Introduction …………………………………………………….………………..…….… 89
3.3 Experimental Section ………………………………………………….………..….…..… 91
3.3.1 Chemicals ……………………………………………………………………..….…… 91
3.3.2 Generation of Stressed Sample ………………………………………………..…...…. 94
3.3.3 Reduction, Alkylation, Tryptic Digestion and 18O-Labeling of the IgG2 ……...….…. 94
3.3.4 HPLC …………………………………………………………………………..……... 95
3.3.5 Mass Spectrometry …………………………………………………………..…..…… 95
3.4 Results and Discussion ……………………………………………………………...…… 96
3.4.1 Stage 1: Identification of Crosslinked Peptides ………………………….…….…….. 96
3.4.2 Stage 2: Deduce Partial Sequence for Each Chain ………………………..………… 100
3.4.3 Stage 3: Inference of Full Sequence for Each Chain ……………...………………… 104
3.4.4 Stage 4: Deduction of Crosslinking Chemistry and Site …………………….……… 111
3.4.5 Final Confirmation and Additional Support ……………………….……………..… 115
3.4.6 Targeted Search Based on the Newly Established Crosslinking Chemistry ..……… 121
3.5 Formation of Thioether ………………………………………………………..………. 129
3.6 Conclusions …………………………………………..….…………………………….. 131
3.7 References ………………………………………………..…….……………………… 131
viii
Chapter 4: Discovery and Characterization of a Novel His-His crosslink in IgG1 Utilizing 18O- labeling and Mass Spectrometry .….……………………………………………..………….. 137
4.1 Abstract ………………………………………………………….………….…………... 138
4.2 Introduction ……………………………………………………………….………..…… 139
4.3 Experimental Section ………………………………………………………….………… 140
4.3.1 Chemicals …………………………………………………………………….…….… 140
4.3.2 Generation of Stressed Sample …………………………………………….……..….. 141
4.3.3 Aggregates by Size Exclusion Chromatography ………………………………..…… 141
4.3.4 Reduction, Alkylation, Tryptic Digestion and 18O-Labeling of IgG1 ……….….…… 142
4.3.5 HPLC ……………………………………………………………………………..….. 143
4.3.6 Mass Spectrometry …………………………………………………………………… 144
4.4 Results and Discussion …………………………………………………….……….…… 145
4.4.1 Detection of Crosslinked Protein ……………………..………………………..…….. 145
4.4.2 Detection of Crosslinked Peptide ……………………………..…………..…….…… 148
4.4.3 Elucidation of Crosslinking Chemistry ……………….…………………….….….…. 149
4.4.4 Structural Confirmation by Mass Spectrometry ………………………….….….…… 151
4.4.5 Mechanism of formation for His-His crosslink ……………………….……….….…. 163
ix
4.4.6 Other Crosslinks ………………………………………………………………...…… 172
4.5 Conclusions …………………………………………………………………….….……. 174
4.6 References ………………………………………………………………………….…… 174
Chapter 5: Conclusion and Future Directions .….……………………………………….…... 180
5.1 isoAsp Project …………………………………………………………………….…….. 181
5.2 Crosslink Project …………………………………………………………………….….. 182
5.3 References ……………………………………………………………………...…..…… 188
x
List of Figures
Figure 1-1 MS/MS for mapping PTMs ……………………………………………………...…. 4
Figure 1-2 Total isoAsp analysis via HPLC/UV at 260nm for SAH measurement or radioactive detection of MeOH ……………………………………………………..……………………… 14
Figure 1-3 Mechanism of fragmentation of the Asp and isoAsp peptides in ETD MS ……...… 16
Figure 1-4 Acid-and based catalyzed deamidation ………………………………………...…... 19
Figure 1-5 Detection of deamidation and isomerization via (A) isoaspartate methyltransferase
(PIMT)-catalyzed methylation of isoaspartate and hydrazine trapping of methylester and succinimide and (B) enrichment by hydrazide-aldehyde affinity………………...... ….…….…. 21
Figure 1-6 Disulfide scrambling under a basic condition to form crosslink degradants ………. 24
Figure 1-7 The formation of thioether and other related degradants via dehydroalanine followed by Michael addition …………………………………………………………………………..... 25
Figure 1-8 Non-disulfide crosslinking in insulin Asn deamidation followed by the reaction with the N-terminal amine ……………………………………………………………………….….. 26
Figure 1-9 (A) Dityrosine and Tyr-Cys crosslinks formed via tyrosyl radical (B) Other Tyrosine related crosslink involving tyrosine oxidation followed by Michael addition with primary amines
…………………………………………………………………………………………………….…...... 28
Figure 1-10 Histidine-related crosslinks via photo-oxidation ..……………………..….……… 29
Figure 1-11 Dimerization of human superoxide dismutase via a novel oxidative modification ─ ditryptophan crosslink ….………..…………………….………………………………….…… 30
xi
Figure 1-12 Formaldehyde-mediated cross-linking ….………..…………………….………… 31
Figure 1-13 RNase dimerization by a single amide bond between Lys66 and Glu9 under vacuum and 85 °C ………………….….………..………………………………………………….…… 32
Figure 1-14 Isotopic labeling at N-termini ………………………………………………….…. 38
Figure 2-1 Isotopic distribution of a singly charged DSIP peptide with/without 18O tag (A) and a triply charged tryptic peptide LC69-108 from the IgG1 sample with/without 18O tag (B) …..... 63
Figure 2-2 Specificity of PIMT-mediated 18O-labeling shown in the Asp-DSIP and isoAsp-DSIP samples ………………………………………………………………………………….…..…. 64
Figure 2-3 The mixture of isoAsp-DSIP and DSIP peptide was analyzed by PIMT/18O-labeling method …………………………………………………….…………………………...………. 66
Figure 2-4 The effects of pH (A) and incubation time (B) on the hydrolysis of the succinimide and methyl ester ……………………………………………………………….……….………. 69
Figure 2-5 Stability of Asp-DSIP during sample treatment ………………………..………..… 70
Figure 2-6 Guanidine HCl (Gnd-HCl, 1.25 M) quenched PIMT activity during hydrolysis and thereby minimized the incorporation of two 18O-atoms into isoAsp peptides …………....…… 72
Figure 2-7 Identification of isoAsp site in a doubly charged tryptic peptide HC 271-284 from the stressed IgG1 ………………………………………………………………………….…..….... 77
Figure 2-8 Identification of isoAsp site in a triply charged tryptic peptide LC69-108 from the
IgG1 sample ……………………………………………………………………………..…...... 78
xii
Figure 2-9 Identification of isoAsp site in a doubly charged tryptic peptide HC389-405 from the
IgG1 sample …………………………………………………………………………...……...... 79
Figure 2-10 Identification of isoAsp site in the stressed Asp-DSIP peptide by tandem mass spectrometry ………………………………………………………………………….……….... 82
Figure 3-1 Isotopic distributions of the cross-linked peptide HC:G118-R129/HC:C215-K240
……………………………………………………………….……………………………....…. 97
Figure 3-2 CID MS/MS spectrum of the triply charged precursor ions at m/z 1351.33 (16O-
labeled C-termini) and 1354.00 (18O-labeled C-termini) ………………………………..……. 114
Figure 3-3a MS/MS data of the cross-link peptide HC:G118-R129/HC:KΔ214-K240
……………………………………………………………………………………….…..…….. 117
Figure 3-3b MS3 for structure confirmation of the singly-charged fragment ion m/z 1196 from
the cross-link peptide HC:G118-R129/HC:KΔ214-K240 ……………………….…………… 118
Figure 3-3c MS3 for structure confirmation of the double-charged fragment ion m/z 1521 from
the cross-linked peptide HC:G118-R129/HC:KΔ214-K240 …..………………….………...… 119
Figure 3-4 MS/MS data of the cross-linked peptide HC:C215-K240/HC:C215-K240 ….…... 123
Figure 3-5 MS/MS data of the cross-linked peptide HC:K214-K240*/HC:KΔ214-K240 ….... 124
Figure 3-6 MS/MS data of the cross-link peptide LC:T211-S218/HC:G118-R129 ……….… 125
Figure 3-7 MS/MS data of the cross-linked peptide LC:T211-S218/HC:K214-K240 ……… 126
xiii
Figure 3-8 (A) Detection of cross-links in IgG2 by reducing SDS-PAGE and (B) Aggregation
analysis by size exclusion chromatography …………………………………….……….…..... 128
Figure 3-9 Major disulfide linkage isoforms in IgG2 ………………………………….……... 130
Figure 4-1 Detection of crosslinking in IgG1 by reduced SDS-PAGE and size exclusion chromatography (SEC) ……………………………………………………………….….…… 147
Figure 4-2 Isotopic distributions of the crosslinked peptide S215-K244/S215-K244 m/z 1673.54
(z=4) from tryptic digestion of IgG1 ………………………………………………….….…... 148
Figure 4-3 CID MS/MS spectra of the quadruply charged precursor ions m/z 1673.54 (16O-
labeled C-termini) and 1675.54 (18O-labeled C-termini) of the crosslinked tryptic peptide S215-
K244/S215-K244 …………………………………………………………………..……….… 154
Figure 4-4 MS3 spectrum of the doubly charged fragment ion m/z 1488.35 obtained from
MS/MS of the precursor ion m/z 1673.54 in Figure 4-3 …………………………….….….… 156
Figure 4-5 CID MS/MS spectrum of the triply charged precursor ion m/z 1178.77 of the
crosslinked S215-E229/S215-E229 peptide generated from combined trypsin and GluC digestion
………………………………………………………………………………………………… 157
Figure 4-6 CID MS/MS spectrum of the quadruply charged precursor ions m/z 1549.53 of the
crosslinked peptide D217-K244/D217-K244 from Asp-N digestion ………………………… 158
Figure 4-7 CID MS/MS spectrum of the triply charged precursor ions at m/z 1013.41 of the
crosslinked peptide D217-E229/D217-E229 from digestion with Asp-N and GluC ………… 159
xiv
Figure 4-8 ETD MS/MS spectrum of the precursor ion m/z 1339.70 (z=5) of the crosslinked
tryptic peptide S215-K244/S215-K244 ………………………………………………………. 160
Figure 4-9 ETD MS/MS spectrum of the precursor ion m/z 1033.35 (z=6) of the crosslinked
peptide D217-K244/D217-K244 from Asp-N digestion ……………………………………... 161
Figure 4-10 ETD MS/MS spectrum of the precursor ion m/z 884.33 (z=4) of the crosslinked
peptide S215-E229/S215-E229 from digestion with trypsin and GluC ………………………. 162
Figure 4-11 ETD MS/MS spectrum of the precursor ion m/z 760.31 (z=4) of the crosslinked
peptide D217-E229/D217-E229 from digestion with Asp-N and GluC ……………………… 163
Figure 4-12 CID MS/MS spectrum of the tryptic peptide containing the 2-oxo-His (+14 Da)
intermediate …………………………………………………………………………………… 167
Figure 4-13 CID MS/MS spectrum of the tryptic peptide containing the His+32 intermediate
…………………………………………………………………………………………………. 168
Figure 4-14 (A) Space filling illustration of the hinge region of IgG1 antibody (DKTHTCPPCP)
(B) Three-dimensional (3D) structure of an IgG1 …………………..……..………….……… 169
Figure 4-15 ETD MS/MS spectrum of the quadruply charged precursor ion m/z 821.09 of the
crosslinked peptide D217-E229/S215-E229 generated by limited Asp-N digestion of fully digested IgG1 by trypsin and GluC ……………………………………………….……..…… 171
Figure 4-16 CID MS/MS spectrum of the triply charged precursor ions at m/z 1096.09 of the crosslinked peptide D217-E229/S215-E229 from limited digestion by Asp-N of the fully digested IgG1 by GluC and trypsin ……………………………………………………..….… 172
xv
List of Tables
Table 2-1 Representative isoAsp containing peptides detected in IgG1 ……………….….….. 74
Table 3-1 Fragmentation ions of a cross-link peptide ……………………………….…....….... 93
Table 3-2 The cross-linked peptide candidates identified by MassAnalyzer algorithm ..…...... 99
Table 3-3 Peptides in IgG2 that match with the mass of fragment ions of the triply charged ion m/z 1351.33 via FindPept (sorted in the order of primary sequence number) ……….……….. 105
Table 3-4 Peptides in IgG2 that match with the mass of fragment ions of the triply charged ion m/z 1351.33 via FindPept (grouped in the order of m/z value ) ………….………...... …. 107
Table 3-5 Partial sequences determined from the mass of fragmentation ions for a triply charge precursor ion at the retention time of 91.17 min with m/z 1351.33 …………….……….……. 109
Table 3-6 De novo sequencing for sequence tag using y-ions from the cross-linked fragments
(group 4 in Table 3-1) in the cross-link peptide G118-R129/C215-K240 ……………………. 110
Table 3-7 Elemental formula with mass of 149.9987 Da …………………………………..… 112
Table 3-8 Crosslinked peptides identified in IgG2 ………………………………….…..….… 120
Table 3-9 The cross-linking peptides identified in the IgG2 via a targeted search for cysteinyl thioether ………………………………………………………………………………………. 122
Table 3-10 Quantification of the cross-linked peptides in the IgG2 …………………….……. 127
Table 4-1 Partial sequences that match the mass of fragmentation ions for the precursor ion m/z
1673.54 (z=4) eluted at 112.48 min …………………………………………………………... 150
xvi
Table 4-2 Deduction of elemental formula for the crosslinked S215-K244/S215-K244 peptide
………………………………………………………………………….……………………… 151
Table 4-3 Crosslinked peptides obtained from digestion of IgG1 by various proteases and the combination thereof ……………………………………………………….………………….. 153
Table 4-4 Peptides containing the 2-oxo-His (+14 Da) and His+32 (+32 Da) intermediates observed in the stressed IgG1 ………………………………………………….….…..……… 166
Table 4-5 Thioether crosslinks detected in IgG1…………………………………..……..…… 173
xvii
List of Schemes
Scheme 1-1 Deamidation, isomerization, racemization and PIMT-dependent methylation …... 10
Scheme 2-1 Formation of isoAsp from the isomerization of aspartic acid (Asp) or the
deamidation of asparagine (Asn) …………………………………….…………….…..……..... 54
Scheme 2-2 Isotopic labeling of isoaspartic acid via protein isoaspartyl methyltransferase
(PIMT)-catalyzed S-adenosyl-methionine (SAM or AdoMet)-dependent methylation and
hydrolysis of the resulting methyl ester and succinimide in 18O-water …………………...…… 56
Scheme 2-3 Identification of isoAsp peptides by mass spectrometry using the mass increase of 2
Da imparted by 18O-labeling ………………………………………………………………...... 57
Scheme 3-1 Flow chart of XChem-Finder in four main stages ……………………………..… 92
Scheme 3-2 Establishment of crosslink chemistry based on formula C4H6O4S obtained from
elemental composition analysis of 149.9987 Da ……………………………….…..………… 130
Scheme 4-1 Proposed mechanism for the formation of His-His crosslink via photo-oxidation intermediates ……………………………………………………………………………….…. 164
Scheme 5-1 The use of combining sample enrichment and Lys-N digestion for detection of crosslinks …………………………………………………………………………………....… 184
Scheme 5-2 Isotopic labeling at N-termini via 1) trypsin digestion; 2) protection of є–amino
group of lysine by reductive methylation; 3) specific derivatization of N-terminal amino group
2 with a 1:1 mixture of DNFB (2, 4-dinitrofluorobenzene) and [ H3]DNFB at pH 7.0 ….…..… 185
Scheme 5-3 N-Terminal Succinylation via two-step chemical derivatizations ………….....… 186
xviii
Abbreviations and Symbols
ACN acetonitrile
Asn (N) asparagine
Asp (D) aspartic acid
Asu aspartyl succinimide
CDR complementary-determining region of IgG
CID collision induced dissociation
°C degree Celsius
CEX cation exchange chromatography
CHO Chinese hamster ovary
Cys cysteine
Da Dalton
DSIP β-delta sleep-inducing peptide
DTT dithiothreitol
ECD electron capture dissociation
EDTA ethylenediaminetetraacetic acid
EGFR epidermal growth factor receptor
xix
ELISA enzyme-linked immunosorbent assay
ESI electron spray ionization
ETD electron transfer dissociation
FT-MS/MS Fourier transform tandem mass spectrometry
GndHCl guanidine hydrochloride
HC heavy chain of IgG
HCD
His (H) histidine
His-His histidine-histidine crosslink
HPLC high performance liquid chromatography
IAA iodoacetic acid
ICH The international conference on harmonization of technical requirements
for registration of pharmaceuticals for human use
IgG immunoglobulin gamma isoAsp (isoD) isoaspartic acid kD kilodalton kV kilovolt
LC-MS liquid chromatography coupled with mass spectrometry
xx
LC-MS/MS liquid chromatography coupled with tandem mass spectrometry
LC liquid chromatography, light chain of IgG
m milli(10-3); meter(s)
M molarity
mAb monoclonal antibody
MALDI matrix-assisted laser desorption ionization
mg milligram
mg/mL milligram per milliliter
min minute(s)
µ micro (10-6)
µL microliter mL milliliter
MS mass spectrometry, mass spectrum, mass spectroscopy
MS/MS (MS2) tandem mass spectrometry
MSn multiple stage fragmentation
m/z Mass-to-charge ratio
nm nanometer
xxi
% percentage
pI isoelectric point
PIMT protein isoaspartate methyltransferase
ppm parts per million
HNMR proton nuclear magnetic resonance
PTM post-translational modification
RP-HPLC reversed phase high performance liquid chromatography
RT retention time
SAH (AdoHcy) S-adenosyl-homocystein
SAM (AdoMet) S-adenosyl-L-methionine
SDS-PAGE sodium dodecyl sulfate polyacrylamide gel electrophoresis
SEC size exclusion chromatography
TFA trifluoroacetic acid
Tris tris(hydroxymethyl)aminomethane
UV ultraviolet
UPLC ultra performance liquid chromatography v/v volume-to-volume ratio
xxii
XIC extracted ion chromatography
xxiii
Chapter 1: Overview of Protein Post-Translational Modification and Their Analysis
1.1 Protein Post-Translational Modifications
Post-translational modifications (PTMs) are enzymatic or chemical modifications
occurring on the amino acid side chain or the amino and carboxyl termini of proteins[1]. Protein
backbone cleavage, commonly referred to as proteolysis, is also considered as a PTM in many
cases[1]. Over 300 PTMs have been reviewed in the literature[1, 2]. The common types of
PTMs include oxidation, deamidation, isomerization, protein crosslinks, glycosylation,
phosphorylation, ubiquitination (attachment of 76-amino acid residue ubiquitin protein),
nitrosylation, pyroglutamic acid, methylation, acetylation, lipidation. This thesis will focus on
Asp isomerization and non-reducible protein crosslinks.
1.2 Biological and Biopharmaceutical Importance of PTMs
PTMs are ubiquitous in biological systems and biopharmaceuticals. PTMs can
potentially change protein’s physical or chemical properties, conformation, activity, cellular location or stability and serve as one of the most important regulatory mechanisms for fine- tuning protein function. Therefore, PTMs influence almost all aspects of normal cell biology and pathogenesis. PTMs can occur during manufacturing process and storage, leading to structural
changes and loss of efficacy, causing immunogenic response, raising safety concern[3-5]. As
examples, biological significance of Asp isomerization and protein crosslinks will be discussed
in section 1.4.2 & 1.5.2, respectively.
It has been recognized the potential damage to proteins by solution pH, temperature and
light exposure for years as reflected in the ICH Guideline Q1B and Q5C (the international
1 conference on harmonization of technical requirements for registration of pharmaceuticals for human use, www.ICH.org) [6-10]. As such, stress testing studies must be conducted in order to assess the stability of products and to facilitate improvements in the manufacturing process, formulations, package, and storage conditions. Typical stress studies include high and low pH, elevated temperature, photolysis, oxidation, and freeze-thaw cycles and shear. Photodegradation study covers direct long-term exposure of product to sunlight and common light source for indoor lighting and UV-sterilization in industries. In this thesis, acidic and basic condition leading to isomerization and thioether crosslinks in IgGs will be described in Chapter Two and
Three, respectively. The formation of histidine-histidine (His-His) crosslinks in IgG resulting from light exposure will be presented in Chapter Four. A more detail overview on the formation and biological importance of isoAsp and protein crosslinks will be presented in section 1.4 and
1.5, respectively.
1.3 Analysis of PTMs
Identifying and understanding PTMs is critical in the study of cell biology and disease treatment and prevention. However, it is a great challenge to study a specific modified form in a largely heterogeneous protein pool because 1) it is often at a low abundance; 2) most post- translational modifications (e.g. isoAsp formation) alter protein in subtle ways that are not easily detected; 3) a protein can be modified by more than one type of PTMs; 4) furthermore, a protein can be multiply modified by the same PTM at different residues. This challenge will continue to promote methodology development for protein separation/detection as well as better instrumentation. In a constant effort to develop novel, highly sensitive and sophisticated PTM identification techniques, MS-based protein analysis holds great potential for the analysis of
2
protein PTMs. The currently reported analysis methods of isoAsp and protein crosslinks will be
reviewed in sections 1.4.3 and 1.5.3, respectively. In this section, the focus will be on the
overview of the determination of PTMs via MS-based approach.
The widely used strategy for protein identification is to cleave the protein with highly
specific proteases, followed by LC/MS analysis. PTMs can either increase or decrease the
molecular weight of peptides and result in modification-specific signals in MS/MS. In other words, the modification not only makes peptide’s molecular weight shifted, but all fragment ions containing the modified amino acid residue are also mass-shifted (Figure 1-1). However, precise identification of the modification type and the modification site can be very challenging due to 1) the mass shift in the peptide molecular weight. The difficulty increases for small mass shift which requires higher resolution MS. Modifications that are particularly large may shift total mass outside of the mass range suitable for MS/MS sequencing; 2) the overall abundance of the modified peptide. Most PTMs are low abundance and/or substoichiometric; 3) the stability of the modification and the gas phase dissociation behavior of the modified peptides. Some PTMs are labile during MS and MS/MS; 4) the effect of PTMs on protease digestion. The presence of
PTMs may affect the cleavage efficiency of proteases; 5) the effect of PTMs on the peptide’s ionization efficiency. The detectability/sensitivity of a peptide is a function of its sequence and modification; 6) multiple-site PTMs. This may generate very complicated MS and MS/MS data that are difficult to interpret; 7) the effect of PTMs on peptide chromatographic behavior; 8) sample handling. Sample preparation may introduce artificial modifications to a protein, such as isoaspartic acid formation; 9) the complex sample matrix of formulation samples or biological samples (such as urine, plasma).
3
Therefore several strategies such as sample separation/enrichment to reduce sample complexity and to minimize ion suppression, multiple proteases with different cleavage specificity to generate complementary and redundant sets of overlapping peptides, isotope labeling, multiple stage fragmentation (MSn), different fragmentation techniques (CID, collision induced dissociation vs ETD, electron transfer dissociation), etc. are developed for PTM analysis.
Figure 1-1. Mass spectrometry for mapping PTMs[11]. The mass shift from PTM can be detected in MS and the modification site can be located in MS/MS.
4
The use of specific proteases cleaves proteins into appropriately sized peptides that can
be identified easily in MS/MS experiments. It is important to note that larger peptides (>4 kDa)
have poor recovery and are difficult to characterize by MS/MS on most commercial instruments
(the optimal m/z range is 500-4000 Da for MALDI and 300-1500 Da for ESI), whereas very
small peptides (2-3 residues) are often lost due to their poor retention on reversed-phase
columns. The choice of a proper protease for digestion will be determined by the nature of
amino acid sequence of the protein being analyzed. However, Trypsin and Endoproteinase LysC
are the most commonly used in proteomics because they cleave Arg/Lys or Lys at the C-termini
of digested peptides which aid in identification via tandem mass spectrometry. Based on the
average occurrence of lysine (5.7%) and arginine (5.5%) in proteins in the Swiss-Prot database,
digestion with LysC would be expected to produce peptides with an average length of about 17
residues while Trypsin would produce peptides with an average length of about 8 residues [12].
The size of those peptides produced by LysC and Trypsin can be optimally separated by reversed phase HPLC and are very suitable for mass spectrometric analysis. In some instances tryptic peptides may be too small or too large for LC MS/MS analysis. Furthermore, the proximity of post-translational modifications (PTMs) to proteolytic sites can interfere with protease efficiency, which can potentially hinder detection of the PTM. For example, the missed cleavage by trypsin at Lys residue was reported due to its close proximity to the crosslinking site[13, 14].
As such, there are cases where other proteases (such as GluC, AspN, chymotrypsin, etc) with
different cleavage specificity are useful. A disadvantage of using GluC, and AspN is that they often yield peptides that are longer and contain one or more internal basic residues, which are poorly fragmented by CID. But the alternative fragmentation strategies such as ETD and ECD are known to improve identification of long, highly-charged peptides containing basic residues.
5
The use of non-specific proteases are also reported[15], but these can decrease experimental reproducibility and complicate the separation and identification of proteolytic peptides.
It is important not to introduce any reagents/contamination during sample preparation, since this can inhibit enzyme activity, and generate artifacts, for example, carbamylation. Also, enzyme activity is greatly dependent on the digestion conditions such as pH, temperature,
enzyme-to-substrate ratio, etc.
It has become a common practice to use a multiple protease strategy to generate
complementary and redundant sets of overlapping peptides for the improvement of protein
identification and sequence coverage [16-19]. In Swaney’s study, trypsin, LysC, ArgC, AspN,
and GluC were used with two dissociation methods (CID, ETD) in a decision tree-driven fashion
for complex protein samples[16]. All digestions were performed under optimized conditions for
each protease, respectively. They observed a modest boost in protein identifications (~20%) over the use of a single protease, but a more than two fold improvement in proteome sequence coverage. The optimum digestion pH of each protease as well as the carry-over of proteases to next digestion is an important consideration in a tandem approach. Buffer exchange and other
clean-up steps may be necessary for the simplification of the downstream data interpretation.
In addition, the mass spectrometry instruments have been greatly advanced in the recent
years. The instruments (for example, Orbitrap and Fourier transform ion cyclotron resonance
mass spectrometry) offering high resolution, good mass accuracy, high scan speed, and wide
dynamic range are becoming available. In this thesis, the use of 18O-labeling, complimentary
protease digestion, multiple stage fragmentation MSn, different ion activation by CID and ETD
6 on Orbitrap mass spectrometry to detect and characterize isomerization and protein crosslinks will be discussed in great detail in Chapters Two to Four.
1.4 Deamidation and Isomerization
1.4.1 Isoaspartic Acid Formation
Isoaspartic acid (isoAsp, isoD), aspartic acid in a beta-peptide linkage, is a ubiquitous post-translation modification observed both in vivo and in vitro. IsoAsp can be spontaneously generated through the non-enzymatic deamidation of asparagine (Asn) or isomerization of aspartic acid (Asp) during manufacture process and storage. It is of great concern in the protein pharmaceutical product development. The loss of ammonia from Asn or the dehydration of Asp leads to the formation of a labile intermediate of succinimide (Asu) which readily hydrolyzes to isoAsp and Asp in about 3:1 ratio (Scheme 1-1)[20].
Many factors can influence the rates of isoAsp formation, such as pH, temperature, protein sequences, secondary structures, local three-dimensional structure, etc. Basic conditions
(pH>7) favor deamidation while isomerization happens more readily in acidic conditions. In general, the half-times of aspartyl and asparaginyl peptide degradation under physiological conditions (pH 7.5, 37 °C) vary between about 1 and 1000 days[21]. In many case, deamidation rate is determined by the sequence of residues immediately adjacent in the peptide chain and by higher order protein structure[22]. The fastest deamidation sequence was reported as asparagine- glycine (NG) sequence followed by asparagine-serine (NS) sequence. The slowest is asparagine- proline (NP) sequence. Compared to Asn deamidation, Asp isomerization is about 10 times slower under physiological conditions[23]. In 1992, Kroon et al reported that OKT-3, the first marketed monoclonal antibody product(MAb), undergoes deamidation[24]. There are a number
7
of reports on deamidation and isomerization in MAbs and other proteins of pharmaceutical interest [25, 26].
Spontaneous direct hydrolysis of asparagine residues by water attack on the side chain amide group under pH 3 can also result in aspartyl residue formation[27, 28]. This direct
hydrolysis of Asn in an acidic condition results in the formation of Asp as the only product. At
neutral pH, the rate of this reaction appears to be much slower than that of the succinimide pathway. The deamidation rate reaches a minimum at approximately pH of 5.
The succinimide is also racemization-prone and can generate the D-succinimidyl, D- aspartyl, and D-isoaspartyl forms (Scheme 1-1)[29, 30]. Zhang et al reported simultaneous isomerization and racemization of Asp in Asp-Asp motif of a therapeutic protein[31]. Young et
al demonstrated racemization of Asp-25 in mammalian Histone H2B[32]. Asp residues do not
racemize uniformly and specific Asp residues have a greater tendency to racemize than others
depending on the neighboring residue of the Asp residue as well as the higher-order structure
around the Asp residues in the protein[29, 30]. UV radiation and oxidative stress can
promote/induce the racemization of Asp residues[29, 33]. Among the reaction products
(including L/D-succinimidyl, L/D-asp, L/D-isoAsp), L-isoAsp is typically the predominate form.
Similar degradation reactions were also reported for glutamine (Gln) or glutamic acid
(Glu) residues, but the rate of these reactions via six membrane intermediate are much slower (2
orders of magnitude) of Gln than those of asparagine and aspartic acid residues via five member
ring succinimide intermediate[5, 34].
Isoaspartic acid in the damaged protein can be partially repaired by protein L-
isoaspartyl/D-aspartyl O-methyltransferase (PIMT, EC 2.1.1.77) which is a repair enzyme that
initiates the conversion of L-isoAsp or D-Asp residues to L-Asp residues (Scheme 1-1)[20, 35].
8
PIMT recognizes and transfers the methyl group from S-adenosyl-L-methionine to L-isoAsp or
D-Asp to form the methyl ester. The labile methylester is rapidly converted back to succinimide and subsequent hydrolysis can generate Asp and isoAsp.
9
Scheme 1-1. Deamidation, isomerization, racemization and PIMT-dependent methylation[29, 30,
32]. This spontaneous intramolecular rearrangement occurs most readily at Asn-Gly, Asn-Ser and Asp-Gly sequences in flexible regions of polypeptides. L-isoAsp form typically accounts for
70-85% of the succinimide hydrolysis product. Protein-L-isoaspartyl methylthansferase (PIMT) catalyzes the methylation of L-isoAsp and D-Asp in the presence of S-adenosyl-L-methionine
(AdoMet).
10
1.4.2 Biological and Biopharmaceutical Importance of Deamidation and Isomerization
Deamidation of asparagine generates aspartate, which fundamentally changes the amino
acid composition and charge of the polypeptide post-translationally[20]. Asn deamidation alters the charge of the protein from neutral to negatively charged. Formation of isoAsp from either deamidation of Asn or isomerization of Asp results in the insertion of a methylene group and D- configuration of Asp into the protein backbone[20]. These may dramatically change the protein structure/conformation[36], stability[37], bioactivity[37, 38], aggregation[39, 40], and function[41, 42] leading to aging[40], cancers[43], Alzheimer’s disease[44] and immunogenicity[45-47]. The enzymatic conversion by PIMT of the abnormal isoAsp residues to normal Asp residue in proteins prevents the accumulation of a potentially dysfunctional protein in vivo as cells and tissues age. Studies show that there is increased isoAsp accumulation in tissues (e.g. brain) and in fluids (e.g. urine) in PIMT-deficient mice compared to wild type mice[35, 48]. Furthermore, the average age for PIMT-deficient mice is 42 days, much shorter than that of 22-26 months for wild type mice. These studies clearly demonstrate the harmful consequences of isoAsp accumulation[35, 48]. The D-Asp residues were detected in various proteins from diverse tissues of elderly individuals and related to age-related diseases such as cataract and Alzheimer’s disease[29, 30, 33, 49]. Fujii and coworkers reported high level D- isoAsp at the site of 58 and 151 in the αA-crystalline from aged human lenses, which undergo abnormal aggregation and lead to the reduced chaperone activity[29, 30]. The same research group reported that human skin samples exposed to UV light exhibited significant accumulation of D-Asp by comparison to sun-protected skin[50]. This evidence led to the proposal that assay of D-amino acid accretion could serve as an indicator of sun-induced damage. In the next section, the detection of deamidation and isomerization will be discussed.
11
1.4.3 Methods for Detection and Characterization of Deamidation and Isomerization
Methods for detection of deamidation are usually based on the charge-sensitive techniques or mass spectrometry analysis[51]. Deamidation introduces negative charges to a protein that shift its isoelectric point (pI). It also results in a +0.984 Da mass increase from Asn to Asp/isoAsp, which can be detected and quantified by mass spectrometry. On the other hand, for isomerization, there is no difference in charge and molecular mass between Asp and isoAsp, which cannot be reliably distinguished by mass spectrometry, the method of choice for analyzing almost all other PTMs. There was no suitable method for Asp isomerization until recently several methods including chemical, instrumental and enzymatic approaches are developed. In this section, a brief review on those methods will be presented and the limitation of each method will also be discussed. The specific focus of this thesis is the determination of deamidation and isomerization using liquid chromatography-mass spectrometry (LC-MS) methods.
1.4.3.1 Edman Degradation
Edman degradation stops at isoAsp residues and was widely used to detect and identify isoAsp residues[52, 53]. Di Donato and his coworkers used chromatographic separation of tryptic peptides from RNase A followed by Edman degradation sequencing to successfully identify Asn67 in Ribonuclease A (RNase A) as the site of deamidation[53]. The Asn67 deamidation was found to impact catalytic property and refolding rate of RNase A[53]. The
Edman sequencing method needs purified sample and is not suitable when N-terminal amino group is modified/blocked. In addition, relatively large quantity of protein sample is required for
Edman sequencing method.
12
1.4.3.2 Protein-L-isoaspartyl Methyltransferase (PIMT)
Protein-L-isoaspartyl methyltransferase (PIMT, EC 2.1.1.77) specifically transfer a
methyl group from S-adenosyl-L-methionine (SAM or AdoMet) to isoAsp, generating S-
adenosyl-homocystein (SAH or AdoHcy) and the corresponding isoaspartate methyl esters
(Figure 1-2)[26, 32, 54, 55]. The methyl ester has increased retention time compared to isoAsp
on RP-HPLC due to its increased hydrophobicity, but the methyl ester is labile and readily
converts to succinimide. Therefore the methyl ester cannot be used to quantify isoAsp. As such,
the methylation by-products, SAH and methanol (MeOH), are measured instead to achieve the
global analysis of the isoAsp residue content in the protein samples in commercially available kit
IsoQuant. In the radioactive format of IsoQuant, the methyl group donor SAM, is isotopically labeled with tritium and the resulting by-product of [3H] methanol is used for isoAsp quantitation
(Figure 1-2). In the HPLC format of IsoQuant, a by-product of SAH is separated by reversed
phase HPLC and then quantified using a standard at UV 260 nm (Figure 1-2). The major
limitation of this method is that only total isoAsp content is measured and the site of
deamidation/isomerization cannot be located.
13
Figure 1-2. Total isoAsp analysis via HPLC/UV at 260 nm for SAH measurement or radioactive detection of MeOH.
1.4.3.3 Mass Spectrometry
Over the past decade, peptide identification by CID has become the method of choice in mass spectrometry-based proteomics. In neutral and basic solution, deamidation and isomerization involve succinimide intermediate (Asu) (Scheme 1-1). While the Asu intermediate often can be detected as a degradation product (mass decrease of 17 Da), it is readily hydrolyzed in aqueous solution to form the Asp and isoAsp products. The deamidation from Asn to isoAsp and Asp results in 1 Da mass increase which can also be readily detected by modern mass spectrometers, and the deamidation sites can be localized through tandem mass spectrometry fragmentation. In addition, mass spectrometry is a sensitive technique which requires only femtomole to attomole quantities of sample.
Unlike Asn deamidation, Asp isomerization analysis presents a significant challenge for mass spectrometry, as there is no mass change between Asp and isoAsp. Fragment ion intensity
14 in CID was studied to differentiate isoAsp from Asp. Lehmann et al have noticed that replacement of L-Asp by L-isoAsp resulted in 1) the ion intensity ratio of complementary b and y ions generated by cleavage of N- and C-terminal to the isoAsp decreased, and 2) the Asp ammonium ion abundance at m/z 88 also decreased[56]. However, the b/y ion intensity ratio and the ammonium ion intensity vary considerably depending on the peptide sequence and instrumental settings. Thus, the abundance changes are difficult to use in practice for detection of isoAsp. The development of alternative fragmentation techniques has extended the possibilities within tandem mass spectrometry for isomerization detection. More recently, electron capture dissociation (ECD) and electron-transfer dissociation (ETD) are used for the differentiation of isoaspartic acid and aspartic acid residues by the reporter ions c+57 and z·-57
(Figure 1-3)[57-62]. Dai et al reported the identification of major isoAsp-containing proteins in the urine of PIMT-deficient mice via ETD analysis of Lys C digests[63]. The limitations of this approach 1) rely on only one pair of signature ions; 2) The abundance of the reporter ions is low.
The intensity of these reporter ions is typically less than 20% of the corresponding c and z ions;
3) Also higher charge state of peptides is required for ETD fragmentation.
Like other PTMs, deamidation and isomerization often are at low abundance. Sample enrichment to improve detection and reduce sample complexity is desired. Since Asn deamidation to Asp/isoAsp changes the charge of peptides/proteins, cation exchange chromatography (CEX) has been reported to separate intact molecules from their charge variants[64, 65]. However, the mobile phase in CEX usually is not compatible with mass spectrometry and fraction collection of CEX for further mass spectrometry identification is needed.
15
Asp and isoAsp have identical mass and similar pI, thus their analysis remains challenging. Fortunately the structural changes induced by isomerization usually change the retention time of the peptide in reversed phase liquid chromatography (RPLC)[26, 66]. A typical peptide elution order is isoAsp, Asn, Asp, and succinimide[67]. However, caution should be taken for identification based on retention time alone since the separation sometimes varies with different chromatography conditions[61]. As such, RPLC coupled with ETD-MS provides powerful identification tool to differentiate Asp from isoAsp[57, 61].
Figure 1-3. Mechanism of fragmentation of the Asp and isoAsp peptides in ETD MS. (a) formation of c and z fragment ions of the Asp peptides, which is the same for isoAsp peptide; (b) formation of the c+57 and z-57 diagnostic ions of the isoAsp peptides[61].
16
1.4.3.4 18O-Incorporation
As described in the previous section 1.4.1, IsoAsp can be generated from Asn
deamidation or Asp isomerization via a common succinimide intermediate. The identification of
the succinimide intermediate in proteins is challenging because it hydrolyzes rapidly under
neutral to basic pH conditions—typical conditions in protease digestion. Xiao et al have
developed a 18O-labeling method for identification and quantification of succinimide in proteins[68]. The method utilized 18O water in the hydrolysis of succinimide followed by tryptic
digestion and LC/MS analysis to unambiguously identify the sites of deamidation and
isomerization via mass increase of 3 and 2 Da comparing to their 16O counterparts,
respectively[31, 68, 69].
Since Asn deamidation to convert to isomeric products (isoAsp and Asp residues) can
readily occur even under the mild conditions used to digest protein for LC/MS analysis, it often
overestimates the original level of deamidation. The inherent deamidation and those introduced
by sample preparation can be differentiated by preparing sample in 18O-water[70-72]. The
artificial deamidation from sample preparation show a 3 Da mass shift while intrinsic
deamidation has a 1 Da mass shift compared to the non-deamidated peptide. When the sample
preparation is conducted in 18O-water, protease can simultaneously catalyze the incorporation of
up to two 18O atoms at the peptide C-terminal carboxyl groups, resulting in complicated mass spectra. This limitation can be overcome by a multiple-step calculation procedure[72, 73]. In this method, b ions were used for the calculation of Asn deamidation that occurred prior to or during sample preparation, which eliminated the complexity induced by protease C-terminal 18O- labeling.
17
Recently, Wang et al taking advantages of different deamidation mechanisms in acidic
and basic conditions introduced isomer-specific mass tags to 18O-labeled aspartyl and
isoaspartyl-containing peptides (Figure 1-4)[74]. Deamidation under basic conditions generates both aspartyl and isoaspartyl-containing peptides while acid-catalyzed deamidation only leads to aspartyl-containing peptides. When 18O-water is used in those deamidation conditions, different
levels of 18O-incorporation in aspartyl- and isoaspartyl-contained peptides can be achieved. In
the acid-catalyzed labeling, deamidation result in a mass increment of 4n+9 Da (4 Da from each
acidic residues, 4 Da from C-terminus and 5 Da from deamidation-formed Asp residue), where n
is the number of acidic residues and carboxylated Cys residues in a peptide[74]. In contrast, only
one 18O atom is incorporated during the hydrolysis of succinimide intermediate under basic
condition and results in mass shift of 3 Da. These different mass shifts from 18O-labeling can be
exploited for unambiguous assignment of aspartyl- and isoaspartyl-containing peptides by mass
spectrometry.
18
Figure 1-4. Acid-and base catalyzed deamidation[74]. The n is the number of acidic residues and carboxymethylated Cys residues in a peptide.
1.4.3.5 Selective Cleavage of isoAsp Peptides with the Asp-N Protease
Endoproteinase Asp-N selectively cleaving only at Asp, not at isoAsp, was used for enrichment of isoAsp-containing peptides for MS analysis[61, 75]. As a result, isoAsp peptide can be differentiated from the peptide containing Asp. Zhang et al employed this finding to identify isoAsp formation at Asp45, Asp47 and Asn47 of recombinant human iterleukin-11
(rhIL-11)[76]. Rehder et al used Asp-N peptide mapping to identify the isomerization of Asp92 residues of anti-epidermal growth factor receptor (EGFR) immunoglobulin γ2 antibody, which is contributed to the decreased potency to bind to EGFR as measured by a cell proliferation assay[75]. The limitation of this approach is that Asp-N digestion might be not suitable to digest some proteins resulting in very long peptides which are difficult to identify by mass spectrometry.
19
1.4.3.6 Chemo-enzymatic Derivatization and Affinity-based Method
Alfaro and co-workers reported a chemo-enzymatic detection of protein isoaspartate by taking advantage of protein isoaspartate methyltransferase (PIMT) to selectively converts isoaspartates into the corresponding methyl esters followed by hydrazine trapping and then aldehyde affinity enrichment (Figure 1-5)[77]. Hydrazides bind to aldehyde resins at mildly acidic conditions (pH 3-6) and the trapped protein isoaspartate can be released with pH 10. The mass increase of 14 Da from isoAsp to hydrazide can be readily detected by standard mass spectrometry. This method can be used not only for site identification, but also for the detection of low abundance isoAsp peptides/proteins. The limitation of this method is that hydrazine trapping is sub-stoichiometric and hydrolysis is competing.
20
(A)
(B)
Figure 1-5 Detection of deamidation and isomerization via (A) isoaspartate methyltransferase
(PIMT)-catalyzed methylation of isoaspartate and hydrazine trapping of methylester and succinimide and (B) enrichment by hydrazide-aldehyde affinity[77].
21
1.4.3.7 Protein Isoaspartate Methyltransferase-mediated 18O-Labeling
Recently, a novel approach for the detection and characterization of isomerization of Asp
in IgG1 via protein isoaspartate methyltransferase-mediated 18O-labeling followed by mass
spectrometry analysis has been developed in our lab[26]. In this approach, under mild basic condition, 18O has been incorporated into succinimide generated from PIMT-mediated
methylation of isoAsp. Several isoAsp sites in IgG1 have been identified , which will be detailed
in Chapter Two.
1.4.3.8 Methods for Racemization Detection
Racemization is much more difficult to detect because L- and D-amino acids have
identical polarity, charge, and molecular weights. However, there are several methods for
detection and quantification of racemization, which include chromatography, ELISA(enzyme-
linked immunosorbent assay), and enzymatic assays.
In a typical chromatographic protocol, the protein of interest is acid hydrolyzed under
very harsh conditions to release individual amino acids for either achiral separation after chiral derivatization or direct chiral separation on a chiral column[25, 78]. The limitation of this method includes that 1) the harsh acid hydrolysis can induce racemization; 2) the site of racemization is unknown. However, by combining Edman sequencing or peptide mapping with derivatization with chiral reagents, both the sequence and stereo-configuration of a peptide can be determined. Fujii et al tryptically digested αA-crystalline and then used reversed-phase
HPLC-mass spectrometry to analyze the resulting peptides[79]. After the peptides were identified by mass and sequence analysis, the peptides were hydrolyzed and derivatized with o-
22
phthaldialdehyde (OPA) for fluorometric derivatization and N-tert-butyloxycarbonyl-L-cysteine
(Boc-L-Cys) for chiral specificity. They found Asp58 and Asp151 residues in aged human alpha
A-crystalline were highly inverted to D-isomers. Inoue et al also reported racemization and
isomerization of N-terminal Amyloid-β in Alzheimer’s brain tissues by covalent chiral
derivatized UPLC-MS/MS analysis[80].
As an alternative to HPLC-based techniques, racemized amino acids can be identified
and quantified using a stereo-selective enzyme. Protein isoaspartyl methyl transferase (PIMT) selectively recognizes L-isoAsp and D-Asp, which is described in the previous section 1.4.3.2.
Again this method does not recognize racemization site in addition to an underestimation of total damage to protein or an overestimation of one form of specific modification since it only recognizes L-isoAsp and D-Asp, not D-isoAsp.
Currently, the most promising method for detection of racemization is to use the sequence specificity and stereospecificity of antibodies[81, 82]. The antibody-based method is a high sensitive and high throughput assay, but its method development is rather arduous.
1.5 Protein Crosslinking
1.5.1 Crosslink Formation
Protein crosslinks as one of PTMs can arise naturally or as degradation. A few protein crosslinks have been reported so far. Biological crosslinks (e.g. the crosslinks formed via transglutaminases, pentosidine and glucosepane crosslinks) are reported in the literature[83-92], therefore the focus here will be on the crosslinks formed as protein degradation[10, 24, 83].
23
1.5.1.1 Crosslinks as degradants
Disulfide exchange Disulfide scrambling, especially under basic condition, forms abnormal disulfide bonds. High pH deprotonates thiols and forms a thiol anion, which initiate thiol disulfide exchange (Figure 1-6)[93]. Therefore, lowering pH (e.g. typical formulation pH
~5) can minimize disulfide bond scrambling. A few scrambled disulfides in stressed therapeutic
IgG1s, anti-HER2 and anti-CD11a, were characterized by LC/MS with ETD[94]. Disulfide scrambling is a common issue for proteins containing disulfide bonds, therefore care should be taken during protein manufacturing process.
Figure 1-6 Disulfide scrambling under a basic condition to form crosslink degradants[93]
Thioethers Thioether crosslinks were reported to form as protein degradation products, especially under basic conditions, via dehydroalanine intermediate followed by Michael addition
(Figure 1-7)[10, 13, 14, 83, 95-99]. During thioether formation, cysteine racemization on IgG heavy and light chains was observed[100]. The light chain sequence was reported to impact the rate of thioether formation ─ thioether formation rates were faster for IgG1 containing λ light chains than those containing κ light chains[101]. Mozziconnacci et al also observed photolytic
24 conversion of a disulfide bond in IgG1 to thioether crosslink via a thiyl radical-dependent mechanism[98]. As such, thioether crosslink may be a potential issue for the production and formulation of therapeutically disulfide-containing proteins. In this thesis, thioether crosslinks in
IgG2 were formed under a basic condition and their characterization will be presented in Chapter
Three.
Figure 1-7.The formation of thioether and other related degradants via dehydroalanine followed by Michael addition[10, 13, 14, 83, 95-98].
Succinimide-mediated intermolecular transamidation Covalent dimer formation in insulin has been observed both in aqueous and lyophilized formulations[102-104]. Evidence suggests it involves rate-limiting formation of a cyclic anhydride intermediate at the C-terminal
AsnA-21 followed by intermediate partitioning to form crosslinking of AsnA-21-PheB-1 and AsnA-21-
25
GlyA-1 (Figure 1-8)[102-104]. Similar intermolecular amide-linked crosslinking was also
observed in hen egg-white lysozyme[105].
Figure 1-8. Non-disulfide crosslinking in insulin which arise from the initial formation of a
cyclic anhydride intermediate at the C-terminal Asn followed by the reaction with the N-terminal
free amine of another insulin molecule[103].
Tyrosine-related crosslinks Dityrosine crosslink, a biomarker of oxidative/nitrative stress, is a fluorescent molecule detected as photo-degradation in many proteins (such as insulin, calmodulin, etc.) [106, 107]. The dityrosine crosslink formed by UV irradiation of bovine brain calmodulin was believed to be the intermolecular crosslinking of Tyr99 and Tyr138[108]. The
mechanism of dityrosine formation begins with the generation of a tyrosinal radical which then is crosslinked to form dityrosine (Figure 1-9A)[106]. In addition, Mozziconacci et al studied
26
photodegradation of recombinant human insulin in the solid state[109]. They found
dithiohemiacetal and Tyr-Cys crosslink by GluC digestion of the UV-irradiated human insulin followed by mass spectrometry analysis. UV-exposure of solid human insulin results in
photodissociation of the C-terminal intrachain disulfide bond, leading to the formation of a thiyl
radical pair which reacts to proximal Tyr radical to form Tyr-Cys crosslink (Figure 1-9A)[109].
So in order to be crosslinked, the involved amino acid residues must be within a certain distance
of each other.
Other Tyrosine related crosslinks were also detected in the metal-catalyzed oxidized
recombinant human Interferon β-1a and recombinant human insulin (Figure 1-9B)[110, 111].
Tyrosine residues were first oxidized in the presence of copper(II) and ascorbate. The tyrosine
oxidation products undergo Michael addition which is initiated by a primary amine group
(Figure 1-9B). This results in the formation of crosslinks which are most likely responsible for
aggregation[111].
27
Figure 1-9. (A) Dityrosine and Try-Cys crosslinks formed via tyrosyl radical under oxidation
conditions[106, 109]. (B) Other Tyrosine related crosslink in metal catalyzed oxidized interferon
β-1a involving tyrosine oxidation followed by Michael addition with primary amines of Lys side chain or N-termini[110, 111].
Histidine-histidine crosslink Histidine-histidine crosslink formed by photo-oxidation was reported in free histidine and peptides containing histidine (Figure 1-10)[112, 113]. The evidence obtained supports the role of singlet oxygen in the formation of reactive peroxide
28
intermediate on exposure of His-containing peptides to light, which leads to final histidine-
histidine crosslink (Figure 1-10). Most recently, histidine-histidine crosslinks in the hinge region of light stressed IgG1 as protein photodegradation product has been discovered and characterized in our lab. The detail will be discussed in Chapter Four.
Figure 1-10. Histidine-related crosslinks via photo-oxidation[112].
Ditryptophan crosslink A non-disulfide covalent dimer of human superoxide dismutase
1 (hSod1), which was produced during its bicarbonate-dependent peroxidase activity in vitro, has been isolated and characterized by coupling 18O-labeling and mass spectrometry analysis
29
recently[114]. This covalent dimer was found to consist of two hSod1 subunits crosslinked by a
ditrytophan, which contains a bond between C3 and N1 of the respective Trp32 residues (Figure
1-11)[114]. Carbonate radical was believed to promote ditryptophan crosslink[114].
Figure 1-11. Dimerization of human superoxide dismutase via a novel oxidative modification ─
ditryptophan crosslink[114].
Formaldehyde-mediated crosslinking Formaldehyde-mediated crosslinking was also
reported to cause significant aggregation of lyophilized tetanus and diphtheria toxoids during
storage[115]. Formaldehyde, used to prepare the toxoid from the native toxin, reacts with Lys
residue and results in the reactive electrophiles which react with nucleophiles of a second vaccine molecule to form intermolecular crosslinks (Figure 1-12).
30
Figure 1-12. Formaldehyde-mediated cross-linking in vaccines where a formaldehyde-modified
electrophile is attacked by nucleophiles to form intermolecular crosslinks[115].
Amide crosslink A cross-linked ribonuclease A (RNase A) dimer composed of monomeric units covalently linked by a single amide bond between the side-chain of Lys66 and
Glu9 was generated without chemical reagent and characterized by mass spectrometry(Figure 1-
13)[116, 117]. It is very interesting that this dimer shows a two-fold increase in activity over monomeric RNase A[116].
31
Figure 1-13. RNase dimerization by a single amide bond between Lys66 and Glu9 under vacuum
and 85 °C[117]
1.5.2 Biological and Biopharmaceutical Importance of Crosslinks
Since the technical challenge on analysis of crosslink (as described in the next section
1.5.3), the limited crosslinks (such as thioether) were discovered serendipitously and
characterized with painstaking efforts. This results in limited biological knowledge on protein
crosslinks. In this section, a brief review of biological and biopharmaceutical importance of crosslinks will be presented next.
Protein crosslinks play a significant role in protein structure and stability. For example, disulfide bonds between cysteine residues separated in primary sequence often help to stabilize tertiary structure and subsequently affect biological activity and stability[51, 118]. On the other hand, abnormal crosslinks often lead to protein stability problem and immunogenicity.
Aggregation of type I soluble tumor necrosis factor receptor due to photoirradiation was reported via disulfide formation[119]. Aberrant disulfide linkages are also reported to associate with disease. The mutation of Cys470 to Arg in recombinant human arylsulfatase A (rhASA) and thus disruption of a disulfide linkage, has been reported in patients with metachromatic leukodystrophy (MLD), an autosomal recessive disease[120].
32
Dityrosine crosslinking is likely responsible for the dimerization and decrease bioactivity
of insulin[107]. Elevated levels of urinary dityrosine have been demonstrated in aging animals
and patients with systemic inflammation[106].
Thioether crosslink in commercial recombinant human growth hormone (r-hGH)
exhibited a significantly reduced in vivo biopotency and altered receptor-binding properties
compared with a control[96].
Covalent insulin dimers formed through transamidation reactions of AsnA21 and PheB1 accumulate in the circulation of type I diabetic patients undergoing prolonged insulin therapy, accounting for significantly reduced insulin biologic activity[121].
1.5.3 Methods for Detection and Characterization of Crosslinks
1.5.3.1 HPLC with Fluorescence Detection
Dityrosine crosslinks were detected and quantified by LC with fluorescence detection after acid hydrolysis of proteins[106, 122]. This method provides no structural data of crosslink site. In addition, the method is not suitable for the crosslinks which are not stable to conventional conditions of acid hydrolysis. Furthermore, HPLC results can be altered by the presence of other molecules that can coelute with the target molecule. This requires additional characterization (e.g. by MS) on the molecule of interest.
1.5.3.2 MS-based Method
Despite the excellence of mass spectrometry as an analytical tool for PTMs, it is very challenging to identify crosslinked peptides after proteolytic digestion of crosslinked proteins
33
because crosslinked peptides are often present at substoichiometric levels, which leads to failure
in detecting them during data-dependent LC/MS analysis. Even when the crosslink peptides have been identified, it remains challenging to assign sequences and locate site of crosslinking.
This is because (1) tandem mass spectra of crosslinked peptides are complicated by the presence of two sets of fragment ions; (2) the masses of crosslinked peptides are not in database if the crosslink chemistry is unknown. Therefore, traditional database search algorithms and de novo sequencing cannot be used to interpret their tandem mass spectrometry. In chemical crosslinking
widely used to probe protein structure and interaction, the crosslink chemistry is known. A
database of the intact mass and the tandem mass spectra for possible combination of crosslinked
peptides can be computerized, and subsequently used to correlate with observed spectra to
identify both the peptide sequence and sites of modification. However, this approach is futile if
the crosslink chemistry is unknown. This ultimately leads to very limited crosslinks discovered.
It is very helpful for interpretation of the tandem mass spectra if the fragment ions with and
without crosslink site can be distinguished isotopically or chemically. The linear fragment ions
then can be used for database searching as well as de novo sequencing[13]. This will be
described in more detail in our XChem-Finder work flow in Chapter Three and Four. In this
section, a brief summary of the techniques used to facilitate MS analysis of crosslinks will be
presented next.
1.5.3.2.1 C-Terminal 18O-Labeling:
Various stable isotope labeling techniques via metabolic labeling, chemical tagging, or
proteolytic 18O labeling have been developed and used for relative quantitation of change in
protein abundances between two compared samples, and also for qualitative characterization of
34
differentially labeled proteomes[123-125]. In this section the focus will be on simple and easy
enzyme-catalyzed 18O-labeling.
Protease catalytic 18O-labeling relies on the help of proteases (such as trypsin, Lys-C,
Glu-C) to exchange two 16O atoms for two 18O atoms at the C-terminal carboxyl group of each
newly formed digested peptide, resulting in mass shift of 4 Da[123, 126]. It was reported the
optimum pH for the carboxyl oxygen exchange reaction catalyzed by Lys-C and trypsin are 5
and 6, respectively[126]. Other protease such as endoproteinase Lys-N was reported only
incorporate one 18O atom and yield spectra insufficient to resolve isotope peak overlap[127].
Different peptides incorporate 18O atoms at different rates, which can complicate data analysis
and limit its applications in quantitative proteomics[128]. As a matter of fact, trypsin-catalyzed
18O exchange at the carboxyl terminus is in many instances inhomogeneous/incomplete[129].
Also back exchange of the carboxyl oxygen of 18O-labeled peptide to oxygen-16 could occur in
16O-water. Therefore, several approaches including using high enzyme-to-protein ratio[130],
low pH[131], heating[130], immobilized trypsin[132] were developed to optimize the exchange reaction.
Incorporation of 18O at a newly created C-termini during proteolytic digestion represents
clever approach to specifically detect crosslinked peptides[133]. Four and two 18O will be
incorporated at C-termini of a crosslinked peptide and linear peptide, respectively, thus resulting
in a specific isotopic signature in the mass shift of 8 Da for crosslinked peptides and 4 Da for linear peptides. The proteolytic 18O-labeling achieving isotope labeling concurrent with the
proteolytic digestion of proteins offers simplicity. In this thesis, the strategy of incorporation of
18O at C-terminus of peptide during tryptic digestion was used to facilitate the identification of
35
thioethers and histidine-histidine crosslinks, and the optimized 18O-labeled method will be described in more detail in Chapter Three and Four.
For large molecular weight crosslinks, this 18O-labeling may not produce sufficient mass
difference between light and heavy form, therefore complicating detection of crosslinks. Since
protease-catalyzed 18O-labeling is only applied to newly created C-termini, a crosslink
containing C-terminus of protein has mass shift of only 4 Da compounding with linear peptides.
To overcome this, other strategies such as using proteases with different specificity or N-terminal
modification can be explored.
1.5.3.2.2 N-Terminal Modification
Since crosslinks contain two amino termini, isotopic labeling of α-amino groups with N-
terminal modification reagents will lead to incorporation of the two isotopically coded groups in
crosslinked peptides versus one in case of linear peptides (Figure 1-14)[134]. Modification of
crosslinked peptides with an equimolar mixture of light and heavy isotopic forms of an amine-
2 2 reactive reagent e.g. H3-2,4-dinitrofluorobenzen ([ H3]NDFB) results in a specific triplet of
signals separated by mass according to the mass difference between the light and heavy isotopic
forms of the reagent[134]. The resulting 1:2:1 intensity ratio of these peaks is due to the possible
combinations of the different isoforms of the product (LL, LH+HL, HH, where L and H are light
and heavy forms, respectively). On the other hand, modification of a single N-terminal amino
group of linear peptides results in a doublet of signals in a 1:1 ratio due to the possible
combination of L and H isoforms[134]. One complication in this approach is the possible
modification of the є-amino groups of lysine residues which will produce a false positive
isotopic signature for lysine-containing linear peptides. Blocking lysine residues by reductive
36 methylation and then selectively hydrolyzing to release α-amino termini for derivatization with
2 [ H3]DNFB prior to enzymatic digest was proposed (Figure 1-17)[134]. The utility of this approach was demonstrated in the characterization of the unique crosslinks of polyubiquitin[134]. The lysine residue protection adds additional step in sample preparation and also leads to a higher molecular mass tryptic crosslinks due to missed digest sites at the modified lysine residues.
37
Figure 1-14. Isotopic labeling at N-termini via 1) protection of є–amino group of lysine; 2)
limited proteolysis to generate peptide; 3) specific derivatization of the liberated–amino group
2 with [ H3]DNFB[134].
1.5.3.2.3 Chromatographic Sample Enrichment
In order to improve identification of protein crosslinks using mass spectrometry, sample
enrichment can be greatly helpful. The separation based on protein size, e.g. size exclusion chromatography(SEC), was successfully used to enrich crosslinked protein for the characterization of thioether crosslinks in monoclonal antibody by LCMS analysis[14, 95].
38
Also, crosslinked peptides having higher charge state than linear peptides can be enriched by cation exchange chromatography (CEX)[135].
1.5.3.3 Antibody-based Method
Kato and colleagues have developed rabbit polyclonal and mouse monoclonal antibodies to detect dityrosine immunohistochemically in lipofuscin granules in aged human brain and in atherosclerotic lesions in mice[136, 137]. However, method of immunodetection is semiquantitative, requires extensive sample preparation and antibody purification, and may be confounded by the presence of crossreacting proteins in the sample of interest.
1.6 Conclusions
Detection and characterization of PTMs remain challenging, especially when at low abundance. The high sensitive MS-based method is a method of choice for the analysis of
PTMs. Various techniques as described in this chapter have been developed to enrich/facilitate the detection of PTMs. The demand to detect and characterize the low level of various PTMs for obtaining the knowledge of protein degradation pathway, controlling the quality of therapeutic proteins, and understanding PTMs biological importance will drive the continuing effort to develop novel, highly sensitive and sophisticated PTM identification techniques.
39
1.7 References
[1] Walsh CT, Garneau-Tsodikova S, Gatto GJ, Jr. Protein posttranslational modifications: the chemistry of proteome diversifications. Angew Chem Int Ed Engl 2005;44:7342-72.
[2] Walsh C. Posttranslational modifications of proteins: Expanding nature's inventory: Roberts and Company Publishers; 2006.
[3] Liu H, Gaza‐Bulseco G, Faldu D, Chumsae C, Sun J. Heterogeneity of monoclonal antibodies. J Pharm Sci 2008;97:2426-47.
[4] Zhang Z, Pan H, Chen X. Mass spectrometry for structural characterization of therapeutic antibodies. Mass Spectrom Rev 2009;28:147-76.
[5] Manning MC, Chou DK, Murphy BM, Payne RW, Katayama DS. Stability of protein pharmaceuticals: an update. Pharm Res 2010;27:544-75.
[6] Kerwin BA, Remmele RL, Jr. Protect from light: photodegradation and protein biologics. J Pharm Sci 2007;96:1468-79.
[7] Volkin DB, Mach H, Middaugh CR. Degradative covalent reactions important to protein stability. Mol Biotechnol 1997;8:105-22.
[8] Manning MC, Patel K, Borchardt RT. Stability of protein pharmaceuticals. Pharm Res 1989;6:903-18.
[9] Pattison DI, Rahmanto AS, Davies MJ. Photo-oxidation of proteins. Photochem Photobiol Sci 2012;11:38-53.
[10] Wang W, Singh S, Zeng DL, King K, Nema S. Antibody structure, instability, and formulation. J Pharm Sci 2007;96:1-26.
[11] Larsen MR, Trelle MB, Thingholm TE, Jensen ON. Analysis of posttranslational modifications of proteins by tandem mass spectrometry. Biotechniques 2006;40:790-8.
[12] Stone KL, Williams KR. Enzymatic digestion of proteins in gels for mass spectrometric identification and structural analysis. Curr Protoc Protein Sci 2004;Chapter 11:Unit 11 3.
[13] Liu M, Zhang Z, Zang T, Spahr C, Cheetham J, Ren D, et al. Discovery of undefined protein cross-linking chemistry: a comprehensive methodology utilizing 18O-labeling and mass spectrometry. Anal Chem 2013;85:5900-8.
[14] Tous GI, Wei Z, Feng J, Bilbulian S, Bowen S, Smith J, et al. Characterization of a novel modification to monoclonal antibodies: thioether cross-link of heavy and light chains. Anal Chem 2005;77:2675-82.
40
[15] Wang B, Malik R, Nigg EA, Korner R. Evaluation of the low-specificity protease elastase for large-scale phosphoproteome analysis. Anal Chem 2008;80:9526-33.
[16] Swaney DL, Wenger CD, Coon JJ. Value of using multiple proteases for large-scale mass spectrometry-based proteomics. J Proteome Res 2010;9:1323-9.
[17] Mohammed S, Lorenzen K, Kerkhoven R, Breukelen Bv, Vannini A, Cramer P, et al. Multiplexed proteomics mapping of yeast RNA polymerase II and III allows near- complete sequence coverage and reveals several novel phosphorylation sites. Anal Chem 2008;80:3584-92.
[18] Choudhary G, Wu S-L, Shieh P, Hancock WS. Multiple enzymatic digestion for enhanced sequence coverage of proteins in complex proteomic mixtures using capillary LC with ion trap MS/MS. J Proteome Res 2003;2:59-67.
[19] An HJ, Peavy TR, Hedrick JL, Lebrilla CB. Determination of N-glycosylation sites and site heterogeneity in glycoproteins. Anal Chem 2003;75:5628-37.
[20] Yang H, Zubarev RA. Mass spectrometric analysis of asparagine deamidation and aspartate isomerization in polypeptides. Electrophoresis 2010;31:1764-72.
[21] Brennan TV, Clarke S. Effect of adjacent histidine and cysteine residues on the spontaneous degradation of asparaginyl- and aspartyl-containing peptides. Int J Pept Protein Res 1995;45:547-53.
[22] Radkiewicz JL, Zipse H, Clarke S, Houk KN. Neighboring side chain effects on asparaginyl and aspartyl degradation: an ab initio study of the relationship between peptide conformation and backbone NH acidity. J Am Chem Soc 2001;123:3499-506.
[23] Stephenson RC, Clarke S. Succinimide formation from aspartyl and asparaginyl peptides as a model for the spontaneous degradation of proteins. J Biol Chem 1989;264:6164-70.
[24] Kroon DJ, Baldwin-Ferro A, Lalan P. Identification of sites of degradation in a therapeutic monoclonal antibody by peptide mapping. Pharm Res 1992;9:1386-93.
[25] McCudden CR, Kraus VB. Biochemistry of amino acid racemization and clinical application to musculoskeletal disease. Clin Biochem 2006;39:1112-30.
[26] Liu M, Cheetham J, Cauchon N, Ostovic J, Ni W, Ren D, et al. Protein isoaspartate methyltransferase-mediated 18O-labeling of isoaspartic acid for mass spectrometry analysis. Anal Chem 2012;84:1056-62.
[27] Robinson NE, Robinson AB. Deamidation of human proteins. Proc Natl Acad Sci U S A 2001;98:12409-13.
[28] Robinson NE. Protein deamidation. Proc Natl Acad Sci U S A 2002;99:5283-8.
41
[29] Fujii N, Kaji Y, Fujii N. D-Amino acids in aged proteins: analysis and biological relevance. J Chromatogr B Analyt Technol Biomed Life Sci 2011;879:3141-7.
[30] Fujii N, Kaji Y, Fujii N, Nakamura T, Motoie R, Mori Y, et al. Collapse of homochirality of amino acids in proteins from various tissues during aging. Chem Biodivers 2010;7:1389-97.
[31] Zhang J, Yip H, Katta V. Identification of isomerization and racemization of aspartate in the Asp-Asp motifs of a therapeutic protein. Anal Biochem 2011;410:234-43.
[32] Young GW, Hoofring SA, Mamula MJ, Doyle HA, Bunick GJ, Hu Y, et al. Protein L- isoaspartyl methyltransferase catalyzes in vivo racemization of Aspartate-25 in mammalian histone H2B. J Biol Chem 2005;280:26094-8.
[33] Mori Y, Aki K, Kuge K, Tajima S, Yamanaka N, Kaji Y, et al. UV B-irradiation enhances the racemization and isomerizaiton of aspartyl residues and production of Nepsilon-carboxymethyl lysine (CML) in keratin of skin. J Chromatogr B Analyt Technol Biomed Life Sci 2011;879:3303-9.
[34] Won JI, Meagher RJ, Barron AE. Characterization of glutamine deamidation in a long, repetitive protein polymer via bioconjugate capillary electrophoresis. Biomacromolecules 2004;5:618-27.
[35] Kim E, Lowenson JD, MacLaren DC, Clarke S, Young SG. Deficiency of a protein- repair enzyme results in the accumulation of altered proteins, retardation of growth, and fatal seizures in mice. Proc Natl Acad Sci U S A 1997;94:6132-7.
[36] Noguchi S. Structural changes induced by the deamidation and isomerization of asparagine revealed by the crystal structure of Ustilago sphaerogena ribonuclease U2B. Biopolymers 2010;93:1003-10.
[37] Friedman AR, Ichhpurani AK, Brown DM, Hillman RM, Krabill LF, Martin RA, et al. Degradation of growth hormone releasing factor analogs in neutral aqueous solution is related to deamidation of asparagine residues. Replacement of asparagine residues by serine stabilizes. Int J Pept Protein Res 1991;37:14-20.
[38] Huang L, Lu J, Wroblewski VJ, Beals JM, Riggin RM. In vivo deamidation characterization of monoclonal antibody by LC/MS/MS. Anal Chem 2005;77:1432-9.
[39] Nilsson MR, Driscoll M, Raleigh DP. Low levels of asparagine deamidation can have a dramatic effect on aggregation of amyloidogenic peptides: implications for the study of amyloid formation. Protein Sci 2002;11:342-9.
[40] Takata T, Oxford JT, Demeler B, Lampi KJ. Deamidation destabilizes and triggers aggregation of a lens protein, betaA3-crystallin. Protein Sci 2008;17:1565-75.
42
[41] Charache S, Fox J, McCurdy P, Kazazian H, Jr., Winslow R, Hathaway P, et al. Postsynthetic deamidation of hemoglobin Providence (beta 82 Lys replaced by Asn, Asp) and its effect on oxygen transport. J Clin Invest 1977;59:652-8.
[42] Curnis F, Longhi R, Crippa L, Cattaneo A, Dondossola E, Bachi A, et al. Spontaneous formation of L-isoaspartate and gain of function in fibronectin. J Biol Chem 2006;281:36466-76.
[43] Takehara T, Takahashi H. Suppression of Bcl-xL deamidation in human hepatocellular carcinomas. Cancer Res 2003;63:3054-7.
[44] Shimizu T, Watanabe A, Ogawara M, Mori H, Shirasawa T. Isoaspartate formation and neurodegeneration in Alzheimer's disease. Arch Biochem Biophys 2000;381:225-34.
[45] Doyle HA, Gee RJ, Mamula MJ. Altered immunogenicity of isoaspartate containing proteins. Autoimmunity 2007;40:131-7.
[46] Yang ML, Doyle HA, Gee RJ, Lowenson JD, Clarke S, Lawson BR, et al. Intracellular protein modification associated with altered T cell functions in autoimmunity. J Immunol 2006;177:4541-9.
[47] Mamula MJ, Gee RJ, Elliott JI, Sette A, Southwood S, Jones PJ, et al. Isoaspartyl post- translational modification triggers autoimmune responses to self-proteins. J Biol Chem 1999;274:22321-7.
[48] Lowenson JD, Kim E, Young SG, Clarke S. Limited accumulation of damaged proteins in l-isoaspartyl (D-aspartyl) O-methyltransferase-deficient mice. J Biol Chem 2001;276:20695-702.
[49] Sadakane Y, Yamazaki T, Nakagomi K, Akizawa T, Fujii N, Tanimura T, et al. Quantification of the isomerization of Asp residue in recombinant human alpha A- crystallin by reversed-phase HPLC. J Pharm Biomed Anal 2003;30:1825-33.
[50] Fujii N, Tajima S, Tanaka N, Fujimoto N, Takata T, Shimo-Oka T. The presence of D- beta-aspartic acid-containing peptides in elastic fibers of sun-damaged skin: a potent marker for ultraviolet-induced skin aging. Biochem Biophys Res Commun 2002;294:1047-51.
[51] Beck A, Wagner-Rousset E, Ayoub D, Van Dorsselaer A, Sanglier-Cianferani S. Characterization of therapeutic antibodies and related products. Anal Chem 2013;85:715- 36.
[52] Edman P. Sequence determination. Mol Biol Biochem Biophys 1970;8:211-55.
43
[53] Di Donato A, Ciardiello MA, de Nigris M, Piccoli R, Mazzarella L, D'Alessio G. Selective deamidation of ribonuclease A. Isolation and characterization of the resulting isoaspartyl and aspartyl derivatives. J Biol Chem 1993;268:4745-51.
[54] Johnson BA, Shirokawa JM, Hancock WS, Spellman MW, Basa LJ, Aswad DW. Formation of isoaspartate at two distinct sites during in vitro aging of human growth hormone. J Biol Chem 1989;264:14262-71.
[55] Schurter BT, Aswad DW. Analysis of isoaspartate in peptides and proteins without the use of radioisotopes. Anal Biochem 2000;282:227-31.
[56] Lehmann WD, Schlosser A, Erben G, Pipkorn R, Bossemeyer D, Kinzel V. Analysis of isoaspartate in peptides by electrospray tandem mass spectrometry. Protein Sci 2000;9:2260-8.
[57] Cournoyer JJ, Lin C, O'Connor PB. Detecting deamidation products in proteins by electron capture dissociation. Anal Chem 2006;78:1264-71.
[58] Sargaeva NP, Lin C, O'Connor PB. Identification of aspartic and isoaspartic acid residues in amyloid beta peptides, including Abeta1-42, using electron-ion reactions. Anal Chem 2009;81:9778-86.
[59] Cournoyer JJ, Pittman JL, Ivleva VB, Fallows E, Waskell L, Costello CE, et al. Deamidation: Differentiation of aspartyl from isoaspartyl products in peptides by electron capture dissociation. Protein Sci 2005;14:452-63.
[60] Chan WY, Chan TW, O'Connor PB. Electron transfer dissociation with supplemental activation to differentiate aspartic and isoaspartic residues in doubly charged peptide cations. J Am Soc Mass Spectrom 2010;21:1012-5.
[61] Ni W, Dai S, Karger BL, Zhou ZS. Analysis of isoaspartic Acid by selective proteolysis with Asp-N and electron transfer dissociation mass spectrometry. Anal Chem 2010;82:7485-91.
[62] Sargaeva NP, Lin C, O'Connor PB. Differentiating N-terminal aspartic and isoaspartic acid residues in peptides. Anal Chem 2011;83:6675-82.
[63] Dai S, Ni W, Patananan AN, Clarke SG, Karger BL, Zhou ZS. Integrated proteomic analysis of major isoaspartyl-containing proteins in the urine of wild type and protein L- isoaspartate O-methyltransferase-deficient mice. Anal Chem 2013;85:2423-30.
[64] Harris RJ, Kabakoff B, Macchi FD, Shen FJ, Kwong M, Andya JD, et al. Identification of multiple sources of charge heterogeneity in a recombinant antibody. J Chromatogr B Biomed Sci Appl 2001;752:233-45.
44
[65] Zhang W, Czupryn MJ. Analysis of isoaspartate in a recombinant monoclonal antibody and its charge isoforms. J Pharm Biomed Anal 2003;30:1479-90.
[66] Sargaeva NP, Goloborodko AA, O'Connor PB, Moskovets E, Gorshkov MV. Sequence- specific predictive chromatography to assist mass spectrometric analysis of asparagine deamidation and aspartate isomerization in peptides. Electrophoresis 2011;32:1962-9.
[67] Chelius D, Rehder DS, Bondarenko PV. Identification and characterization of deamidation sites in the conserved regions of human immunoglobulin gamma antibodies. Anal Chem 2005;77:6004-11.
[68] Xiao G, Bondarenko PV, Jacob J, Chu GC, Chelius D. 18O labeling method for identification and quantification of succinimide in proteins. Anal Chem 2007;79:2714-21.
[69] Chu GC, Chelius D, Xiao G, Khor HK, Coulibaly S, Bondarenko PV. Accumulation of succinimide in a recombinant monoclonal antibody in mildly acidic buffers under elevated temperatures. Pharm Res 2007;24:1145-56.
[70] Li X, Cournoyer JJ, Lin C, O'Connor PB. Use of 18O labels to monitor deamidation during protein and peptide sample processing. J Am Soc Mass Spectrom 2008;19:855-64.
[71] Gaza-Bulseco G, Li B, Bulseco A, Liu HC. Method to differentiate asn deamidation that occurred prior to and during sample preparation of a monoclonal antibody. Anal Chem 2008;80:9491-8.
[72] Liu H, Wang F, Xu W, May K, Richardson D. Quantitation of asparagine deamidation by isotope labeling and liquid chromatography coupled with mass spectrometry analysis. Anal Biochem 2013;432:16-22.
[73] Du Y, Wang F, May K, Xu W, Liu H. Determination of deamidation artifacts introduced by sample preparation using 18O-labeling and tandem mass spectrometry analysis. Anal Chem 2012;84:6355-60.
[74] Wang S, Kaltashov IA. An 18O-labeling assisted LC/MS method for assignment of aspartyl/isoaspartyl products from Asn deamidation and Asp isomerization in proteins. Anal Chem 2013;85:6446-52.
[75] Rehder DS, Chelius D, McAuley A, Dillon TM, Xiao G, Crouse-Zeineddini J, et al. Isomerization of a single aspartyl residue of anti-epidermal growth factor receptor immunoglobulin gamma2 antibody highlights the role avidity plays in antibody activity. Biochemistry 2008;47:2518-30.
[76] Zhang W, Czupryn JM, Boyle PT, Jr., Amari J. Characterization of asparagine deamidation and aspartate isomerization in recombinant human interleukin-11. Pharm Res 2002;19:1223-31.
45
[77] Alfaro JF, Gillies LA, Sun HG, Dai S, Zang T, Klaene JJ, et al. Chemo-enzymatic detection of protein isoaspartate using protein isoaspartate methyltransferase and hydrazine trapping. Anal Chem 2008;80:3882-9.
[78] Waite ER, Collins MJ, Ritz-Timme S, Schutz HW, Cattaneo C, Borrman HI. A review of the methodological aspects of aspartic acid racemization analysis for use in forensic science. Forensic Sci Int 1999;103:113-24.
[79] Fujii N, Satoh K, Harada K, Ishibashi Y. Simultaneous stereoinversion and isomerization at specific aspartic acid residues in alpha A-crystallin from human lens. J Biochem 1994;116:663-9.
[80] Inoue K, Hosaka D, Mochizuki N, Akatsu H, Tsutsumiuchi K, Hashizume Y, et al. Simultaneous Determination of Post-Translational Racemization and Isomerization of N- Terminal Amyloid-beta in Alzheimer's Brain Tissues by Covalent Chiral Derivatized Ultraperformance Liquid Chromatography Tandem Mass Spectrometry. Anal Chem 2014;86:797-804.
[81] Sakai K, Homma H, Lee JA, Fukushima T, Santa T, Tashiro K, et al. Localization of D- aspartic acid in elongate spermatids in rat testis. Arch Biochem Biophys 1998;351:96- 105.
[82] Sakai K, Homma H, Lee JA, Fukushima T, Santa T, Tashiro K, et al. D-aspartic acid localization during postnatal development of rat adrenal gland. Biochem Biophys Res Commun 1997;235:433-6.
[83] Liu H, May K. Disulfide bond structures of IgG molecules: structural variations, chemical modifications and possible impacts to stability and biological function. MAbs 2012;4:17-23.
[84] Heck T, Faccio G, Richter M, Thony-Meyer L. Enzyme-catalyzed protein crosslinking. Appl Microbiol Biotechnol 2013;97:461-75.
[85] Griffin M, Casadio R, Bergamini CM. Transglutaminases: nature's biological glues. Biochem J 2002;368:377-96.
[86] Lucero HA, Kagan HM. Lysyl oxidase: an oxidative enzyme and effector of cell function. Cell Mol Life Sci 2006;63:2304-16.
[87] Spasser L, Brik A. Chemistry and biology of the ubiquitin signal. Angew Chem Int Ed Engl 2012;51:6840-62.
[88] Baynes JW. Role of oxidative stress in development of complications in diabetes. Diabetes 1991;40:405-12.
46
[89] Nagaraj RH, Shipanova IN, Faust FM. Protein cross-linking by the Maillard reaction. Isolation, characterization, and in vivo detection of a lysine-lysine cross-link derived from methylglyoxal. J Biol Chem 1996;271:19338-45.
[90] Sell DR, Monnier VM. Structure elucidation of a senescence cross-link from human extracellular matrix. Implication of pentoses in the aging process. J Biol Chem 1989;264:21597-602.
[91] Sell DR, Biemel KM, Reihl O, Lederer MO, Strauch CM, Monnier VM. Glucosepane is a major protein cross-link of the senescent human extracellular matrix. Relationship with diabetes. J Biol Chem 2005;280:12310-5.
[92] Monnier VM, Mustata GT, Biemel KL, Reihl O, Lederer MO, Zhenyu D, et al. Cross- linking of the extracellular matrix by the maillard reaction in aging and diabetes: an update on "a puzzle nearing resolution". Ann N Y Acad Sci 2005;1043:533-44.
[93] Trivedi MV, Laurence JS, Siahaan TJ. The role of thiols and disulfides on protein stability. Curr Protein Pept Sci 2009;10:614-25.
[94] Wang Y, Lu Q, Wu SL, Karger BL, Hancock WS. Characterization and comparison of disulfide linkages and scrambling patterns in therapeutic monoclonal antibodies: using LC-MS with electron transfer dissociation. Anal Chem 2011;83:3133-40.
[95] Cohen SL, Price C, Vlasak J. Beta-elimination and peptide bond hydrolysis: two distinct mechanisms of human IgG1 hinge fragmentation upon storage. J Am Chem Soc 2007;129:6976-7.
[96] Lispi M, Datola A, Bierau H, Ceccarelli D, Crisci C, Minari K, et al. Heterogeneity of commercial recombinant human growth hormone (r-hGH) preparations containing a thioether variant. J Pharm Sci 2009;98:4511-24.
[97] Datola A, Richert S, Bierau H, Agugiaro D, Izzo A, Rossi M, et al. Characterisation of a novel growth hormone variant comprising a thioether link between Cys182 and Cys189. ChemMedChem 2007;2:1181-9.
[98] Mozziconacci O, Kerwin BA, Schoneich C. Exposure of a monoclonal antibody, IgG1, to UV-light leads to protein dithiohemiacetal and thioether cross-links: a role for thiyl radicals? Chem Res Toxicol 2010;23:1310-2.
[99] Fradkin AH, Mozziconacci O, Schoneich C, Carpenter JF, Randolph TW. UV photodegradation of murine growth hormone: chemical analysis and immunogenicity consequences. Eur J Pharm Biopharm 2014;87:395-402.
[100] Zhang Q, Flynn GC. Cysteine racemization on IgG heavy and light chains. J Biol Chem 2013;288:34325-35.
47
[101] Zhang Q, Schenauer MR, McCarter JD, Flynn GC. IgG1 thioether bond formation in vivo. J Biol Chem 2013;288:16371-82.
[102] Darrington RT, Anderson BD. Evidence for a common intermediate in insulin deamidation and covalent dimer formation: effects of pH and aniline trapping in dilute acidic solutions. J Pharm Sci 1995;84:275-82.
[103] Strickley RG, Anderson BD. Solid-state stability of human insulin. II. Effect of water on reactive intermediate partitioning in lyophiles from pH 2-5 solutions: stabilization against covalent dimer formation. J Pharm Sci 1997;86:645-53.
[104] Brange J, Hallund O, Sorensen E. Chemical stability of insulin. 5. Isolation, characterization and identification of insulin transformation products. Acta Pharm Nord 1992;4:223-32.
[105] Desfougeres Y, Jardin J, Lechevalier V, Pezennec S, Nau F. Succinimidyl residue formation in hen egg-white lysozyme favors the formation of intermolecular covalent bonds without affecting its tertiary structure. Biomacromolecules 2011;12:156-66.
[106] DiMarco T, Giulivi C. Current analytical methods for the detection of dityrosine, a biomarker of oxidative stress, in biological samples. Mass Spectrom Rev 2007;26:108- 20.
[107] Correia M, Neves-Petersen MT, Jeppesen PB, Gregersen S, Petersen SB. UV-light exposure of insulin: pharmaceutical implications upon covalent insulin dityrosine dimerization and disulphide bond photolysis. PLoS One 2012;7:e50733.
[108] Malencik DA, Anderson SR. Dityrosine formation in calmodulin. Biochemistry 1987;26:695-704.
[109] Mozziconacci O, Haywood J, Gorman EM, Munson E, Schoneich C. Photolysis of recombinant human insulin in the solid state: formation of a dithiohemiacetal product at the C-terminal disulfide bond. Pharm Res 2012;29:121-33.
[110] Torosantucci R, Mozziconacci O, Sharov V, Schoneich C, Jiskoot W. Chemical modifications in aggregates of recombinant human insulin induced by metal-catalyzed oxidation: covalent cross-linking via michael addition to tyrosine oxidation products. Pharm Res 2012;29:2276-93.
[111] Torosantucci R, Sharov VS, van Beers M, Brinks V, Schoneich C, Jiskoot W. Identification of oxidation sites and covalent cross-links in metal catalyzed oxidized interferon Beta-1a: potential implications for protein aggregation and immunogenicity. Mol Pharm 2013;10:2311-22.
48
[112] Agon VV, Bubb WA, Wright A, Hawkins CL, Davies MJ. Sensitizer-mediated photooxidation of histidine residues: evidence for the formation of reactive side-chain peroxides. Free Radic Biol Med 2006;40:698-710.
[113] Kang P, Foote CS. Photosensitized oxidation of 13C,15N-labeled imidazole derivatives. J Am Chem Soc 2002;124:9629-38.
[114] Medinas DB, Gozzo FC, Santos LF, Iglesias AH, Augusto O. A ditryptophan cross-link is responsible for the covalent dimerization of human superoxide dismutase 1 during its bicarbonate-dependent peroxidase activity. Free Radic Biol Med 2010;49:1046-53.
[115] Schwendeman SP, Costantino HR, Gupta RK, Siber GR, Klibanov AM, Langer R. Stabilization of tetanus and diphtheria toxoids against moisture-induced aggregation. Proc Natl Acad Sci U S A 1995;92:11234-8.
[116] Simons BL, Kaplan H, Fournier SM, Cyr T, Hefford MA. A novel cross-linked RNase A dimer with enhanced enzymatic properties. Proteins 2007;66:183-95.
[117] Simons BL, King MC, Cyr T, Hefford MA, Kaplan H. Covalent cross-linking of proteins without chemical reagents. Protein Sci 2002;11:1558-64.
[118] Zhang T, Zhang J, Hewitt D, Tran B, Gao X, Qiu ZJ, et al. Identification and characterization of buried unpaired cysteines in a recombinant monoclonal IgG1 antibody. Anal Chem 2012;84:7112-23.
[119] Roy S, Mason BD, Schoneich CS, Carpenter JF, Boone TC, Kerwin BA. Light-induced aggregation of type I soluble tumor necrosis factor receptor. J Pharm Sci 2009;98:3182- 99.
[120] Coulter-Mackie MB, Gagnier L. Spectrum of mutations in the arylsulfatase A gene in a Canadian DNA collection including two novel frameshift mutations, a new missense mutation (C488R) and an MLD mutation (R84Q) in cis with a pseudodeficiency allele. Mol Genet Metab 2003;79:91-8.
[121] Robbins DC, Hirshman M, Wardzala LJ, Horton ES. High-molecular-weight aggregates of therapeutic insulin. In vitro measurements of receptor binding and bioactivity. Diabetes 1988;37:56-9.
[122] Malencik DA, Anderson SR. Dityrosine as a product of oxidative stress and fluorescent probe. Amino Acids 2003;25:233-47.
[123] Miyagi M, Rao KC. Proteolytic 18O-labeling strategies for quantitative proteomics. Mass Spectrom Rev 2007;26:121-36.
[124] Wang YK, Ma Z, Quinn DF, Fu EW. Inverse 18O labeling mass spectrometry for the rapid identification of marker/target proteins. Anal Chem 2001;73:3742-50.
49
[125] Ye X, Luke B, Andresson T, Blonder J. 18O stable isotope labeling in MS-based proteomics. Brief Funct Genomic Proteomic 2009;8:136-44.
[126] Hajkova D, Rao KC, Miyagi M. pH dependency of the carboxyl oxygen exchange reaction catalyzed by lysyl endopeptidase and trypsin. J Proteome Res 2006;5:1667-73.
[127] Rao KC, Palamalai V, Dunlevy JR, Miyagi M. Peptidyl-Lys metalloendopeptidase- catalyzed 18O labeling for comparative proteomics: application to cytokine/lipolysaccharide-treated human retinal pigment epithelium cell line. Mol Cell Proteomics 2005;4:1550-7.
[128] Ramos-Fernandez A, Lopez-Ferrer D, Vazquez J. Improved method for differential expression proteomics using trypsin-catalyzed 18O labeling with a correction for labeling efficiency. Mol Cell Proteomics 2007;6:1274-86.
[129] Ye X, Luke BT, Johann DJ, Jr., Ono A, Prieto DA, Chan KC, et al. Optimized method for computing (18)O/(16)O ratios of differentially stable-isotope labeled peptides in the context of postdigestion (18)O exchange/labeling. Anal Chem 2010;82:5878-86.
[130] Petritis BO, Qian WJ, Camp DG, 2nd, Smith RD. A simple procedure for effective quenching of trypsin activity and prevention of 18O-labeling back-exchange. J Proteome Res 2009;8:2157-63.
[131] Stewart, II, Thomson T, Figeys D. 18O labeling: a tool for proteomics. Rapid Commun Mass Spectrom 2001;15:2456-65.
[132] Sevinsky JR, Brown KJ, Cargile BJ, Bundy JL, Stephenson JL, Jr. Minimizing back exchange in 18O/16O quantitative proteomics experiments by incorporation of immobilized trypsin into the initial digestion step. Anal Chem 2007;79:2158-62.
[133] Back JW, Notenboom V, de Koning LJ, Muijsers AO, Sixma TK, de Koster CG, et al. Identification of cross-linked peptides for protein interaction studies using mass spectrometry and 18O labeling. Anal Chem 2002;74:4417-22.
[134] Chen X, Chen YH, Anderson VE. Protein cross-links: universal isolation and characterization by isotopic derivatization and electrospray ionization mass spectrometry. Anal Biochem 1999;273:192-203.
[135] Rinner O, Seebacher J, Walzthoeni T, Mueller LN, Beck M, Schmidt A, et al. Identification of cross-linked peptides from large sequence databases. Nat Methods 2008;5:315-8.
[136] Kato Y, Maruyama W, Naoi M, Hashizume Y, Osawa T. Immunohistochemical detection of dityrosine in lipofuscin pigments in the aged human brain. FEBS Lett 1998;439:231-4.
50
[137] Kato Y, Wu X, Naito M, Nomura H, Kitamoto N, Osawa T. Immunochemical detection of protein dityrosine in atherosclerotic lesion of apo-E-deficient mice using a novel monoclonal antibody. Biochem Biophys Res Commun 2000;275:11-5.
51
Chapter 2: Protein Isoaspartate Methyltransferase-Mediated 18O-Labeling of
Isoaspartic Acid for Mass Spectrometry Analysis
Reproduced with permission from “Min Liu, Janet Cheetham, Nina Cauchon, Judy
Ostovic, Wenqin Ni, Da Ren, and Zhaohui Sunny Zhou. Protein Isoaspartate
Methyltransferase-Mediated 18O-Labeling of Isoaspartic Acid for Mass Spectrometry
Analysis. Analytical Chemistry, 2012, 84, 1056-1062”. Copyright [2012] American
Chemical Society.
Co-authors’ work in this chapter: Min Liu: experimental design and execute, data analysis,
manuscript writing and revision; Janet Cheetham: manuscript writing and revision, grant support;
Nina Cauchon: manuscript writing and revision; Judy Ostovic: manuscript writing and revision;
Wenqin Ni: PIMT purification, manuscript writing and revision; Da Ren: idea contribution,
experimental design, manuscript writing and revision; Zhaohui Sunny Zhou: idea contribution,
experimental design, data analysis, manuscript writing and revision and grant support.
52
2.1 Abstract
Arising from spontaneous aspartic acid (Asp) isomerization or asparagine (Asn)
deamidation, isoaspartic acid (isoAsp, isoD or beta-Asp) is a ubiquitous non-enzymatic
modification of proteins and peptides. Due to no mass difference between isoaspartyl and
aspartyl species, sensitive and specific detection of isoAsp, particularly in complex samples,
remains challenging. Here we report a novel assay for Asp isomerization by isotopic labeling
with 18O via a two-step process: the isoAsp peptide is first specifically methylated by protein
isoaspartate methyltransferase (PIMT, EC 2.1.1.77) to the corresponding methyl ester, which is
subsequently hydrolyzed in 18O-water to regenerate isoAsp. The specific replacement of 16O with 18O at isoAsp leads to a mass shift of 2 Da, which can be automatically and unambiguously
recognized using standard mass spectrometry, such as collision-induced dissociation (CID), and
data analysis algorithms. Detection and site identification of several isoAsp peptides in a
monoclonal antibody and the β-delta sleep-inducing peptide (DSIP) are demonstrated.
2.2 Introduction
The non-enzymatic post-translational formation of isoaspartic acid (isoAsp, isoD or beta-
Asp) in oligopeptides (Scheme 2-1), arising from either the isomerization of aspartic acid (Asp)
or the deamidation of asparagine (Asn) via a common succinimide intermediate, is one of the major chemical degradation pathways both in vivo and in vitro. The formation of isoAsp via Asp isomerization and Asn deamidation in a protein alters its structure by inserting an extra methylene group into the peptide backbone[1] and, in doing so, may change protein activity or trigger an immunologic response[2, 3]. As such, this protein post-translational modification
53
(PTM) plays critical roles in biological processes, human diseases and protein pharmaceutical
development. For example, isoAsp level is elevated in amyloid-beta peptides in Alzheimer’s
disease[4]. IsoAsp accumulates over time and thus is associated with aging, perhaps acting as a
molecular clock[5-9]. In addition, significant amounts of isoAsp are also commonly observed in protein pharmaceuticals and represent a major contributor to heterogeneity, particularly after
long-term storage[10-12]. Typically, pH is one of critical factors affecting the formation of Asn
deamidation and Asp isomerization; both processes are also reported to depend on the primary
sequences, higher-order structures and formulation[10, 11, 13-15].
O O
NH2 OH NH NH N N H H O O Asparagine Aspartic acid (Asn) (Asp) NH3 H2O Deamidation Isomerization
O
N N H O Succinimide
H2O
O
NH OH N H O Isoaspartic acid (IsoAsp)
Scheme 2-1. Formation of isoAsp from the isomerization of aspartic acid (Asp) or the
deamidation of asparagine (Asn).
54
For analysis, it is challenging to differentiate isoAsp from Asp or Asn (particularly the former), as isoAsp and Asp have identical mass and bear similar charge and structure. Current approaches for isoAsp analysis include chemical (e.g., Edman degradation), immunological, enzymatic (e.g., isoQuant), and instrumental methods (e.g., chromatography and mass spectrometry)[16, 17]. High performance liquid chromatography (HPLC) coupled with mass spectrometry/collision induced dissociation (LC-MS/CID) is commonly used for the characterization of protein modifications, however MS/CID often fails to differentiate isoAsp
and Asp[17]. Recently O’Connor and others have demonstrated that electron transfer
dissociation (ETD)/electron capture dissociation (ECD) mass spectrometry is able to distinguish
isoAsp from Asp peptides based on a pair of the characteristic reporter ions of isoAsp (c.+58 and
z-57)[16-20]. However, the peak intensity of this single pair of diagnostic ions of isoAsp may
vary under different conditions, often requiring manual inspection of the spectral data and
making assignment ambiguous when isoAsp is present at low abundance; moreover, multiply
charged precursor ions are required as well, limiting the scope of this method.
Another commonly used assay is based on protein isoaspartate O-methyltransferase
(PIMT or PCMT, EC 2.1.1.77). This enzyme specifically transfers a methyl group from S- adenosyl-L-methionine (SAM or AdoMet) to isoAsp, generating S-adenosyl-homocysteine
(SAH or AdoHcy) and the corresponding isoaspartate methyl esters (Scheme 2-2)[5, 21-24]. As
a result, the amount of isoAsp can be deduced from quantifying AdoHcy, the byproduct of
methylation. The other methylation product, isoAsp methyl ester, is labile and spontaneously
cyclize to aspartyl succinimide (Asu), which is also labile under most conditions used for the analysis of peptides and proteins[25, 26]. Therefore, the information on the specific location of isoAsp in peptides is often lost as the methyl esters and succinimides typically rapidly hydrolyze
55 back to isoAsp and Asp. To overcome these limitations, trapping the labile isoaspartyl methyl esters and succinimides with hydrazines or hydroxylamines has been developed for isoAsp detection[27, 28]. However, the conversion of methyl esters to hydrazides or hydroxamic acids is not stoichiometric, as water present in the reagent solutions competes with the trapping reaction. On the other hand, as discussed below, the hydrolysis reaction in 18O-water represents an attractive method for isoAsp labeling.
Scheme 2-2. Isotopic labeling of isoaspartic acid via protein isoaspartyl methyltransferase
(PIMT)-catalyzed S-adenosyl-methionine (SAM or AdoMet)-dependent methylation and hydrolysis of the resulting methyl ester and succinimide in 18O-water.
56
Scheme 2-3. Identification of isoAsp peptides by mass spectrometry using the mass increase of 2
Da imparted by 18O-labeling.
Stable isotope labeling combined with mass spectrometry analysis is a powerful tool for identification and quantification due to the fact that no detectable change in retention time, ionization efficiency and fragmentation patterns after isotope labeling is observed[29-37]. For example, Fenselau and coworkers have developed a general strategy to label C-termini of peptides using proteases and 18O-water for peptide quantification and identification[30, 35, 36].
The use of 18O to quantitate succinimide and to track deamidation during sample handling has
also been reported[29, 31-34]. Herein we describe a novel isoAsp assay that couples PIMT-
mediated methylation with 18O-labeling followed by LC-MS analysis (Scheme 2-2 & 2-3). In
57
the first step, PIMT specifically methylates isoAsp to form isoaspartate methyl ester.
Subsequently, the labile methyl ester spontaneously cyclizes to the succinimide intermediate
(Asu) which then hydrolyzes in 18O-water to produce 18O-labeled isoAsp[29, 31-34, 38]. The
incorporation of 18O shifts the modified residual mass by 2 and 3 Da relative to Asp and Asn, respectively. This allows facile screening and site determination of isoAsp using standard mass spectrometry techniques (such as CID) and data analysis algorithms. Using our method, several isoAsp peptides in a recombinant monoclonal antibody and a synthetic peptide were detected and
the sites of isoAsp were identified. In addition, aspartyl succinimide (Asu) and isoAsp can be
distinguished by 18O-labeling in the presence and absence of PIMT. Because mass spectrometry
with reasonably high resolution can distinguish isoAsp from Asn (a mass increase of 0.984 Da),
our method focuses on the analysis of isoAsp from Asp (no change in mass)—the smallest
protein post-translational modification.
2.3 Experimental Section
2.3.1 Chemicals
All chemicals were reagent grade or above. Guanidine hydrochloride (GndHCl) and S-
adenosyl-methionine hydrochloride (AdoMet or SAM) were purchased from Sigma (St. Louis,
MO, USA). 18O-water (97%) was obtained from Cambridge Isotope Laboratories (Andover,
MA, USA). β-Delta sleep-inducing peptides (Asp-DSIP, Trp-Ala-Gly-Gly-Asp-Ala-Ser-Gly-
Glu, and isoAsp-DSIP, Trp-Ala-Gly-Gly-isoAsp-Ala-Ser-Gly-Glu) trifluoroacetate salts were
purchased from Bachem America (King of Prussia, PA, USA). Immobilized trypsin was
purchased from Thermo Scientific (Rockford, IL, USA). Recombinant protein L-isoaspartyl-O-
58
methyltransferase (PIMT) was obtained as previously described[39]. Recombinant monoclonal antibody, anti-streptavidin immunoglobulin gamma 1 (IgG1), was produced in Chinese hamster ovary (CHO) cells, purified according to standard manufacturing procedures, formulated at a concentration of 20 mg/mL in 50 mM acetate buffer at pH 5.2, and stored at -70 °C at Amgen
(Thousand Oaks, CA, USA).
2.3.2 Generation of isoAsp
The Asp-DSIP peptides were dissolved to a final concentration of 1 mg/mL in 0.1 M acetate buffer at pH 4.8 and stored at 50 °C for 3 days. After being exchanged into 0.1M acetate buffer at pH 4, the antibody IgG1 was incubated at 45 °C for 1 month.
2.3.3 Reduction, Alkylation, and Tryptic Digestion of IgG1
IgG1 (20 mg/mL) was diluted to 1 mg/mL in the denaturation buffer (7.5 M guanidine
HCl, 2 mM EDTA and 0.25 M Tris-HCl, pH 7.5) to a final volume of 0.5 mL. Reduction was
accomplished with the addition of 3 μL of 0.5 M dithiothreitol (DTT) followed by 30 min
incubation at room temperature. S-Carboxymethylation was achieved with the addition of 7 μL
of 0.5 M iodoacetic acid (IAA); the reaction was carried out in the dark for 15 min at room
temperature. Excess iodoacetic acid was quenched with the addition of 4 μL of 0.5 M DTT.
Reduced and alkylated IgG1 samples were exchanged into the digestion buffer (0.1 M Tris-HCl
at pH 7.5) using a NAP-5 size-exclusion column (GE Healthcare, Piscataway, NJ, USA). Then,
150 µL of immobilized trypsin suspension was centrifuged at 900 x g for 10 seconds and the
59
supernatant was removed followed by three washes, each with 1 mL of 0.1 M Tris-HCl at pH
7.5. The washed immobilized trypsin was mixed with 300 μL of the reduced, alkylated, and buffer-exchanged antibody to achieve a 1:2 (v/v) enzyme/substrate ratio before incubation at 37
°C for 45 min. Subsequently, 300 μL of acetonitrile was added and the supernatant was collected after centrifuging at 900 x g for 10 seconds. Then, 200 μL of the supernatant was dried and then reconstituted into 98 μL of 18O-water for methylation as described next.
2.3.4 Methylation Catalyzed by PIMT
Methylation reactions were carried out in a final volume of 100 μL containing 0.1 M
Tris-HCl at pH 7.5, 7 μM of DSIP peptides or the tryptic digest of IgG1, 120 μM of AdoMet and
15.9 μM of PIMT at 37 °C for 30 min. To quench methylation, guanidine HCl (GndHCl, 10 M)
was added to a final concentration of 2.3 M. It should be noted that the above buffer or reagents
were prepared in 18O-water, instead of normal water.
2.3.5 18O-Labeling
Sodium bicarbonate buffers at pH 8.5 in 18O-water was prepared by drying 100 μL of 1
M sodium bicarbonate solution at pH 8.5 followed by reconstitution into the same volume of
18O-water. To initiate the hydrolysis of succinimide, 60 μL of 1 M sodium bicarbonate-18O-
water was added to the methylated peptide solution described above. The reaction was then
conducted at 37 °C for 30 min.
60
2.3.6 HPLC
The separation of the DSIP peptides was carried out on an XBridge C18 column (150 x
2.1 mm, 3.5 µm, Waters, Milford, MA, USA) at column temperature of 50 °C with a flow rate of
200 μL/min. Mobile phase A consisted of 0.1% formic acid in water while mobile phase B
contained 0.085% formic acid in 90% acetonitrile. A linear gradient was applied by increasing
mobile phase B from 0 to 50% in 60 min. The injection volume was 25 μL. Chromatographic
profiles were monitored by UV absorption at 215 nm.
Tryptic digests of IgG1 sample were separated on a Polaris Ether C18 column (250 x 2.1
mm, 3 μm, Varian, Palo Alto, CA, USA) at column temperature of 50 °C with a flow rate of 200
μL/min. Mobile phase A was 0.1% trifluoroacetic acid in water while mobile phase B contained
0.085% trifluoroacetic acid in 90% acetonitrile. A linear gradient was applied by increasing
mobile phase B from 0 to 50% in 195 min. The injection volume was 100 μL. Elution profiles
were monitored by UV absorption at 215 nm.
2.3.7 Mass Spectrometry
LXQ and LTQ Orbitrap mass spectrometers (ThermoFisher Scientific, San Jose, CA,
USA) were used in-line with HPLC systems (Agilent 1100, Palo Alto, CA, USA) for the stressed
Asp-DSIP and IgG1 samples, respectively. LXQ was operated with a full scan, zoom scan and
data-independent MS/MS scan. The spray voltage was 5 kV, and the capillary temperature was
280 °C. For LTQ Orbitrap, a high resolution full MS scan at 60,000 resolution (at m/z 400), followed by data-dependent MS/MS scans of the top three most abundant ions, was set up to
61
acquire both the mass and the sequence information. The spray voltage was 5 kV, and the
capillary temperature was 300 °C. Both instruments were tuned using the doubly charged ion of
the synthetic peptide, Bradykinin. The MS/MS spectra were obtained using normalized collision
energy of 35%. Mass Analyzer software developed in-house was used for peptide identification
and sequencing[40, 41]. Extracted ion chromatograms (XIC) were used to quantify the relative
amount of isoAsp peptide and its Asp isomer.
2.4 Results and Discussion
As expected from the general approach outlined in Scheme 2-2 & 2-3, isoAsp peptides
were observed completely labeled by 18O via the sequential methylation and hydrolysis in 18O water, as evident from the mass shift of 2 Da in isotopic distribution (see Figure 2-1). By searching for potential 2 Da mass increases for each Asp residue using a standard data mining algorithm, e.g. MassAnalyzer, several 18O-labeled isoAsp peptides from the IgG1 sample were
automatically identified. Furthermore, the precise locations of isoAsp were readily and
unambiguously established by tandem mass spectrometry using CID.
62
851.36 100 A) 18 80 A) isoAsp-DSIP (with O tag)
60 852.38 40
Relative Abundance 20 853.38 849.38 850.40 854.44 855.36 0 100 849.36
80 B) Asp-DSIP (without 18O tag) 60 850.36 40
Relative Abundance20 851.36 852.38 0 848 849 850 851 852 853 854 855 856 m/z
1406.6407 100 1406.9748 B) 1406.3067 isoAsp-LC69-108 (with18O tag) 80
1407.3088 60
1405.9721 40 1407.6430
Relative Abundance 20 1407.9768 1408.3113 1405.6388 0
1405.9743 100 1406.3083 1405.6402 80 Asp-LC69-108 (without 18O tag)
60 1406.6424
40 1405.3058 1406.9761
Relative Abundance 20 1407.3102 1407.6440 1407.9784 1408.3125 0 1405.5 1406.0 1406.5 1407.0 1407.5 1408.0 1408.5 m/z
Figure 2-1. Isotopic distribution of a singly charged DSIP peptide with/without 18O tag (A) and a
triply charged tryptic peptide LC69-108 from the IgG1 sample with/without 18O tag (B). Their
sequences with alkylated cysteines are
69SGTSASLAITGLQAEDEADYYCQSYisoDSSLSGLYVFGTGTK108 and
69SGTSASLAITGLQAEDEADYYCQSYDSSLSGLYVFGTGTK108, respectively.
63
2.4.1 Methylation of isoAsp
The specificity of the PIMT-catalyzed methylation of isoAsp residues has been
extensively investigated, indicating that only isoAsp residues, but not Asp, are recognized[2, 5,
9, 10, 24]. In agreement with the literature, no methyl ester or succinimide was observed for the
Asp-DSIP peptide (Figure 2-2).
Asp m/z 849.37 100 Asp-DSIP Sample 80
60
40
Relative Abundance 20
0 Succinimide
m/z 831.42 100 isoAsp-DSIP Sample 80
60 18O-IsoAsp 18 40 m/z 851.45 O-Asp Methyl Ester m/z 851.43 m/z 863.33 Relative Abundance 20
0 14 15 16 17 18 19 20 Time (min)
Figure 2-2. Specificity of PIMT-mediated 18O-labeling shown in the Asp-DSIP and isoAsp-DSIP
samples. Methylation via PIMT specifically occurs at isoAsp peptide (bottom trace), not Asp
peptide (top trace). The corresponding methyl ester is labile and converts to the succinimide
intermediate spontaneously. Hydrolysis of succinimide in 18O-water forms 18O-isoAsp (major
product) and 18O-Asp (minor product).
64
As illustrated in Scheme 2-2, the degree of methylation of isoAsp peptides dictates the
overall yield of 18O incorporation into isoAsp residues. Therefore, methylation efficiency was optimized before attempting 18O-labeling. As for a typical enzymatic transformation,
methylation is faster under higher PIMT concentration, so a relatively high concentration of
PIMT (16 μM) was employed. In addition, PIMT is known to be sensitive to feedback inhibition
from the product, S-adenosyl-homocysteine (AdoHcy or SAH)[42], so excess of AdoMet (the
methyl donor) was also used. If necessary, product inhibition can be further alleviated by the
addition of AdoHcy nucleosidase (EC 3.2.2.9), as has been previously demonstrated[43-45]. In
addition, pH, temperature, detergents (e.g. SDS) or chaotropic reagents (e.g. guanidine-HCl) may
affect PIMT enzyme activity, therefore caution was taken during sample preparation. Under our
conditions, methylation of isoAsp was completed as evidenced by the disappearance of isoAsp peptide peak and the concomitant appearance of peaks for its methyl ester and succinimide peptide (Figure 2-3). Moreover, complete methylation is also supported by the near stoichiometric incorporation of 18O into each isoAsp residue in isoAsp-DSIP and the tryptic
peptides from the IgG1 antibody described next (Figure 2-1 and Figure 2-3).
65
A) 849.49 100 81.4% Asp 18.6% isoAsp Mixture prior to PIMT/18O labeling 50 849.53
0 849.53 100 75.3% Asp 15.3% Methylester No hydrolysis 50 9.4% Succinimide 831.41 863.50 0 849.55 100 88.2% Asp 18 Hydrolysis at pH9/30min 50 11.8% O-isoAsp 851.43 0 849.53 100 Relative Abundance 88.3% Asp 11.7% 18O-isoAsp Hydrolysis at pH9/3hrs 50 851.51 0 849.41 100 87.9% Asp 12.1% 18O-isoAsp Hydrolysis at pH9/24hrs 50 851.51 0 11 12 13 14 15 16 17 18 19 20 Time (min)
849.30 851.34 100 B)
90 Unlabeled isoAsp 18 80 O-isoAsp (Hydrolysis at pH9/30min) 70 18O-isoAsp (Hydrolysis at pH9/3hrs) 60 18O-isoAsp (Hydrolysis at pH9/24hrs)
50 850.30 852.36
40
30
20 853.36
10 854.34 855.34 0 847 848 849 850 851 852 853 854 855 856 857 m/z
Figure 2-3. The mixture of isoAsp-DSIP and DSIP peptide was analyzed by PIMT/18O-labeling method. Complete conversion of isoAsp to its corresponding methyl ester and succinimide was evidenced by the disappearance of the isoAsp peak. Complete hydrolysis in the next step was supported by the disappearance of the methyl ester and succinimide peaks in buffer pH 9 for 30 min. No significant isoAsp change was observed when the sample was exposed to buffer at pH 9,
37 °C, after 30 min, 3 hrs and 24 hrs (A). No isotopic distortion was observed when the sample was exposed for 24 hrs (B).
66
To ensure all isoAsp residues are accessible to PIMT, the IgG1 was first digested by trypsin prior to methylation. Immobilized trypsin was used and removed from the peptides after digestion for several reasons: first and foremost, to prevent 18O-labeling at the C-termini of peptides[30, 35, 36]; second, to prevent proteolysis of PIMT. For comparison, the reduced and alkylated protein was methylated/labeled in 18O-water first and then digested with trypsin.
Similar results were found (data not shown), suggesting in this case all isoAsp residues in this particular protein are accessible to the PIMT enzyme.
It is worth noting that isoAsp residues at C- or N-termini are not methylated by PIMT, thus precluding their detection by our approach. On the other hand, it has been shown that isoAsp is refractory to the proteolytic digestion by most endoproteases (such as Asp-N and trypsin)[16, 46, 47], so isoAsp is unlikely to be at the termini of the digested peptides. If such peptides do exist, N-terminal isoAsp can be detected by other methods, such as ETD mass spectrometry, as recently described[48].
67
2.4.2 Hydrolysis and 18O Incorporation
As illustrated in Scheme 2-2, the isoaspartate methyl ester spontaneously converts to
succinimide which is hydrolyzed in 18O-water into isoAsp and Asp peptides, resulting in 18O incorporation. The mass spectra of 18O-labeled isoAsp-DSIP and IgG1 peptides showed greater
than 95% isotope incorporation (Figure 2-1 and Figure 2-3), which is crucial to the subsequent
mass spectrometric analysis. Considering that 18O-water contained ~ 3% 16O-water, our results
again indicate that isoAsp was completely methylated under these conditions.
Buffer pH and hydrolysis time are critical parameters to completely convert succinimide
to 18O-isoAsp/Asp with minimum artifact. As shown in Figure 2-4, the hydrolysis of methyl
ester and succinimide was quicker under higher pH conditions, though we suspect that the
abundance of deamidation artifacts from sample treatment might be increased at higher pH.
Therefore, the hydrolysis conditions were examined, and pH 8.5-9.0, 37 °C and 30 min was
found to be optimal. Under these conditions, both isoaspartate methyl ester and succinimide
peaks disappeared with the concomitant appearance of 18O-labeled isoAsp and Asp peptide peaks
(see Figure 2-3). Similar to other peptides, the isoAsp species was the preferential hydrolysis
product compared to its Asp counterpart in a ratio of about 3:1. The presence of 18O-labeled Asp
peptides does not affect the detection of 18O-labeled isoAsp peptides and, in fact, may provide
secondary confirmation, as 18O in both isoAsp and Asp species serves as a telltale sign of the
existence of isoAsp in the original samples.
68
18O-isoAsp 851.47 18 100 O-Asp A) 851.49 80 pH8.1-30min 60 Succinimide 40 Methylester 831.51 865.57 20
Relative Abundance 0 851.39 100
80 851.43 60 pH8.5-30min 40 20
Relative Abundance 0 851.51 100
80 851.51 60 40 pH9.0-30min 20
Relative Abundance 0 11 12 13 14 15 16 17 18 19 20 Time (min)
Succinimide 831.44 100 B) 80 18 60 O-isoAsp pH8.5-15min 851.53 18O-Asp 40 851.39 Methylester 20 863.58
Relative Abundance 0 851.39 100
80 851.43 60 pH8.5-30min 40 20
Relative Abundance 0 851.52 100
80 851.48 60 pH8.5-45min 40 20
Relative Abundance 0 11 12 13 14 15 16 17 18 19 20 Time (min)
Figure 2-4. The effects of pH (A) and incubation time (B) on the hydrolysis of the succinimide and methyl ester. Complete hydrolysis was observed at pH 8.5-9.0 at 37 °C after 30 min.
Since deamidation of Asn and isomerization of Asp may happen spontaneously during the sample handling process, the degree of background reactions should be measured[31, 32].
Prolonged incubation and harsh conditions should be avoided. Under our conditions, no isoAsp
69 from Asp was detected when the samples were incubated at pH 9, 37 °C for 30 min, and 24 hrs
(Figure 2-5). Additionally, deamidation of Asn (even the NG “hot spots”) occurred to a small degree for 30 min, thus having little practical effects on the identification of isoAsp from Asp via tandem mass spectrometry.
Asp 849.44 100 A) 80 60 isoAsp 40 849.52 20 Marker 831.43 Relative Abundance Relative 0 849.40 100 80 60 40 20 DSIP/pH9/30min
Relative Abundance Relative 0 849.47 100 80 60 40 DSIP/pH9/24hrs 20 Relative Abundance Relative 0 12 13 14 15 16 17 18 19 Time (min)
849.28 100 B) 90
80 70 DSIP STD 60 DSIP/pH9/30min 50 850.26 40 DSIP/pH9/24hrs Relative Abundance
30
20 851.28 10 852.24 0 847 848 849 850 851 852 853 854 855 856 857 m/z
Figure 2-5. Stability of Asp-DSIP during sample treatment. Neither isoAsp-DSIP formation nor
18O-incorporation was observed when Asp-DSIP was exposed to pH 9 buffer at 37 °C for 24 hrs.
70
The 18O-labeled isoAsp peptides may be methylated by PIMT again[38], leading to two
18O incorporation. This phenomenon is similar to protease-catalyzed 18O-labeling of C-termini
of peptides that Fenselau’s group has developed[30, 35, 36]. This however does not affect the
database search and site identification of labeled isoAsp. To simplify data analysis,
incorporation of two 18O atoms can be minimized by quenching PIMT activity (e.g., by adding guanidine) after the methylation step, as demonstrated in Figure 2-6. Because of the stoichiometry may not be precisely controlled, such analysis should be treated not as absolute but rather semi-quantitative.
71
18 O-isoAsp 18 A) 851.47 O-Asp 100 851.49 80 Marker 60 SuccinimideMethylester 40 833.58 865.41 20 Relative Abundance Relative
0 851.56 100 851.44 80 0.24M GndHCl 60 40 20
Relative Abundance Relative 0
851.39 100 80 851.60 1.25M GndHCl 60 40 20
Relative Abundance 0 11 12 13 14 15 16 17 18 19 20 Time (min)
B) 849.34 100 Unlabeled IsoAsp 80 60 850.36 40 20 851.34 852.32 Relative Abundance 0 18 851.36 O-isoAsp-0.24M GndHCl 100 80 60 853.40 852.38 40 854.36 20 855.30 849.34 850.34 Relative Abundance 0
851.40 18 100 O-isoAsp-1.25M GndHCl 80 60 40 852.38 20 853.38 849.44 850.42 854.42 Relative Abundance 0 848 849 850 851 852 853 854 855 856 m/z
Figure 2-6. Guanidine HCl (Gnd-HCl, 1.25 M) quenched PIMT activity during hydrolysis and thereby minimized the incorporation of two 18O-atoms into isoAsp peptides. The blue arrow indicates that two 18O-atoms were incorporated into some peptides when 0.24 M guanidine HCl was used.
72
2.4.3 Screening of 18O-Labeled isoAsp by Mass Spectrometry
Compared to unlabeled isoAsp, 18O-labeling at isoAsp results in a 2 Da mass increase that
can be easily detected with standard mass spectrometry (for work flow, see Scheme 2-3). As
shown in Figure 2-1A, the 18O-labeled isoAsp-DSIP peptide was detected as a singly charged ion
at m/z 851, 2 Da higher than that from the unlabeled Asp-DSIP peptide (m/z 849). Similarly, a
shift of 0.6663 m/z was observed for the triply charged peptide LC 69-108 from IgG1 (Figure 2-
1B). The high 18O-labeling efficiency (near completion) resulted in a clean shift of the isotopic pattern, thereby enabling automatic recognition of the 18O-labeled isoAsp peptides using a
standard data analysis algorithm. For example, multiple isoAsp peptides in IgG1 were readily identified in this manner, some of which are listed in Table 2-1. Therefore, the method reported
here is suitable for automatic, high throughput screening of isoAsp. 17O-water can also be used
for labeling, but is less desirable due to its higher cost and smaller mass shift (1 Da) on the
labeled peptides.
73
Table 2-1. Representative isoAsp containing peptides detected in IgG1.
Peak Ratio
Retention in the
Time Charge Obs. m/z Cal. m/z Peptide Sequence IsoD Site* stressed
(min) sample
(%)
98.41 2 840.4055 840.4073 271FNWYVisoDGVEVHNAK284 9.2 HC, D276
100.07 2 839.4036 839.4052 271FNWYVDGVEVHNAK284 90.8
114.75 2 938.4655 938.4672 389TTPPVLDSisoDGSFFLYSK405 0.6
HC, D397
117.12 2 937.4632 937.4651 389TTPPVLDSDGSFFLYSK405 99.4
131.24 3 1405.9721 1405.9707 69SGTSASLAITGLQAEDEADYYCQSYisoDSSLSGLYVFGTGTK108 1.4 LC, D94
(CDR) 135.25 3 1405.3058 1405.3026 69SGTSASLAITGLQAEDEADYYCQSYDSSLSGLYVFGTGTK108 98.6
*Note: LC, HC, and CDR are light chain, heavy chain, and complementary-determining region, respectively. Cysteine was alkylated.
74
2.4.4 Co-elution of isoAsp and Asp and Overlapping of Isotope Patterns
As shown in Table 2-1 and Figures 2-2 ~2-6, isoAsp peptide and its Asp counterpart are
fully or partially resolved by liquid chromatography. However, occasionally they may co-
elute[16, 49]. Under such a scenario, the isotopic envelope of the unlabeled (16O) Asp peptide
peak overlaps with that of the labeled (18O) isoAsp species, potentially complicating the analysis
of the isotope pattern. Although the mixed isotope patterns can always be deconvoluted[30, 34,
50], as has been done in the analysis of succinimide and deamidation by 18O-labeling[29, 32], the practical issue is whether the intensity from the 18O-species is sufficiently high (e.g., when
isoAsp is in low abundance) so that the 2 Da mass shift can be automatically recognized by data analysis algorithms. In addition, the deconvolution of the mixed isotope patterns complicates
isoAsp analysis. Remediation includes changing chromatographic conditions for peptide
separation or using different proteases to generate isoAsp and Asp peptides with different
sequences. A more direct and surefire approach, as we have reported, is to treat the sample with
endoprotease Asp-N (EC 3.4.24.33), which cleaves peptides at the N-terminal side of Asp but
not isoAsp residues[16, 46, 47]. As such, sample treatment with Asp-N has been used as an
effective method to selectively remove Asp peptides from their isoAsp counterparts, enriching
isoAsp species for subsequent analysis[16].
2.4.5 Identification of isoAsp Sites in 18O-Labeled Peptides
In addition to screening isoAsp in a given peptide, the precise location of modification can be facilely deduced by tandem mass spectrometry with high confidence, as the specific 18O
75
incorporation imparts 2 Da mass increases on the fragmentation ions (such as b and y ions in
CID mode) that contain isoAsp (Figure 2-7~Figure 2-10). For example, the isoAsp modification
site in a tryptic peptide from the IgG1 sample was identified to be at isoAsp276 by its MS/MS
data (Figure 2-7). The mass increment of 2 Da corresponding to 18O incorporation was evident
in a series of y9-y12 and b6-b13 ions with normal intensity as peaks from the unlabeled (16O)
peptide, leading to unambiguous identification of the modification site. In comparison,
ETD/ECD mass spectrometry distinguishes isoAsp from Asp peptides based on only a single
pair of characteristic reporter ions of isoAsp (c.+58 and z-57) that also are of various intensity
under different conditions[16-20]. As another example, the third peptide in Table 2-1 is located in the light chain (LC) of IgG1 at amino acid positions 69-108 and contains 40 amino acids with three closely positioned Asp residues, posing challenge for site identification. Nevertheless, the modification site was automatically detected and unambiguously established to be isoAsp94 based on the isotopic patterns conferred by 18O incorporation, again exemplifying the utility of
our method.
76
y9 970.4 A)
b5 y12++ 710.2 y10 1069.5
b6 y11 y4 697.4 y5 827.3 1232.5 469.3 y6 b3 568.4 611.3 y3 b4 796.5 853.5 b2 448.2 y8 b11 b12 332.2 y7 884.4 983.3 1112.3 1211.41348.5 1418.5 b13 262.2 b7 b8 b9 b10 y121462.61533.7 400 600 800 1000 1200 1400 1600 m/z
y9 968.4 B)
y12++ 709.1 y10 697.4 1067.5 y6
y11 1230.5 y4 469.3 y5 568.4 710.2 b6 b5 448.3 825.3 b11 b3 y3 853.4 1346.5 611.2 y8 332.2 796.4 1209.4 b12 b13 b4 y7 981.3 1416.6 262.1 b8 1110.5 b10 y121460.61531.6 b2 882.5 b9 b7 400 600 800 1000 1200 1400 1600 m/z
Figure 2-7. Identification of isoAsp site in a doubly charged tryptic peptide HC 271-284 from the
stressed IgG1. The MS/MS spectra with (A) and without 18O tag (B) were obtained by collision induced dissociation (CID) of the (M+2H)2+ precursor ions m/z 840.41 and 839.40 for the top and bottom traces, respectively . Cysteine is alkylated.
77
A)
y8 872.5
b13 b14 y19++ y20++ 1187.6 y22++ 1037.2 b12 1118.9 1258.5 1059.5 y10 y6 1129.7 1042.6 y11 610.4 y14++ 1388.8 709.4 1387.6 y13 1329.7 y26++ 788.4 889.4 b15 1480.2
b9-H2O b10 770.4 b27++ 1444.6 b15[3+] b8 463.4 1200.8 675.5 y17++ y16++ y21++ y23++ y24++ 752.5 892.6y18++ 1243.0 604.4 849.1 1293.2 1358.4 1416.9 b16++ 956.8 y12 b7 985.6 y14 639.2 b11 y9 946.5 514.5
500 600 700 800 900 1000 1100 1200 1300 1400 1500 m/z
B) y8 872.5
y10 1042.6 y11 1129.6 b13 1187.6 y14++ 1036.2 1117.8 1388.5 709.4 y19++ y6 y20++ b15 b12 610.3 1387.6 788.4 1059.5 y22++ y21++1257.0 b9-H2O 1199.4 y13 y26++ 770.5 889.5 y18++ 1258.7 1329.7 1479.2 b15[3+] b10 b14 1333.5 b27++ 955.7 1443.6 463.3 1292.3 752.4 1242.7 657.3 b11 y9 y23++ 1357.0 604.3 b16++ y17++946.6 y12 y24++ b7 b8 985.6 1416.7 675.4 y16++ 891.6 y14 848.8 500 600 700 800 900 1000 1100 1200 1300 1400 1500 m/z
Figure 2-8. Identification of isoAsp site in a triply charged tryptic peptide LC69-108 from the
IgG1 sample. The MS/MS spectra with (A) and without 18O tag (B) were obtained by collision-
induced dissociation (CID) of the (M+3H)3+ precursor ions, m/z 1406.64 and 1405.97, for the top and bottom traces, respectively. Cysteine was alkylated.
78
y15++ 837.6 A)
y11 907.8 1267.6 y3 b5y4 b12++b6 y5 b7 y14++y6 y16++y7 b9y8 b10 y9b11 y10 b12 b13y12 b14 y14 b15y15 b16 400 600 800 1000 1200 1400 1600 1800 m/z 397.3 609.4 724.3 1152.6 1380.6
y15++ 836.7
906.7 y3 b5y4 b6 y5 b7 y6b8 y16++y7 b9y8 y9b11 y10 b12 y11 b13y12 b14 y14 b15y15 b16 609.5 724.4 1150.6 1265.6 1378.6 397.3 400 600 800 1000 1200 1400 1600 1800 m/z
Figure 2-9. Identification of isoAsp site in a doubly charged tryptic peptide HC389-405 from the
IgG1 sample. The MS/MS spectra with (top) and without 18O tag (bottom) (A) were obtained by collision-induced dissociation (CID) of the (M+2H)2+ precursor ions, m/z 938.97 and 937.97, for
top and bottom trace, respectively. (B) and (C) are the zoomed-in views.
79
y15-H2O++ 828.6 B)
b6 609.4
726.4
724.3 510.3 b6-H2O b7 y4 524.4 y5 591.2 657.4 b6-2H2O b5 b8 y3 573.4 811.5 496.2 b7-H2O 397.3 804.6 706.5 y6 b5-2H2O b5-H2O 547.4 460.4 565.3 776.3 793.5 434.2 478.3 627.4 740.4
400 450 500 550 600 650 700 750 800 m/z y15-H2O++ 827.7
b7 724.4 b6 609.5
y3 397.3 522.4
y4 y5 510.3 b6-H2O 657.4 b5 591.4 y6 496.3 b6-2H2O b7-H2O 804.5 b5-2H2O 811.5 573.5 706.5 460.4 b5-H2O b8 547.4 425.4 452.3 478.5 689.3 776.6
400 450 500 550 600 650 700 750 800 m/z
Figure 2-9. Continued.
80
907.8 C)
y11 1267.6
M-H2O++ 929.4 y10 1152.6
928.3 y12 b9 1380.6 y9 948.5 y15 y8 1065.6 1577.7 1673.8 1479.7 891.6 b12-H2O 1480.6 1643.9 1674.9 1201.8 1366.7 1576.8 b15 y7 b14 y14 b12 b13 b15-H2O b16 b10 1219.5 985.5 1072.6 1442.7 1624.6 1729.7
900 1000 1100 1200 1300 1400 1500 1600 1700 1800 m/z
906.7
M-H2O++ 928.6
y11 1265.6 y10 1150.6 y12 1378.6 b15-H2O b14 y14 1622.7b15 y8 1477.6 1574.8 926.4 948.5 1640.8y15 b9 1671.9 y9 1364.7 1063.6 b12-H2O 891.5 b13 b15-2H2O b16 1199.6 1605.7 y7 b10 b11 b12 1727.8 983.6 1070.4 1217.6 1527.9 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 m/z
Figure 2-9. Continued.
81
y7 594.3 A)
M-H2O 833.4
b6 b7-H2O 560.3 y8-2H2O 629.3 b5 b6-H2O b5-H2O489.3 542.2 471.3 480.3 b7 M-2H2O y5 y7-H2O b8-H2O 537.2 y8-H2O 815.4 576.4 686.4 b4 y6 b7-2H2O 647.3 b8 y8 372.3 390.2 y8-3H2O 704.3 611.3 665.4 374.3 y3 303.3315.3 443.2 292.3 b3 363.3y4 798.4
300 350 400 450 500 550 600 650 700 750 800 m/z
M-H2O B) 831.4 y7 592.3
b7-H2O y8-2H2O 627.3
b7 b6 y8-H2O b5-H2O 558.3 645.3 469.2 b8-H2O M-2H2O b5 684.3 b8 813.3 487.2 b6-H2O 702.3 540.3 y7-3H2O y7-H2O b7-2H2O y8 538.2 574.3 663.3 388.2 441.2 y8-3H2O 609.3 b8-2H2O 301.2 535.3 363.2 370.2 666.3 478.3 y6 796.2 292.3 356.3 b4372.3 y3 315.4 y5 720.3 b3 y4 300 350 400 450 500 550 600 650 700 750 800 m/z
Figure 2-10. Identification of isoAsp site in the stressed Asp-DSIP peptide by tandem mass spectrometry. The MS/MS data of isoAsp-DSIP with 18O tag (A) and Asp-DSIP without 18O tag (B) were obtained by collision induced dissociation (CID) of the (M+H)1+ precursor ions
m/z 851 and 849 for the top and bottom traces, respectively.
82
2.5 Conclusions
We present herein an approach combining chemo-enzymatic transformations to specifically label isoAsp with 18O for facile analysis by standard mass spectrometry and routine data analysis algorithms. The complete incorporation of 18O into isoAsp makes it feasible for quantitative analysis; this project is currently under development in our laboratories.
2.6 References
[1] Noguchi S. Structural changes induced by the deamidation and isomerization of asparagine revealed by the crystal structure of Ustilago sphaerogena ribonuclease U2B. Biopolymers 2010;93:1003-10.
[2] Doyle HA, Gee RJ, Mamula MJ. Altered immunogenicity of isoaspartate containing proteins. Autoimmunity 2007;40:131-7.
[3] Moss CX, Matthews SP, Lamont DJ, Watts C. Asparagine deamidation perturbs antigen presentation on class II major histocompatibility complex molecules. J Biol Chem 2005;280:18498-503.
[4] Shimizu T, Matsuoka Y, Shirasawa T. Biological significance of isoaspartate and its repair system. Biol Pharm Bull 2005;28:1590-6.
[5] Clarke S. Aging as war between chemical and biochemical processes: protein methylation and the recognition of age-damaged proteins for repair. Ageing Res Rev 2003;2:263-85.
[6] Curnis F, Longhi R, Crippa L, Cattaneo A, Dondossola E, Bachi A, et al. Spontaneous formation of L-isoaspartate and gain of function in fibronectin. J Biol Chem 2006;281:36466- 76.
[7] Robinson NE, Robinson AB. Molecular Clocks: Deamidation of Asparaginyl and Glutaminyl Residues in Peptides and Proteins. Cave Junction, Oregon, USA: Althouse Press; 2004.
[8] Desrosiers RR, Fanelus I. Damaged proteins bearing L-isoaspartyl residues and aging: a dynamic equilibrium between generation of isomerized forms and repair by PIMT. Curr Aging Sci 2011;4:8-18.
[9] Reissner KJ, Aswad DW. Deamidation and isoaspartate formation in proteins: unwanted alterations or surreptitious signals? Cell Mol Life Sci 2003;60:1281-95. 83
[10] Manning MC, Chou DK, Murphy BM, Payne RW, Katayama DS. Stability of protein pharmaceuticals: an update. Pharm Res 2010;27:544-75.
[11] Wakankar AA, Borchardt RT. Formulation considerations for proteins susceptible to asparagine deamidation and aspartate isomerization. J Pharm Sci 2006;95:2321-36.
[12] Liu H, Gaza-Bulseco G, Faldu D, Chumsae C, Sun J. Heterogeneity of monoclonal antibodies. J Pharm Sci 2008;97:2426-47.
[13] Geiger T, Clarke S. Deamidation, isomerization, and racemization at asparaginyl and aspartyl residues in peptides. Succinimide-linked reactions that contribute to protein degradation. J Biol Chem 1987;262:785-94.
[14] Liu YD, van Enk JZ, Flynn GC. Human antibody Fc deamidation in vivo. Biologicals 2009;37:313-22.
[15] Rehder DS, Chelius D, McAuley A, Dillon TM, Xiao G, Crouse-Zeineddini J, et al. Isomerization of a single aspartyl residue of anti-epidermal growth factor receptor immunoglobulin gamma2 antibody highlights the role avidity plays in antibody activity. Biochemistry 2008;47:2518-30.
[16] Ni W, Dai S, Karger BL, Zhou ZS. Analysis of isoaspartic Acid by selective proteolysis with Asp-N and electron transfer dissociation mass spectrometry. Anal Chem 2010;82:7485-91.
[17] Yang H, Zubarev RA. Mass spectrometric analysis of asparagine deamidation and aspartate isomerization in polypeptides. Electrophoresis 2010;31:1764-72.
[18] Chan WY, Chan TW, O'Connor PB. Electron transfer dissociation with supplemental activation to differentiate aspartic and isoaspartic residues in doubly charged peptide cations. J Am Soc Mass Spectrom 2010;21:1012-5.
[19] Cournoyer JJ, Pittman JL, Ivleva VB, Fallows E, Waskell L, Costello CE, et al. Deamidation: Differentiation of aspartyl from isoaspartyl products in peptides by electron capture dissociation. Protein Sci 2005;14:452-63.
[20] O'Connor PB, Cournoyer JJ, Pitteri SJ, Chrisman PA, McLuckey SA. Differentiation of aspartic and isoaspartic acids using electron transfer dissociation. J Am Soc Mass Spectrom 2006;17:15- 9.
[21] Aswad DW, Paranandi MV, Schurter BT. Isoaspartate in peptides and proteins: formation, significance, and analysis. J Pharm Biomed Anal 2000;21:1129-36.
[22] Schurter BT, Aswad DW. Analysis of isoaspartate in peptides and proteins without the use of radioisotopes. Anal Biochem 2000;282:227-31.
[23] McFadden PN, Clarke S. Methylation at D-aspartyl residues in erythrocytes: possible step in the repair of aged membrane proteins. Proc Natl Acad Sci U S A 1982;79:2460-4.
84
[24] O'Connor CM. Protein L-isoaspartyl, D-aspartyl O-methyltransferases: catalysts for protein repair. In: Clarke SG, Tamanoi F, editors. Enzymes: Protein Methyltransferases. Amsterdam: Elsevier; 2006. p. 385-433.
[25] Johnson BA, Aswad DW. Enzymatic protein carboxyl methylation at physiological pH: cyclic imide formation explains rapid methyl turnover. Biochemistry 1985;24:2581-6.
[26] Barber JR, Clarke S. Demethylation of protein carboxyl methyl esters: a nonenzymatic process in human erythrocytes? Biochemistry 1985;24:4867-71.
[27] Alfaro JF, Gillies LA, Sun HG, Dai S, Zang T, Klaene JJ, et al. Chemo-enzymatic detection of protein isoaspartate using protein isoaspartate methyltransferase and hydrazine trapping. Anal Chem 2008;80:3882-9.
[28] Zhu JX, Aswad DW. Selective cleavage of isoaspartyl peptide bonds by hydroxylamine after methyltransferase priming. Anal Biochem 2007;364:1-7.
[29] Chu GC, Chelius D, Xiao G, Khor HK, Coulibaly S, Bondarenko PV. Accumulation of succinimide in a recombinant monoclonal antibody in mildly acidic buffers under elevated temperatures. Pharm Res 2007;24:1145-56.
[30] Fenselau C, Yao X. 18O2-labeling in quantitative proteomic strategies: a status report. J Proteome Res 2009;8:2140-3.
[31] Gaza-Bulseco G, Li B, Bulseco A, Liu HC. Method to differentiate asn deamidation that occurred prior to and during sample preparation of a monoclonal antibody. Anal Chem 2008;80:9491-8.
[32] Li X, Cournoyer JJ, Lin C, O'Connor PB. Use of 18O labels to monitor deamidation during protein and peptide sample processing. J Am Soc Mass Spectrom 2008;19:855-64.
[33] Terashima I, Koga A, Nagai H. Identification of deamidation and isomerization sites on pharmaceutical recombinant antibody using H(2)(18)O. Anal Biochem 2007;368:49-60.
[34] Xiao G, Bondarenko PV, Jacob J, Chu GC, Chelius D. 18O labeling method for identification and quantification of succinimide in proteins. Anal Chem 2007;79:2714-21.
[35] Yao X, Afonso C, Fenselau C. Dissection of proteolytic 18O labeling: endoprotease-catalyzed 16O-to-18O exchange of truncated peptide substrates. J Proteome Res 2003;2:147-52.
[36] Yao X, Freas A, Ramirez J, Demirev PA, Fenselau C. Proteolytic 18O labeling for comparative proteomics: model studies with two serotypes of adenovirus. Anal Chem 2001;73:2836-42.
[37] Ye X, Luke B, Andresson T, Blonder J. 18O stable isotope labeling in MS-based proteomics. Brief Funct Genomic Proteomic 2009;8:136-44.
85
[38] Lindquist JA, McFadden PN. Incorporation of two 18O atoms into a peptide during isoaspartyl repair reveals repeated passage through a succinimide intermediate. J Protein Chem 1994;13:553-60.
[39] Villa ST, Xu Q, Downie AB, Clarke SG. Arabidopsis Protein Repair L-Isoaspartyl Methyltransferases: Predominant Activities at Lethal Temperatures. Physiol Plant 2006;128:581-92.
[40] Zhang Z. De novo peptide sequencing based on a divide-and-conquer algorithm and peptide tandem spectrum simulation. Anal Chem 2004;76:6374-83.
[41] Zhang Z. Prediction of low-energy collision-induced dissociation spectra of peptides. Anal Chem 2004;76:3908-22.
[42] Clarke S, Banfield K. Homocysteine in Health & Disease. In: Carmel R, Jacobsen DW, editors. New York: Cambridge University Press; 2001. p. 63-78.
[43] Cannon LM, Butler FN, Wan W, Zhou ZS. A stereospecific colorimetric assay for (S,S)- adenosylmethionine quantification based on thiopurine methyltransferase-catalyzed thiol methylation. Anal Biochem 2002;308:358-63.
[44] Dorgan KM, Wooderchak WL, Wynn DP, Karschner EL, Alfaro JF, Cui Y, et al. An enzyme- coupled continuous spectrophotometric assay for S-adenosylmethionine-dependent methyltransferases. Anal Biochem 2006;350:249-55.
[45] Hendricks CL, Ross JR, Pichersky E, Noel JP, Zhou ZS. An enzyme-coupled colorimetric assay for S-adenosylmethionine-dependent methyltransferases. Anal Biochem 2004;326:100-5.
[46] Bohme L, Bar JW, Hoffmann T, Manhart S, Ludwig HH, Rosche F, et al. Isoaspartate residues dramatically influence substrate recognition and turnover by proteases. Biol Chem 2008;389:1043-53.
[47] Kameoka D, Ueda T, Imoto T. A method for the detection of asparagine deamidation and aspartate isomerization of proteins by MALDI/TOF-mass spectrometry using endoproteinase Asp-N. J Biochem 2003;134:129-35.
[48] Sargaeva NP, Lin C, O'Connor PB. Differentiating N-terminal aspartic and isoaspartic acid residues in peptides. Anal Chem 2011;83:6675-82.
[49] Krokhin OV, Antonovici M, Ens W, Wilkins JA, Standing KG. Deamidation of -Asn-Gly- sequences during sample preparation for proteomics: Consequences for MALDI and HPLC- MALDI analysis. Anal Chem 2006;78:6645-50.
[50] Mason CJ, Therneau TM, Eckel-Passow JE, Johnson KL, Oberg AL, Olson JE, et al. A method for automatically interpreting mass spectra of 18O-labeled isotopic clusters. Mol Cell Proteomics 2007;6:305-18.
86
Chapter 3: Discovery of Undefined Protein Crosslinking Chemistry: A Comprehensive
Methodology Utilizing 18O-labeling and Mass Spectrometry
Reproduced with permission from “Min Liu, Zhongqi Zhang, Tianzhu Zang, Chris
Spahr, Janet Cheetham, Da Ren, and Zhaohui Sunny Zhou. Discovery of Undefined Protein
Crosslinking Chemistry: A Comprehensive Methodology Utilizing 18O-labeling and Mass
Spectrometry. Analytical Chemistry 2013, 85, 5900-5908.” Copyright [2013] American
Chemical Society.
Co-authors’ work in this chapter: Min Liu: experimental design and execute, data analysis, manuscript writing and revision; Zhonqi Zhang: new 18O-screening function in MassAnalyzer algorithm, data analysis, manuscript writing and revision; Tiazhu Zang: data analysis, manuscript writing and revision; Chris Spahr: data analysis, manuscript writing and revision; Janet Cheetham: idea contribution, manuscript writing and revision, grant support; Da Ren: idea contribution, experimental design, manuscript writing and revision; Zhaohui Sunny Zhou: idea contribution, experimental design, data analysis, manuscript writing and revision and grant support.
87
3.1 Abstract
Characterization of protein crosslinking, particularly without prior knowledge of the
chemical nature and site of crosslinking, poses a significant challenge due to their intrinsic
structural complexity and the lack of a comprehensive analytical approach. Towards this end,
we have developed a generally applicable workflow—XChem-Finder that involves four stages.
(1) Detection of crosslinked peptides via 18O-labeling at C-termini. (2) Determination of the putative partial sequences of each crosslinked peptide pair using a fragment ion mass database search against known protein sequences coupled with a de novo sequence tag search. (3)
Extension to full sequences based on protease specificity, the unique combination of mass, and
other constraints. (4) Deduction of crosslinking chemistry and site. The mass difference
between the sum of two putative full-length peptides and the crosslinked peptide provides the
formulas (elemental composition analysis) for the functional groups involved in each cross-
linking. Combined with sequence restraint from MS/MS data, plausible crosslinking chemistry
and site were inferred, and ultimately, confirmed by matching with all data. Applying our
approach to a stressed IgG2 antibody, ten cross-linked peptides were discovered and found to be
connected via thioether originating from disulfides at locations that had not been previously
recognized. Furthermore, once the crosslink chemistry was revealed, a targeted crosslink search
yielded four additional crosslinked peptides that all contain the C-terminus of the light chain.
88
3.2 Introduction
Protein crosslinking exists in a myriad of biological systems and protein pharmaceuticals,
such as collagen, ubiquitylated proteins, and monoclonal antibodies[1-6]. Rich and diverse
chemistry is involved as well, including disulfide[1], dityrosine[2], lysinoalanine[7, 8], lanthionine[7, 8], etc. Additionally, chemical crosslinking is widely used to probe protein structures and interactions[9, 10]. Due to their intrinsic structural complexity, characterization of
crosslinked peptides is complex, but nonetheless tractable if the crosslink chemistry is pre-
defined. For example, a database of the intact mass (precursor ion) and the tandem mass spectra
(fragmentation ions) for all possible combinations of crosslinked peptides (e.g., two cysteines to
form a disulfide bond) can be generated computationally, and subsequently, correlated with
observed spectra to identify both the sequences and sites of modification. Such a database search
strategy is the cardinal principle behind many common algorithms, including ASAP[11],
X!link[12], BLink[13], Xlink-Identifier[14, 15] and MassAnalyzer[16]. Moreover, clever
experimental tricks, such as judicious isotope labeling[17, 18], can markedly simplify the
process and enhance the confidence level for assignment with the assistance of software tools
e.g., Pro-Cross-link[19, 20], PepLynx[21], xQuest[22], iXLink/doXLink/XlinkViewer[23]. To
date, the rapid advancements in mass spectrometers, data analysis algorithms, and computational
capacity have made analyses of crosslinking with known chemistry much more accessible if not
routine (for recent reviews, see [9, 10]).
Yet the aforementioned approaches are futile if the crosslink chemistry is unknown or not
pre-defined; for one thing, no theoretical mass or spectrum can be simulated. Even if crosslinked
peptides have been identified, it remains a tall order to deduce the sequences and sites of
89
crosslinking. Conceptually, de novo sequencing should provide at least partial sequences for
crosslinked peptides (see review paper[24]). Under typical fragmentation conditions, however, a
crosslinked peptide gives rise to at least five sets of b- and y-ions that are intertwined and indistinguishable. In addition, high-charge-states (≥3+) are typically featured in the crosslinked
peptides, resulting in multiple charge fragment ions (e.g. 2+ or 3+) and further complicating data
interpretation[12, 22, 25]. High resolution mass spectrometers (e.g. Orbitrap), capable of the
determination of fragment ion charge state, have become widely available only recently. As
such, the drastically increased complexity in tandem spectrum renders de novo sequencing
ineffective in most cases. Unknown or undefined crosslinks are typically discovered
serendipitously, requiring isolation of the crosslinked peptides and “old-fashioned” protein
chemistry. Even so, full characterization remains elusive for many cases. For instance, the non-
reducible crosslinks between an IgG heavy chain and a light chain in a murine monoclonal
antibody, OKT3, and between two heavy chains of IgG2 could not be elucidated even after
intensive efforts[26, 27].
To facilitate systematic and unbiased discovery of unknown crosslinks, we have
developed a generally applicable workflow—XChem-Finder (Scheme 3-1). First, crosslinked
peptides were isotopically labeled at the C-termini to facilitate their detection[19-21, 28].
Proteins were digested in 18O-(heavy) and 16O-(light) water, respectively, followed by
LC/MS/MS analysis. At full scan, the distinct isotope pattern of the crosslinked peptides (a mass
increase of 8 Da) compared to the non-crosslinked linear species (a mass increase of 4 Da) was
readily detected by a spectral analysis algorithm[19-21]. The second and more challenging part
is to determine the sequences, chemical nature and site of crosslink. The workflow breaks down
the challenge into workable sub-steps. (a) The candidate ions of crosslinked peptides underwent
90
high resolution MS/MS analysis. Based on their isotope patterns, linear and crosslinked
fragment ions are divided into different groups (Table 3-1). (b) Mass of linear fragment ions
were searched against the protein sequence, yielding partial sequences (often sequence ladders)
of each chain of the crosslinked peptides. In parallel, de novo sequencing of crosslinked
fragment ions affords sequence tags. (c) Combining the partial sequences and sequence tags,
putative full-length sequences of each chain were deduced based on protease specificity, the
unique combination of mass, and other constraints. (d) The difference between the combined
mass of the two putative full-length peptides and the observed mass of a crosslinked peptide
provides the formula for the functional group involved in the crosslink (mass to formula).
Combined with sequence restraint from MS/MS data, the crosslink chemistry and site were
inferred, and ultimately, confirmed by matching with all data.
Applying our XChem-Finder approach to a stressed IgG2, ten crosslinked peptides were
discovered and found to be linked via thioether that originated from disulfides at locations that
had not been reported. Furthermore, once the crosslinking chemistry was revealed, a targeted
search yielded additional four crosslinked peptides that all contain the C-terminus of light chain.
3.3 Experimental Section
3.3.1 Chemicals
All chemicals were reagent grade or above. Guanidine hydrochloride (GndHCl),
dithiothreitol (DTT), iodoacetic acid (IAA), trifluoroacetic acid (TFA), acetonitrile (ACN),
HPLC-grade water, and bradykinin were from Sigma-Aldrich (St. Louis, MO, USA).
91
Sequencing grade trypsin was obtained from Roche (Indianapolis, IN, USA). 18O-water (97%) was obtained from Cambridge Isotope Laboratories (Andover, MA, USA). Recombinant monoclonal antibody anti-streptavidin immunoglobulin gamma 2 (IgG2) was produced in
Chinese hamster ovary (CHO) cells (Amgen, Thousand Oaks, CA, USA), purified according to standard manufacturing procedures, formulated at a concentration of 20 mg/mL in 50 mM sodium acetate pH 5.2, and stored at -70 °C.
Stage 1: Detection of Stage 2: Determination of Partial Sequences Stage 3: Stage 4: Deduction of
Crosslinked Peptides Inference of Crosslink Site &
Full Sequences Chemistry
Tryptic digestion in Group fragment ions via mass shift Protease Elementary
16O- and 18O-water 0 or +4 Da +8 Da specificity & composition analysis of
(linear fragment ions) (crosslinked fragment ions) other mass difference
Match mass of fragment de novo sequencing constraints (combined native linear
ions with peptide sequences (manual) chains vs crosslink
(FindPept) peptide)
Mass shift of 8 Da Partial peptide sequences; Sequence tag Putative full Confirmed structure
often sequence ladders sequences of (Thioether)
both chains
Scheme 3-1. Flow chart of XChem-Finder in four main stages: (1) detection of crosslink by 18O- labeling; (2) determination of partial sequences of crosslinked peptides from the mass of fragmentation ions; (3) inference of full sequence of each chain; (4) determination of crosslink site and chemistry by elemental composition analysis and chemical intuition.
92
Table 3-1. Fragmentation ions of a cross-link peptide. 18O-labeling at C-termini were shown in red. Each letter (e.g., abc and XYZ) represents
one amino acid residue. The symbol (?) denotes unknown cross-link chemistry. Five —and intertwined—sets of b- and y-ions are in a cross-link
peptide. The ions with consecutive bond dissociations or internal fragments are excluded due to lower abundance in CID.
Precursor Ion
Group Structure 18O mass Linear or cross-linked Searchable (Match Sequence Tags (De Notes number shift (Da) partial sequence) novo sequence)
1 0 Linear Yes Yes b-ions, single chain
2 +4 Linear Yes Yes y-ions, single chain
3 +4 Cross-linked No Yes b-ions, cross-linked
4 +8 Cross-linked No Yes y-ions, cross-linked
5 +4 Linear No No Modified single chain from cleavage of the cross-linking
93
3.3.2 Generation of Stressed Sample
After being buffer exchanged into 100 mM Tris at pH 8.5, the IgG2 antibody was
incubated at 50 °C for 7 days in the dark.
3.3.3 Reduction, Alkylation, Tryptic Digestion and 18O-Labeling of the IgG2
Tryptic digestion of the stressed IgG2 was performed similarly to the procedure described
by Ren et al[29]. Briefly, IgG2 (20 mg/mL) was diluted to 1 mg/mL in a denaturing buffer (7.5
M GndHCl, 2 mM EDTA and 0.25 M Tris-HCl, pH 7.5) to a final volume of 0.5 mL. Reduction
was accomplished with the addition of 3 μL of 0.5 M DTT followed by 30 min incubation at room temperature. S-Carboxymethylation was achieved with the addition of 7 μL of 0.5 M IAA; the reaction was carried out in the dark for 15 min at room temperature. Excess IAA was quenched with the addition of 4 μL of 0.5 M DTT. The reduced and alkylated IgG2 samples were subsequently exchanged into the digestion buffer (0.1 M Tris-HCl at pH 7.5) using a NAP-
5 size-exclusion column (GE Healthcare, Piscataway, NJ, USA). After two aliquots (200 µL each) of the above buffer-exchanged antibody were completely dried via Speed Vac and reconstituted separately into the same volume of 18O-water or 16O-water, 6 µL of 1 mg/mL
trypsin in 18O-water or 16O-water solution, respectively, was added to achieve a 1:25 (w/w)
enzyme/substrate ratio. The reaction mixtures were incubated at 37 °C for 30 min.
94
3.3.4 HPLC
Tryptic digests of the IgG2 (25 μL) were separated on a Jupiter C5 column (250 x 2.0 mm, 5 μm, 300Ǻ, Phenomenex, Torrance, CA, USA) at a temperature of 50 °C with a flow rate of 200 μL/min on a HPLC system (Agilent 1100, Palo Alto, CA, USA). Mobile phase A was
0.1% TFA in water (v/v) while mobile phase B contained 0.085% TFA / 90% ACN / 10%water.
A gradient was applied by holding at 2% B for 2 min, increasing to 22% B in 38 min, then 42%
B in 80 min, then 100% B in 25 min followed by holding at 100% B for 5 min. The column was re-equilibrated at 2% B for 30 min before next injection.
3.3.5 Mass Spectrometry
An LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific, San Jose, CA, USA) was used in-line with the HPLC system for the analyses of the IgG2 tryptic digests. A full MS scan
(with 60,000 resolution at m/z 400 and an automatic gain control (AGC) target value of 2x105) followed by data-dependent MS/MS scans of the three most abundant precursor ions was set up to acquire both the peptide mass and sequence information. The spray voltage was 5.5 kV, and the capillary temperature was 250 °C. The instrument was tuned using the doubly-charged ion of a synthetic peptide, bradykinin. The MS/MS spectra were obtained using collision-induced dissociation (CID) with normalized collision energy of 35%. For MS/MS with ion detection in the Orbitrap, the AGC target was set to 3x106, resolution to 7,500, and the precursor isolation width to 4 m/z unit. Peptides were identified by MassAnalyzer by comparing experimental
MS/MS to theoretically predicted MS/MS[16, 30-32]. Peak alignment between 16O- and 18O-
95
digest runs was automatically performed by MassAnalyser[33]. A new function was
implemented in MassAnalyzer to calculate the level of 18O-labeling in each peptide. The number
of incorporated 18O in a peptide is calculated from the following equation:
− = Mlabeled Munlabeled N18 Eq. 1 O 2.004
18 where Mlabeled and Munlabeled are the average masses of the O-labeled and unlabeled peptides,
respectively, as calculated by the centroids of their respective isotope envelopes. The value of
2.004 Da is the mass difference between an 18O atom and an 16O atom.
3.4 Results and Discussion
3.4.1 Stage 1: Identification of Crosslinked Peptides.
18O-labeling combined with mass spectrometry is commonly used to identify crosslinked
peptides [19-21, 28]. As shown in Scheme 3-1, newly created C-termini of tryptic peptides from
digestion in 18O-water were completely labeled by 18O. The distinct isotope pattern for the
labeled crosslinked peptides (a mass increase of 8 Da) compared to linear (non-crosslinked)
species (a mass increase of 4 Da) can be automatically detected by common spectral analysis
algorithms, such as an in-house isotopic screening algorithm (MassAnalyzer[16, 30-32]).
18O-Labeling. A general strategy to label the C-termini of peptides in 18O-water catalyzed by proteases for peptide identification and quantification is well documented[34-40].
Under our experimental conditions, near complete (four) 18O-incorporation for crosslinked
96 peptides was evident from the isotopic distributions (see Figure 3-1). The small amount of 16O- water (3%) in 18O-water had no significant impact in their isotopic patterns and the subsequent data analysis. 18O-labeling during tryptic digestion is only applied for newly created C-termini, not the C-termini of proteins. Hence, a crosslinked tryptic peptide that contains the C-terminus of the protein only has a mass shift of 4 Da, therefore cannot be differentiated from the linear peptides. This limitation can be overcome by using proteases with different substrate specificity or labeling N-termini (e.g., formaldehyde-d2 and sodium cyanoborohydride or succinic anhydride-d4)[41-43]. In this paper, this was satisfactorily addressed via a targeted mass search after the crosslink chemistry was elucidated. In addition, the deamidation of asparagine and isomerization of aspartic acid could potentially introduce 18O into peptides[44-46]; under our conditions, no isoaspartic acid was detected in the candidate peptides.
1351.3303 100 1350.9961 1351.6641 16 80 O-water
1351.9982 60
40 1350.6618 1352.3322
1352.6652 Relative Abundance Relative 20 1352.9986 0 1354.0024 100 1353.6682 1354.3362 18 80 8Da Mass Shift O-water
60 1354.6702 1353.3341 40 1355.0039
Relative Abundance Relative 20 1355.3376 1353.0002 1355.6718 0 1350 1351 1352 1353 1354 1355 1356 m/z
Figure 3-1. Isotopic distributions of the cross-linked peptide HC:G118-R129/HC:C215-K240
(RT at 91.17min). A mass shift of 8 Da was observed indicating four 18O-incorporation for the crosslinked peptide.
97
Screening of Crosslinked Peptides in Full Scan. An 18O incorporation value (Eq. 1) of
4.0 ± 0.3 was set as cut-off in our screening. The initial screening results for the stressed IgG2 are shown in Table 3-2. Each peak was evaluated for false positive. For instance, gas phase dimerization, commonly observed in mass spectrometry[47, 48], was readily determined based on retention time (same as the monomers) and mass (exactly double that of monomers). In addition, weak precursor ions (typically with peak intensity of 50,000 count or lower) with poor or no MS/MS data were excluded. Based on these criteria, ten candidates shown in red in Table
3-2 were selected for subsequent high resolution MS/MS analysis.
98
Table 3-2. The cross-linked peptide candidates identified by MassAnalyzer algorithm. False hits (gas phase dimers) and weak ions (with poor or no MS/MS data) were excluded from the subsequent analysis. Ten cross-linked peptide candidates shown in red were selected for high resolution MS/MS analysis in next step. For the ions with multiple charges, the one with the
18 18 highest intensity was examined. N O is the number of incorporated O.
Charge Retention m/z Average Monoisotopic Intensity Comments
Time Mass Mass
(RT, min) (Da) (Da)
3 39.93 1119.24 3355.58 3353.699 575968 3.86 Q(-17)SVVTQPPSVSGAPGQR(dimer)
3 60.37 1163.58 3487.76 3485.707 511960 3.84 YAASSYLSLTPEQWK(dimer)
3 76.76 1655.45 4963.26 4960.313 216540 3.82 NQFSLELTSVTAADTAVYYCAR(dimer)
4 87.91 1045.77 4179.76 4177.057 93846 3.78 Same as m/z1394.03
3 87.91 1394.03 4179.80 4177.059 386639 3.81 Cross-link candidate
3 88.56 1413.36 4237.87 4235.064 796803 3.82 Cross-link candidate
4 88.57 1060.53 4237.79 4235.065 175997 3.85 Same as m/z1413.36
5 88.59 848.42 4237.77 4235.065 85712 3.85 Same as m/z1413.36
3 91.17 1351.33 4051.60 4048.963 205489 3.85 Cross-link candidate
4 96.15 1605.54 6418.27 6414.117 159092 3.77 Cross-link candidate, similar to m/z1620.04(-58)
4 97.39 1620.04 6476.45 6472.120 168467 3.86 Cross-link candidate
4 97.88 1484.22 5932.92 5928.854 424306 3.79 Cross-link candidate
4 98.49 1475.73 5898.65 5894.856 71227 3.92 Cross-link candidate
4 98.99 1498.47 5990.95 5986.858 303560 3.82 Cross-link candidate
6 99.00 999.32 5990.99 5986.854 53191 3.87 Same as m/z1498.47
4 99.76 1452.20 5804.77 5800.765 401159 3.78 Cross-link candidate
3 99.78 1935.93 5804.71 5800.758 176055 3.80 Same as m/z1452.20
3 101.56 1893.23 5676.57 5673.664 58750 3.83 Miss monoisotopic peak. same as m/z1419.92
4 101.59 1419.92 5676.47 5672.670 66658 3.89 Cross-link candidate
99
3.4.2 Stage 2: Deduce Partial Sequence for Each Chain.
As illustrated in Scheme 3-1, this stage involves (a) grouping fragment ions based on
their isotope patterns imparted by their corresponding structural features (e.g., linear or cross-
linked), (b) deducing partial peptide sequences via a database search (match mass with partial
peptide sequences using FindPept) and de novo sequencing, and (c) determining most likely
candidate peptides.
Deconvolution of Fragment Ions. Most precursor ions for cross-linked peptides were
highly charged (e.g., 3+ or 4+), thus doubly and triply charged fragment ions abound, e.g. ion
m/z 839.49 (2+) and 1300.47 (3+) in Figure 3-2. The high resolution of the tandem mass
spectrum allowed us to measure the isotope envelope and hence determine the charge state. Also
considering fragment ion type (b- vs y-ion), monoisotopic neutral mass of each fragment ion
from a crosslinked peptide was calculated manually. For example, +17.0033 Da (the mass of
OH-) and -1.0073 Da (the mass of H+) were added to a singly charged b-and y-ion, respectively,
to obtain their neutral peptide mass. The high-resolution for the tandem spectra was crucial in determining the correct charge state and hence neutral mass; otherwise, incorrect monoisotopic mass would lead to false hits and even erroneous assignment.
Grouping Fragment Ions by 18O Incorporation. The fragment ions containing zero,
one, and two C-termini displayed a mass shift of 0, 4, and 8 Da, respectively, in the
corresponding MS/MS spectra obtained from 18O-water vs 16O-water (referred as 18O/16O rule in this paper) and accordingly, are divided into different groups (Table 3-1). For each crosslinked peptide, two sets of linear fragment ions that contain no crosslink site do exist for each chain.
100
One set is the b-ions prior to the crosslink site, which show no mass shifts with 18O-labeling and
thus are separated from other fragment ions (group 1 in Table 3-1). Another set is the y-ions to
the C-terminal side of the crosslink site (group 2 in Table 3-1), which contain two 18O with a mass shift of 4 Da. Essentially, these linear fragments are searchable in standard database, i.e., the mass can be matched with the corresponding peptide fragments. The freely available
FindPept (web.expasy.org/findpept/) was used for the search in this study. Each observed mass value of these linear fragment ions should match to a partial sequence of the crosslinked peptides, but also unrelated sequences (false hits). High mass accuracy (typical 10 ppm in our
FT MS/MS experiments) greatly limits false positives. Furthermore, multiple fragmentation ions collectively—and in combination with de novo sequencing as described below—narrow the hits to a selected few, if not one, candidate peptides.
Isotope pattern (8 Da mass shift with 18O-labeling) can also be readily used to isolate a
set of fragmentation ions that contain two C-termini (y-ions containing the crosslink site, see
group 4 in Table 3-1). First, these ions were excluded from database search (which is for linear
peptides), reducing false hits. Second and more importantly, this markedly simplified set of
tandem spectra could be used for de novo sequencing to yield sequence tags, as it was indeed the
case in our study.
Partial Sequence Search via FindPept. The neutral peptide monoisotopic mass
(obtained from the fragment ion bins of the mass shift of 0 and 4 Da, linear peptides, as shown in
Scheme 3-1) were searched against the known IgG2 sequence using FindPept with user-defined
mass error (10 ppm for the resolution of 7500 in FT-MS/MS in our experiments). FindPept also
allows users to define the residue modifications, for example, alkylation at all cysteine residues
101
(+58.005 Da for reaction with iodoacetic acid in our experiments). FindPept outputs a list of peptides that match the neutral peptide masses, and naturally, some are false hits. As such,
several complimentary steps (constraints) were taken to confirm the actual sequences (higher
probability and confidence level) and rule out false hits. It is worth noting that this is an iterative
process, so the steps can be taken in a different order based on individual situation.
As an example, the process is demonstrated using a triply charged crosslinked peptide
m/z 1351.33 (retention time at 91.17 min, G118-R129/C215-K240). The corresponding neutral
monoisotopic mass of its fragment ions were searched against the IgG2 sequence. The full list of
fragment ion peptides is shown in Table 3-3 and 3-4 and some are highlighted in Table 3-5.
A rewarding first step is to sort the peptides according to their positions in the full protein
sequence. As illustrated in Table 3-3 and 3-5, typically, at least one sequence ladder could be
readily identified. For example, the overwhelmingly large numbers of fragment ions (eight
peptides #3-10 in Table 3-5) that share C-terminal sequences (CPPCPAPPVAGPSVFLFPPKPK) were found, essentially affirming this is part of the true sequence. An immediate implication is that the largest observed fragment ion (2361.190 Da) sets an upper limit for the mass of the other chain (1687 Da) by subtraction from the observed total crosslink peptide mass (4048.963 Da).
Based on this criterion, fourteen peptides in Table 3-3 with a mass of significantly larger than
1687 Da were excluded.
Another two powerful constraints that can be applied to data analysis are based on the
protease specificity (referred to as the tryptic rule) and mass shift conferred by 18O-labeling
(18O/16O rule). For instance, the above mentioned eight peptides (peptides 3-10 in Table 3-5) are
likely from a tryptic peptide as they all end with lysine, and indeed, a mass shift of 4 Da was
observed for all the fragmentation ions from digestion in heavy and light water. Similarly, the
102
other two overlapping partial sequences are likely the N-terminal fragments of a single tryptic
peptide containing GPSVFPLA; and again, as expected, no mass shift was observed from 18O- labeling. Conversely, false hits can be ruled out; for example, the doubly charged fragment ion m/z 1101.0885 (Table 3-4) matches four peptide sequences
(W)GQGTLVTVSSASTKGPSVFPLAP(C), (T)APKLLIYGNSNRPSGVPDRF(S),
(Y)WGQGTLVTVSSASTKGPSVFPL, (C)PPCPAPPVAGPSVFLFPPKPK/(D). Since a mass shift from 18O-labeling was observed, an internal fragment was ruled out, and therefore, this
leaves only the last sequence with a C-terminal lysine as the only plausible choice.
At this point, the fragment ion mass search data indicated the peptide at m/z 1351.33 highly likely contains CPPCPAPPVAGPSVFLFPPKPK. For the second chain, although mass search of two b-ions suggests the presence of GPSVFPLA, additional data were warranted for higher confidence level in the assignment as described next.
De Novo Sequencing. This compliments nicely with the database search and afford sequence tags[25]. As shown in Scheme 3-1, the identification of the sequence tags was conducted using the crosslinked y-ions (8 Da mass shift in 18O-digest, group 4 in Table 3-1),
which obviously would not match any single chain peptides in the database. In Table 3-6, the
observed m/z value is from the most abundant isotopic peak in each isotopic envelope because the monoisotopic peak is weak for large ions at low level. The mass difference between a pair of
adjacent y-ions was calculated and compared manually to the mass of single amino acids and
dipeptides within a mass error of 0.05 Da. Matching single amino acid residues or dipeptides are
shown in red. The sequence tag SVFPLA was confirmed in the crosslink peptide chain G118-
103
R129, lending a strong support to the existence of peptide chain G118-R129 as a component of
the crosslinked peptide (Table 3-6).
In summary, the peptide chain C219-K240 (219CPPCPAPPVAGPSVFLFPPKPK240) and
G118-A125 (118GPSVFPLA125) was identified at the end of Stage 2 as parts of the crosslinked
peptide of m/z 1351.33.
3.4.3 Stage 3: Inference of Full Sequence for Each Chain
Extension to the Putative Full Sequences. Because the peptides were generated by
trypsin digestion, the putative partial sequences of crosslinked peptide chains were extended to
their corresponding full tryptic peptides (G118-R129, C215-K240, V294-K314) with mass of
1287.6282, 2911.3305, and 2502.2941 Da, respectively (Table 3-5). The mass difference
between the observed intact crosslinked peptide (4048 Da) and the first tryptic peptide C215-
K240 (2911 Da) is 1137 Da. This narrowed down the second crosslink chain to G118-R129
(1287.6282 Da) while a putative tryptic peptide V294-K314 (2502.2941 Da) is too large
(combined mass) to be the second chain. This leaves peptide G118-R129 as an only plausible
choice to pair with C215-K240. We were mindful that mis-cleavage might happen, which would
be considered if the initial inference did not yield correct assignment.
104
Table 3-3. Peptides in IgG2 that match with the mass of fragment ions of the triply charged ion m/z 1351.33 (RT at 91.17 min) via FindPept. All
Cys are alkylated with IAA. The adjoining residues before cleavage are in parenthesis. NA means not available. The peptides were sorted in the order of primary sequence number. The amino acid position 1-439 and 440-657 was for HC and LC, respectively.
Mass shift in
18O-water User mass Theor Δmass
m/z Charge (m/z) (Da) mass (Da) (ppm) peptide Position Notes
1181.5930 2 2 2361.171 2361.173 0.9 (Q)ESGPGLVKPSGTLSLTCAVS GGSIS(S) 6-30 Exclude (Tryptic rule, 18O/16O rule, too large as 2nd chain)
1188.6063 3 NA 3580.807 3580.817 2.9 (W)SWVRQPPGKGLEWIGEISHS GTTNYNPSLKSR/(V) 36-67 Exclude (too large as 2nd chain)
1188.6063 3 NA 3580.807 3580.776 -8.7 (I)GEISHSGTTNYNPSLKSRVT ISGDKSKNQFSLE(L) 50-82 Exclude (Tryptic rule, too large as 2nd chain)
1101.0885 2 2 2218.173 2218.163 -4.3 (Y)WGQGTLVTVSSASTKGPSVF PL(A) 103-124 Exclude (Tryptic rule, 18O/16O rule, too large as 2nd chain)
1101.0885 2 2 2200.162 2200.174 5.4 (W)GQGTLVTVSSASTKGPSVFPLAP(C) 104-126 Exclude (Tryptic rule, 18O/16O rule, too large as 2nd chain)
1181.5930 2 2 2361.171 2361.189 7.4 (W)GQGTLVTVSSASTKGPSVFPLAPC(S) 104-127 Exclude (Tryptic rule, 18O/16O rule. Too large as 2nd chain)
470.2368 NA 0 505.251 505.254 5.1 (K)/GPSVF(P) 118-122 -
769.4172 1 0 786.420 786.428 9.6 (K)/GPSVFPLA(P) 118-125 -
1333.6070 2 2 2683.210 2683.218 3.1 (V)TVPSSNFGTQTYTCNVDHKP SNTK/(V) 183-206 Exclude (too large as 2nd chain)
1341.1364 2 NA 2680.258 2680.258 0.1 (T)YTCNVDHKPSNTKVDKTVERKC(C) 194-215 Exclude (Tryptic rule, too large as 2nd chain)
769.4172 1 0 804.431 804.434 3.8 (S)NTKVDKT(V) 204-210 Exclude (Tryptic rule)
1181.5930 2 2 2361.171 2361.19 8.0 (E)CPPCPAPPVAGPSVFLFPPK PK/(D) 219-240 -
1101.0885 2 2 2200.162 2200.175 6.0 (C)PPCPAPPVAGPSVFLFPPKPK/(D) 220-240 -
1846.0661 1 4 1845.059 1845.055 -2.0 (C)PAPPVAGPSVFLFPPKPK/(D) 223-240 -
923.5283 2 2 1845.042 1845.055 7.1 (C)PAPPVAGPSVFLFPPKPK/(D) 223-240 -
839.4851 2 2 1676.956 1676.965 5.5 (A)PPVAGPSVFLFPPKPK/(D) 225-240 -
105
1384.7951 1 4 1383.788 1383.791 2.4 (V)AGPSVFLFPPKPK/(D) 228-240 -
470.2368 NA 0 505.251 505.254 5.1 (A)GPSVF(L) 229-233 Exclude (Tryptic rule)
1313.7528 1 4 1312.745 1312.754 7.0 (A)GPSVFLFPPKPK/(D) 229-240 -
566.3647 1 4 565.357 565.359 3.1 (F)PPKPK/(D) 236-240 -
1688.7689 1 4 1705.772 1705.788 9.4 (V)HQDWLNGKEYKCK/(V) 302-314 -
470.2368 NA 0 505.251 505.254 5.2 (E)PQVY(T) 338-341 Exclude (Tryptic rule)
470.2368 NA 0 487.240 487.239 -1.9 (P)PSRE(E) 345-348 Exclude (Tryptic rule)
1676.7706 2 NA 3351.527 3351.503 -7.1 (E)WESNGQPENNYKTTPPMLDS DGSFFLYSK/(L) 373-401 Exclude (too large as 2nd chain)
1181.5930 2 2 2361.171 2361.175 1.8 (S)DGSFFLYSKLTVDKSRWQQG(N) 393-412 Exclude (Tryptic rule, 18O/16O rule, too large as 2nd chain)
1181.5930 2 2 2361.171 2361.183 4.8 (M)HEALHNHYTQKSLSLSPGKQ S(V) 421-441 Exclude (Tryptic rule, 18O/16O rule, too large as 2nd chain, HC-LC)
1676.7706 2 NA 3369.537 3369.532 -1.4 (P)SVSGAPGQRVTISCTGSSSN IGAGYDVHWYQQ(L) 448-479 Exclude (Tryptic rule, too large as 2nd chain)
923.5283 2 2 1863.053 1863.041 -6.6 (D)VHWYQQLPGTAPKLLI(Y) 474-489 Exclude (Tryptic rule, 18O/16O rule)
1101.0885 2 2 2200.162 2200.175 6.0 (T)APKLLIYGNSNRPSGVPDRF(S) 484-503 Exclude (Tryptic rule, 18O/16O rule, too large as 2nd chain)
1188.6063 3 NA 3580.807 3580.801 -1.6 (A)TLVCLISDFYPGAVTVAWKA DSSPVKAGVETTTP(S) 576-609 Exclude (Tryptic rule, too large as 2nd chain)
470.2368 NA 0 505.251 505.254 5.1 (F)YPGAV(T) 585-589 Exclude (Tryptic rule)
751.4051 NA 0 786.419 786.412 -8.5 (Y)LSLTPEQ(W) 623-629 Exclude (Tryptic rule)
769.4172 1 0 786.420 786.412 -9.7 (Y)LSLTPEQ(W) 623-629 Exclude (Tryptic rule)
106
Table 3-4. Peptides in IgG2 that match with the mass of fragment ions of the triply charged ion m/z 1351.33 (RT at 91.17 min) via
FindPept. All Cys are alkylated with IAA. The adjoining residues before cleavage are in parenthesis. NA means not available. The table was grouped in the order of m/z value.
Mass shift in
18O-water User mass Theor Δmass
m/z Charge (m/z) (Da) mass (Da) (ppm) peptide Notes
470.2368 NA 0 487.240 487.239 -1.9 (P)PSRE(E) Exclude (Tryptic rule)
470.2368 NA 0 505.251 505.254 5.1 (K)/GPSVF(P) -
470.2368 NA 0 505.251 505.254 5.1 (A)GPSVF(L) Exclude (Tryptic rule)
470.2368 NA 0 505.251 505.254 5.1 (F)YPGAV(T) Exclude (Tryptic rule)
470.2368 NA 0 505.251 505.254 5.2 (E)PQVY(T) Exclude (Tryptic rule)
566.3647 1 4 565.357 565.359 3.1 (F)PPKPK/(D)
751.4051 NA 0 786.419 786.412 -8.5 (Y)LSLTPEQ(W) Exclude (Tryptic rule)
769.4172 1 0 786.420 786.412 -9.7 (Y)LSLTPEQ(W) Exclude (Tryptic rule)
769.4172 1 0 786.420 786.428 9.6 (K)/GPSVFPLA(P) -
769.4172 1 0 804.431 804.434 3.8 (S)NTKVDKT(V) Exclude (Tryptic rule)
839.4851 2 2 1676.956 1676.965 5.5 (A)PPVAGPSVFLFPPKPK/(D) -
923.5283 2 2 1845.042 1845.055 7.1 (C)PAPPVAGPSVFLFPPKPK/(D) -
923.5283 2 2 1863.053 1863.041 -6.6 (D)VHWYQQLPGTAPKLLI(Y) Exclude (Tryptic rule, 18O/16O rule)
1101.0885 2 2 2200.162 2200.174 5.4 (W)GQGTLVTVSSASTKGPSVFPLAP(C) Exclude (Tryptic rule, 18O/16O rule)
1101.0885 2 2 2200.162 2200.175 6.0 (T)APKLLIYGNSNRPSGVPDRF(S) Exclude (Tryptic rule, 18O/16O rule)
1101.0885 2 2 2218.173 2218.163 -4.3 (Y)WGQGTLVTVSSASTKGPSVF PL(A) Exclude (Tryptic rule, 18O/16O rule)
107
1101.0885 2 2 2200.162 2200.175 6.0 (C)PPCPAPPVAGPSVFLFPPKPK/(D) -
1181.5930 2 2 2361.171 2361.173 0.9 (Q)ESGPGLVKPSGTLSLTCAVS GGSIS(S) Exclude (Tryptic rule, 18O/16O rule)
1181.5930 2 2 2361.171 2361.175 1.8 (S)DGSFFLYSKLTVDKSRWQQG(N) Exclude (Tryptic rule, 18O/16O rule)
1181.5930 2 2 2361.171 2361.183 4.8 (M)HEALHNHYTQKSLSLSPGKQ S(V) Exclude (Tryptic rule, 18O/16O rule)
1181.5930 2 2 2361.171 2361.189 7.4 (W)GQGTLVTVSSASTKGPSVFPLAPC(S) Exclude (Tryptic rule, 18O/16O rule)
1181.5930 2 2 2361.171 2361.190 8.0 (E)CPPCPAPPVAGPSVFLFPPK PK/(D) -
1188.6063 3 NA 3580.807 3580.776 -8.7 (I)GEISHSGTTNYNPSLKSRVT ISGDKSKNQFSLE(L) Exclude (Tryptic rule, too large as 2nd chain)
1188.6063 3 NA 3580.807 3580.801 -1.6 (A)TLVCLISDFYPGAVTVAWKA DSSPVKAGVETTTP(S) Exclude (Tryptic rule, too large as 2nd chain)
1188.6063 3 NA 3580.807 3580.817 2.9 (W)SWVRQPPGKGLEWIGEISHS GTTNYNPSLKSR/(V) Exclude (Too large as 2nd chain)
1313.7528 1 4 1312.745 1312.754 7.0 (A)GPSVFLFPPKPK/(D) -
1333.6070 2 2 2683.210 2683.218 3.1 (V)TVPSSNFGTQTYTCNVDHKP SNTK/(V) Exclude (too large as 2nd chain)
1341.1364 2 NA 2680.258 2680.258 0.1 (T)YTCNVDHKPSNTKVDKTVERKC(C) Exclude (Tryptic rule)
1384.7951 1 4 1383.788 1383.791 2.4 (V)AGPSVFLFPPKPK/(D) -
1676.7706 2 NA 3351.527 3351.503 -7.1 (E)WESNGQPENNYKTTPPMLDS DGSFFLYSK/(L) Exclude (too large as 2nd chain)
1676.7706 2 NA 3369.537 3369.532 -1.4 (P)SVSGAPGQRVTISCTGSSSN IGAGYDVHWYQQ(L) Exclude (Tryptic rule, too large as 2nd chain)
1688.7689 1 4 1705.772 1705.788 9.4 (V)HQDWLNGKEYKCK/(V) -
1846.0661 1 4 1845.059 1845.055 -2.0 (C)PAPPVAGPSVFLFPPKPK/(D) -
108
Table 3-5. Partial sequences determined from the mass of fragmentation ions for a triply charge precursor ion at the retention time of
91.17 min with m/z 1351.33 (molecular mass at 4048.963 Da). The longest b- and y-ions observed are underlined. All Cys are alkylated with IAA. The adjoining residues before cleavage are in parenthesis. 18O-labeling indicated that ions 1 and 2 are b-ions and ion 3 to 11 are y-ions. The crosslinked peptides were determined to be HC:G118-R129/HC:C215-K240
Corresponding Tryptic Peptide Notes
User Theor. mass Δmass # m/z Charge mass peptide Sequence Mass (Da) (ppm) (Da)
1 470.2368 1 505.251 505.254 5.1 (K)/GPSVF(P) (K)/118GPSVFPLAPCSR129/(S) 1287.6282 Chain 2
2 769.4172 1 786.420 786.428 9.6 (K)/GPSVFPLA(P)
3 566.3647 1 565.357 565.359 3.1 (F)PPKPK/(D) (K)/215CCVECPPCPAPPVAGPSVFLFPPKPK240/(D) 2911.3305 Chain 1
4 1313.7528 1 1312.745 1312.754 7.0 (A)GPSVFLFPPKPK/(D)
5 1384.7951 1 1383.788 1383.791 2.4 (V)AGPSVFLFPPKPK/(D)
6 839.4851 2 1676.956 1676.965 5.5 (A)PPVAGPSVFLFPPKPK/(D)
7 1846.0661 1 1845.042 1845.055 7.1 (C)PAPPVAGPSVFLFPPKPK/(D)
8 923.5283 2 1845.059 1845.055 -2.0 (C)PAPPVAGPSVFLFPPKPK/(D)
9 1101.0885 2 2200.162 2200.175 6.0 (C)PPCPAPPVAGPSVFLFPPKPK/(D)
10 1181.5930 2 2361.171 2361.190 8.0 (E)CPPCPAPPVAGPSVFLFPPKPK/(D)
ruled out for 11 1688.7689 1 1705.772 1705.788 9.4 (V)HQDWLNGKEYKCK/(V) (R)/294VVSVLTVVHQDWLNGKEYKCK314/(V) 2502.2941 chain 2, see text
109
Table 3-6. De novo sequencing for sequence tag using y-ions from the cross-linked fragments (group 4 in Table 3-1) in the cross-link peptide G118-R129/C215-K240. The observed m/z value is the most abundant isotopic peak in each isotopic envelope. The mass difference between a pair of adjacent y-ions was calculated and compared to the mass of single amino acids and dipeptides within mass error of 0.05 Da. The amino acid residues or dipeptides found in the putative cross-linked peptide chains are shown in red. The sequence tag SVFPLA is in the cross-linked peptide chain G118-R129.
Charge Mass Shift in Obs. Mass Mass Diff. Obs. m/z Amino Acid Residue or Di-peptide (Mass error) State 18O-water (Da) (Da)
1633.2705 2 4 3264.5265 17.9980 H2O=18.0106(0.0126)
1642.2695 2 4 3282.5244 184.0776 IA, LA=184.1212(0.0436); PS=184.0848(0.0072)
1734.3083 2 4 3466.6020 79.0643 P-H2O=97.0528-18.0106=79.0422(-0.0221)
1773.8405 2 4 3545.6663 18.0336 H2O=18.0106(-0.0230)
1782.8573 2 4 3563.7000 147.0739 F=147.0684(-0.0055)
1237.6012 3 2.5 3709.7818 186.1085 W=186.0793(-0.0292); AD,GE=186.0641(-0.0444); SV=186.1004(-0.0081)
1299.6374 3 2.5 3895.8903 - -
110
3.4.4 Stage 4: Deduction of Crosslinking Chemistry and Site
Toward this end, two pieces of information are particularly useful: (1) elemental composition of the functional group involved and (2) peptide fragments not observed in tandem mass spectrometry.
Deducing Crosslinking Chemistry: Mass to Formula (Elemental Composition). An example of the elemental composition calculation is illustrated in Table 3-7. Once the peptides
G118-R129 and C215-K240 were established as the components of the cross-linked peptide, the difference between the observed mass of the crosslinked peptide (4048.9600 Da) and the combined mass of the two chains devoid of modifications (1287.6282+2911.3305=4198.9587
Da) was calculated to be 149.9987 Da. Four potential elemental compositions for the mass of
149.9987 Da were obtained via Thermo-Fisher Scientific Xcalibur (Table 3-7). The last two were eliminated based on their high delta ppm relative to the FT MS mass accuracy (typically ≤
5 ppm). The high RDB (ring and double bond) value makes the second one unlikely too. This leaves the first one as the only plausible choice.
The elemental composition of C4H6O4S contains sulfur, which only presents in cysteine and methionine. Each of the putative crosslinked peptide pair G118-R129 and C215-K240, and particularly the fragments that were not observed by tandem mass spectrometry (those not underlined in Table 3-5), contain cysteine but not methionine. During sample preparation, cysteine residues were reduced and alkylated by iodoacetic acid (IAA), so the mass for all peptides were calculated assuming cysteines are alkylated. Hence, removal of two alkyl groups
(two C2H3O2) and a sulfur atom exactly matches the determined elemental composition. The
111 mass of observed crosslink peptide (4048.963 Da) and the theoretical thioether peptide
(4048.960 Da) are practically identical (with mass error of 0.74 ppm, see Table 3-8). All together, we surmised that the crosslinking chemistry is a thioether originating from a pair of cysteine residues (Scheme 3-2).
Table 3-7. Elemental formula with mass of 149.9987 Da (the mass difference between the sum of unmodified peptides and crosslinked peptide). RDB means ring and double bond; ppm is part-per-million.
Elemental Composition Proposed Structure
Formula Cal. Mass (Da) Delta ppm RDB
C4H6O4S 149.9987 0 2.0 S+CH2COOH+CH2COOH
C10ON 149.9980 5 11.5 Excluded
C5H2N4S 150.0000 -9 7.0 Excluded
C2H4O3N3S 149.9973 9 2.5 Excluded
Locating Crosslink Site. Typically, the crosslink site can be localized by the largest b- and y-ions observed. For the crosslinked peptide m/z 1351.33, the largest b- and y-ions are
GPSVFPLA/(PCSR, not observed), (CCVE, not observed)/CPPCPAPPVAGPSVFLFPPKPK for the chain HC:G118-R129 and HC:C215-K240, respectively. This indicates the crosslink site is in the corresponding PCSR and CCVE region (Figure 3-2). Compared to the highly stable
112
valine and proline, cysteine and glutamic acid are chemically reactive, so they are more likely
candidates for crosslinking.
The elemental analysis described in the previous section indicates that a sulfur atom was
removed, suggesting that a cysteine is involved. In addition, functional groups with a combined
composition of C4H6O4 are eliminated from the theoretical peptide pairs, in which cysteinyl
residues were assumed to be alkylated with IAA (C2H3O2). Taken together, our data indicated that the crosslink site is highly likely at HC:Cys127-HC:Cys215 or HC:Cys127-HC:Cys216 as shown in Figure 3-2. Because the two cysteines (Cys215 and Cys216) in the heavy chain are adjacent to
each other, the exact crosslink site was unable to be unambiguously determined here.
113
b22-H2O++ y8+++ y4++ 1783.16 100 1237.90 16 y21++ y7++ O-water y11 1641.75 1782.28 1101.46 1256.74 b8++ y16 80 y16++ y9+++ 1103.19 839.72 1270.97 1678.05 y18 y5++ y7+++ y10+++ y15 1846.17 60 y7-H2O++ b10++ 1299.85 1580.94 b5 1773.89 b22-H2O+++ y6++ 1849.91 b8-H2O 840.36 y13 b21++ 40 y5 1188.41 y12 1733.91 751.49 1384.78 1743.27 1783.90 566.46 1314.80 y4-H2O++ y8++ b8 y18++ y22++ Relative Abundance Relative 1856.88 20 b5-H2O y8 1633.40 923.74 1333.70 769.48 1181.97 1885.82 470.34 713.48 973.60 1539.70 0 y16++ y8+++ b22-H2O++ 841.78 b8++ 100 1240.42 18 1104.93 y11 1786.98 O-water y4++ y7++ 1260.85 1786.39 80 y21++ 1646.35y16 y9+++ 1103.98 1681.95 1273.60 y5++ b5 60 y10+++ y7-H2O++ y7+++ y6++ 1853.88 y12 1302.43 1778.08 b10++ 1737.61 1787.72 y13 b21++ 40 y5 b22-H2O+++1317.75 y8++ b8-H2O 1388.87 1745.64 y18 570.50 1189.26 y4-H2O++ 1860.89 751.46 y18++ y8 1318.81 1850.05
Relative Abundance b8 1637.00 20 b5-H2O 925.78 977.60 y22++ y11 769.49 1183.90 1584.91 470.33 717.51 945.78 1889.84 0 600 800 1000 1200 1400 1600 1800
Figure 3-2. CID MS/MS spectrum of the triply charged precursor ions at m/z 1351.33 (16O- labeled C-termini) and 1354.00 (18O-labeled C-termini). Singly and doubly charge fragment ions
that contain the individual chain of the crosslinked peptides (G118-R129/C215-K240, RT at
91.17 min) are highlighted in blue and red, respectively. Characteristic mass shift imparted by
the heavier isotope 18O was observed (e.g., the mass shift of 4 Da for y5 in 16O- vs 18O-water).
114
3.4.5 Final Confirmation and Additional Support
Confirmation by Data Matching. Once the putative crosslinking chemistry and site
have been proposed, theoretical fragmentation spectra were calculated and compared with the
observed spectra. The assignment is shown in Figure 3-2 and is highly consistent with the
deduced structure. A handful fragment peaks of a few crosslinks were not assigned initially, and hence were subjected to further analysis as described next.
MS3 Analysis. MS3 analysis may provide additional structural information, especially
for fragment ions that are difficult to assign in the MS/MS. For example, in Figure 3-3a, two
high intensity fragment ions at m/z 1196.50 (singly charged) and 1520.90 (doubly charged) were
observed for the triply charged crosslinked peptide m/z 1413.37 (G118-R129/K214-K240), but
could not be assigned to typical b- or y-ions. To ascertain, MS3 analysis of these two unassigned
ions revealed the sequences to be 118GPSVFPLAPC*SR129 and
214KCC*VECPPCPAPPVAGPSVFLFPPKPK240, in which a dehydroalanine replaces C127 in
peptide G118-R129 and a free cysteine replaces the thioether at 216 in peptide K214-K240,
respectively (Figure 3-3b & 3-3c). These data further supported the proposed sequence and
crosslink sites. Alkylation at Lys and Met as an artifact from sample preparation in peptide
mapping was reported[49]. The alkylation at K214 in the crosslink peptide G118-R129/K214-
K240 is in agreement with the literature [49].
115
Additional Peptides. Following the same work flow, full sequences, crosslinking
chemistry and sites have been established for all ten candidate crosslinked peptides shown in red
in Table 3-2. The final results of all identified crosslinked peptides are summarized in Table 3-8.
To evaluate the sensitivity of our method, the peak intensity from LC-MS analysis for each crosslinked peptide and its related (not crosslinked) peptides was used to estimate the degree of crosslinking as described by Zhang[16], ranging from 0.2% to 5.0% (half less than 1%; see Table 3-10 for details). Comparable data were observed based on reducing SDS-PAGE, which indicated about 8% of total crosslinked species (see Figure 3-8A). It is also worth noting that no enrichment or separation was performed on the IgG2 samples prior to tryptic digestion
(the first step of our work flow); in other words, the crosslinked peptides were analyzed in the presence of large excess of native peptides. Of course, considerably higher sensitivity can be achieved if the crosslinked proteins are separated or enriched prior to analysis.
116
(See Fig. S3-4) (See Fig. S3-5)
Ion 1521++ 1520.90 100 16 O-water Ion 1196 80 1196.50
60 y4++ b15++ 1735.05 b5 40 b8 1462.89 y18++ y21++ b11++ y3-H2O++ y7++ 769.44 b22++ 923.46 1280.45 y10+++ y16 1875.87 Relative Abundance 20 y5 y16++ 1101.24 1836.93 y20-H2O++ 1361.73 1677.76 566.34 709.41 839.58 428.25 1043.48 1580.77 1944.77 0 Ion 1521++ 1522.79 100 18O-water Ion 1196 80 1200.51
60 y4++ y3-H2O++ 1738.65 40 y21++ b11++ y16 y10+++ b22++ b8 1103.22 1282.57 1681.68 y16++ b5 769.39 1364.50 1838.05 Relative Abundance 20 y5 y20-H2O++ b15++ y7++ 841.68y18++ 1682.71 1879.82 570.40 713.36 1465.19 432.21 925.681046.48 1635.52 0 400 600 800 1000 1200 1400 1600 1800 2000
Figure 3-3a. MS/MS data of the cross-link peptide HC:G118-R129/HC:KΔ214-K240 (RT at
88.56 min, m/z 1413.36, charge of 3). Lys214 was found alkylated by IAA in this peptide.
117
A)
1196.6369 100
90
80 1197.6389 70
60
50
40 Relative Abundance 30 1198.6415
20
10 1199.6492 1200.6535 0 1196.5 1197.0 1197.5 1198.0 1198.5 1199.0 1199.5 1200.0 1200.5 1201.0 M-H2O m/z 1178.52 100
90 B)
80
70
60
50 y10 1042.49 40 b8-H2O
Relative Abundance 30 751.33 M-2H2O b8 1160.53 769.34 20 y7 y10-H2O y8 1024.56 y4 709.41 y9 y5 856.41 1150.73 10 428.26 y8-H2O b9 955.63 371.25 499.31 692.43 1007.54 838.65 866.44 1082.34 0 400 500 600 700 800 900 1000 1100 1200 m/z
Figure 3-3b. MS3 for structure confirmation of the singly-charged fragment ion m/z 1196 from the cross-link peptide HC:G118-R129/HC:KΔ214-K240 (RT at 88.56 min, m/z 1413.36, charge of 3).
Lys214 was found alkylated in this peptide. A) Isotopic distribution (observed m/z 1196.6369, calculated m/z 1196.6423, mass error of 4.5 ppm). B) MS3 spectrum.
118
A) 1521.2133 100 1521.7178 90
80
70 1520.7060
60 1522.2217
50
40 Relative Abundance 30 1522.7274 20
10 1523.2295 0 1520.5 1521.0 1521.5 1522.0 1522.5 1523.0 1523.5 m/z y16 1677.74 100 1678.77 90 B) y21++ 1101.37 80 b11 1363.27 70 y18 1846.89 60 1456.95 1845.81 50
y5 b6 b9 1376.37 1522.54 1636.63 40 840.14 y12 566.33 1195.28 1313.60 30 1475.61 y15 b15 y13 1580.81 1727.50 1847.94 20 1196.38 1384.68 b14 y8 y11 b16 y5-H2O y6 1257.64 1656.47 10 b5 973.43 1785.15 548.37 713.38 679.26 801.67 842.39 1085.26 1899.39 0 600 800 1000 1200 1400 1600 1800 2000 m/z
Figure 3-3c. MS3 for structure confirmation of the double-charged fragment ion m/z 1521 from the cross-linked peptide HC:G118-R129/HC:KΔ214-K240 (RT at 88.56 min, m/z 1413.36, charge of
3). Lys214 was found alkylated in this peptide. A) Isotopic distribution (observed m/z 1520.7060, calculated m/z 1520.7200, mass error of 9.2 ppm). B) MS3 spectrum.
119
Table 3-8. Crosslinked peptides identified in IgG2. Thioether crosslink sites are labeled in red and bold-face. The exact site at either C215 or C216 is unable to be differentiated with available data.
Unless noted, all cysteine side chains are alkylated with IAA. Alkylated Lys is shown in blue. The longest b- or y-ions observed are underlined. Peptides are shown with the adjoining amino acid residues before cleavage in parenthesis. The peptide #8 contains a thioether (in red) and dehydroalanine (in green and asterisk) at C215-C216, as shown in Figure S5.
# Name m/z(charge) RT Observed Mass Theoretical Mass Mass Error Sequence Cross-linking
(min) (Da) (Da) (ppm) Site in Heavy Chain
1 G118-R129/ C215-K240 1351.33(3+) 91.17 4048.963 4048.960 0.74 (K)118GPSVFPLAPCSR129(S) 127
(K)215CCVECPPCPAPPVAGPSVFLFPPKPK240(D) 215 or 216
2 G118-R129/ K214-K240 1394.03(3+) 87.91 4177.059 4177.055 0.96 (K)118GPSVFPLAPCSR129(S) Same
(R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D)
3 G118-R129/ K214-K240 1413.36(3+) 88.56 4235.064 4235.060 0.94 (K)118GPSVFPLAPCSR129(S) Same
(R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D)
4 C215-K240/ C215-K240 1419.92(4+) 101.59 5672.670 5672.662 1.41 (K)215CCVECPPCPAPPVAGPSVFLFPPKPK240(D) 215 or 216
(K)215CCVECPPCPAPPVAGPSVFLFPPKPK240(D) 215 or 216
5 C215-K240/ K214-K240 1452.20(4+) 99.76 5800.765 5800.757 1.38 (K)215CCVECPPCPAPPVAGPSVFLFPPKPK240(D) Same
(R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D)
6 K214-K240/ K214-K240 1484.22(4+) 97.88 5928.854 5928.852 0.34 (R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D) Same
(R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D)
7 K214-K240/ K214-K240 1498.47(4+) 98.99 5986.858 5986.858 0.00 (R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D) Same
(R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D)
8 K214-K240*/ K214-K240 1475.73(4+) 98.49 5894.860 5894.856 0.68 (R)214KCC*VECPPCPAPPVAGPSVFLFPPKPK240(D) Same
(R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D)
9 T210-K240/ K214-K240 1605.54(4+) 96.15 6414.117 6414.112 0.78 (K)210TVERKCCVECPPCPAPPVAGPSVFLFPPKPK240(D) Same
(R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D)
10 T210-K240/ K214-K240 1620.04(4+) 97.39 6472.120 6472.117 0.46 (K)210TVERKCCVECPPCPAPPVAGPSVFLFPPKPK240(D) Same
(R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D)
120
3.4.6 Targeted Search Based on the Newly Established Crosslinking Chemistry
After the thioether crosslink chemistry was established, a targeted search for this
particular modification was performed following well-established protocols. First, a theoretical
database was built for all combination of a thioether crosslinking between any two cysteinyl
residues. Then, all observed precursor ions were searched against the database. When a hit was
found in the targeted mass search, the corresponding MS/MS data from both the 18O-water and
16O-water digests were examined for further structural confirmation. By this approach, four
additional thioether peptides were found (Table 3-9, Figure 3-6 & 3-7). All contain a light chain
C-terminal peptide, so each has only one newly created C-terminus (two 18O-incorporation) and therefore was not discriminated from single chain peptides in the initial screening stage. Again, these results showcase the utility of our approach to identify crosslinks in macromolecules derived from previously unknown crosslinking chemistry.
121
Table 3-9. The cross-linking peptides identified in the IgG2 via a targeted search for cysteinyl thioether (labeled in red and bold-face). The exact site at either C215 or C216 is unable to be differentiated with available data. Unless noted, all cysteine side chains are alkylated with
IAA. Alkylated Lys is shown in blue and asterisk. Peptides are shown with the adjoining amino acid residues before cleavage in parenthesis.
# Name m/z(charge) RT Observed Mass Theoretical Mass Mass Error Sequence Cross-linking
(min) (Da) (Da) (ppm) Site
11 T211-S218/ 1002.50(2+) 52.25 2001.984 2001.983 0.40 (K)211TVAPTECS218 LC217/
G118-R129 (K)118GPSVFPLAPCSR129(S) HC127
12 T211-S218/ 1209.90(3+) 78.29 3625.689 3625.685 1.10 (K)211TVAPTECS218 LC217/
C215-K240 (K)215CCVECPPCPAPPVAGPSVFLFPPKPK240(D) HC215 or 216
13 T211-S218/ 1252.94(3+) 76.12 3753.784 3753.780 1.00 (K)211TVAPTECS218 Same
K214-K240 (R)214KCCVECPPCPAPPVAGPSVFLFPPKPK240(D)
14 T211-S218/ 1272.27(3+) 77.69 3811.789 3811.786 0.70 (K)211TVAPTECS218 Same
Δ Δ K 214- (R)214K CCVECPPCPAPPVAGPSVFLFPPKPK240(D)
K240
122
y16++ 839.82 100 16O-water 80 y18 60 1847.14 b5++ b10++ y21++ b10+++ y22++ 1738.07 1333.49 1999.66 40 1101.38 1181.83 y16 840.61 y12 y5 y8 b21+++ y18++ 1678.00 1102.15 1313.77 Relative Abundance Relative 973.72 y13 y15 1704.30 b8++ 20 566.46 y11 y5-H2O y6 923.88 1384.75 1915.37 1256.79 1580.94 1739.01 548.50 567.50 713.52 924.73 0 y16++ 841.76 100 18O-water 80
60 y18 y21++ y11 1851.17 b10+++ 842.49 1260.81 b8++ 40 1103.39 1334.79 b21+++ y12 y16 1917.23 y5 y6 1705.38 y18++ 1104.13 y13 y5-H2O 717.52 1317.79 y15 1682.05 b5++
Relative Abundance Relative 570.49 20 925.78 y22++ 1388.76 1739.90 1918.23 550.53 y8 1584.94 571.53 1183.99 524.34 831.82 977.70 0 400 600 800 1000 1200 1400 1600 1800 2000 m/z
Figure 3-4. MS/MS data of the cross-linked peptide HC:C215-K240/HC:C215-K240 (RT at
101.59 min, m/z 1419.92, charge of 4).
123
y16++ 16 839.82 O-water 100
y21++ Ion 1429++ 80 1101.38 b11+++ 60 y16 1407.91Ion 1521++ b22+++ 1677.92 1521.72 40 y5 1778.90 b6++ y12 1429.47 1848.94 566.37 y22++ y15 y18++ 1313.74 20 1182.14 1581.86 1803.41 Relative Abundance 923.57
0 y16++ 841.75 18 100 O-water
b22+++ 80 Ion 1429++ 1779.68
60 y21++ b11+++ Ion 1521++ y16 1103.17 1408.74 1523.51 1681.87 40 y12 1431.87 y5 b6++ y18++ 1445.93 y15 y22++ 1317.82 20 570.38 1851.17 Relative Abundance 925.68 1184.27 1584.92 717.43 1025.69 1984.12 0 600 800 1000 1200 1400 1600 1800 2000 m/z
Figure 3-5. MS/MS data of the cross-linked peptide HC:K214-K240*/HC:KΔ214-K240 (RT at
98.49 min, m/z 1475.73, charge of 4).
124
y4 1234.40 100 y3-H2O 80 y7 y5++ 1515.56 16O-water 866.54 60 y7++ 814.05 758.51 622.80 b8 y6++ 1244.48 b7-H2O 902.06 40 b5 769.33 y5 680.31b8-H2O y8++ 963.32 y6 b5-H2O 751.28 832.05 993.07 1305.45 1418.56
Relative Abundance 470.19 b7++ y2 y8 20 b5 b7 1034.39 488.21 698.26 949.61 1404.55 1662.64 442.23 1137.42 1731.64 0 y4 1238.43 100 y3-H2O 80 y7 18 1519.57 O-water y5++ 60 624.85 868.47 y7++ 815.99 b7-H2O 760.45 b8 y6++ 1248.47 40 680.32 769.32 b5 b8-H2O 903.93 y5 b5-H2O y8++ 967.34 y6 751.29 1309.49 b5 b7 833.99 b7++ 994.98 1422.51 y8 Relative Abundance20 470.19 y2 698.36 1038.36 488.22 951.59 1408.52 1666.61 442.24 1141.42 1735.67 0 400 600 800 1000 1200 1400 1600 1800 2000 m/z
Figure 3-6. MS/MS data of the cross-link peptide LC:T211-S218/HC:G118-R129 (RT at 52.25 min, m/z 1002.50, charge of 2).
125
y16++ 839.66 100 16O-water 80
60 y21++ y5+++ 1101.20 y12 1162.41 1313.54 40 y6+++ b22++ y18 y5 y18++ 1185.70 1596.44 y16 b11++ y13 b6 1845.93
Relative Abundance 566.36 b15++ 1677.75 20 923.48 1039.53 1384.58 1554.46 y5++ b9 1222.22 1743.21 1909.54 791.08 904.41 1460.22 0 y16++ 841.73 100 18O-water 80
60 y21++ 1103.29 y13 y5+++ b6 y5++ 40 1388.54 1163.45 y6+++ 1554.43 1745.65 y18 y5 1187.21 y12 b22++ y16 1849.87 y18++ b15++ Relative Abundance 20 b11++ 1596.44 1681.80 b9 570.40 925.65 1222.641317.57 1039.49 1495.64 1909.53 793.18 904.43 1806.65 0 400 600 800 1000 1200 1400 1600 1800 2000 m/z
Figure 3-7. MS/MS data of the cross-linked peptide LC:T211-S218/HC:K214-K240 (RT at
76.12 min, m/z 1252.94, charge of 3).
126
Table 3-10 Quantification of the cross-linked peptides in the IgG2
# Peptide Level (%)
1 G118-R129/C215-K240 0.5
2 G118-R129/K214-K240 1.0
3 G118-R129/KΔ214-K240 1.9
4 C215-K240/C215-K240 0.3
5 C215-K240/K214-K240 2.6
6 K214-K240/K214-K240 2.4
7 K214-K240/KΔ214-K240 1.7
8 K214-K240*/KΔ214-K240 0.2
9 T210-K240/K214-K240 0.6
10 T210-K240/KΔ214-K240 0.5
11 T211-S218/G118-R129 5.0
12 T211-S218/C215-K240 0.2
13 T211-S218/K214-K240 0.2
14 T211-S218/KΔ214-K240 0.4
127
mAU B) 500
400
300
200
6% 100 IgG2-pH8.5/50C/1wk 1% 3%
IgG2-Control 0
0 10 20 30 40 50 min
Figure 3-8. (A) Detection of cross-links in IgG2 by reducing SDS-PAGE. Lane 1, 2, 3, and 4 are molecular weight marker, control, acid stressed sample (pH 3.0 at 50 °C for 1 week), and base stressed sample (pH 8.5 at 50 °C for 1 week), respectively. About 8% (combined) non-reducible high molecular bands were observed in the base stressed IgG2, as quantified by the software TotalLab Quant version
12.4. (B) Aggregation analysis by size exclusion chromatography. Seven percent (7%) aggregation was observed in the base stressed IgG2. Column: TSKgel G3000 SWXL 7.8 x 300 mm 5 µm (two in tandem); mobile phase: 150 mM NaCl in 100 mM sodium phosphate buffer pH 6.9; column temperature: ambient; flow rate: 0.5 mL/min; detection: 215 nm; injection: 10 µL of 2.5mg/mL sample.
128
3.5 Formation of Thioether
Thioether is a known modification for proteins[50-55]. For IgG1, a thioether crosslink
was located at the disulfide bond of the light chain C-termini and the heavy chain hinge
region[52, 55]. A generally accepted mechanism involves a β-elimination of disulfide to
generate dehydroalanine followed by Michael addition by another cysteinyl thiol[52-54, 56, 57].
Basic conditions and structural flexibility generally favor its formation[50-55]. In addition,
radical intermediates have been postulated for desulfurization[57, 58]. The hinge region of IgG2
is highly flexible and solvent exposed, and therefore very susceptible to this transformation.
Indeed, our results indicated it occurs more frequently at the light chain C-termini and in the
hinge region of IgG2. It is very interesting that the disulfide bonds of heavy chain C127 –heavy
chain C215 (or C216) in the IgG2 A/B form (or B form) are also reactive (Figure 3-9). These
thioether crosslinks at HC:Cys127-HC:Cys215 (or HC:Cys127-HC:Cys216), HC:Cys215-HC:Cys215
(or HC:Cys216-HC:Cys216), LC:Cys217-HC:Cys127, and LC:Cys217-HC:Cys215(or LC:Cys217-
HC:Cys216) originated from native disulfides as shown in red in Figure 3-9. Thioether linkage is
in agreement with the previous reports on IgG2 disulfide bond pairing[59-61]. In Table 3-8, the
crosslink peptide #8 (HC:K214-K240*/HC:K214-K240) contains a thioether and an
dehydroalanine. The corresponding linear peptide K214-K240*
(214KCC*VECPPCPAPPVAGPSVFLFPPKPK240, the dehydroalanine at C215 or C216 was
denoted as asterisk) was also found. All together, these data are consistent with thioether
formation via dehydroalanine intermediates.
129
Scheme 3-2. Establishment of crosslink chemistry based on formula C4H6O4S obtained from elemental composition analysis of 149.9987 Da.
Figure 3-9. Major disulfide linkage isoforms in IgG2. Those labeled in red were found to
convert into thioethers in IgG2.
130
3.6 Conclusions
The utility of our XChem-Finder strategy for the characterization of protein crosslinking with undefined chemistry is exemplified by the discovery of fourteen thioether peptides in IgG2.
Essential to our approach is 18O-isotope labeling; it allows the facile detection of crosslinked peptides, and most significantly, divides the complex tandem mass spectra to sub-sets that can be processed by standard database search (FindPept that matches fragment ions with partial peptide sequences) and de novo sequencing (sequence tags). High-resolution spectral data also dramatically improve the confidence of assignment, and moreover, reveal the chemical nature of the crosslinking. While the reported work was manually processed, most steps can be automated. Hence our XChem-Finder strategy should be generally applicable for the discovery of crosslinked proteins, without prior defined chemistry, in both biological systems and biopharmaceuticals.
3.7 References
[1] Liu H, Gaza-Bulseco G, Faldu D, Chumsae C, Sun J. Heterogeneity of monoclonal antibodies. J Pharm Sci 2008;97:2426-47.
[2] DiMarco T, Giulivi C. Current analytical methods for the detection of dityrosine, a biomarker of oxidative stress, in biological samples. Mass Spectrom Rev 2007;26:108- 20.
[3] Srivastava OP, Kirk MC, Srivastava K. Characterization of covalent multimers of crystallins in aging human lenses. J Biol Chem 2004;279:10901-9.
[4] Wilhelmus MM, Grunberg SC, Bol JG, van Dam AM, Hoozemans JJ, Rozemuller AJ, et al. Transglutaminases and transglutaminase-catalyzed cross-links colocalize with the pathological lesions in Alzheimer's disease brain. Brain Pathol 2009;19:612-22.
131
[5] Lopez B, Gonzalez A, Hermida N, Valencia F, de Teresa E, Diez J. Role of lysyl oxidase in myocardial fibrosis: from basic science to clinical aspects. Am J Physiol Heart Circ Physiol 2010;299:H1-9.
[6] Nemes Z, Devreese B, Steinert PM, Van Beeumen J, Fesus L. Cross-linking of ubiquitin, HSP27, parkin, and alpha-synuclein by gamma-glutamyl-epsilon-lysine bonds in Alzheimer's neurofibrillary tangles. FASEB J 2004;18:1135-7.
[7] Friedman M. Chemistry, biochemistry, nutrition, and microbiology of lysinoalanine, lanthionine, and histidinoalanine in food and other proteins. J Agric Food Chem 1999;47:1295-319.
[8] Nashef AS, Osuga DT, Lee HS, Ahmed AI, Whitaker JR, Feeney RE. Effects of alkali on proteins. Disulfides and their products. J Agric Food Chem 1977;25:245-51.
[9] Leitner A, Walzthoeni T, Kahraman A, Herzog F, Rinner O, Beck M, et al. Probing native protein structures by chemical cross-linking, mass spectrometry, and bioinformatics. Mol Cell Proteomics 2010;9:1634-49.
[10] Singh P, Panchaud A, Goodlett DR. Chemical cross-linking and mass spectrometry as a low-resolution protein structure determination technique. Anal Chem 2010;82:2636-42.
[11] Young MM, Tang N, Hempel JC, Oshiro CM, Taylor EW, Kuntz ID, et al. High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry. Proc Natl Acad Sci U S A 2000;97:5802-6.
[12] Lee YJ, Lackner LL, Nunnari JM, Phinney BS. Shotgun cross-linking analysis for studying quaternary and tertiary protein structures. J Proteome Res 2007;6:3908-17.
[13] Hoopmann MR, Weisbrod CR, Bruce JE. Improved strategies for rapid identification of chemically cross-linked peptides using protein interaction reporter technology. J Proteome Res 2010;9:6323-33.
[14] Du X, Chowdhury SM, Manes NP, Wu S, Mayer MU, Adkins JN, et al. Xlink-identifier: an automated data analysis platform for confident identifications of chemically cross- linked peptides using tandem mass spectrometry. J Proteome Res 2011;10:923-31.
[15] Chowdhury SM, Du X, Tolic N, Wu S, Moore RJ, Mayer MU, et al. Identification of cross-linked peptides after click-based enrichment using sequential collision-induced dissociation and electron transfer dissociation tandem mass spectrometry. Anal Chem 2009;81:5524-32.
[16] Zhang Z. Large-scale identification and quantification of covalent modifications in therapeutic proteins. Anal Chem 2009;81:8354-64.
132
[17] Zang T, Lee BW, Cannon LM, Ritter KA, Dai S, Ren D, et al. A naturally occurring brominated furanone covalently modifies and inactivates LuxS. Bioorg Med Chem Lett 2009;19:6200-4.
[18] Wan W, Zhao G, Al-Saad K, Siems WF, Zhou ZS. Rapid screening for S- adenosylmethionine-dependent methylation products by enzyme-transferred isotope patterns analysis. Rapid Commun Mass Spectrom 2004;18:319-24.
[19] Gao Q, Xue S, Doneanu CE, Shaffer SA, Goodlett DR, Nelson SD. Pro-CrossLink. Software tool for protein cross-linking and mass spectrometry. Anal Chem 2006;78:2145-9.
[20] Gao Q, Xue S, Shaffer SA, Doneanu CE, Goodlett DR, Nelson SD. Minimize the detection of false positives by the software program DetectShift for 18O-labeled cross- linked peptide analysis. Eur J Mass Spectrom (Chichester, Eng) 2008;14:275-80.
[21] Zelter A, Hoopmann MR, Vernon R, Baker D, MacCoss MJ, Davis TN. Isotope signatures allow identification of chemically cross-linked peptides by mass spectrometry: a novel method to determine interresidue distances in protein structures through cross- linking. J Proteome Res 2010;9:3583-9.
[22] Rinner O, Seebacher J, Walzthoeni T, Mueller LN, Beck M, Schmidt A, et al. Identification of cross-linked peptides from large sequence databases. Nat Methods 2008;5:315-8.
[23] Seebacher J, Mallick P, Zhang N, Eddes JS, Aebersold R, Gelb MH. Protein cross- linking analysis using mass spectrometry, isotope-coded cross-linkers, and integrated computational data processing. J Proteome Res 2006;5:2270-82.
[24] Seidler J, Zinn N, Boehm ME, Lehmann WD. De novo sequencing of peptides by MS/MS. Proteomics 2010;10:634-49.
[25] Singh P, Shaffer SA, Scherl A, Holman C, Pfuetzner RA, Larson Freeman TJ, et al. Characterization of protein cross-links via mass spectrometry and an open-modification search strategy. Anal Chem 2008;80:8799-806.
[26] Kroon DJ, Baldwin-Ferro A, Lalan P. Identification of sites of degradation in a therapeutic monoclonal antibody by peptide mapping. Pharm Res 1992;9:1386-93.
[27] Van Buren N, Rehder D, Gadgil H, Matsumura M, Jacob J. Elucidation of two major aggregation pathways in an IgG2 antibody. J Pharm Sci 2009;98:3013-30.
[28] Back JW, Notenboom V, de Koning LJ, Muijsers AO, Sixma TK, de Koster CG, et al. Identification of cross-linked peptides for protein interaction studies using mass spectrometry and 18O labeling. Anal Chem 2002;74:4417-22.
133
[29] Ren D, Pipes GD, Liu D, Shih LY, Nichols AC, Treuheit MJ, et al. An improved trypsin digestion method minimizes digestion-induced modifications on proteins. Anal Biochem 2009;392:12-21.
[30] Zhang Z. Prediction of low-energy collision-induced dissociation spectra of peptides with three or more charges. Anal Chem 2005;77:6364-73.
[31] Zhang Z. Prediction of collision-induced-dissociation spectra of peptides with post- translational or process-induced modifications. Anal Chem 2011;83:8642-51.
[32] Zhang Z. Prediction of low-energy collision-induced dissociation spectra of peptides. Anal Chem 2004;76:3908-22.
[33] Zhang Z. Retention time alignment of LC/MS data by a divide-and-conquer algorithm. J Am Soc Mass Spectrom 2012;23:764-72.
[34] Schnolzer M, Jedrzejewski P, Lehmann WD. Protease-catalyzed incorporation of 18O into peptide fragments and its application for protein sequencing by electrospray and matrix-assisted laser desorption/ionization mass spectrometry. Electrophoresis 1996;17:945-53.
[35] Ye X, Luke B, Andresson T, Blonder J. 18O stable isotope labeling in MS-based proteomics. Brief Funct Genomic Proteomic 2009;8:136-44.
[36] Fenselau C, Yao X. 18O2-labeling in quantitative proteomic strategies: a status report. J Proteome Res 2009;8:2140-3.
[37] Yao X, Afonso C, Fenselau C. Dissection of proteolytic 18O labeling: endoprotease- catalyzed 16O-to-18O exchange of truncated peptide substrates. J Proteome Res 2003;2:147-52.
[38] Yao X, Freas A, Ramirez J, Demirev PA, Fenselau C. Proteolytic 18O labeling for comparative proteomics: model studies with two serotypes of adenovirus. Anal Chem 2001;73:2836-42.
[39] Bantscheff M, Dumpelfeld B, Kuster B. Femtomol sensitivity post-digest (18)O labeling for relative quantification of differential protein complex composition. Rapid Commun Mass Spectrom 2004;18:869-76.
[40] Stewart, II, Thomson T, Figeys D. 18O labeling: a tool for proteomics. Rapid Commun Mass Spectrom 2001;15:2456-65.
[41] Koehler CJ, Arntzen MO, de Souza GA, Thiede B. An Approach for Triplex-Isobaric Peptide Termini Labeling (Triplex-IPTL). Anal Chem 2013.
134
[42] Koehler CJ, Arntzen MO, Strozynski M, Treumann A, Thiede B. Isobaric peptide termini labeling utilizing site-specific N-terminal succinylation. Anal Chem 2011;83:4775-81.
[43] Nakazawa T, Yamaguchi M, Okamura TA, Ando E, Nishimura O, Tsunasawa S. Terminal proteomics: N- and C-terminal analyses for high-fidelity identification of proteins using MS. Proteomics 2008;8:673-85.
[44] Liu M, Cheetham J, Cauchon N, Ostovic J, Ni W, Ren D, et al. Protein isoaspartate methyltransferase-mediated 18O-labeling of isoaspartic acid for mass spectrometry analysis. Anal Chem 2012;84:1056-62.
[45] Ni W, Dai S, Karger BL, Zhou ZS. Analysis of isoaspartic Acid by selective proteolysis with Asp-N and electron transfer dissociation mass spectrometry. Anal Chem 2010;82:7485-91.
[46] Alfaro JF, Gillies LA, Sun HG, Dai S, Zang T, Klaene JJ, et al. Chemo-enzymatic detection of protein isoaspartate using protein isoaspartate methyltransferase and hydrazine trapping. Anal Chem 2008;80:3882-9.
[47] Gururaja TL, Payan DG, Anderson DC. Gas phase dimerization of neuropeptide head activator analogs useful for the noncovalent constraint of peptides. Biopolymers 2007;88:55-63.
[48] Banerjee S, Mazumdar S. Non-covalent dimers of the lysine containing protonated peptide ions in gaseous state: electrospray ionization mass spectrometric study. J Mass Spectrom 2010;45:1212-9.
[49] Gurd FR. [34a] carboxymethylation. Methods Enzymol 1972;25:424-38.
[50] Datola A, Richert S, Bierau H, Agugiaro D, Izzo A, Rossi M, et al. Characterisation of a novel growth hormone variant comprising a thioether link between Cys182 and Cys189. ChemMedChem 2007;2:1181-9.
[51] Lispi M, Datola A, Bierau H, Ceccarelli D, Crisci C, Minari K, et al. Heterogeneity of commercial recombinant human growth hormone (r-hGH) preparations containing a thioether variant. J Pharm Sci 2009;98:4511-24.
[52] Cohen SL, Price C, Vlasak J. Beta-elimination and peptide bond hydrolysis: two distinct mechanisms of human IgG1 hinge fragmentation upon storage. J Am Chem Soc 2007;129:6976-7.
[53] Florence TM. Degradation of protein disulphide bonds in dilute alkali. Biochem J 1980;189:507-20.
[54] Galande AK, Trent JO, Spatola AF. Understanding base-assisted desulfurization using a variety of disulfide-bridged peptides. Biopolymers 2003;71:534-51.
135
[55] Tous GI, Wei Z, Feng J, Bilbulian S, Bowen S, Smith J, et al. Characterization of a novel modification to monoclonal antibodies: thioether cross-link of heavy and light chains. Anal Chem 2005;77:2675-82.
[56] Zhao G, Zhou ZS. Vinyl sulfonium as novel proteolytic enzyme inhibitor. Bioorg Med Chem Lett 2001;11:2331-5.
[57] Wang Z, Rejtar T, Zhou ZS, Karger BL. Desulfurization of cysteine-containing peptides resulting from sample preparation for protein characterization by mass spectrometry. Rapid Commun Mass Spectrom 2010;24:267-75.
[58] Zhou ZS, Smith AE, Matthews RG. L-Selenohomocysteine: one-step synthesis from L- selenomethionine and kinetic analysis as substrate for methionine synthases. Bioorg Med Chem Lett 2000;10:2471-5.
[59] Dillon TM, Ricci MS, Vezina C, Flynn GC, Liu YD, Rehder DS, et al. Structural and functional characterization of disulfide isoforms of the human IgG2 subclass. J Biol Chem 2008;283:16206-15.
[60] Wypych J, Li M, Guo A, Zhang Z, Martinez T, Allen MJ, et al. Human IgG2 antibodies display disulfide-mediated structural isoforms. J Biol Chem 2008;283:16194-205.
[61] Zhang B, Harder AG, Connelly HM, Maheu LL, Cockrill SL. Determination of Fab- hinge disulfide connectivity in structural isoforms of a recombinant human immunoglobulin G2 antibody. Anal Chem 2010;82:1090-9.
136
Chapter 4: Discovery and Characterization of a Novel Photo-Oxidative Histidine-Histidine
Crosslink in IgG1 Antibody Utilizing 18O-labeling and Mass Spectrometry
Reproduced with permission from “Min Liu, Zhongqi Zhang, Janet Cheetham, Da
Ren, and Zhaohui Sunny Zhou. Discovery and Characterization of a Novel Photo-Oxidative
Histidine-Histidine Crosslink in IgG1 Antibody Utilizing 18O-labeling and Mass
Spectrometry. Analytical Chemistry 2014, 86, 4940-4948.” Copyright [2014] American
Chemical Society.
Co-authors’ work in this chapter: Min Liu: experimental design and execute, data analysis, manuscript writing and revision; Zhonqi Zhang: data analysis, manuscript writing and revision;
Janet Cheetham: idea contribution, manuscript writing and revision, and grant support; Da Ren: manuscript writing and revision; Zhaohui Sunny Zhou: idea contribution, data analysis, manuscript writing and revision, and grant support.
137
4.1 Abstract
A novel photo-oxidative crosslinking between two histidines (His-His) has been
discovered and characterized in an IgG1 antibody via the workflow of XChem-Finder –18O labeling and mass spectrometry (Anal Chem 2013, 85, 5900-5908). Its structure was elucidated by peptide mapping with multiple proteases with various specificities (e.g., trypsin, Asp-N, and
GluC combined with trypsin or Asp-N) and mass spectrometry with complementary fragmentation modes (e.g., collision-induced dissociation (CID) and electron-transfer dissociation (ETD)). Our data indicated that crosslinking occurred across two identical conserved histidine residues on two separate heavy chains in the hinge region, which is highly flexible and solvent accessible. Based on model studies with short peptides, it has been proposed that singlet oxygen reacts with the histidyl imidazole ring to form an endoperoxide and then converted to the 2-oxo-histidine (2-oxo-His) and His+32 intermediates, the latter is subject to a nucleophilic attack by the unmodified histidine; and finally, elimination of a water molecule leads to the final adduct with a net mass increase of 14 Da. Our findings are consistent with this mechanism. Successful discovery of crosslinked His-His again demonstrates the broad applicability and utility of our XChem-Finder approach in the discovery and elucidation of protein cross-linking, particularly without a priori knowledge of the chemical nature and site of crosslinking.
138
4.2 Introduction
Protein crosslinks are ubiquitous in biological systems and biopharmaceuticals. They are also involved in disease pathologies such as Alzheimer[1-3] and cataractogenesis[2, 4]. As one of the post-translational modifications and degradations that occur during biopharmaceutical protein production processing and storage, crosslinks have been reported to result in aggregation, loss of bioactivity, and immunogenicity[5-7].
Despite the rapid advancements in mass spectrometry and data analysis algorithms, characterization of protein crosslinks remains challenging due to their structural complexity[8].
Whereas a limited set of crosslinked structures (e.g. thioether[7, 9-12]) have been characterized, most remain unknown; for example, the non-disulfide covalent crosslinking in crystalline [4, 13,
14], collagen[15], ubiquitylated proteins[3], ribonuclease A[16] and monoclonal antibodies[17,
18]. It is particularly challenging to characterize protein crosslinking without prior knowledge of the chemical nature and sites of crosslinking as no theoretical mass or spectrum can be predicted.
In contrast, numerous chemical crosslinks with well-established crosslinking chemistry have been used in the investigation of protein structures and protein-protein interactions[19-25].
Since pre-defined crosslinking chemistry is involved, various specialized algorithms have been developed for data analysis for each incorporated crosslink. Naturally, these approaches are less amenable to the identification of crosslinks with undefined crosslinking chemistry. Recently, we developed a workflow—XChem-Finder—that is generally applicable for protein crosslinking. It involves, first, the detection of cross-linked peptides via the unique isotope patterns imparted by
18O-labeling of their two termini (in comparison, one terminus for a linear peptide), and then integrated mass spectrometric and data analysis[8].
139
IgG1 and IgG2 are the most popular therapeutic monoclonal antibodies on the
market[26]. Applying our XChem-Finder workflow, we have discovered and characterized a
novel histidine-histidine (His-His) crosslink in IgG1 antibody. High molecular weight species in the light-irradiated IgG1 were detected by reduced SDS-PAGE and size exclusion chromatography (SEC). Our LC-MS analysis indicated that crosslinking occurred across two identical conserved histidine residues (His220) on two separate heavy chains in the hinge region, which is highly flexible and solvent accessible. The crosslinking chemistry is consistent with the proposed mechanism based on model peptides under photo-oxidative conditions (see Scheme 4-
1) [16, 27-29]. Successful discovery of His-His crosslink in IgG1 has further demonstrated the general applicability and power of our XChem-Finder workflow. To the best of our knowledge, our work reported herein is the first example of such crosslinking in a protein.
4.3 Experimental Section
4.3.1 Chemicals
All chemicals were reagent grade or above. Guanidine hydrochloride (GndHCl),
ethylenediaminetetraacetic acid (EDTA), dithiothreitol (DTT), iodoacetic acid (IAA), trifluoroacetic acid (TFA), acetonitrile (ACN), HPLC-grade water, and bradykinin were from
Sigma-Aldrich (St. Louis, MO, USA). Sequencing grade trypsin, GluC, and Asp-N were from
Roche (Indianapolis, IN, USA). 18O-water (97%) was from Cambridge Isotope Laboratories
(Andover, MA, USA). Recombinant monoclonal IgG1 antibody (anti-streptavidin
immunoglobulin gamma 1) was produced in Chinese hamster ovary (CHO) cells (Amgen,
140
Thousand Oaks, CA, USA), purified according to standard manufacturing procedures, formulated at a concentration of 30 mg/mL in 50 mM sodium acetate at pH 5.2, and stored at -70
°C.
4.3.2 Generation of Stressed Sample
After being exchanged into various buffers of biopharmaceutical interest (50 mM sodium acetate at pH 4.8, 50 mM sodium phosphate at pH 7.4, 50 mM sodium bicarbonate at pH 9.0 or water), the IgG1 antibody at a concentration of 5 mg/mL in a clear 3 mL glass vial was put into a light chamber (Atlas Suntest CPS+ with Xenon Lamp and ID65 solar filter, controlled irradiance at 300-800 nm, light intensity at 765 W/m2) and exposed to light irradiation for 7, 14, and 22 hrs. These conditions represent the light irradiance of 1 x, 2 x, and 3 x ICH (International
Conference on Harmonization of technical requirements for registration of pharmaceuticals for human use), respectively.
4.3.3 Aggregates by Size Exclusion Chromatography
Size exclusion chromatography (SEC) analysis for reduced IgG was carried out as described[30]. Briefly, IgG1 was diluted to 1 mg/mL in a denaturing buffer (7.5 M Gnd-HCl, 2 mM EDTA and 0.25 M Tris-HCl, pH 7.5) at room temperature. Reduction was accomplished by
10 mM DTT at room temperature for 30 min. Then 50 µL of the above samples was injected onto a TSKgel G3000 SWXL column (7.8 x 300 mm 5 µm) with an isocratic mobile phase of
141
0.1% TFA/H2O:ACN (80:20) and a flow rate of 0.2 mL/min. The column was set at room
temperature and the UV detector was at 280 nm.
4.3.4 Reduction, Alkylation, Tryptic Digestion and 18O-Labeling of IgG1
IgG1 was digested by trypsin similarly to the procedure described by Ren et al[31].
Briefly, IgG1 was diluted to 1 mg/mL in a denaturing buffer (7.5 M GndHCl, 2 mM EDTA and
0.25 M Tris-HCl, pH 7.5) to a final volume of 0.5 mL. Reduction was accomplished with the
addition of 3 μL of 0.5 M DTT followed by 30 min incubation at room temperature. S-
Carboxymethylation was achieved with the addition of 7 μL of 0.5 M IAA, and resulting mixture was incubated at room temperature in the dark for 15 min. Excess IAA was quenched with the addition of 4 μL of 0.5 M DTT. The reduced and alkylated IgG1 samples were subsequently
exchanged into the digestion buffer (0.1 M Tris-HCl at pH 7.5) using a NAP-5 size-exclusion column (GE Healthcare, Piscataway, NJ, USA). Next, two aliquots (200 µL each) were completely dried via SpeedVac and reconstituted separately into the same volume of 18O-water
or 16O-water; then 6 µL of 1 mg/mL trypsin in 18O-water or 16O-water solution, respectively, was
added to achieve a 1:25 (w/w) enzyme/substrate ratio. The reaction mixtures were incubated at
37 °C for 30 min.
Other proteolytic digestions of IgG1 (Asp-N, Trypsin combined with GluC, and Asp-N
combined with GluC) were performed in 16O-water only. Proteases were added to 100 µL of the
above buffer-exchanged antibody to achieve a 1:25 (w/w) enzyme/substrate ratio. The reaction
mixtures were incubated at 37 °C overnight.
142
Limited Asp-N digestion was performed by adding 6 µg of Asp-N into 300 µL digest (of
trypsin combined with GluC) and incubating at 37 °C for 1.5 hr for LC/CID-MS analysis. An
aliquot of 200 µL of the above digest was dried via SpeedVac and reconstituted into 40 µL of
water for LC/ETD-MS analysis.
4.3.5 HPLC
The proteolytic digests of IgG1 (25 μL) were separated on a Jupiter C5 column (250 x
2.0 mm, 5 μm, 300Å, Phenomenex, Torrance, CA, USA) at 50 °C with a flow rate of 200
μL/min on a HPLC system (Agilent 1100, Palo Alto, CA, USA). Mobile phase A was 0.1%
TFA in water (v/v) while mobile phase B contained 0.085% TFA in 90% ACN / 10% water. A
gradient was applied by holding at 2% B for 2 min, increasing to 22% B in 38 min, then 42% B
in 80 min, then 100% B in 25 min followed by holding at 100% B for 5 min. The column was
re-equilibrated at 2% B for 30 min prior to next injection.
For ETD analysis, digests of IgG1 (6 μL) were separated on a PROTO C4 column (150 x
1.0 mm, 5 μm, 300Å, Higgins Analytical, Mountain View, CA, USA) at 50 °C with a flow rate
of 60 μL/min on a HPLC system (Agilent 1100, Palo Alto, CA, USA). Mobile phase A was
0.1% FA / 0.02% TFA in water (v/v) while mobile phase B contained 0.1% FA / 0.02% TFA in
90% ACN / 10% water. The same gradient as described above was applied.
143
4.3.6 Mass Spectrometry
An LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific, San Jose, CA, USA) was
used in-line with a HPLC system for the analyses of the IgG1 proteolytic digests. A full MS
scan (with 60,000 resolution at m/z 400 and an automatic gain control (AGC) target value of
2x105) followed by data-dependent MS/MS scans of the three most abundant precursor ions was set up to acquire both the peptide mass and sequence information. The spray voltage was 5.5 kV, and the capillary temperature was 250 °C. The instrument was tuned using the doubly- charged ion of a synthetic peptide, bradykinin. The MS/MS spectra were obtained using CID with normalized collision energy of 35%. For MS/MS with ion detection in the Orbitrap, the
AGC target was set to 3x106, resolution to 7,500, and the precursor isolation width to 4 m/z unit.
Under our experimental conditions, the typical mass accuracy in full MS scan and FT MS/MS is
5 and 10 ppm, respectively.
ETD spectra were acquired on a Thermo-Scientific LXQ-XL mass spectrometer in
centroid mode with isolation width of 5, reaction time of 75 ms and reagent target value of
1x105, using singly charged fluoranthene anions as the ETD reagent. Both CID and ETD data
were analyzed for peptide identification, using a custom-written algorithm MassAnalyzer and verified manually[32-35].
144
4.4 Results and Discussion
A novel His-His crosslink in proteins has been discovered via our XChem-Finder workflow, without pre-defined crosslinking chemistry. Peptide mapping with mass spectrometry has established that the crosslink occurred across two identical His220 on each of the two heavy chains in the hinge region.
4.4.1 Detection of Crosslinked Protein.
Photo-induced non-reducible high molecular weight species were detected by reducing
SDS-PAGE; their intensities increased with longer light exposure (Figure 4-1A). Their formation was pH-dependent: less favorable under acidic conditions, such as pH ~5 for typical formulation of proteins (Figure 4-1B); and more favorable in neutral or basic buffers that are commonly used in protein production and purification (Figure 4-1B). The crosslinked species were also quantified by size exclusion chromatography (SEC) (Figure 4-1C and D). Mobile phase of 0.1% TFA/H2O:ACN (80:20) was used to avoid hydrophobic interaction with stationary phase[30]. The results from SEC and SDS-PAGE were consistent. The total amounts of the early elution peaks observed were at the level of 0.2, 4.5, 9.5, & 16.5% by peak area in the control sample and samples exposed to 1x, 2x, 3x ICH irradiation, respectively (Figure 4-1C).
The crosslinks were also observed to increase to 25.8% in 50 mM NaHCO3 pH 9.0, 15.7% in
50mM sodium phosphate pH 7.4, & 6.3% in 50 mM sodium acetate pH 4.8 (Figure 4-1D). It is interesting to note that the control sample (without light stress) already contained small yet detectable amount of crosslinking (0.4%, Figure 4-1D), suggesting such modifications could
145 occur during routine protein production and process. The chemical nature and site of crosslinking was discovered by our XChem-Finder workflow as detailed next[8].
146
0.16 C) Heavy chain Samples Total Crosslinks 0.14 (%) 0.12 3 x Light 16.5 2 x Light 9.5 Light chain 0.10 1 x Light 4.5 Control 0.2 Aggregate AU 0.08 7.5% 0.06 IgG1-H2O-3xLight 9.0% 5.1% 0.04 IgG1-H2O-2xLight 4.4% 2.9% 0.02 IgG1-H2O-1xLight 1.6% 0.2% IgG1-Control 0% 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 40.00 45.00 50.00 55.00 mAU Minutes 200 D) Heavy chain
175 Samples Total Crosslinks (%) Light chain 150 pH 9.0 25.8 pH 7.4 15.7 125 pH 4.8 6.3 Control 0.4 100 Aggregate 11.6% 75
50 5.8% IgG1-pH9.0-3xLight 14.2% IgG1-pH7.4-3xLight 9.9% 3.4% 25 IgG1-pH4.8-3xLight 2.9% IgG1-Control 0.3% 0 0.1% 10 20 30 40 50 min Minutes
Figure 4-1. Detection of crosslinking in IgG1 by reduced SDS-PAGE (A & B) and size exclusion chromatography
(SEC) (C & D). Samples with the corresponding lane numbers in parenthesis are molecular ladder (1), IgG1-control
(without stress) (2), IgG1-water-1xLight (3), IgG1-water-2xLight (4) & IgG1-water-3xLight (5); molecular ladder
(6), IgG1-control (7), IgG1-pH4.8-3xLight (8), IgG1-pH7.4-3xLight (9), & IgG1-pH 9.0-3xLight (10), respectively.
Formation of the high molecular weight bands increased with longer time light exposure and was more favorable under basic conditions than acidic conditions. His-His crosslinking of two heavy chains is likely to contribute to the band at ~100 kDa. The thioether crosslink between heavy chain and light chain is probably at the band of ~92 kDa.
Other high molecular weight bands could be due to other unknown crosslinking.
147
4.4.2 Detection of Crosslinked Peptides.
Tryptic digestion in 18O-water results in the incorporation of two 18O atoms in each of the
newly generated C-termini[36, 37]; hence, two 18O atoms for a linear tryptic peptide (with one
C-terminus) and four 18O atoms for a crosslinked peptide (with two C-termini)[38-40]. As
shown in Figure 4-2, the isotopic distribution of the peptide at m/z 1673.54 (quadruply charged,
monoisotopic mass 6687.149 Da) show mass shift of 8 Da (i.e., four 18O) in 18O-water compared
to that from 16O-water (Figure 4-2), indicating it contains two C-termini and is a crosslinked
peptide.
1673.7979 100 1673.5471 1674.0485 16O-water 80 1673.2970 1674.2987 60 1673.0456 1674.5492 40 1674.8000
Relative Abundance20 1672.7945 1675.0501 1675.3033 0 1675.5514 100 1675.8017 18O-water 1675.3006 80 1676.0523
1676.3026 60 +8Da 1675.0498
40 1676.5531 1674.7992 1676.8032
Relative Abundance 20 1674.5470 1677.0549 1674.2983 0 1673.0 1673.5 1674.0 1674.5 1675.0 1675.5 1676.0 1676.5 1677.0
Figure 4-2. Isotopic distributions of the crosslinked peptide S215-K244/S215-K244 m/z 1673.54
(z=4) from tryptic digestion of IgG1. A mass increase of 8 Da was observed when the sample was digested in 18O-water instead of regular 16O-water.
148
4.4.3 Elucidation of Crosslinking Chemistry.
The crosslinked peptide m/z 1673.54 underwent FT MS/MS analysis. As described in our previous paper[8], the fragment ions obtained were searched against the amino acid sequence of the IgG1 via FindPept to match all possible peptide fragments, see Table 4-1. Based on the peptide ladders observed, a partial sequence K218-K244
(KTHTCPPCPAPELLGGPSVFLFPPKPK, see Table 4-1) was identified. Then, the partial sequence was extended to a putative full-length tryptic peptide S215-K244
(SCDKTHTCPPCPAPELLGGPSVFLFPPKPK, 3336.587 Da). Since the fragment ions only matched this single peptide, we surmised that crosslinking occurred across the two identical peptides. The combined mass of the two unmodified (native) peptides is 6673.174 Da, which also satisfies the mass limitation conferred by the observed mass of the crosslinked peptide
(6687.149 Da, see Table 4-2).
In order to elucidate the crosslinking chemistry, elemental composition analysis of the crosslink was performed as illustrated in Table 4-2. The mass difference between the sum of the two native peptide chains and observed mass of the crosslinked peptide is 13.975 Da, for which three potential formula (O-2H, N, or CH2) were proposed. From a chemistry perspective, it is difficult to add just one nitrogen atom or a CH2 group. On the other hand, addition of one oxygen atom coupled with the loss of two hydrogen atoms (O-2H) indicates oxidation. The putative peptide chain K218-K244 contains His, of which oxidation and crosslinking have been reported[28, 29]. In addition, the formula O-2H gives the lowest mass error (0.004 Da).
Therefore, a potential His-His crosslinking structure is proposed as illustrated in Table 4-2 and
Figure 4-3 and verified as described next.
149
Table 4-1. Partial sequences that match the mass of fragmentation ions for the precursor ion m/z 1673.54 (z=4) (molecular mass at 6687.149 Da) eluted at
112.48 min. The longest peptide fragments for the observed b- and y-ions are underlined. All Cys are alkylated with IAA. The adjoining residues before cleavage are in parenthesis. Lys443 is the C-terminus of heavy chain. The crosslinked peptides were proposed to be HC:S215-K244/HC:S215-K244.
Corresponding Tryptic Peptide Notes Theor. User mass Δmass # m/z Charge mass peptide (Da) (ppm) Sequence Mass (Da)
1 566.364 1 565.356 565.359 4.8 (F)PPKPK/(D) (K)/215SCDKTHTCPPCPAPELLGGP SVFLFPPKPK244/(D) 3336.587 chain 1
2 1256.736 1 1255.729 1255.733 3.0 (G)PSVFLFPPKPK/(D)
3 1370.777 1 1369.770 1369.776 4.2 (L)GGPSVFLFPPKPK/(D)
4 1483.863 1 1482.856 1482.86 2.5 (L)LGGPSVFLFPPKPK/(D)
5 1823.042 1 1822.034 1822.039 2.8 (A)PELLGGPSVFLFPPKPK/(D)
6 1991.144 1 1990.136 1990.129 -3.4 (C)PAPELLGGPSVFLFPPKPK/(D)
7 1173.626 2 2345.238 2345.249 4.8 (C)PPCPAPELLGGPSVFLFPPK PK/(D)
8 1304.665 2 2607.315 2607.312 -1.2 (H)TCPPCPAPELLGGPSVFLFP PKPK/(D)
9 1487.750 2 2973.485 2973.513 9.4 (D)KTHTCPPCPAPELLGGPSVF LFPPKPK/(D)
10 1478.766 2 2973.528 2973.513 -4.9 (D)KTHTCPPCPAPELLGGPSVF LFPPKPK/(D)
Same as 11 992.341 1 1009.344 1009.348 4.0 (K)/SCDKTHTC(P) (K)/215SCDKTHTCPPCPAPELLGGP SVFLFPPKPK244/(D) above
12 1515.547 1 1532.55 1532.558 5.3 (K)/SCDKTHTCPPCPA(P) Same as above
too large 13 1605.255 2 3244.517 3244.493 -7.3 (K)/SRWQQGNVFSCSVMHEALHN HYTQKSL(S) (K)/411SRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK443 3813.810 as chain 2
150
Table 4-2. Deduction of elemental formula for the crosslinked S215-K244/S215-K244 peptide.
Name Mass (Da)
Calculated mass of S215-K244 (single chain) 3336.587
Sum of the mass of two unmodified chains 6673.174
Observed mass of the crosslinked peptide 6687.149
Mass difference +13.975
Proposed formula and calculated mass +O-2H: 13.979 +N: 14.003 +CH2: 14.016
(mass error in Da) (0.004) (0.028) (0.041)
Comments Most likely Unlikely Unlikely
N H Proposed structure N N - - O NH
4.4.4 Structural Confirmation by Mass Spectrometry
Full Scan and MSMS Analysis First, the calculated mass of the His-His crosslinked
peptide (6687.153 Da) is in good agreement with the observed mass (6687.149 Da, mass error
0.6 ppm, see Table 4-3). Second, the series of b- and y-ions are highly consistent with the
proposed structure (Figure 4-3). The observed y-ions from y5 to y24 and the b5 ions correspond
to fragment ions with no crosslinking site, while the y-ions from y27 and the b-ions from b13 to b24 are from fragments that contain the crosslinked histidine residues. These data support crosslinking at His 220. Moreover, y27* ion (in blue) and b8* ion (in blue) are peptide fragments resulting from cleaving the bond connecting the two crosslinked histidine residues
(see Figure 4-3). The missed cleavage by trypsin at Lys218 is likely due to its close proximity to 151 the crosslinking site at His220, reminiscent of similarly missed cleavages in the case of thioether crosslinking[9]. The second missed cleavage at Lys242 is likely due to the presence of adjacent proline residues. The two missed tryptic cleavages in the crosslinked peptide would have been especially challenging to handle by traditional database-dependent algorithms, again highlighting the utility of isotope labeling and our XChem-Finder workflow[8, 41, 42].
152
Table 4-3. Crosslinked peptides obtained from digestion of IgG1 by various proteases and the combination thereof. The crosslinking
sites are labeled in red and bold-face. All cysteines are alkylated with IAA. Peptides are shown with the amino acid residue position
in IgG1 in superscript and the adjoining amino acid residues before cleavage in parenthesis.
Name Proteases Crosslinked Peptides RT m/z Obs. Mass Theor. Mass Mass Error
(min) (Charge) (Da) (Da) (ppm)
S215-K244/ Trypsin (K)215SCDKTHTCPPCPAPELLGGPSVFLFPPKPK244(D) 112.48 1673.54 6687.149 6687.153 0.6
S215-K244 (K)215SCDKTHTCPPCPAPELLGGPSVFLFPPKPK244(D) (4+)
D217-K244/ Asp-N (C)217DKTHTCPPCPAPELLGGPSVFLFPPKPK244(D) 113.07 1549.53 6191.064 6191.060 0.6
D217-K244 (C)217DKTHTCPPCPAPELLGGPSVFLFPPKPK244(D) (4+)
S215-E229/ Trypsin + GluC (K)215SCDKTHTCPPCPAPE229(L) 41.19 1178.77 3531.289 3531.286 0.6
S215-E229 (K)215SCDKTHTCPPCPAPE229(L) (3+)
D217-E229/ Asp-N + GluC (C)217DKTHTCPPCPAPE229(L) 40.53 1013.41 3035.193 3035.193 0.0
D217-E229 (C)217DKTHTCPPCPAPE229(L) (3+)
D217-E229/ Trypsin + GluC; (C)217DKTHTCPPCPAPE229(L) 40.31 1096.09 3283.234 3283.240 1.9
S215-E229 then Asp-N (K)215SCDKTHTCPPCPAPE229(L) (3+)
153
1646.54 100 16O-water y27(4+) 80 y22(2+) 1582.92 1658.11 1174.15 y27*(2+) 60 1488.35 b19(3+) y13 b13(3+) 1812.50 y5 y24(2+) 1624.58 1305.90 1370.93 b18(3+) y19 40 566.53 y8 1793.53.50 y17 y12 1991.54 b24(3+) 973.95 1313.92 1823.25 b8* y11 b17(3+) 1992.54 b5-2H2O 1256.88 1774.42 1993.71 Relative Abundance Relative 20 b5 y6 992.58 557.40 y17(2+) y19(2+) 1922.33 593.38 713.63 912.51 996.86 0 1660.13 100 18O-water y27(4+) 80 y22(2+) y27*(2+)1584.81 b19(3+) 1813.94 1176.07 1490.23 60 b13(3+) b24(3+) y24(2+) 1625.47 b18(3+) 1307.02 y13 1794.61.50 y19 y5 y8 1374.90 y17 1995.34 40 570.53 y17(2+) 977.77 y11 b17(3+) 914.43 b8* 1260.92 y12 1775.841827.23 b5-2H2O 992.52 Relative Abundance 20 y6 1317.90 557.38 b5 974.53 y19(2+) 593.40 717.61 1923.58 831.52 998.74 0 600 800 1000 1200 1400 1600 1800 2000
Figure 4-3. CID MS/MS spectra of the quadruply charged precursor ions m/z 1673.54 (16O- labeled C-termini) and 1675.54 (18O-labeled C-termini) of the crosslinked tryptic peptide S215-
K244/S215-K244. Characteristic mass shift imparted by the heavier isotope 18O was observed
(e.g., the mass shift of 4 Da for y5 ions in 16O- vs 18O-water, 566.53 vs 570.53). The y27* ion results from cleavage of the His-His bond while the y27 ion contains the crosslinking site. MS3 spectrum of the y27* ion (m/z 1488.35) is shown in Figure 4-4.
154
Additional confirmation by 18O/16O-isotope fragment ions pattern. Since the
fragment ions containing no (zero), one, or two C-termini of the crosslinked peptides displayed a mass shift of 0, 4, and 8 Da, respectively, in the corresponding MS/MS spectra obtained from
18O- and 16O-water, the examination of mass shift of fragment ions can lend further support for the assignment of fragment ions. For example, as shown in Figure 4-3, b-ions prior to the crosslinking site (e.g., b5) have no mass shift between the 18O-water and 16O-water digests. On the other hand, the y-ions without the crosslinking site (e.g., y5) gave mass shift of 4 Da. All assignments were verified by their distinct mass shift in 18O, depending on the number of C-
termini they contain.
Additional confirmation by MS3 analysis. Several abundant fragment ions shown in
Figure 4-3 were selected for MS3 analysis which simplified and further confirmed data
interpretation. For example the fragment ion m/z 1488.35 shown in Figure 4-3 could not be
assigned initially, so it was selected for MS3 analysis (Figure 4-4). The analysis established that it was the y27* ion (in blue) generated from cleaving the bond connecting the two crosslinked histidine residues.
155
y22(2+) 1174.18 100 y26(2+) 90 1424.16
80 y25(2+) M-2H2O(2+) 70 1373.68 y5 1470.42 M-H2O(2+) 60 566.58 y26-H2O(2+) 1479.65 1415.31 y17 50 y19 1824.25 y23(2+) 1992.38 40 b22(2+) Relative Abundance Relative b5 1205.52 y24(2+) 30 y5-H2O 1305.22 629.39 1152.74 b14 y8 1254.45 548.54 1605.03 20 b5-H2O 973.77y19(2+) 611.54 y6 y7 996.49 b15 1125.88 10 713.47 826.85 902.69 1975.32 1077.60 1662.18 438.29 742.63 1522.14 1901.53 0 600 800 1000 1200 1400 1600 1800 2000 m/z
Figure 4-4. MS3 spectrum of the doubly charged fragment ion m/z 1488.35 obtained from
MS/MS of the precursor ion m/z 1673.54 in Figure 4-3.
Peptide mapping with multiple proteases. Since this is the first report of His-His crosslinking in a protein, peptide mapping with additional proteases was carried out to glean complimentary data[43, 44]. In additional to trypsin, proteases with different sequence specificity (e.g. Asp-N[45-47] or GluC [48]) and combined proteases (e.g., trypsin with GluC, and Asp-N with GluC) were employed. Additional crosslinked peptides containing His220 were detected and analyzed: D217-K224/D217-K224 from Asp-N, S215-E229/S215-E229 from trypsin with GluC, and D217-E229/D217-E229 from Asp-N and GluC, respectively.
In each case, the observed mass was in good agreement with its theoretical mass with mass errors ranging from 0.0-0.6 ppm (see Table 4-3). The y- and b-ions were also consistent with the corresponding structure (Figure 4-5, 4-6, & 4-7). Similar to the tryptic peptide, the
156 crosslinking site and chemistry were further supported by the presence of several ions generated from cleaving the bond connecting the two crosslinked histidine residues, such as the doubly charged ion at m/z 880.67, the singly charged ion at m/z 1773.77, and the singly charged ion at m/z 992.52 (b8*) shown in Figure 4-5 (all highlighted in blue).
b13(2+) 1645.83 100
90
80
70 Ion 1759(2+) 60 880.67
50 y7 1157.97 40 768.43 b11(2+)
Relative Abundance Ion 1773 30 y12(3+) 1561.78 1057.61 b8(2+) 1773.77 1383.36 b13-H2O(2+) 20 b13(3+) 1636.17 Ion 1759-H2O(2+) b8* 1523.79 1097.01 1759.78 10 y4 b5 871.73 y8992.52b6(2+) 1338.10 413.34 929.49 593.41 671.44 1252.25 1905.85 1999.06 0 400 600 800 1000 1200 1400 1600 1800 2000
Figure 4-5. CID MS/MS spectrum of the triply charged precursor ion m/z 1178.77 of the crosslinked S215-E229/S215-E229 peptide generated from combined trypsin and GluC digestion. The b8* ion results from cleavage of the His-His bond while the b8 ion contains the crosslinking site.
157
y27(4+) y27*(2+)1520.81 100 1488.36 y22(2+) 90 1174.08 80 b23(3+) 1877.11 70 b23-H2O(3+) b16(3+) 1871.14 60 y13 1627.84 1370.90 b23-2H2O(3+) y5 1865.69 50 566.48 b17-H2O(3+) b15(3+) 1641.47 y17 b11-H2O(3+) 1609.74 40 1823.22 1536.95 y11 1452.27 y19 Relative Abundance 30 1256.87 y12 b9(3+) 1992.44 b6* y19(2+) 1313.88 1402.12 b17(3+) 20 y8 996.60 1646.80 b6(2+) y5-H2Oy6744.46 y17(2+) b20(3+) 973.74 1924.62 10 548.47 713.62 912.49 1741.77 826.69 1099.60 0 600 800 1000 1200 1400 1600 1800 2000
Figure 4-6. CID MS/MS spectrum of the quadruply charged precursor ions m/z 1549.53 of the crosslinked peptide D217-K244/D217-K244 from Asp-N digestion.
158
y12 y7 y4
DKTHTCPPCPAPE b 6 b9 b11 1525 N H N N 1511 O NH y12 y7 y4
DKTHTCPPCPAPE
b6 b9 b11
b11(2+) 1396.95 100
90
80
70
60 b11-H2O(2+) 50 1388.88
40
Relative Abundance 30 Ion 1511(2+) b9(2+) Ion 1511 756.60 y7 1511.73 20 b6(2+) 1312.90 768.44 992.53 Ion 1525 y12(3+) 1135.34 1525.71 y4 974.88 1267.61 10 899.48 413.31 747.73 1089.43 1190.86 0
Figure 4-7. CID MS/MS spectrum of the triply charged precursor ions at m/z 1013.41 of the
crosslinked peptide D217-E229/D217-E229 from digestion with Asp-N and GluC.
ETD MS/MS analysis. As an alternative fragmentation technique, ETD provides sequence information complimentary to that obtained from CID by cleaving a peptide backbone in a less selective manner than CID [34, 49, 50]. Higher charge state ions usually generate more effective ETD fragmentation[50], therefore formic acid instead of TFA was used in the mobile phase to increase charge state for more effective ETD fragmentation and to minimize ion
suppression. All ETD MS/MS spectra were collected with supplemental activation and
159 dominated by charge reduced species. The charge states of 5, 6, 4, and 4 for the peptide S215-
K244/S215-K244, D217-K244/D217-K244, S215-E229/S215-E229, and D217-E229/D217-
E229, respectively, offered optimal ETD fragmentation for each crosslinked peptide (Figure 4-8,
4-9, 4-10 & 4-11). While different than those from CID, the fragmentation patterns from ETD also support our proposed crosslinking site and chemistry. For instance, the c5 and c*8 ions in
Figure 4-8 narrow the site within the HTC motif; the c4 ion in Figure 4-9, c5 and c6 ions in
Figure 4-10, and c3 and z8 ions in Figure 4-11 pinpointed the crosslink at His220.
M(5+) 1339.65 8
7 z18(2+) 939.75
6
5 z.8 c8* 958.60 4 1023.38 z.12 z7 1298.83 3 810.49 z.15 . 1583.22 Relative Abundance z 6 . 698.47 . z 13 2 z16++ z 10 855.46 z.9 1356.82 . 1144.63 z.14 c5 z 15++ 1058.76 792.39 y7 1469.11 1 y5 610.26 z3 c4 z14++ 827.53 1173.63 566.47 734.71 1546.26 356.18 509.37 645.29 0 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600
Figure 4-8. ETD MS/MS spectrum of the precursor ion m/z 1339.70 (z=5) of the crosslinked tryptic peptide S215-K244/S215-K244.
160
M(6+) 1031.49 1546.33 M-58(4+) 1535.23 30
25 1517.39
c21(3+) 20 1795.12
M-58(5+) M(4+) 15 M-58(6+) 1023.63 1228.17 c8(3+) 1549.46 c4(3+) 1353.73 b9(3+) . Relative Abundance c 16(3+) 10 y21(2+)1200.93 1401.83 z18(2+) c20(3+) 1125.33 M(5+) 1634.09 z12(2+) z15(2+) c10(3+) 939.92 1239.92 1746.67 649.65 791.27 z7 z8 y22(2+) 1440.31 1600.72 5 z.6 z3 y5 810.40 957.44 1173.79 z13(3+) 698.47 z16(2+) 355.92 566.50 855.21 280.20 452.56 0 400 600 800 1000 1200 1400 1600 1800
Figure 4-9. ETD MS/MS spectrum of the precursor ion m/z 1033.35 (z=6) of the crosslinked peptide D217-K244/D217-K244 from Asp-N digestion.
161
z13 z11 z10
SCDKT H TCPP C PA P E
c4 c5 c9 c10 c11 c12 c14 N H N N O NH z13 z11
SCDKTHTCPPCPAPE c c4 c5 c6 c9 c10 c11 c12 14
M(4+) 884.46 100 90 A) 80 M(3+) 1178.74 70 60 50 M-58(3+) 40 1159.17
Relative Abundance30 M(2+) 1768.80 20 c14(3+) M-58(2+) 1950.97 1135.17 1739.29 10 c4 c5 y7 c6(3+) y13(3+) z11(2+) 1708.90 1927.22 428.43 509.41 610.20769.13 839.42 894.73 1096.20 1323.19 1513.75 0 200 400 600 800 1000 1200 1400 1600 1800 m/z M(4+) c14(3+) 884.46 1135.17 B) 140 C) 1.0 . 120 z 13(2+) 1635.96 0.8 y13(3+) 100 1096.20 . 0.6 80 z 11(2+) 1513.75 60 . . 0.4 c6(3+) z 10(2+) c 12(2+) 1658.24 Relative Abundance Relative Abundance 839.42 40 1463.18 1618.91 c5 y7 . c11(2+) y6 c 9(2+) c10(2+) 1596.47 0.2 1548.33 c4 610.20769.13 894.73 20 1488.78 1569.64 672.22 1037.73 1441.21 1542.06 428.43 509.41 970.36 0.0 0 400 500 600 700 800 900 1000 1100 1450 1500 1550 1600 1650 m/z m/z
Figure 4-10. ETD MS/MS spectrum of the precursor ion m/z 884.33 (z=4) of the crosslinked peptide S215-E229/S215-E229 from digestion with trypsin and GluC.
162
M(4+) 760.36 22 M-58(3+) 994.14 20 A) 18 1994.11 16 14 12 982.94 10 c12(3+) 969.85 z13(3+) 8 z13(2+) 804.41 1007.62 Relative Abundance 6 1511.12 1944.95 z8 z10(2+) 4 913.35 c2 c3 y5 1474.13 y4 1338.87 1884.37 2 362.36 955.19 1138.92 261.15 413.21 574.46 715.95 1241.51 1608.81 1779.61 0 200 400 600 800 1000 1200 1400 1600 1800 2000 m/z M(4+) 760.36 804.41
1.8 120 B) C) M-58(2+) z13(2+) 1.6 y7 1490.14 1511.12 100 1.4 768.30 1.2 80 c10(2+) b12(2+) 1369.87 1.0 1445.81 c3 60 0.8 z10(2+) . 362.36 z8 z 11(2+) 1474.13 1338.87 1389.95 z.12(2+) Relative Abundance Relative Abundance 0.6 913.35y8 40 a10(2+) 1430.48 c2 1347.95 b11(2+) 1454.37 M(2+) 0.4 z3 929.45 261.15 y6 1397.24 300.14 y4 y5 715.95 20 1519.59 0.2 671.27 894.23 413.21513.34 574.46 820.48 0.0 0 300 400 500 600 700 800 900 1350 1400 1450 1500 1550 m/z m/z
Figure 4-11. ETD MS/MS spectrum of the precursor ion m/z 760.31 (z=4) of the crosslinked
peptide D217-E229/D217-E229 from digestion with Asp-N and GluC.
4.4.5 Mechanism of formation for His-His crosslink.
Photo-oxidation and crosslinking between histidine residues have been studied using both free histidine and model peptides. The commonly accepted mechanism is depicted in Scheme 4-
163
1. Singlet oxygen (e.g., generated from photoactivated dye rose bengal[51]) reacts with histidine
to form a highly reactive and labile endoperoxide intermediate, which converts into a hydroperoxide intermediate and then 2-oxo-histidine (2-oxo-His) and His+32 intermediates.
Subsequently, the His+32 intermediate can be attacked by the nucleophilic imidazole of another histidine residue; followed by the elimination of a water molecule to give the final crosslinking product (Scheme 4-1)[16, 27-29, 52, 53]. As discussed below, our results are consistent with this mechanism.
N N NH N H N O2, hv N O NH NH
His His-His crosslink
1 O2 -H2O
N
NH O N N O N HO OH Endoperoxide NH
His Nucleophilic NNH addition H N O N N OH O HO O NH H N NH Hydroperoxide 2-oxo-His His+32
Scheme 4-1. Proposed mechanism for the formation of His-His crosslink via photo-oxidation intermediates.
164
First, oxygen was present in all buffers and water in which IgG1 was exposed to light
irradiation. Second, several photo-oxidation intermediates were observed. The endoperoxide
intermediate is unstable and has only been observed by low-temperature NMR study[16, 27], so
we are not surprised that it was not detected by our LC-MS analysis. However, the subsequent
oxidation intermediates, 2-oxo-His (+14 Da) and His+32 species (+32 Da), were detected. The peptides with masses 14 and 32 Da greater than the unmodified peptide, S215-K244 (
SCDKTHTCPPCPAPELLGGPSVFLFPPKPK), were observed in the light stressed samples but not in the control sample (Table 4-4). Tandem mass spectra confirmed their structures to be the peptides modified at His220 (Figures 4-12 and 4-13). Third, the reported model studies showed the crosslinking was favored at higher pH, as the neutral (deprotonated) imidazole in histidine
(pKa ~6) is more reactive for nucleophilic attack and thus results in a higher yield of crosslinking[54]. Similar pH dependence was observed in our case as discussed above (Figure
4-1). Lastly, the two His220 residues are juxtaposed in the hinge region, which is highly exposed to solvent and flexible, as illustrated in Figure 4-14. In fact, in most crystal structures, the side chains of residues in the hinge region could not be located, indicating a high degree of flexibility. In this illustrative structure (PDB 1HZH), side chain of only one histidine residue was observed.
165
Table 4-4. Peptides containing the 2-oxo-His (+14 Da) and His+32 (+32 Da) intermediates observed in the stressed IgG1. They were not detected
in the control. His220 residues are labeled in red. All cysteine are alkylated with IAA. Peptides are shown with the amino acid residue position
in IgG1 in superscript and the adjoining amino acid residues before cleavage in parenthesis. The level of each peptide is determined by peak area
of modified peptide over normal tryptic peptide (T219-K224) with the consideration of all charge states.
# Peptide Name Peptide Sequence RT m/z Obs. Mass Theor. Mass Mass Level by
(min) (Charge) (Da) (Da) Error Peak Area
(ppm) (%)
1 T219-K224 (K)219THTCPPCPAPELLGGPSVFLFPPKPK244(D) 93.98 949.82 2845.422 2845.418 1.3 100
(3+)
2 S215-K224 (K)215SCDKTHTCPPCPAPELLGGPSVFLFPPKPK244(D) 92.17 1113.87 3336.591 3336.587 1.1 83.1
(3+)
3 2-oxo-His (K)215SCDKTH(+14)TCPPCPAPELLGGPSVFLFPPKPK244(D) 93.48 1118.20 3350.570 3350.566 1.1 2.3
(3+)
4 His+32 (K)215SCDKTH(+32)TCPPCPAPELLGGPSVFLFPPKPK244(D) 93.00 1124.20 3368.578 3368.577 0.2 0.5
(3+)
5 S215-K244/ (K)215SCDKTHTCPPCPAPELLGGPSVFLFPPKPK244(D) 112.48 1673.54 6687.149 6687.153 0.6 4.4
S215-K244 (K)215SCDKTHTCPPCPAPELLGGPSVFLFPPKPK244(D) (4+)
166
y22(2+) 1174.07 100
90
80
70
60
50
40 y5 b8 y26(2+) 566.48 1006.50 y13 1431.03
Relative Abundance y17(2+) 30 y13(2+) 912.46 y19(2+) M-2H2O(3+) 1370.91 686.32 y14(2+) y7 1106.93 b25(2+) 20 742.81 826.73 y8996.77 b13 P14_F25 1394.29 y11(2+) y6 973.74 y28(3+) y141529.80 y5-H O y15(2+) 1257.96 y12 b16 10 2 713.52 799.78 1036.18 1314.01 1484.11 y15 b15 b17 548.54 629.23 1982.12 323.38 1597.04 1756.92 1869.06 0 400 600 800 1000 1200 1400 1600 1800 2000 m/z
Figure 4-12. CID MS/MS spectrum of the tryptic peptide containing the 2-oxo-His (+14 Da) intermediate (m/z=1118.20, z=3, mass=3350.570 Da).
167
y22(2+) 1174.11 100 1104.15
90
80
70 y5 566.52 y19(2+) 60 996.41 y12(2+) 1393.91 50 657.50 y17(2+) 1371.84 b8-H2O y11(2+) 912.48 40 629.23 1006.49 1306.94 Relative Abundance 30 y8 y11 1669.62 b5 1256.81 593.48 y13(2+) 973.95 20 b7 b4 686.00 y6 863.54 713.61 1566.12 10 492.39 1781.02 1937.86 425.50 0 400 600 800 1000 1200 1400 1600 1800 2000 m/z
Figure 4-13. CID MS/MS spectrum of the tryptic peptide containing the His+32 intermediate
(m/z=1124.20, z=3, mass=3368.578 Da).
168
A)
His237
B)
Figure 4-14. (A) Space filling illustration of the hinge region of IgG1 antibody (DKTHTCPPCP); the underlined and bold residues are shown in color; the atoms are shown in color: oxygen in red, nitrogen in dark blue, carbon in light blue and sulfur in yellow. The image is rendered based on PDB 1HZH using the VMD software (Visual Molecular Dynamics). In this structure, His237 is equivalent to the His220 described in the paper. (B) Three-dimensional (3D) structure of an IgG1 (PDB entry 1HZH[55]). The characteristic hinge region sequences (SCDKTHTCPPC) of two heavy chains of IgG1 are circled. The first cysteine is disulfide bonded with the C-terminal cysteine in light chain. The other two cysteines form inter-heavy chain disulfide bridges. Two heavy chain histidines (His237) are located in the hinge region which is very flexible and highly solvent accessible. The model of 1HZH is being used for illustrative purposes only; in this structure, His237 is equivalent to the His220 discussed in our paper.
169
Based on the reaction pathway and protein structure, crosslinking of lysine with the
oxidized histidine via nucleophilic addition is also plausible[16, 27], and Lys218 is in the
vicinity of His220. Therefore, great effort was made to determine whether the crosslinking is
His220-His220 or Lys218-His220. This is particularly challenging due to the pseudo-symmetry
in the crosslinked peptide; in other words, when the two chains share identical sequence (e.g., in
Figure 4-3, 4-5, and 4-8), any fragment ion could come from either one chain or both. For
example, c4 and c5 ions in Figure 4-8 indicated the existence of unmodified Lys218 and Thr219,
but could not unambiguously establish whether they were from one chain or both. To address
this issue, an asymmetric crosslinked peptide (i.e., two chains of different length) was generated
via limited digestion. IgG1 fully digested by trypsin and GluC was treated with Asp-N for a
limited time to obtain a crosslinked peptide with two different chains D217-E229/S215-E229
(Figure 4-15 and 4-16). Its precursor ion m/z 1096.09 (z=3) has an observed mass of 3283.234
Da, which is in agreement with the theoretical mass of 3283.240 Da (Table 4-3). As shown in
Figures 4-15 and 4-16, cleavage of the His-His bond resulted in ions m/z 1511 and m/z 1773,
indicating that the oxidized His residue is on the long chain highlighted in red. Moreover, the c2
and c3 ions from the short chain (highlighted in blue) together with the c2 to c5 ions from the
long chain (highlighted in red) indicate the absence of modification for all residues N-terminal to
His220 on both chains, thus ruling out crosslinking between Lys218 and His220. This is not
unexpected, as at the pH for our studies, the amine on the lysine side chain is mostly protonated
and thus renders it unreactive[56, 57]. And, of course, others factors such as local environment
and solvation are known to modulate reactivities in enzymes and antibodies[58-60]. Taken
together, our data have firmly established that the crosslinking is between the two heavy chain
His220 residues.
170
M-58(3+) M(4+) M-58(2+) . 822.331065.98 1076.20 1614.51 z 13(2+) 8 . c14(3+)1052.41 z 15(2+) 1635.47 7 A) 1999.43
6 z13(3+) 5 z15(3+) Ion 1511 1975.81 . 4 1090.30 z 13(2+) ion 1773-14(2+) 1511.42 1963.08 3 y7 Ion 1511(2+) 928.32 M(3+) z12(2+) c10(2+)
Relative Abundance c4(2+) c12(2+) y7 c12(3+) 1096.05 2 y6 c6(2+) z.11(2+)1493.91 y3 c5 880.28 996.13 1918.49 c3 c4 804.49 1230.36 1 y3 y6 1136.05 z10(2+)1389.57 1761.61 509.27 610.29 316.25 362.20 671.39 1294.18 0 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 m/z
M(4+) M-58(2+)z.15(2+) 822.33 Ion 1511 1614.51 1635.47 1.2 . . 1605.89 z.13(2+) Ion 1773-14(2+)25 z 13(2+) z 12(2+) B) y7 880.28 C) c12(2+) 1511.42 1598.54 1.0 y7 c14(2+) 1621.72 804.49 1578.10 768.34 20 Ion 1511(2+) z10(2+) 0.8 756.49 z.11(2+) 1585.57 1626.64 b6(3+) c10(2+) b12(2+) 15 1513.41 c2 c5 b8(3+) . c12(2+) b14(2+) 0.6 z 12(2+) 266.25 z3 610.29 839.93 1569.30 c8(2+) 1453.90 1493.91 z3 c4 10 c10(2+) . 1462.89 1548.43 0.4 300.29 y5 z 11(2+) 1430.51 1644.44 Relative Abundance Relative Abundance 509.27 1472.31 c2 y3 c3 y6 z10(2+) 1527.51 261.41 362.20 y5 1389.57 y3 c3 y4 y6 5 1338.55 0.2 y2 381.11 574.26 b11 316.25 671.39 y2 y4466.25 548.43 1365.07 1398.58 245.21 338.22 413.26 0.0 0 300 400 500 600 700 800 1350 1400 1450 1500 1550 1600 1650 m/z m/z
Figure 4-15. ETD MS/MS spectrum of the quadruply charged precursor ion m/z 821.09 of the
crosslinked peptide D217-E229/S215-E229 generated by limited Asp-N digestion of fully digested IgG1 by trypsin and GluC.
171
b11(2+) b13(2+) 1520.88 100
90
80
70
60
50 b13-H2O(2+) b11-H2O(2+) 40 1512.11 y7 Relative Abundance 30 y7 y10(3+) 768.38 1075.75 b11(2+) Ion 1773-14 Ion 1511(2+) Ion 1773-14(2+) y12(3+) b8(2+) b9(2+) 20 1759.60 y4 756.64 880.51 1058.15 b6(2+) 1436.90 Ion 1773 y4 1259.371399.10 980.29 10 1213.36 1773.63 413.23 524.28 672.27 1729.74 1952.85 0 400 600 800 1000 1200 1400 1600 1800 2000
Figure 4-16. CID MS/MS spectrum of the triply charged precursor ions at m/z 1096.09 of the
crosslinked peptide D217-E229/S215-E229 from limited digestion by Asp-N of the fully
digested IgG1 by GluC and trypsin.
4.4.6 Other Crosslinks.
As reported in literature[9-11], crosslinking via thioether between the heavy chain hinge region and the light chain C-termini (HC:S215-K218/LC:T211-S218, SCDK/TVAPTECS) was
also observed in the photo-irradiated IgG1 (Table 4-5). The cleavage and formation of carbon- 172
sulfur (C-S) bonds may occur via either homolytic (e.g., radical or photo-induced) or heterolytic
(e.g, elimination and addition) mechanisms[10, 11, 61, 62]. These additional crosslinks may also
account for the multiple non-reducible higher molecular species detected by SDS-PAGE and
SEC described above (Figure 4-1). In the SDS-PAGE gel (Figure 4-1), the first band (with an
apparent molecular weight about 92 kD) for the sample pH9-3xLight (lane 10) was not observed
in the samples pH4.8-3xLight (lane 8) and pH7-3xLight (lane 9). It is likely that this band
corresponds to the thioether crosslink of LC-HC, as it is favorable under basic conditions[9, 10].
Although the thioether crosslink between two heavy chains (e.g. Cys222-Cys222) has been
reported after a higher dose of photo irradiation[11], it was not detected in our sample by MS.
The His-His crosslink of two heavy chains may contribute to the band with an apparent
molecular weight about 100 kDa. The bands with apparent molecular weight about150 and 200
kDa are probably due to the crosslinking of more than two chains.
Table 4-5. Thioether crosslinks detected in IgG1. The second thioether crosslink has two
missed trypsin cleavages at Lys218 and Lys242 probably due to nearby thioether crosslink site
and proline amino acid residues, respectively.
Theo. Mass RT m/z Obs. Mass Level Sequence Mass Error (min) (charge) (Da) (%) (Da) (ppm)
LC:T211-S218/ 211TVAPTECS218/ 612.77 22.11 1223.535 1223.534 0.7 0.1 HC:S215-K218 215SCDK218 (2+)
LC:T211-S218/ 211TVAPTECS218/ 1351.99 90.97 4050.947 4050.942 1.2 10.1 HC:S215-K244 215SCDKTHTCPPCPAPELLGGPSVFLFPPKPK244 (3+)
173
4.5 Conclusions
Our XChem-Finder workflow again leads to the discovery of an undefined and novel
protein His-His crosslink, demonstrating its broad applicability and utility. Since the His-His
crosslink is found in the highly conserved hinge region of IgG1, this modification most likely exists in other IgG1 molecules. As discussed above, a low level of crosslinking was present even without light stress, suggesting protein crosslinking in therapeutic proteins is perhaps more common than we have appreciated. Such drastic modification of proteins is likely to affect product quality, clinical efficacy, and even at low abundance, immunogenicity. And again, to the best of our knowledge, there is no other alternative systematic approach that can be generally used to fully characterize protein crosslinking without a priori knowledge of the chemistry and site. With the rapid advancement in mass spectrometric techniques (e.g. high resolution and complementary fragmentation mechanisms), we expect the discovery and elucidation of other new protein crosslinking by our XChem-Finder approach will be equally successful.
4.6 References
[1] Wilhelmus MM, Grunberg SC, Bol JG, van Dam AM, Hoozemans JJ, Rozemuller AJ, et al. Transglutaminases and transglutaminase-catalyzed cross-links colocalize with the pathological lesions in Alzheimer's disease brain. Brain Pathol 2009;19:612-22.
[2] Wang SS, Wu JW, Yamamoto S, Liu HS. Diseases of protein aggregation and the hunt for potential pharmacological agents. Biotechnol J 2008;3:165-92.
[3] Nemes Z, Devreese B, Steinert PM, Van Beeumen J, Fesus L. Cross-linking of ubiquitin, HSP27, parkin, and alpha-synuclein by gamma-glutamyl-epsilon-lysine bonds in Alzheimer's neurofibrillary tangles. FASEB J 2004;18:1135-7.
174
[4] Balasubramanian D, Du X, Zigler JS, Jr. The reaction of singlet oxygen with proteins, with special reference to crystallins. Photochem Photobiol 1990;52:761-8.
[5] Liu H, Gaza-Bulseco G, Faldu D, Chumsae C, Sun J. Heterogeneity of monoclonal antibodies. J Pharm Sci 2008;97:2426-47.
[6] Beck A, Wagner-Rousset E, Ayoub D, Van Dorsselaer A, Sanglier-Cianferani S. Characterization of therapeutic antibodies and related products. Anal Chem 2013;85:715- 36.
[7] Lispi M, Datola A, Bierau H, Ceccarelli D, Crisci C, Minari K, et al. Heterogeneity of commercial recombinant human growth hormone (r-hGH) preparations containing a thioether variant. J Pharm Sci 2009;98:4511-24.
[8] Liu M, Zhang Z, Zang T, Spahr C, Cheetham J, Ren D, et al. Discovery of undefined protein cross-linking chemistry: a comprehensive methodology utilizing (18)o-labeling and mass spectrometry. Anal Chem 2013;85:5900-8.
[9] Tous GI, Wei Z, Feng J, Bilbulian S, Bowen S, Smith J, et al. Characterization of a novel modification to monoclonal antibodies: thioether cross-link of heavy and light chains. Anal Chem 2005;77:2675-82.
[10] Cohen SL, Price C, Vlasak J. Beta-elimination and peptide bond hydrolysis: two distinct mechanisms of human IgG1 hinge fragmentation upon storage. J Am Chem Soc 2007;129:6976-7.
[11] Mozziconacci O, Kerwin BA, Schoneich C. Exposure of a monoclonal antibody, IgG1, to UV-light leads to protein dithiohemiacetal and thioether cross-links: a role for thiyl radicals? Chem Res Toxicol 2010;23:1310-2.
[12] Wang Z, Rejtar T, Zhou ZS, Karger BL. Desulfurization of cysteine-containing peptides resulting from sample preparation for protein characterization by mass spectrometry. Rapid Commun Mass Spectrom 2010;24:267-75.
[13] Fujimori E. Crosslinking and photoreaction of ozone-oxidized calf-lens alpha-crystallin. Invest Ophthalmol Vis Sci 1982;22:402-5.
[14] Srivastava OP, Kirk MC, Srivastava K. Characterization of covalent multimers of crystallins in aging human lenses. J Biol Chem 2004;279:10901-9.
[15] Lopez B, Gonzalez A, Hermida N, Valencia F, de Teresa E, Diez J. Role of lysyl oxidase in myocardial fibrosis: from basic science to clinical aspects. Am J Physiol Heart Circ Physiol 2010;299:H1-9.
175
[16] Shen HR, Spikes JD, Kopeckova P, Kopecek J. Photodynamic crosslinking of proteins. II. Photocrosslinking of a model protein-ribonuclease A. J Photochem Photobiol B 1996;35:213-9.
[17] Kroon DJ, Baldwin-Ferro A, Lalan P. Identification of sites of degradation in a therapeutic monoclonal antibody by peptide mapping. Pharm Res 1992;9:1386-93.
[18] Van Buren N, Rehder D, Gadgil H, Matsumura M, Jacob J. Elucidation of two major aggregation pathways in an IgG2 antibody. J Pharm Sci 2009;98:3013-30.
[19] Sinz A. Investigation of protein-protein interactions in living cells by chemical crosslinking and mass spectrometry. Anal Bioanal Chem 2010;397:3433-40.
[20] Petrotchenko EV, Borchers CH. Crosslinking combined with mass spectrometry for structural proteomics. Mass Spectrom Rev 2010;29:862-76.
[21] Walzthoeni T, Leitner A, Stengel F, Aebersold R. Mass spectrometry supported determination of protein complex structure. Curr Opin Struct Biol 2013;23:252-60.
[22] Singh P, Panchaud A, Goodlett DR. Chemical cross-linking and mass spectrometry as a low-resolution protein structure determination technique. Anal Chem 2010;82:2636-42.
[23] Tang X, Bruce JE. Chemical cross-linking for protein-protein interaction studies. Methods Mol Biol 2009;492:283-93.
[24] Bruce JE. In vivo protein complex topologies: sights through a cross-linking lens. Proteomics 2012;12:1565-75.
[25] Singh P, Shaffer SA, Scherl A, Holman C, Pfuetzner RA, Larson Freeman TJ, et al. Characterization of protein cross-links via mass spectrometry and an open-modification search strategy. Anal Chem 2008;80:8799-806.
[26] Wang X, Das TK, Singh SK, Kumar S. Potential aggregation prone regions in biotherapeutics: A survey of commercial monoclonal antibodies. MAbs 2009;1:254-67.
[27] Shen HR, Spikes JD, Kopecekova P, Kopecek J. Photodynamic crosslinking of proteins. I. Model studies using histidine- and lysine-containing N-(2- hydroxypropyl)methacrylamide copolymers. J Photochem Photobiol B 1996;34:203-10.
[28] Agon VV, Bubb WA, Wright A, Hawkins CL, Davies MJ. Sensitizer-mediated photooxidation of histidine residues: evidence for the formation of reactive side-chain peroxides. Free Radic Biol Med 2006;40:698-710.
[29] Pattison DI, Rahmanto AS, Davies MJ. Photo-oxidation of proteins. Photochem Photobiol Sci 2012;11:38-53.
176
[30] Liu H, Gaza-Bulseco G, Chumsae C. Analysis of reduced monoclonal antibodies using size exclusion chromatography coupled with mass spectrometry. J Am Soc Mass Spectrom 2009;20:2258-64.
[31] Ren D, Pipes GD, Liu D, Shih LY, Nichols AC, Treuheit MJ, et al. An improved trypsin digestion method minimizes digestion-induced modifications on proteins. Anal Biochem 2009;392:12-21.
[32] Zhang Z. Prediction of collision-induced-dissociation spectra of peptides with post- translational or process-induced modifications. Anal Chem 2011;83:8642-51.
[33] Zhang Z. Prediction of low-energy collision-induced dissociation spectra of peptides. Anal Chem 2004;76:3908-22.
[34] Zhang Z. Prediction of electron-transfer/capture dissociation spectra of peptides. Anal Chem 2010;82:1990-2005.
[35] Zhang Z. Prediction of low-energy collision-induced dissociation spectra of peptides with three or more charges. Anal Chem 2005;77:6364-73.
[36] Yao X, Freas A, Ramirez J, Demirev PA, Fenselau C. Proteolytic 18O labeling for comparative proteomics: model studies with two serotypes of adenovirus. Anal Chem 2001;73:2836-42.
[37] Yao X, Afonso C, Fenselau C. Dissection of proteolytic 18O labeling: endoprotease- catalyzed 16O-to-18O exchange of truncated peptide substrates. J Proteome Res 2003;2:147-52.
[38] Gao Q, Xue S, Shaffer SA, Doneanu CE, Goodlett DR, Nelson SD. Minimize the detection of false positives by the software program DetectShift for 18O-labeled cross- linked peptide analysis. Eur J Mass Spectrom (Chichester, Eng) 2008;14:275-80.
[39] Gao Q, Xue S, Doneanu CE, Shaffer SA, Goodlett DR, Nelson SD. Pro-CrossLink. Software tool for protein cross-linking and mass spectrometry. Anal Chem 2006;78:2145-9.
[40] Back JW, Notenboom V, de Koning LJ, Muijsers AO, Sixma TK, de Koster CG, et al. Identification of cross-linked peptides for protein interaction studies using mass spectrometry and 18O labeling. Anal Chem 2002;74:4417-22.
[41] Liu M, Cheetham J, Cauchon N, Ostovic J, Ni W, Ren D, et al. Protein isoaspartate methyltransferase-mediated 18O-labeling of isoaspartic acid for mass spectrometry analysis. Anal Chem 2012;84:1056-62.
177
[42] Wan W, Zhao G, Al-Saad K, Siems WF, Zhou ZS. Rapid screening for S- adenosylmethionine-dependent methylation products by enzyme-transferred isotope patterns analysis. Rapid Commun Mass Spectrom 2004;18:319-24.
[43] Swaney DL, Wenger CD, Coon JJ. Value of using multiple proteases for large-scale mass spectrometry-based proteomics. J Proteome Res 2010;9:1323-9.
[44] Ni W, Lin M, Salinas P, Savickas P, Wu SL, Karger BL. Complete mapping of a cystine knot and nested disulfides of recombinant human arylsulfatase A by multi-enzyme digestion and LC-MS analysis using CID and ETD. J Am Soc Mass Spectrom 2013;24:125-33.
[45] Ingrosso D, Fowler AV, Bleibaum J, Clarke S. Specificity of endoproteinase Asp-N (Pseudomonas fragi): cleavage at glutamyl residues in two proteins. Biochem Biophys Res Commun 1989;162:1528-34.
[46] Tetaz T, Morrison JR, Andreou J, Fidge NH. Relaxed specificity of endoproteinase Asp- N: this enzyme cleaves at peptide bonds N-terminal to glutamate as well as aspartate and cysteic acid residues. Biochem Int 1990;22:561-6.
[47] Ni W, Dai S, Karger BL, Zhou ZS. Analysis of isoaspartic Acid by selective proteolysis with Asp-N and electron transfer dissociation mass spectrometry. Anal Chem 2010;82:7485-91.
[48] Sorensen SB, Sorensen TL, Breddam K. Fragmentation of proteins by S. aureus strain V8 protease. Ammonium bicarbonate strongly inhibits the enzyme but does not improve the selectivity for glutamic acid. FEBS Lett 1991;294:195-7.
[49] Kim MS, Pandey A. Electron transfer dissociation mass spectrometry in proteomics. Proteomics 2012;12:530-42.
[50] Syka JE, Coon JJ, Schroeder MJ, Shabanowitz J, Hunt DF. Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc Natl Acad Sci U S A 2004;101:9528-33.
[51] Tomita M, Irie M, Ukita T. Sensitized photooxidation of histidine and its derivatives. Products and mechanism of the reaction. Biochemistry 1969;8:5149-60.
[52] Kang P, Foote CS. Photosensitized oxidation of 13C,15N-labeled imidazole derivatives. J Am Chem Soc 2002;124:9629-38.
[53] Nilsson R, Merkel PB, Kearns DR. Unambiguous evidence for the participation of singlet oxygen ( 1 ) in photodynamic oxidation of amino acids. Photochem Photobiol 1972;16:117-24.
178
[54] Verweij H, Dubbelman TM, Van Steveninck J. Photodynamic protein cross-linking. Biochim Biophys Acta 1981;647:87-94.
[55] Saphire EO, Parren PW, Pantophlet R, Zwick MB, Morris GM, Rudd PM, et al. Crystal structure of a neutralizing human IGG against HIV-1: a template for vaccine design. Science 2001;293:1155-9.
[56] Alfaro JF, Gillies LA, Sun HG, Dai S, Zang T, Klaene JJ, et al. Chemo-enzymatic detection of protein isoaspartate using protein isoaspartate methyltransferase and hydrazine trapping. Anal Chem 2008;80:3882-9.
[57] Zang T, Dai S, Chen D, Lee BW, Liu S, Karger BL, et al. Chemical methods for the detection of protein N-homocysteinylation via selective reactions with aldehydes. Anal Chem 2009;81:9065-71.
[58] Zhou ZS, Flohr A, Hilvert D. An Antibody-Catalyzed Allylic Sulfoxide-Sulfenate Rearrangement. J Org Chem 1999;64:8334-41.
[59] Zhao G, Zhou ZS. Vinyl sulfonium as novel proteolytic enzyme inhibitor. Bioorg Med Chem Lett 2001;11:2331-5.
[60] Zhou ZS, Jiang N, Hilvert D. An Antibody-Catalyzed Selenoxide Elimination. J Am Chem Soc 1997;119:3623-4.
[61] Zhou ZS, Smith AE, Matthews RG. L-Selenohomocysteine: one-step synthesis from L- selenomethionine and kinetic analysis as substrate for methionine synthases. Bioorg Med Chem Lett 2000;10:2471-5.
[62] Matthews RG, Smith AE, Zhou ZS, Taurog RE, Bandarian V, Evans JC, et al. Cobalamin-Dependent and Cobalamin-Independent Methionine Synthases: Are There Two Solutions to the Same Chemical Problem? Helvetica Chimica Acta 2003;86:3939- 54.
179
Chapter 5: Conclusion and Future Directions
In this thesis, LCMS-based methods have been developed for the detection and
characterization of Asp isomerization and protein crosslinks in monoclonal antibodies. As
described in Chapter Two, a protein isoaspartate methyltransferase-mediated 18O-labeling
followed by LC/MS analysis is described to detect one of the most challenging post translational modifications─isomerization of Asp due to the subtle difference between isoaspartic acid and aspartic acid and difficulty to differentiate them. Several isoAsp peptides in IgG1 were characterized and isomerization site were unambiguously identified.
In Chapter Three, a comprehensive methodology for the identification of protein crosslinks without a prior knowledge of chemistry via 18O-labeling and LC/MS analysis is
presented. Due to the intrinsic structural complexity of crosslink, it is very challenging to detect
and characterize crosslinks, especially when crosslink chemistry is unknown. The utility of our
XChem-Finder work flow has been successfully demonstrated via the detection of thioether
crosslinks in IgG2 and the discovery of a novel histidine-histidine crosslink in IgG1 in Chapter
Three and Four, respectively. Both thioether crosslinks and histidine-histidine crosslinks are
found as degradation compounds in the constant region of IgG, therefore these modifications are
most likely common to IgG.
Despite considerable efforts to understand the relevance of post-translational
modifications such as Asp isomerization and protein crosslinking in the cellular context, we are
still in the process of unraveling the complexity of these modifications and their tremendous impact. Sophisticated technological advances like powerful separation techniques, high
180 resolution mass spectrometry are now increasingly available for identification and characterization of these site specific protein modifications. In this chapter, some future work to extend the utilities of our methods for each project is listed next. For crosslink project, the improvement of our current XChem-Finder work flow will be discussed in more detail.
5.1 isoAsp Project
Biological Samples. The method described in Chapter Two can be useful for monitoring isoAsp formation in not only IgG but also other therapeutic proteins during production and storage to ensuring the quality of the therapeutic proteins. Most importantly, the method can be applied to biological samples such as plasma, serum, urine and tissues to identify potential isoAsp proteins to understand the isoAsp process pathways in vivo and to identify disease- associated biomarkers.
As discussed in the section 1.4.1, artifactual deamidation or isomerization can be significant during sample preparation. The extreme pH exposure and high temperature should be avoided during sample preparation. The sample preparation for biological samples typically takes longer time to remove other interference proteins. Therefore, caution should especially be taken to minimize artifactual deamidation or isomeration. The inherent isoAsp and those introduced by sample preparation can be differentiated by preparing sample in 18O-water and quantified by b-ion intensity calculation procedure (See detail discussion in the section
1.4.3.4)[1-3].
D-isoAsp Detection. It is difficult to identify the peptides containing D-isoAsp. So far no sensitive method for D-isoAsp has been reported. The lack of suitable methods has prevented
181
its biological study. Antibodies are highly specific against their specifically modified proteins[4,
5], but there is very limited reports on antibody-based method yet. It is worthy to explore more in this are in the future.
5.2 Crosslink Project
Other Crosslinks in Proteins Our XChem-Finder work flow has successfully been used to detect thioethers and histidine-histidine crosslinks in IgG. We expect the discovery and elucidation of other new protein crosslinking in pharmaceutical products and biological samples by our XChem-Finder approach will be equally successful. Dityrosine crosslinks in protein has been proposed and studied since 1980s, but its analysis most often employ reversed phase HPLC with fluorometric detection due to its complicated structure in nature and difficulties on its MS data interpretation[6-8]. Calmodulin contains two tyrosyl residues with no cysteine or tryptophan, so it is a good model protein to study dityrosine crosslink[8]. Our XChem-Finder work flow might be applied to directly detect and characterize dityrosine crosslink in calmodulin.
This might shed a light on the formation of dityrosine crosslink in other proteins (for example, proximities of two tyrosine residues in proteins).
Sample Enrichment. Our current XChem-Finder process to detect the crosslinks is to directly use tryptic digest from both 16O and 18O-water for LC/MS/MS analysis. A
chromatographic fractionation of peptic digests may be necessary sometime to avoid suppression
of the ionization of some peptides. Various methods to achieve such enrichment are now
explored. One such approach that has so far been employed for enrichment of chemically
crosslinked peptides makes use of the generally higher charge state that distinguishes chemically
182
crosslinked peptides from linear peptides[9]. Chemically crosslinked peptides elute in the late
fractions in cation-exchange chromatography[9]. Also peptides with high charge states are selected for fragmentation in MS analysis[9]. Taouatas and his colleague combined Lys-N proteolytic digestion, strong cation exchange enrichment, and mass spectrometry (MALDI-
MS/MS by CID and LC-MS/MS by CID or ETD) to achieve an optimal targeted strategy for proteome analysis[10-12]. The lack of an enrichment step in the current work flow means that only the most abundant of crosslinks are likely to be found. In the case of photo-degraded IgG1, only histidine-histidine crosslink was discovered in our study although there are highly likely other crosslinks via radical mechanism in the sample. Development of specific and efficient enrichment strategies may help to discover new crosslinks. For example, Lys-N digestion of
SEC (size exclusion chromatography) high molecular fractions followed by CEX (cation exchange chromatography) enrichment for LC/MS analysis will be explored in the future
(Scheme 5-1)[10-12].
183
Scheme 5-1. The use of combining sample enrichment and Lys-N digestion for detection of crosslinks[10-12]. Two enrichment steps via SEC at protein level and CEX at peptide level are used to reduce sample complexity. Lys-N peptides that do or do not contain a basic amino acid
(e, g. His or Arg. His is used as an example here.) are shown with the charge in the scheme. The
Lys-N peptides result in dominant b-ions in MALDI-CID (matrix assisted laser ionization- collusion induced dissociation) and c-ions in ETD facilitating MS/MS data interpretation.
N-Terminal Labeling. In this thesis, the incorporation of 18O at C-termini of each newly created peptide was described to distinguish the crosslinked peptides from linear peptides.
However, it failed to detect the crosslinked peptides containing C-termini of proteins. Isotope
2 2 labeling at N-termini with H3-2,4-dinitrofluorobenzen ([ H3]NDFB) as described in section
1.5.3.2.2 might be useful. In the original protocol, methylation of є–amino group of lysine and
N-terminal tag was conducted before protease digestion which results in miss-cleavage due to
dimethyled lysines resistance to protease digestion[13]. This can be overcome by simply
switching sample preparation order—protease digestion first followed protection of є–amino
2 group of Lys and then specific derivatization of the N-terminal amino group with [ H3]DNFB
184
2 (Scheme 5-2). This will increase cost/consumption of [ H3]DNFB reagent, but can minimize miss-cleavage to get too large crosslinked peptides.
N N H2N
N N H2N HN Lys H2N Lys H2N Lys O2N
O2N F HN Lys H2N Lys H2N Lys O N O2N H 2 Crosslinked peptides 1:2:1 Triplet Trypsin O N O H O2N 2
NaCNBH3 2 Protein N N DNFB: [ H3]DNFB (1:1) N
H2N Lys O N HN Lys H2N Lys 2 Linear peptides 1:1 Doublet O2N
Scheme 5-2. Isotopic labeling at N-termini via 1) trypsin digestion; 2) protection of є–amino group of lysine by reductive methylation; 3) specific derivatization of N-terminal amino group
2 with a 1:1 mixture of DNFB (2, 4-dinitrofluorobenzene) and [ H3]DNFB at pH 7.0. The resulting crosslinked and linear peptides can isotopically be identified (1:2:1 triplet and 1:1 doublet with a space of 3 Da for crosslinked and linear peptides, respectively)[13].
In addition, succinylation is known to modify peptides at N-terminal and є–amino group[14-17]. In the effort to develop a protein quantification method based on isobaric peptide termini labeling, the use of 2-Methoxy-4,5-dihydro-1H-imidazole reacts first with є–amino group followed by N-termini labeling with succinic anhydride and teterdeuterated succinic anhydride-d4 (Scheme 5-3)[14, 17]. Koehler at al recently reported succinylation selectively
185
occurred at N-terminal amino group using sodium acetate buffer at pH 7.6 (Scheme 5-3)[15, 16].
In principal, the site-specific N-terminal succinylation can be used for the detection of crosslink
peptides. Succinic anhydride specifically isotope label N-termini which results in mass increase
of 100 Da and 200 Da for single peptides and crosslinked peptides, respectively (Scheme 5-3).
We plan to exploit the succinylation to improve the detection of crosslink peptides in the future.
Scheme 5-3. N-Terminal Succinylation via two-step chemical derivatizations[14, 17] and site
specific N-terminal succinylation in sodium acetate buffer pH 7.6[15, 16] can be used to differentiate crosslinked peptides and linear peptides.
186
Digestion under Acidic Conditions. Trypsin digestion is often performed at slight basic
condition which may make some crosslinks unstable. Protease digestion in acidic condition may
be explored.
Deglycosylation. Heterogeneity and relatively poor ionization efficiency of
glycopeptides increase difficulty to determine the crosslinks near the site of glycan attachment.
Deglycosylation before 18O-labeling and protease digestion is worth to explore in the future.
Complicated Crosslinks. Proteins are known to be degraded in light exposure by a
number of mechanisms, which is of concern for products manufactured for the clinic. Protein
degradation in the light can involve multiple amino acid residues and form a combination of multiple degradation pathways. This gives very complicated mass spectra which pose great challenge to current methodologies including our XChem-Finder work flow.
Intra-crosslinks. In our XChem-Finder work flow, 18O-labeling followed by tryptic
digestion and LC/MS analysis is used to differentiate single peptides and crosslinked peptides.
This approach may not be suitable for intra-crosslinks when the two crosslinked amino acid
residues are close so that no tryptic cleavage between them occurs. This can be addressed with
the combination of multiple protease digestion (e.g., GluC/Trpsin) and 18O-labeling, N-terminal
labeling or chemical tag. We are going to explore this area in the future.
Others As mention before, some limitation of XChem-Finder (e.g. crosslinks containing
C-terminus of protein, high quality MSMS spectra, large crosslinks, etc.) can be addressed by
use of protease with different selectivity (e.g., GluC instead of Trypsin), N-terminal labeling and
different ion activation (CID and ETD). Chemical tag may provide some solution as well and
might be studied in the future. In addition, MALDI often generates single charge ions while ESI
187
gives multiple charge ions for large peptides and proteins. As such MALDI instead of ESI may simplify the data interpretation of crosslinks. Next, we are going to exploit MALDI for
characterization of crosslinks.
5.3 References
[1] Li X, Cournoyer JJ, Lin C, O'Connor PB. Use of 18O labels to monitor deamidation during protein and peptide sample processing. J Am Soc Mass Spectrom 2008;19:855-64.
[2] Gaza-Bulseco G, Li B, Bulseco A, Liu HC. Method to differentiate asn deamidation that occurred prior to and during sample preparation of a monoclonal antibody. Anal Chem 2008;80:9491-8.
[3] Liu H, Wang F, Xu W, May K, Richardson D. Quantitation of asparagine deamidation by isotope labeling and liquid chromatography coupled with mass spectrometry analysis. Anal Biochem 2013;432:16-22.
[4] Sakai K, Homma H, Lee JA, Fukushima T, Santa T, Tashiro K, et al. Localization of D- aspartic acid in elongate spermatids in rat testis. Arch Biochem Biophys 1998;351:96- 105.
[5] Sakai K, Homma H, Lee JA, Fukushima T, Santa T, Tashiro K, et al. D-aspartic acid localization during postnatal development of rat adrenal gland. Biochem Biophys Res Commun 1997;235:433-6.
[6] DiMarco T, Giulivi C. Current analytical methods for the detection of dityrosine, a biomarker of oxidative stress, in biological samples. Mass Spectrom Rev 2007;26:108- 20.
[7] Correia M, Neves-Petersen MT, Jeppesen PB, Gregersen S, Petersen SB. UV-light exposure of insulin: pharmaceutical implications upon covalent insulin dityrosine dimerization and disulphide bond photolysis. PLoS One 2012;7:e50733.
[8] Malencik DA, Anderson SR. Dityrosine formation in calmodulin. Biochemistry 1987;26:695-704.
188
[9] Rinner O, Seebacher J, Walzthoeni T, Mueller LN, Beck M, Schmidt A, et al. Identification of cross-linked peptides from large sequence databases. Nat Methods 2008;5:315-8.
[10] Boersema PJ, Taouatas N, Altelaar AF, Gouw JW, Ross PL, Pappin DJ, et al. Straightforward and de novo peptide sequencing by MALDI-MS/MS using a Lys-N metalloendopeptidase. Mol Cell Proteomics 2009;8:650-60.
[11] Taouatas N, Altelaar AF, Drugan MM, Helbig AO, Mohammed S, Heck AJ. Strong cation exchange-based fractionation of Lys-N-generated peptides facilitates the targeted analysis of post-translational modifications. Mol Cell Proteomics 2009;8:190-200.
[12] Taouatas N, Mohammed S, Heck AJ. Exploring new proteome space: combining Lys-N proteolytic digestion and strong cation exchange (SCX) separation in peptide-centric MS- driven proteomics. Methods Mol Biol 2011;753:157-67.
[13] Chen X, Chen YH, Anderson VE. Protein cross-links: universal isolation and characterization by isotopic derivatization and electrospray ionization mass spectrometry. Anal Biochem 1999;273:192-203.
[14] Arntzen MO, Koehler CJ, Treumann A, Thiede B. Quantitative proteome analysis using isobaric peptide termini labeling (IPTL). Methods Mol Biol 2011;753:65-76.
[15] Koehler CJ, Arntzen MO, Strozynski M, Treumann A, Thiede B. Isobaric peptide termini labeling utilizing site-specific N-terminal succinylation. Anal Chem 2011;83:4775-81.
[16] Koehler CJ, Arntzen MO, Treumann A, Thiede B. A rapid approach for isobaric peptide termini labeling. Methods Mol Biol 2012;893:129-41.
[17] Koehler CJ, Strozynski M, Kozielski F, Treumann A, Thiede B. Isobaric peptide termini labeling for MS/MS-based quantitative proteomics. J Proteome Res 2009;8:4333-41.
189