bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
A multi-modal proteomics strategy for characterizing posttranslational modifications of tumor suppressor p53 reveals many sites but few modified forms
Caroline J. DeHart1, Luca Fornelli1, Lissa C. Anderson2, Ryan T. Fellers1, Dan Lu3, Christopher L. Hendrickson2,4, Galit Lahav3, Jeremy Gunawardena3, and Neil L. Kelleher1,5*
Summary
Post-translational modifications (PTMs) are found on most proteins, particularly on “hub” proteins
like the tumor suppressor p53, which has over 100 possible PTM sites. Substantial crosstalk between PTM
sites underlies the ability of such proteins to integrate diverse signals and coordinate downstream
responses. However, disentangling the combinatorial explosion in global PTM patterns across an entire
protein (“modforms”) has been challenging, as conventional peptide-based mass spectrometry strategies
(so-called “bottom-up” MS) destroy such global correlations. Alternatively, direct analysis of intact and
modified proteins using “top-down” MS retains global information. Here, we applied both strategies to
recombinant p53 phosphorylated in vitro with Chk1 kinase, which exhibited 41 modified sites by bottom-
up MS, but no more than 8 modified sites per molecule detected by top-down MS. This observation that
many low-abundance modifications comprise relatively few modforms above a 1% threshold indicates
that endogenous p53 PTM complexity may be more definable than previously thought.
Keywords: p53, top-down proteomics, post-translational modifications, modforms, 21 T FT-ICR
1 National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, IL, 60208, USA 2 Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Tallahassee, FL, 32310, USA 3 Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA 4 Department of Chemistry and Biochemistry, Florida State University, Tallahassee, FL, 32310, USA 5 Departments of Chemistry, Molecular Biosciences and the Division of Hematology-Oncology, Northwestern University, Evanston, IL, 60208, USA * Correspondence: [email protected]
1 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Introduction The development of nucleic acid sequencing technologies has domesticated the genome and
revealed the unexpected richness of RNA biology. Proteins, however, remain central players in most
cellular processes as the vast complexity of the proteome becomes better appreciated through the
application of emerging technologies [1]. Aside from somatic mutations in coding sequences and
alternative splicing, most proteins are also post-translationally modified, often on multiple sites. This is
particularly the case for “hub” proteins of key cellular pathways, which integrate diverse input signals and
help orchestrate multiple downstream responses.
The tumor suppressor p53 provides a potent example of proteomic complexity. p53 is a master
cellular regulator and frequently mutated in human cancers [2-4]. It integrates input signals from myriad
stressors, including genotoxic damage, oncogenesis, hypoxia, and viral infection [5, 6] into highly
orchestrated downstream responses, such as apoptosis, mitosis, and metabolic changes [7-9]. It does so
in a highly dynamic [10, 11] and stressor- and cell-type-specific manner [12]. The activity and stability of
p53 is associated with hundreds of individual post-translational modifications (PTMs) reported to date
[13-15], which are known to be regulated by numerous enzymes [13, 14] within stressed and unstressed
cells. These PTMs have been identified on over 100 sites within the N-terminal transactivation domains
(AA 1-92), central DNA-binding domain (AA 102-292), and C-terminal tetramerization and basic domains
(AA 292-393) of the protein (Figure 1A) [3, 13, 14].
For p53 and similar proteins, it is likely that extensive crosstalk exists between PTM sites on different
domains to integrate upstream signals and downstream responses. Such PTM crosstalk offers a way to
encode information about context and function in the pattern of modifications across the whole protein,
as suggested by the concept of combinatorial “PTM codes”, of which the histone code is best known [16].
Crosstalk between modified sites could arise through protein folding, which can bring widely separated
parts of a peptide sequence together in space, as well as through allosteric conformational changes that
2 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
could alter binding partners [17]. If such crosstalk underlies information encoding and integrative
function, then it becomes essential to know the patterns of modification across the entire protein, as well
as the stoichiometric abundance and distribution of these patterns within a given biological context. This
presents a formidable challenge, especially for extensively modified proteins such as p53, which could
theoretically exhibit over 1030 modified forms. While these cannot all exist, given the limitation of protein
copy number in cells or entire organisms [1], the sheer scale of combinatorial PTMs highlights the
challenge of determining which modified p53 forms are actually present within a given biological context.
The term “proteoform” covers all changes of a given protein molecule arising after transcription,
including alternative spliceforms, chemical adducts, proteolytic cleavages, and PTMs [1, 18]. In the context
of information encoding by PTMs, it is more helpful to focus on “modforms” [19], or those proteoforms
arising from reversible enzymatic modification and de-modification, such as phosphorylation, acetylation
and ubiquitinylation. The number of modforms that a protein can exhibit grows exponentially with the
number of modification sites. An individual protein molecule will exhibit one modform, while a population
of protein molecules will exhibit a distribution of modforms in varying proportions. This “modform
distribution” offers the most complete information about global PTM crosstalk and a protein's capability
to encode information.
Post-translational modifications of p53 are typically analyzed by PTM-specific antibodies, which
provide high sensitivity but can only identify a few modifications. Traditional proteomic approaches
employing proteolytic digestion and analysis of the resulting peptides offer another approach for PTM
detection and are commonly referred to as “bottom-up” (BU) mass spectrometry (MS). This latter strategy
excels at high-throughput, high-confidence PTM characterization of p53 populations, most notably in a
pair of recent studies which performed comprehensive PTM cataloging of p53 isolated from normal
human fibroblasts and identified 230 PTMs across 101 amino acid residues of p53 [15, 20]. However,
proteolytic digestion severs the linkage of combinatorial PTM patterns, losing information on PTM
3 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
crosstalk between functional domains and making accurate determination of complex modform
distributions impossible [21].
An alternative strategy, analysis of intact modforms by “top-down” (TD) MS [22], offers a
complementary approach. By keeping the complete protein primary structure intact, the PTM landscape
of a proteoform can be characterized in precise molecular detail, and correlations between widely
separated modification sites can be maintained [23-25]. In recent work, we showed that TD MS offers a
significant advantage for estimating modform distributions [19, 21], especially when used in combination
with BU MS and other “middle-down” (MD) MS approaches, which employ partial enzymatic digestion to
provide ~5-25 kDa protein fragments [26-28]. However, analysis of intact endogenous human p53
modforms by TD MS presents a formidable technical challenge, due to both the molecular weight (43.6
kDa) and the sheer number of potential modforms. A method by which the number and identity of PTMs
on the most abundant p53 modforms could be characterized within a specific biological context would
therefore be immensely valuable to the fields of p53 biology and translational proteomics.
To explore the capabilities of TD MS for p53 modform characterization, we focused on recombinant
human p53 (hereafter, “rp53”), phosphorylated in vitro by the kinase Chk1, which is known to target p53
in vivo [29-32]. Despite showing far less PTM complexity than endogenous p53, this representative model
reflects the patterns of modification produced by typical regulatory enzymes. Although previous studies
of Chk1 action on p53 have shown only 8 phosphorylation sites, we mapped a total of 41 phosphorylation
sites to rp53 by combined BU and MD MS. However, TD MS revealed no more than 8 phosphorylation
sites on each intact molecule, within the limits of detection. Moreover, as rp53 was digested into smaller
peptides, the number of abundant phosphorylation sites observed on each peptide was always vastly
lower than the theoretical maximum. Despite the potential complexity of 241, or over 2 x 1012, modforms,
the actual observed complexity is far less when viewed by TD and MD MS. In other words, we found many
modification sites but few modforms. As we show here, TD MS offers a new approach to quantifying global
4 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
patterns of PTMs and suggests that the actual PTM complexity generated and maintained by cells may be
more definable than previously thought.
Results Method optimization for analysis of intact p53 modforms
To the best of our knowledge, a method for the separation of intact p53 modform populations into
isobaric peaks by high-performance liquid chromatography and subsequent characterization by tandem
mass spectrometry (LC-MS/MS) has yet to be established. Therefore, in order to perform this critical
aspect of our multimodal proteomics strategy, we first needed to determine which sample,
chromatographic, instrument, and data analysis parameters would be required for rapid and high-
confidence identification of intact p53 protein. To develop an LC-MS/MS method, we isolated
recombinant full-length human p53 (rp53) (Figure 1B) from an E. coli-based system [33] by cobalt affinity
chromatography (Figure S1A) and introduced it into two different Fourier-transform (FT) mass
spectrometers for analysis by LC-MS/MS.
We first obtained a successful medium-resolution intact mass (MS1) spectrum of purified rp53 on
an Orbitrap Fusion Lumos mass spectrometer, utilizing a short transient acquisition of the FT signal that
enhances signals of charge states produced by electrospray ionization (ESI) using the lowest possible
resolving power (7,500 at 200 m/z) (Figure S2A) [34]. After performing charge state deconvolution using
the ReSpect™ algorithm, the resulting average mass of 45,683.75 Da was within 1.6 Da of that calculated
from in silico translation of the rp53 plasmid sequence [33]. We confirmed our identification of rp53 by
targeting selected charge states for fragmentation by collision-induced dissociation (CID), resulting in the
generation of protein fragment ions (MS2). The monoisotopic masses of these fragment ions could then
be calculated using the Xtract algorithm and compared against the rp53 protein sequence using ProSight
Lite data analysis software [35, 36], with high-confidence matches determined by probability-based
scoring metrics (e.g. p-score) (Figure S3A).
5 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
As we believed that a high-resolution MS1 spectrum would be required to accurately identify and
characterize multiple PTMs on intact p53, we next performed top-down LC-MS/MS analysis of intact rp53
on the 21 Tesla (T) Fourier Transform (FT) – Ion Cyclotron Resonance (ICR) mass spectrometer at the
National High Magnetic Field Laboratory (Tallahassee, FL), which holds the current world record for FT
resolving power [37]. By doing so, we successfully obtained a high-resolution MS1 spectrum of intact rp53
(150,000 at 400 m/z) on a chromatographic timescale (Figure S2B), with charge states isotopically
resolved to the baseline (Figure S2B, inset), a feat not yet achievable on any other MS platform. By
performing charge state deconvolution using the Xtract algorithm, we calculated the monoisotopic mass
of intact rp53 to be 45,656.45 Da, only 0.13 Da away from the theoretical mass. We obtained additional
improvements in rp53 sequence coverage by employing electron-transfer dissociation (ETD) as the MS2
fragmentation method (Figure S3B), thus proving this strategy capable of rapid, high-confidence rp53
modform characterization by TD LC-MS/MS.
Characterization of representative p53 modforms
With a workflow for accurate identification and characterization of intact rp53 by TD MS, we next
analyzed a population of rp53 modforms created in vitro by a single enzyme. We selected purified human
Chk1 kinase for this purpose because this enzyme had been previously described to phosphorylate human
p53 in vitro and/or in vivo at 8 residues (Ser 15, Thr 18, Ser 20, Ser 37, Ser 313, Ser 314, Ser 378, Thr 387)
[29-32], and could therefore provide up to 256 potential rp53 modforms from a single enzymatic reaction.
Our rationale was to visualize the number of phosphorylated sites present on the most abundant rp53
modforms by TD, while employing complementary BU analysis to confirm the number and locations of all
phosphorylated rp53 residues. For the TD approach, we subjected replicate preparations of Chk1-
phosphorylated rp53 (phospho-rp53) (Figure S1B), in addition to kinase-free rp53 controls prepared in
parallel, to LC-MS/MS analysis on an Orbitrap Fusion Lumos mass spectrometer, employing the LC and
“medium-resolution” MS parameters previously optimized for unmodified rp53. Upon performing charge
6 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
state deconvolution using the ReSpect™ algorithm, we observed that intact Chk1-treated rp53 modforms
bore up to 8 phosphorylated sites (Figure 2A, S4A, and S5), consistent with the number predicted by
previous studies. We further confirmed the number of detectable phosphorylated sites on rp53 modforms
by performing high-resolution MS1 analysis of unmodified and Chk1-treated rp53 on the 21 T FT-ICR mass
spectrometer. By isolating the 44+ charge state of phospho-rp53, we were able to maximize our ion signal
and visualize peaks for each of the 8 rp53 phospho-modforms at near-baseline isotopic resolution (Figure
S4B). However, due to the size and signal complexity of the target species, MS2 fragmentation of intact
phospho-rp53 modforms could not be successfully obtained by LC-MS/MS on either instrument platform.
To localize and assign phosphorylated sites within phospho-rp53 modforms, we subjected parallel
preparations of Chk1-phosphorylated and kinase-free rp53 to concurrent trypsin and LysC digestion,
followed by high-resolution BU LC-MS/MS analysis on an Orbitrap Fusion Lumos mass spectrometer. We
then used Mascot software to search the resulting peptide and accompanying fragment ion monoisotopic
masses against the rp53 sequence, as well as the UniProt E. coli database to rule out any false positive
matches from contaminating proteins, using a series of increasingly strict parameters (detailed in
Experimental Methods). We also performed replicate searches using the MS Amanda [38] and Target
Decoy PSM Validator nodes in Proteome Discoverer, as this workflow provided orthogonal probability-
based scoring metrics and allowed us to manually validate each identified phosphorylation site within the
raw data. By accepting only the highest-scoring identifications from both search strategies across three
independent experiments, we identified a total of 39 phosphorylated residues on rp53 resulting from in
vitro phosphorylation by Chk1 (Figure 2B, Table S1). This striking increase in the number of potential
phosphorylation sites from the 8 predicted by the literature proportionally increased the number of
potential modforms within the sample from 256 to 5 x 1011 and indicated that each isobaric modform
peak visualized by TD might bear far higher complexity than anticipated, thus partially explaining why MS2
based characterization of intact phospho-rp53 could not be achieved.
7 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Integration of proteomic approaches to estimate p53 modform distributions
As our TD and BU datasets indicated two highly divergent degrees of phosphorylation exhibited by
phospho-rp53 modforms, we sought to find a middle ground in which we could better visualize the
number and combinations of phosphorylated sites present within specific regions of the rp53 sequence.
Our rationale was that partial proteolytic digestion of phospho-rp53 would provide both smaller target
modforms and reduced signal complexity, thus facilitating the identification and localization of the most
abundant phosphorylation sites on each intact digestion product. For this “middle-down” (MD)-MS
approach, we subjected parallel preparations of Chk1-phosphorylated and kinase-free rp53 to partial
digestion with LysC and analyzed the resulting peptides by high-resolution MD LC-MS/MS on an Orbitrap
Fusion Lumos mass spectrometer.
For the smaller peptide fragments (≤10 kDa), we searched the resulting data against a custom
UniProt-style database, comprising the rp53 sequence and all phosphorylation sites confirmed by BU,
using ProSightPC 4.0 and a three-pronged search strategy (detailed in Experimental Methods) [39]. We
then used TDCollider software to collate the results of each high-throughput search into a single output
file comprising a protein sequence coverage map and accompanying MS2 graphical fragment maps of all
rp53 digestion products identified by MD. To rule out the possibility of false identifications, both positive
and negative, we subjected each identified phospho-rp53 sub-region to manual phosphorylation site
validation with ProSight Lite (examples in Figure S6D-F). In doing so, we identified two additional rp53
phosphorylation sites (Ser 390, Ser 395) (Figure 2B, Figure S6E, Table S1) within a region that received
poor peptide coverage by BU analysis (AA 389-398), bringing the total number of sites identified to 41
(and the total number of potential modforms to 2 x 1012).
For the larger (≥10 kDa) peptide fragments, we first obtained a medium-resolution MS1 spectrum
of each species (Figure S7, Figure S8) to determine both the MW and the number of potential
phosphorylation sites by performing charge state deconvolution using the ReSpect™ algorithm (Table S2).
8 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
We then performed a series of targeted MS2 experiments employing HCD or ETD fragmentation of
individual charge states, with method parameters tailored to obtain the best possible sequence coverage
of each phospho-rp53 peptide. Finally, we verified that we had identified the correct peptide fragment by
calculating MS2 fragment ion monoisotopic masses with the Xtract algorithm and comparing them against
the respective regions of rp53 using ProSight Lite (Figure S7, Figure S8).
To integrate the phospho-rp53 modform characterization performed by TD-, BU-, and MD-MS into
a constrained estimate of the total number of potential modforms present, we first calculated the relative
abundance of each phosphorylated form of intact rp53 and MD digestion products spanning the entire
length of the rp53 protein sequence (Figure 3, Figure S6). By overlaying the final relative abundance
histograms against a coverage map of the rp53 sequence and all phosphorylation sites identified therein
(Figure 3, Figure S6), we were able to localize subsets of potential modform complexity to specific regions
of phosphorylated rp53. For example, of the 8 total phosphorylated sites detected on intact rp53 (Figure
3A), up to 8 sites could be ascribed to the N-terminal half of the protein (Figure 3B, 3C; Figure S7), and 5
to the C-terminal half (Figure 3B, 3D; Figure S8) with the number of abundant sites appearing to decrease
closer to the termini (Figure 3E, 3F; Figure S6A-C).
By calculating the numbers of potential combinations resulting from the addition of up to 8
phosphorylated sites to intact rp53, multiplying each number by the abundance ratio calculated for the
respective isobaric modform peak in BioPharma Finder, and summing the results together, we estimated
a total of 68 potential phospho-rp53 modforms when considering the TD data alone (Figure 4, Table S3).
By comparing the number of phosphorylation sites mapped by BU with the maximum number of sites
observed within each region by MD, we could further refine our estimation of overall modform
complexity. For example, the presence of 12 sites identified within AA 296-412 by BU, observed as a
maximum of 4 co-occurring sites by MD, indicated the presence of 495 potential combinations (Figure
3F). However, only 6 sites were observed in a total of 11 combinations within that region by MD. Similarly,
9 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
of the 19 sites localized to AA 1-173 by BU, with 3,874 potential combinations, only 6 sites in a total of 13
combinations were observed by MD. As we consistently observed that fewer than 50% of the
phosphorylation sites predicted by BU within a specific peptide sequence were detected by MD, and that
these sites were detected within a small subset of all potential combinations by MD, even for the smallest
peptide fragments (Figure S6A-C), it is more than likely that the actual intact rp53 phospho-modform
distribution occupies a similarly small subset of our BU-based estimate.
Discussion Advancements in multimodal proteomic methodology
Single charge states of intact p53 protein are challenging to analyze by TD MS, even when abundant
and unmodified. The relatively large MW of p53 (43.6 kDa for endogenous, 45.6 kDa for recombinant) is
at or above the upper limit of most commercially available MS instrumentation for high-quality MS1 and
MS2 fragmentation data on a timescale compatible with chromatography. This is due to the MW-
dependent decrease in signal-to-noise ratio from the concomitant increase in peaks generated by
electrospray ionization and chemical adducts [40], a situation exacerbated by PTMs and other sources of
biological variation (e.g. mutations). Special measures, such as the “medium-resolution” method
employed on the Orbitrap Fusion Lumos or the superior signal-to-noise ratio provided by the 21 T FT-ICR,
must therefore be taken to obtain high mass accuracy measurements and confident identification of intact
p53 protein by TD MS. Moreover, the biochemical properties of rp53 present an additional challenge,
namely rapid and reproducible degradation outside of the sample parameters detailed in the
Experimental Methods. Finally, the addition of PTMs at variable sites and at widely variable
stoichiometries provides a significant increase in both sample and signal complexity, thus rendering the
chromatographic resolution and ensuing accurate mass determination of individual rp53 modforms a
formidable proposition.
10 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
After surmounting these challenges to develop a workflow by which intact rp53 could be
successfully analyzed and characterized by TD LC-MS/MS on two platforms, the Fusion Lumos (Figure S2A,
S3A) and the 21 T FT-ICR (Figure S2B, S3B), we were able to analyze intact and phosphorylated rp53
modforms generated by Chk1 kinase in vitro. While complementary PTM cataloging by high-resolution BU
and MD reproducibly showed that 41 phosphorylation sites were present on Chk1-treated rp53 (Figure
2B, Table S1), substantially more than the literature had indicated, we were able to directly visualize by
intact mass that the most abundant modforms within our sample population bore only up to 8
phosphorylated sites with a threshold of ~2-3% relative abundance for modform detection (Figure 2A,
Figure S4). As viewed by whole-protein mass spectrometry, phosphorylated rp53 greatly constrained our
estimate of how many rp53 modforms were present above an abundance threshold and vastly diminished
the degree of potential modform complexity predicted from the BU data alone. Moreover, in establishing
the parameter requirements for the analysis of high-MW, highly-modified intact proteins by LC-MS/MS,
we have laid a path for the characterization of complicated modform populations from a broad range of
biological contexts.
The intermediate nature of the MD approach also provides numerous benefits when attempting to
characterize a complex modform population. Foremost, peptides from ~5-20 kDa are smaller and less
extensively modified than whole modforms, thus facilitating localization of abundant modification sites
by standard MS2 fragmentation techniques. While partial digestion products > 20 kDa may not be as
accessible to MS2 characterization (Figure S7, Figure S8), due to the number of potential fragment ion
channels exceeding the capacity of the mass analyzer (Orbitrap Fusion Lumos), they can still indicate the
number of abundant modification sites detectable within a given region of the protein (Figure 3). This
latter point proved especially helpful when estimating the total number of rp53 phospho-modforms, as
the number of abundant modifications observed per region was always significantly less (50% or below)
than the number predicted by BU (Figure 2B, Figure 3), even for the smallest pieces analyzed (Figure S6).
11 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
While our final rp53 modform estimate was somewhat impacted by lack of complete MS2 sequence
coverage within the largest phosphorylated rp53 peptides, which spanned the central 100 AA of rp53, MD
analysis indicated that most of the phosphorylation sites were likely to exist closer to the N- and C-termini
(Figure 3), from which we were able to obtain high sequence coverage by MS2 fragmentation and
therefore complete localization of multiple phosphorylation sites.
Biological implications on p53: many modifications, few abundant modforms
p53 is a transcription factor and nodal signal conduit protein responsive to a diverse repertoire of
upstream cellular stresses (e.g. hypoxia, oncogenic signaling, DNA damage) while regulating myriad
downstream effector pathways encompassing thousands of target genes [41] collectively coordinating
the final cellular response (e.g. DNA damage repair, apoptosis, cell cycle arrest). An explanation for how
a single protein can perform such a wide range of roles in a context specific manner has been provided by
the discovery of additional PTMs present on p53. Indeed, several specific enzymatic phosphorylation or
acetylation events have been shown to alter the cell fate selection by p53 [42-45]. More recently,
endogenous human p53 has been shown to bear more than 300 PTMs at individual sites with unknown
stoichiometries resulting from stressor-specific cellular response pathways [13-15]. This has led to a
prevailing speculation of the existence of a highly complex PTM-based combinatorial logic, or “PTM
barcodes” [12]. Patterns of PTMs are thought to direct the ability of p53 to activate a response to a given
stressor in a cell by localizing to the nucleus and binding to a small subset of over 3500 transcriptional
targets [7, 46], often by interacting with only a few protein binding partners out of potential thousands
[13, 41]. Given the sheer number of possible PTM combinations (estimated at 1030), it is difficult to
envision a feasible evolution of an equal number of ‘readers’ for every PTM combination, due to the
metabolic burden this would impose on any given cell. However, questions remain: just how complex is
the p53 PTM code and how many p53 modforms are there?
12 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Pre-existing techniques such as immunoblots or even BU proteomics have not been able to reveal
the true scope of PTM-based p53 regulation over decades of dedicated research. Analysis of p53
populations by modification-specific antibodies or standard BU proteomics methods only indicate which
PTMs might be present; elucidation of PTM crosstalk between different regions of the protein, the
modform distribution across a p53 population, or the PTM logic correlating to p53 modform function
within any given biological context is far more challenging. Indeed, when comprehensive peptide-based
PTM catalogs of equivalent p53 populations resulting from different stressors were compared, there were
no significant differences observed in the location or identity of any PTMs, despite each population
exhibiting distinct functional phenotypes (e.g. ability to activate transcription of downstream target
genes) [20]. Without a method to determine which p53 modforms are expressed at stoichiometric
abundance as the result of a specific cellular stressor, elucidation of the underlying PTM logic governing
even a single p53-dependent response pathway will likely remain elusive. Our TD MS workflow for intact
p53 analysis therefore provides a first attempt to decipher modforms directly on whole p53 molecules.
In order to reduce the complexity and competition of multi-enzyme modifications on p53, we
limited our study to in vitro phosphorylation by a single DNA damage responsive kinase, Chk1. While we
do not know if the 33 new phosphorylation sites on rp53 reported here are physiologically relevant, our
collective BU, MD, and TD MS results nevertheless provide new insight into the readout of PTM-based
logic. Firstly, we detected a range of individual rp53 molecules containing combined totals of 0 through 8
phosphorylation events by TD. This reinforces the hypothesis that PTMs modulate p53 activity in a graded,
rheostat-like response [47], wherein the activity of p53 is more critically ascribed by the number of
phosphorylation sites, rather than by their precise locations. Secondly, by constraining our estimated
modform distribution to the empirical data provided by our integrated proteomic approach, we were able
to narrow our estimate to 68 potential rp53 modforms, approximately 11 orders of magnitude lower than
the 2 x 1012 potential modforms resulting from a total occupancy of all 41 phosphorylation sites (Figure
13 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
4). This emphasizes that the actual number of modforms present within a population is substantially less
than the total number of theoretically possible combinations, thus significantly reducing the potential
complexity of the regulatory PTM barcode.
The next step, which we have already began to investigate, will be to employ mathematical
algorithms to compute the precise abundance of all empirically observed modforms within a given
population and thereby constrain the distribution of all potential p53 modforms. This ability to constrain
and quantify p53 modform distributions using an integrated MS approach will facilitate the identification
of potential redundancies or linkages between modification sites, providing greater insight into the
underlying logic of signal pathways and p53 PTM codes. Until TD MS technology is capable of performing
complete PTM localization of complex p53 modforms, we must rely on complementary proteomic
strategies to provide the most conservative estimate of p53 modform distributions and constrain the
complexity of the p53 PTM landscape. While the kinase-treated samples studied herein bear dozens of
low-abundance PTM sites from promiscuous activity in vitro, the modform distribution created in vivo is
likely to be quite different. In vivo, one could expect tighter control over modification site utilization, but
also that some degree of site promiscuity and biochemical “noise” is inevitable. Advancements in
measurement science should aim to quantify endogenous p53 modform distributions, clarifying the
spatiotemporal crosstalk among modifications on p53 [14] and PTM alterations over different temporal
phases of p53 expression in response to stress (e.g. the pulsatile expression of p53 in response to ionizing
radiation in MCF7 cells [10, 11]). These insights will in turn significantly advance our understanding of how
these PTMs may change as a result of oncogenic p53 mutations, as well as the impact of the resulting p53
onco-modforms in human cancers.
Acknowledgements This work was supported by the National Institute of General Medical Sciences R01GM105375 (J.G.,
D.L. and G.L.), and P41GM108569, for the National Resource for Translational and Developmental
14 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Proteomics (NRTDP) based at Northwestern University (N.L.K.), as well as by the National Science
Foundation Cooperative Agreement No. DMR-1157490 and the State of Florida (L.C.A. and C.L.H.). Further
support was provided by the National Center for Research Resources award #S10RR027842 to the Keck
Biophysics Facility at Northwestern University. The content is solely the responsibility of the authors and
does not necessarily represent the official views of the National Institutes of Health.
Author Contributions C.J.D., L.F., and L.C.A. designed and conducted the experiments. C.J.D., L.F., L.C.A., and R.T.F. analyzed the data. C.L.H. and N.L.K. contributed analytical tools. C.J.D., L.F., D.L., G.L., J.G., and N.L.K. wrote the paper.
Declaration of Interests R.T.F., J.G., and N.L.K. are involved in the production of freeware and commercialized software, so therefore declare a conflict of interest.
Experimental Methods Expression and isolation of recombinant human p53
A pET15-based plasmid allowing IPTG-inducible expression of intact human p53 bearing an N-
terminal 6xHis tag [33] was purchased from Addgene (Plasmid # 24859), amplified in DH5α E. coli by liquid
culture in LB medium, isolated via Qiagen midiprep kit (Qiagen, Hilden, Germany), verified to contain the
correct p53 sequence by Sanger sequencing at the Northwestern University Genomics Core Facility
(Evanston, IL), and transformed into BL21-DE3 competent E. coli (New England Biolabs, Ipswich, MA).
Intact recombinant human p53 (rp53) was then expressed and isolated using a method modified from
[33]. Briefly, transformed cells were amplified in 5 mL LB cultures overnight at 37 °C, diluted 1:50 into 150
mL LB cultures, amplified for 8 h at 37 °C, induced with IPTG at a final concentration of 0.5 mM, and
transferred to 25 °C for overnight rp53 expression (typically 16-18 h). Cells were pelleted by centrifugation
at 5 krpm for 30 min. at 4 °C, frozen in LN2 for 30s, and lysed into 25 mM Tris-HCl, pH 7.5, containing 500
mM NaCl, 10 mM imidazole, 1% Triton X-100, 4 μM leupeptin, 1.7 μM antipain, 1.5 μM pepstatin, and 1
15 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
mM PMSF. Lysates were incubated on ice for 20 min., followed by addition of 750 U of benzonase nuclease
(Sigma-Aldrich Corp., St. Louis, MO) and MgCl2 to a final concentration of 1 mM. Lysates were incubated
for 20 min. at 20 °C, clarified by centrifugation at 7.5 krpm for 15 min. at 4 °C, and incubated with 6 mL of
Pierce HisPur cobalt affinity resin (Thermo Fisher Scientific, San Jose, CA) by batch rotation for 4 h at 4 °C.
Protein-bound resin was poured into a leveled 25 mL glass column (Bio-Rad Laboratories, Hercules,
CA) by gravity flow and washed successively with 3 column volumes of ice-cold 20 mM Tris-HCl, pH 7.5,
containing 250 mM NaCl, 4 μM leupeptin, 1.7 μM antipain, 1.5 μM pepstatin, 1 mM PMSF, and a step
gradient of 10 mM, 25 mM, or 50 mM imidazole. Bound rp53 was eluted into 1 mL fractions with 1.5
column volumes of ice-cold 50 mM Tris, pH 7.5, containing 400 mM NaCl, 150 mM imidazole, 4
μM leupeptin, 1.7 μM antipain, 1.5 μM pepstatin, and 1 mM PMSF. Aliquots from each eluted fraction
were subjected to SDS-PAGE and subsequent silver nitrate stain [48] to evaluate rp53 quantity and purity
(Figure S1A). Those fractions containing the bulk of the purified rp53 were concentrated and exchanged
into ice-cold 50 mM Tris-HCl, pH 7.5, containing 150 mM NaCl, 4 μM leupeptin, 1.7 μM antipain, 1.5
μM pepstatin, and 1 mM PMSF by centrifugation in a 5 mL 30 kDa MWCO filter (Merck Millipore, Billerica,
MA) at 7.5 krpm for a total of 20 min. at 4 °C. Concentrated rp53 protein was then snap-frozen in LN2 and
stored at -80 °C for a maximum of 14 days prior to analysis.
Phosphorylation of recombinant p53 in vitro by Chk1 kinase
Aliquots of concentrated rp53 were phosphorylated in vitro by recombinant Chk1 kinase (R&D
Technologies, North Kingstown, RI) using a method modified from [30]. Briefly, 96 pmol aliquots of rp53
(~ 4.4 μg, as determined by BCA assay) were incubated with 10 pmol of Chk1 (544 ng) in 20 mM Tris-HCl,
pH 7.5, containing 10 mM MgCl2, 2 mM MnCl2, 1 mM DTT, and 25 μM ATP for 40 min. at 30 °C.
Phosphorylation was then halted by addition of DTT to a final concentration of 50 mM, followed by
incubation at RT for 10 min. and for a subsequent 30 min. on ice. Typical reaction volumes were 50 μL,
with kinase-free controls prepared in parallel. Aliquots from both preparations were subjected to SDS-
16 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
PAGE and subsequent staining by Pro-Q Diamond phosphoprotein gel stain (Thermo Fisher Scientific) and
SYPRO Ruby fluorescent protein stain (Thermo Fisher Scientific) (Figure S1B). Stained gel bands were
visualized at a wavelength of 580 nm on a Typhoon 9400 phosphorimager (Amersham Biosciences, Little
Chalfont, UK) at the Northwestern University Keck Biophysics Facility (Evanston, IL), or at 610 nm on a
ChemiDoc XRS+ molecular imager (Bio-Rad Laboratories), respectively. Fluorescent gel images were
analyzed and colorized using ImageJ [49].
Sample preparation for denaturing top-down mass spectrometry
Aliquots of unmodified, phosphorylated, and kinase-free control rp53 were subjected to
MeOH:CHCl3:H2O precipitation directly prior to LC-MS/MS analysis using a method modified from [50].
Briefly, proteins were precipitated in 8 volumes of methanol (Optima-grade, Thermo Fisher Scientific or
LC-MS grade, Honeywell-Burdick & Jackson, Muskegon, MI), 2 volumes of ACS-grade chloroform (Sigma-
Aldrich Corp.), and 6 volumes of water (Optima-grade, Themo Fisher Scientific or LC-MS grade, Honeywell-
Burdick & Jackson), followed by phase separation via centrifugation at 13.2 krpm for 15 min. at RT. Pellets
were washed twice in 8 volumes of ice-cold methanol with repeated centrifugation and immediately
resuspended in 25 μL of ice-cold Mobile phase A (4.8 % ACN and 0.2% FA in Optima or LC-MS grade H2O).
Samples were centrifuged for 5 min. at 13.2 krpm and 4 °C, transferred into pre-chilled vials, and
immediately carried on ice for MS analysis. All samples were maintained on ice between LC-MS runs.
Replicate samples were prepared at NRTDP (Evanston, IL) or shipped frozen to NHMFL (Tallahassee, FL)
for on-site preparation and analysis.
Sample preparation for middle-down mass spectrometry
Aliquots of phosphorylated and kinase-free rp53 samples previously analyzed by or prepared in
parallel to those analyzed by denaturing top-down MS were transferred to LoBind tubes (Eppendorf,
Hauppauge, NY) and precipitated in 6 volumes of acetone overnight at -80 °C. Proteins were pelleted by
centrifugation at 13.2 krpm for 10 min. at 4 °C, washed in an equivalent volume of ice-cold acetone with
17 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
repeated centrifugation, and dried for 1 min. at RT. Pellets were resuspended in 20 μL 8M urea, reduced
with DTT (10 mM final concentration) for 20 min. at RT, alkylated with IAA (36 mM final concentration)
for 30 min. at RT in the dark, diluted to 150 μL in 0.1M NH4HCO3, pH 7.5, digested with 500 ng LysC
(Promega) for 2 h at RT, and quenched with 3 μL formic acid (Thermo Fisher Scientific). The resulting
peptides were subjected to concentration and desalting by C4 ZipTip (Merck Millipore), with activation
performed in 100% MeOH, equilibration and washing performed in 0.1% LC-MS grade trifluoracetic acid
(Sigma-Aldrich Corp.), and elution into 80% Optima-grade ACN and 0.1% trifluoracetic acid (all solutions
prepared in Optima-grade H2O, Thermo Fisher Scientific). Eluted peptides from both preparations were
diluted 1:5 in Mobile phase A and transferred to vials for analysis by middle-down LC-MS/MS.
Sample preparation for bottom-up mass spectrometry
Aliquots of unmodified, phosphorylated, and kinase-free rp53 samples previously analyzed or
prepared in parallel to those analyzed by top-down MS were transferred to LoBind tubes and precipitated
in 6 volumes of acetone overnight at -80 °C. Proteins were pelleted by centrifugation at 13.2 krpm for 10
min. at 4 °C, washed in an equivalent volume of ice-cold acetone with repeated centrifugation, and dried
for 1 min. at RT. Pellets were resuspended in 20 μL 8M urea, reduced with DTT (10 mM final concentration)
for 20 min. at RT, alkylated with IAA (36 mM final concentration) for 30 min. at RT in the dark, diluted to
150 μL in 0.1M NH4HCO3, pH 7.5, digested with 500 ng Trypsin-LysC (Promega, Madison, WI) overnight at
30 °C, and quenched with 3 μL formic acid (Thermo Fisher Scientific). The resulting peptides were
subjected to concentration and desalting by C18 ZipTip (Merck Millipore), with activation performed in
90% Optima-grade ACN and 0.2% LC-MS grade formic acid, equilibration and washing performed in 0.2%
LC-MS grade formic acid, and elution into 70% Optima-grade ACN and 0.2% formic acid (all solutions
prepared in Optima-grade H2O, Thermo Fisher scientific). Eluted peptides from both preparations were
dried via speedvac at RT, resuspended in 24 μL Mobile phase A, centrifuged at 13.2 krpm for 5 min. at RT
to remove debris, and transferred to vials for analysis by tandem MS/MS.
18 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
LC-MS/MS on an Orbitrap Fusion Lumos mass spectrometer
Top-down proteomics – Intact rp53 was analyzed by liquid chromatography on-line coupled to mass
spectrometry (LC-MS/MS). Reversed-phase liquid chromatography was performed on a Dionex Ultimate
3000 chromatographic system (Thermo Fisher Scientific) using a vented tee setup described previously
[34]. Mobile phase A was composed of 4.8% acetonitrile (ACN) in H2O with 0.2% formic acid (FA), while
Mobile phase B consisted of 4.8% H2O in ACN and 0.2% FA. Top-down experiments were performed using
in-house packed PLRP-S (5 μm, 1000 Å, Agilent, Santa Clara, CA) trap (150 µm i.d., 25 mm) and analytical
columns (75 µm i.d., 250 mm length). The samples were first loaded onto the trap column at 3 µL/min.
flow rate. After 10 min. of wash, proteins were separated at 300 nL/min. using the analytical column, held
at 20 °C, over a 30 min. gradient of Mobile phase B from 20 to 35% followed by column wash at 95% B
and re-equilibration at 10% B (total run time: 50 min.). The outlet of the analytical column was coupled to
a homemade nanoelectrospray (nanoESI) source, where ionization was facilitated using a pulled capillary
emitter (15 µm i.d., New Objective, Woburn, MA) connected to a high voltage union (potential set at 1.8-
2 kV). The Orbitrap Fusion Lumos was operated in “intact protein mode”, with ion transfer tube
temperature set at 320 °C. MS1 scans were recorded over a 500-2000 m/z window at 7,500 resolution (at
200 m/z) with 8 ms transients [34], averaging 20 microscans and with AGC target value of 4e5, applying a
4 V source-induced dissociation (SID). MS2 scans were acquired by targeting a given charge state or by
selecting the top 2 most abundant precursors, isolated using the quadrupole over a 3 m/z window.
Dynamic exclusion was set at 60 s with ±1.5 m/z unit tolerance. For high-energy collisional dissociation
(HCD) runs, precursor ions were activated using normalized collision energy (NCE) of 23% and a stepped
collision energy option (±5 steps). For collision-induced dissociation (CID) runs, precursor ions were
activated using NCE = 28% and an 0.25 activation q. For high capacity electron transfer dissociation (ETD)
runs [51], ion-ion interaction was set at 8 ms, with the reagent AGC target set at 7e5. All MS2 scans were
19 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
acquired at 60,000 resolution (at 200 m/z) over a 400-2000 m/z window, averaging 4 microscans and
setting the precursor AGC target value to 2e5.
Middle-down proteomics – Middle-down experiments were performed using in-house packed PLRP-
S (5 μm, 1000 Å, Agilent) trap (150 µm i.d., 25 mm) and analytical columns (75 µm i.d., 250 mm length).
The aforementioned vented tee setup was used, with the LC gradient being modified starting at 5% Mobile
phase B and washing the column at 90% B. Flow rates were not changed. The Orbitrap Fusion Lumos was
operated in “intact protein mode”, with ion transfer tube temperature set at 320 °C. MS1 scans were
recorded over a 500-2000 m/z window using 120,000 resolution (at 200 m/z), averaging 20 microscans
and with AGC target value of 2e5, applying a 4 V source-induced dissociation (SID). Data-dependent MS2
scans were acquired by selecting the top 2 most abundant precursors, isolated using the quadrupole over
a 3 m/z window. Dynamic exclusion was set at 60 s with ±1.5 m/z unit tolerance. For HCD runs, precursor
ions were activated using NCE=24% and a stepped collision energy option (±5% steps). For ETD runs, ion-
ion interaction was set at 20 ms, with the reagent AGC target set at 5e5. For electron transfer dissociation
– higher-energy collisional dissociation (EThcD) runs [52], ETD duration was reduced to 8 ms and
supplemented with HCD fragmentation applying a 15 V potential to the HCD cell. All MS2 scans were
acquired at 50,000 or 60,000 resolution (at 200 m/z) over a 400-2000 m/z window, averaging 4 microscans
and setting the precursor AGC target value to 2e5.
Bottom-up proteomics – For bottom-up proteomics, both trap (150 µm i.d., 25 mm length) and
analytical (75 µm i.d., 230 mm length) columns were in-house packed with C18 stationary phase
(ProntoSIL C18AQ, 3 µm, 120 Å, Bischoff Chromatography). The aforementioned vented tee setup was
used, with the LC gradient being modified starting at 2% Mobile phase B and washing the column at 85%
B. Flow rates were not changed. The Orbitrap Fusion Lumos was operated as follows: survey scans (MS1)
were collected at 120,000 resolution (at 200 m/z), over a 375-1800 m/z window. The MS1 AGC target was
set at 4e5 charges, with max injection time of 50 ms. Precursor ions were quadrupole isolated (isolation
20 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
window of 1.7 m/z) and subjected to two different types of tandem MS (MS2). Certain runs used EThcD to
fragment peptide cations, with ETD followed by HCD activation of ETD products at 15 V potential. ETD was
performed applying a base value of 100 ms for ion-ion interaction time (modulated using the precursor
charge-dependent function), a maximum injection time of 80 ms and a cation AGC target of 2e5 charges.
The reagent AGC was set at 5e5 charges. Alternatively, a decision tree [53] was used, where peptides
fragmented by HCD (NCE set at 29%, cation AGC set at 1e5, maximum injection time 70 ms) could undergo
a second isolation and MS2 fragmentation event performed via ETD if a neutral loss corresponding to the
detachment of a phosphate group was detected during the HCD MS2 event. ETD was performed with the
same parameters used for EThcD. All MS2 scans were recorded in the Orbitrap at 15,000 resolution (at
200 m/z) over a 110-2000 m/z window. All data-dependent scans were queued using a “total cycle time”
logic limiting the duty cycle to 2 s. The dynamic exclusion was applied, with 45 s duration and 10 ppm
precursor tolerance.
LC-MS/MS on a 21 Tesla FT-ICR mass spectrometer
Top-down proteomics – For analysis of intact unmodified, phosphorylated, and kinase-free rp53 by
denaturing top-down MS on the 21 T FT-ICR mass spectrometer at NHMFL [37, 54], reversed-phase liquid
chromatography was performed on an ACQUITY M-class chromatographic system (Waters, Milford, MA)
using the vented tee setup described above. Mobile phase A was composed of 4.8% acetonitrile (ACN) in
H2O with 0.3% formic acid (FA), while Mobile phase B consisted of 4.8% H2O in 47.5% ACN, 47.5% IPA, and
0.3% FA. Top-down experiments were performed using in-house packed PLRP-S (5 μm, 1000 Å, Agilent)
trap (150 µm i.d., 20 mm) and analytical columns (75 µm i.d., 250 mm length). The samples were first
loaded onto the trap column at 2.5 µL/min. flow rate, and after 10 min. of wash proteins were separated
at 300 nL/min. using the analytical column, maintained at room temperature, over gradient of Mobile
phase B from 5 to 50% in 2.5 min., 50% to 60% in 22.5 min., 60% to 90% in 25 min., and 90% to 5% in 5
min. (total run time: 55 min.). The outlet of the column was coupled to a homemade nanoESI source,
21 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
where ionization was reached using a pulled capillary emitter (15 µm i.d., New Objective) connected to a
high voltage union (potential set at 1.8-2 kV). The 21 T FT-ICR was operated in “proteomics mode”, with
ion transfer tube temperature set at 320 °C. Broadband MS1 scans were recorded over a 700-2000 m/z
window using 150,000-300,000 resolution (at 400 m/z), averaging 2-4 microscans and with AGC target
value of 3e6, applying a 15 V source-induced dissociation (SID). Isolated MS1 scans were recorded over an
m/z window of 950-1250 (centered around the rp53 44+ charge state), using 300,000 resolution (at 400
m/z), averaging 4 microscans and with AGC target value of 2.25e6, applying a 15 V source-induced
dissociation (SID). MS2 scans were acquired by targeting a given charge state or selecting the top-2 most
abundant precursors within a 15 m/z isolation window. Dynamic exclusion was set at 120 s with ±3 m/z
unit tolerance. For CID runs, precursor ions were activated using NCE=35% and an 0.25 activation q. For
front-end ETD (fETD) runs [55], ion-ion interaction time was set at 7.5-10 ms, with 10-15 multiple fills
within the multipole storage device (MSD, [37]) and the reagent AGC target set at 4e5. All MS2 scans were
acquired at 150,000 resolution (at 400 m/z) over a 300-2000 m/z window, averaging 2 microscans and
setting the precursor AGC target value to 6e5.
Data analysis
Top-down proteomics – Data acquired from intact rp53 by denaturing top-down LC-MS/MS were
analyzed as follows: for the medium-resolution FTMS1 spectra of unmodified or phosphorylated rp53
acquired at NRTDP, precursor average mass was determined from the MS1 spectrum by BioPharma Finder
(v3.0, Thermo Fisher Scientific) using the ReSpect™ algorithm with a target mass of 46 kDa, a maximum
charge of 100+, and a minimum of 6 adjacent charges over an m/z range of 500-2000. For the high-
resolution FTMS1 spectra of unmodified or phosphorylated rp53 acquired at NHMFL, precursor
monoisotopic masses were determined by the Xtract algorithm in Xcalibur (v. 3.0.57, Thermo Fisher
Scientific), with S/N ratio of 3 and a maximum allowed charge of 60+. MS2 fragment ion monoisotopic
masses from both experiments were obtained by use of Xtract, with S/N ratio of 3 and a maximum allowed
22 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
charge of 30+. Fragment ion masses were then compared against the rp53 sequence by ProSight Lite (v1.4,
http://prosightlite.northwestern.edu/, [35, 36]), with a global 10 ppm fragment ion mass tolerance and
(f)ETD (c,z), HCD (b,y), or CID (b,y) fragmentation specified as applicable.
Bottom-up proteomics – Data acquired from unmodified or Chk1-phosphorylated rp53 peptides by
tandem MS/MS were converted from .raw to Mascot .mgf files using the DTA Generator function in
COMPASS (v1.0.4.5, [56]), with precursor cleaning enabled and allowed precursor charges ranging from
2+ to 6+. The resulting files were searched against both the UniProt Escherichia coli database and a
database comprising the rp53 sequence using Mascot (v2.5.1, Matrix Science, www.matrixscience.com)
and the following parameters: tryptic peptides, up to 3 missed cleavages, precursor ion charges of 2-4+,
10 ppm precursor mass tolerance, and 0.2 Da fragment ion mass tolerance, with both fixed
(Carbamidomethyl (C, + 57.02 Da)) and variable (Phosphorylation (S/T/Y, + 79.96 Da), Oxidation (M, +
15.99 Da), Deamidation (N/Q, +0.98 Da), and Gln -> pyro-Glu (N-term Q, -17.03 Da)) modifications
specified. Replicate searches were also performed on each .raw file against the rp53 sequence database
using the MS Amanda [38] and Target Decoy PSM Validator nodes in Proteome Discoverer 1.4.1.14
(Thermo Scientific). Parameters were as follows: tryptic peptides, up to 3 missed cleavages, peak S/N
threshold of 3, precursor ion charges of 2-7+, 10 ppm precursor and fragment ion mass tolerances, strict
target FDR of 0.01, relaxed target FDR of 0.05, and both fixed (Carbamidomethyl (C, + 57.02 Da)) and
variable (Phosphorylation (S/T/Y, + 79.96 Da), Oxidation (M, + 15.99 Da), Deamidation (N/Q, +0.98 Da),
and Gln -> pyro-Glu (N-term Q, -17.03 Da)) modifications specified. The results from both search
workflows were interrogated manually for accuracy and duplication across replicate injections and
preparations. Those rp53 phosphorylation sites deemed to be high-confidence identifications were then
collated and incorporated into a custom search database generated by modifying the UniProt XML
sequence, variation, and post-translational modification database for human p53
23 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
(http://www.uniprot.org/uniprot/P04637) to incorporate the rp53 sequence and only the mapped Chk1
phosphorylation sites with Notepad ++ (v6.9, https://notepad-plus-plus.org/).
Middle-down proteomics –The rp53 + Chk1 XML file was used to generate a search database in
ProSightPC 4.0 (Thermo Fisher Scientific), against which the .raw files from the middle-down MS
experiments could be analyzed using the High Throughput Wizard function. Each .raw file was processed
using the Xtract algorithm and a custom set of parameters including a precursor mass tolerance of 0.5
m/z, a retention time tolerance of 2.0 min., a precursor and fragment ion S/N ratio of 3, and precursor
and fragment ion maximum charges of 80+ and 30+, respectively. The resulting .puf files were searched
against the rp53 + Chk1 database using a three-pronged search comprising an absolute mass search
(monoisotopic masses, 2.2 Da precursor mass tolerance, 10 ppm fragment mass tolerance), followed by
a biomarker search (monoisotopic masses, 10 ppm precursor and fragment ion mass tolerances, all
phosphorylation sites considered), followed by the same biomarker search with ‘delta m’ mode enabled
(as in [39]). Data acquired from unmodified rp53 were also searched against the rp53 + Chk1 database
using an identical strategy. The search results from each individual .raw file were saved as .puf files and
collated into a single .tdReport file using TDCollider v1.0 (http://proteinaceous.net/product/tdcollider/)
(Proteinaceous, Inc.). In addition to filtering the results by E-value (with those above 1 x 10-4 excluded),
each hit was manually validated for accuracy prior to inclusion in the final list of phospho-rp53 modforms.
Fragment ion coverage maps of each modform were created manually by exporting each hit to ProSight
Lite. Modform abundance histograms were generated by exporting the averaged MS1 spectrum of each
partial digest product (most abundant region of the respective chromatographic peak) from Xcalibur
QualBrowser (Thermo Fisher Scientific), performing mass deconvolution and relative modform
determination in BioPharma Finder (Thermo Fisher Scientific) using the Respect™ algorithm with target
mass range adjusted for each respective species, exporting the resulting average masses and relative
abundances to Microsoft Excel, normalizing to the most abundant peak, and manually generating
24 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
histograms from the final data. Modform distribution estimations for intact rp53 and the two largest
peptides analyzed by MD MS were estimated by multiplying the percent relative abundance of each
species by the potential number of modforms (2^N) for 0-8 modification sites, then summing the results
(Table S3).
Experimental Design and Statistical Rationale
For the analysis of intact rp53 by denaturing top-down LC-MS/MS, a total of 5 biological replicate
preparations of purified protein were analyzed, with up to 6 technical replicates per preparation. This
number was required due to the pronounced instability of rp53 in MS-compatible buffers, which required
substantial optimization of the top-down MS workflow. For the analysis of in vitro phosphorylated rp53,
in addition to the kinase-free controls prepared in parallel to account for any chemical adducts due to the
kinase reaction, a total of 3 biological replicate preparations and up to 5 technical replicates per
preparation were analyzed by top-down, middle-down, and bottom-up mass spectrometry. For the latter
two methods, partial or complete proteolytic digestion was performed on samples which had previously
been subjected to top-down analysis, in addition to technical replicate samples generated in parallel. This
was done to determine the degree of variation in Chk1-derived recombinant p53 phosphorylation
patterns between biological and technical replicates, as well as to directly identify which phosphorylation
sites were present within a specific phosphorylated rp53 modform population. For analysis of the bottom-
up data, only those peptides with a Mascot ion score of 30 or greater, an MSAmanda score of 150 or
greater, an E-value of 1 x 10-16 or below, and one or more manually verified localized phosphorylation
sites were included in the list of identified rp53 phosphorylation sites and the custom search database.
For the analysis of the middle-down data, only those species identified with an E-value 1 x 10-4 or below
and manually verified to have fragment ion coverage of sufficient quality to identify each rp53
phosphorylation site were considered.
25 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 1
Figure 1. Sequence of endogenous and recombinant human p53. A. Endogenous human p53 sequence, with functional domains and theoretical monoisotopic mass indicated. B. Recombinant human p53 (rp53) sequence, comprising the full-length endogenous human p53 sequence with an N-terminal 20-residue 6xHis tag and linker sequence, with corresponding changes in residue numbers and theoretical monoisotopic mass indicated.
26 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 2
Figure 2. Initial characterization of representative p53 modforms. A. MS1 spectrum of in vitro phosphorylated recombinant human p53 (rp53) obtained by TD LC-MS/MS on an Orbitrap Fusion Lumos mass spectrometer using an FT RP of 7,500 at 200 m/z and a custom short-transient method. Inset: Average masses of the 8 rp53 phospho-modforms deconvolved using the ReSpect™ algorithm in BioPharma Finder. B. Phosphorylation sites mapped by BU and MD LC-MS/MS on an Orbitrap Fusion Lumos mass spectrometer.
27 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 3
28 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 3. Integration of proteomic approaches to estimate p53 modform distributions. A. Relative abundance of all modforms of intact phospho-rp53 detected by top-down mass spectrometry. B. Top: Coverage map of phosphorylation sites identified by bottom-up and middle-down mass spectrometry; sites indicated with vertical lines. Bottom: Coverage map of the phosphorylation sites located within the largest N-terminal and C-terminal rp53 peptides identified by middle-down analysis. C. Relative abundance of all modforms of the N-terminal 269 amino acid residues of phospho-rp53 detected by middle-down mass spectrometry. D. Relative abundance of all modforms of the C-terminal 206 amino acid residues of phospho-rp53 detected by middle-down mass spectrometry. E. Relative abundance of all modforms of the N-terminal 138 amino acid residues of phospho-rp53 detected by middle-down mass spectrometry. F. Relative abundance of all modforms of the C-terminal 116 amino acid residues of phospho-rp53 detected by middle-down mass spectrometry.
29 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure 4
Figure 4. A combined strategy to characterize and constrain complex modform distributions. Using a combination of top-down (TD), middle-down (MD), and bottom-up (BU) strategies, one obtains different estimations of the maximal number of modforms present within an in vitro modified recombinant p53 population. One can reconcile the different estimates by applying an abundance filter and realizing that occupancies for individual PTMs can range from 100% to <0.01%; the detection of multiple sites by larger peptides creates a fractional multiplication which has PTM sites with very low occupancies contributing negligibly to the relative abundance of modforms as viewed by MS of the whole protein (which typically has a detection limit of modforms present at ~1% relative abundance or higher).
30 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
References
1. Aebersold, R., et al., How many human proteoforms are there? Nat Chem Biol, 2018. 14(3): p. 206-214. 2. Hollstein, M., et al., p53 mutations in human cancers. Science, 1991. 253(5015): p. 49-53. 3. Joerger, A.C. and A.R. Fersht, Structural biology of the tumor suppressor p53 and cancer- associated mutants. Adv Cancer Res, 2007. 97: p. 1-23. 4. Freed-Pastor, W.A. and C. Prives, Mutant p53: one name, many proteins. Genes Dev, 2012. 26(12): p. 1268-86. 5. Meek, D.W., Regulation of the p53 response and its relationship to cancer. Biochem J, 2015. 469(3): p. 325-46. 6. Horn, H.F. and K.H. Vousden, Coping with stress: multiple ways to activate p53. Oncogene, 2007. 26(9): p. 1306-16. 7. Menendez, D., A. Inga, and M.A. Resnick, The expanding universe of p53 targets. Nat Rev Cancer, 2009. 9(10): p. 724-37. 8. Vousden, K.H. and C. Prives, Blinded by the Light: The Growing Complexity of p53. Cell, 2009. 137(3): p. 413-31. 9. Gottlieb, E. and K.H. Vousden, p53 regulation of metabolic pathways. Cold Spring Harb Perspect Biol, 2010. 2(4): p. a001040. 10. Loewer, A., et al., Basal dynamics of p53 reveal transcriptionally attenuated pulses in cycling cells. Cell, 2010. 142(1): p. 89-100. 11. Purvis, J.E., et al., p53 dynamics control cell fate. Science, 2012. 336(6087): p. 1440-4. 12. Murray-Zmijewski, F., E.A. Slee, and X. Lu, A complex barcode underlies the heterogeneous response of p53 to stress. Nat Rev Mol Cell Biol, 2008. 9(9): p. 702-12. 13. Kruse, J.P. and W. Gu, Modes of p53 regulation. Cell, 2009. 137(4): p. 609-22. 14. Gu, B. and W.G. Zhu, Surf the post-translational modification network of p53 regulation. Int J Biol Sci, 2012. 8(5): p. 672-84. 15. DeHart, C.J., et al., Extensive post-translational modification of active and inactivated forms of endogenous p53. Mol Cell Proteomics, 2014. 13(1): p. 1-17. 16. Prabakaran, S., et al., Post-translational modification: nature's escape from genetic imprisonment and the basis for dynamic information encoding. Wiley Interdiscip Rev Syst Biol Med, 2012. 4(6): p. 565-83. 17. Leney, A.C., et al., Elucidating crosstalk mechanisms between phosphorylation and O- GlcNAcylation. 2017. 114(35): p. E7255-E7261. 18. Smith, L.M., N.L. Kelleher, and P. Consortium for Top Down, Proteoform: a single term describing protein complexity. Nat Methods, 2013. 10(3): p. 186-7. 19. Prabakaran, S., et al., Comparative analysis of Erk phosphorylation suggests a mixed strategy for measuring phospho-form distributions. Mol Syst Biol, 2011. 7: p. 482. 20. DeHart, C.J., D.H. Perlman, and S.J. Flint, Impact of the adenoviral E4 Orf3 protein on the activity and posttranslational modification of p53. J Virol, 2015. 89(6): p. 3209-20. 21. Compton, P.D., N.L. Kelleher, and J. Gunawardena, Estimating the Distribution of Protein Post- Translational Modification States by Mass Spectrometry. J Proteome Res, 2018. 17(8): p. 2727- 2734. 22. Toby, T.K., L. Fornelli, and N.L. Kelleher, Progress in Top-Down Proteomics and the Analysis of Proteoforms. Annu Rev Anal Chem (Palo Alto Calif), 2016. 9(1): p. 499-519. 23. Ntai, I., et al., Precise characterization of KRAS4b proteoforms in human colorectal cells and tumors reveals mutation/modification cross-talk. Proc Natl Acad Sci U S A, 2018. 115(16): p. 4140-4145.
31 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
24. Wildburger, N.C., et al., Diversity of Amyloid-beta Proteoforms in the Alzheimer’s Disease Brain. Scientific Reports, 2017. 7(1): p. 9520. 25. Toby, T.K., et al., Proteoforms in Peripheral Blood Mononuclear Cells as Novel Rejection Biomarkers in Liver Transplant Recipients. Am J Transplant, 2017. 17(9): p. 2458-2467. 26. Fornelli, L., et al., Accurate Sequence Analysis of a Monoclonal Antibody by Top-Down and Middle-Down Orbitrap Mass Spectrometry Applying Multiple Ion Activation Techniques. Anal Chem, 2018. 90(14): p. 8421-8429. 27. Wu, C., et al., A protease for 'middle-down' proteomics. Nature Methods, 2012. 9: p. 822. 28. Cannon, J., et al., High-throughput middle-down analysis using an orbitrap. J Proteome Res, 2010. 9(8): p. 3886-90. 29. Goudelock, D.M., et al., Regulatory interactions between the checkpoint kinase Chk1 and the proteins of the DNA-dependent protein kinase complex. J Biol Chem, 2003. 278(32): p. 29940-7. 30. Shieh, S.Y., et al., The human homologs of checkpoint kinases Chk1 and Cds1 (Chk2) phosphorylate p53 at multiple DNA damage-inducible sites. Genes Dev, 2000. 14(3): p. 289-300. 31. Craig, A.L., et al., The MDM2 ubiquitination signal in the DNA-binding domain of p53 forms a docking site for calcium calmodulin kinase superfamily members. Mol Cell Biol, 2007. 27(9): p. 3542-55. 32. Ou, Y.H., et al., p53 C-terminal phosphorylation by CHK1 and CHK2 participates in the regulation of DNA-damage-induced C-terminal acetylation. Mol Biol Cell, 2005. 16(4): p. 1684-95. 33. Ayed, A., et al., Latent and active p53 are identical in conformation. Nat Struct Biol, 2001. 8(9): p. 756-60. 34. Fornelli, L., et al., Advancing Top-down Analysis of the Human Proteome Using a Benchtop Quadrupole-Orbitrap Mass Spectrometer. J Proteome Res, 2017. 16(2): p. 609-618. 35. Fellers, R.T., et al., ProSight Lite: graphical software to analyze top-down mass spectrometry data. Proteomics, 2015. 15(7): p. 1235-8. 36. DeHart, C.J., et al., Bioinformatics Analysis of Top-Down Mass Spectrometry Data with ProSight Lite. Methods Mol Biol, 2017. 1558: p. 381-394. 37. Hendrickson, C.L., et al., 21 Tesla Fourier Transform Ion Cyclotron Resonance Mass Spectrometer: A National Resource for Ultrahigh Resolution Mass Analysis. J Am Soc Mass Spectrom, 2015. 26(9): p. 1626-32. 38. Dorfer, V., et al., MS Amanda, a Universal Identification Algorithm Optimized for High Accuracy Tandem Mass Spectra. Journal of Proteome Research, 2014. 13(8): p. 3679-3684. 39. Zamdborg, L., et al., ProSight PTM 2.0: improved protein identification and characterization for top down mass spectrometry. Nucleic Acids Res, 2007. 35(Web Server issue): p. W701-6. 40. Compton, P.D., et al., On the scalability and requirements of whole protein mass spectrometry. Anal Chem, 2011. 83(17): p. 6868-74. 41. Kastenhuber, E.R. and S.W. Lowe, Putting p53 in Context. Cell, 2017. 170(6): p. 1062-1078. 42. Knights, C.D., et al., Distinct p53 acetylation cassettes differentially influence gene-expression patterns and cell fate. J Cell Biol, 2006. 173(4): p. 533-44. 43. Roy, S. and M. Tenniswood, Site-specific acetylation of p53 directs selective transcription complex assembly. J Biol Chem, 2007. 282(7): p. 4765-71. 44. Smeenk, L., et al., Role of p53 serine 46 in p53 target gene regulation. PLoS One, 2011. 6(3): p. e17574. 45. Sykes, S.M., et al., Acetylation of the p53 DNA-binding domain regulates apoptosis induction. Mol Cell, 2006. 24(6): p. 841-51. 46. Fischer, M., Census and evaluation of p53 target genes. Oncogene, 2017. 36(28): p. 3943-3956. 47. Lee, C.W., et al., Graded enhancement of p53 binding to CREB-binding protein (CBP) by multisite phosphorylation. Proc Natl Acad Sci U S A, 2010. 107(45): p. 19290-5.
32 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
48. Shevchenko, A., et al., Mass spectrometric sequencing of proteins silver-stained polyacrylamide gels. Anal Chem, 1996. 68(5): p. 850-8. 49. Schneider, C.A., W.S. Rasband, and K.W. Eliceiri, NIH Image to ImageJ: 25 years of image analysis. Nat Methods, 2012. 9(7): p. 671-5. 50. Wessel, D. and U.I. Flugge, A method for the quantitative recovery of protein in dilute solution in the presence of detergents and lipids. Anal Biochem, 1984. 138(1): p. 141-3. 51. Riley, N.M., et al., Enhanced Dissociation of Intact Proteins with High Capacity Electron Transfer Dissociation. J Am Soc Mass Spectrom, 2016. 27(3): p. 520-31. 52. Brunner, A.M., et al., Benchmarking multiple fragmentation methods on an orbitrap fusion for top-down phospho-proteoform characterization. Anal Chem, 2015. 87(8): p. 4152-8. 53. Bailey, D.J., et al., Instant spectral assignment for advanced decision tree-driven mass spectrometry. Proc Natl Acad Sci U S A, 2012. 109(22): p. 8411-6. 54. Anderson, L.C., et al., Identification and Characterization of Human Proteoforms by Top-Down LC-21 Tesla FT-ICR Mass Spectrometry. J Proteome Res, 2017. 16(2): p. 1087-1096. 55. Earley, L., et al., Front-end electron transfer dissociation: a new ionization source. Anal Chem, 2013. 85(17): p. 8385-90. 56. Wenger, C.D., et al., COMPASS: a suite of pre- and post-search proteomics software tools for OMSSA. Proteomics, 2011. 11(6): p. 1064-74.
33 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Supplemental Figures
Figure S1. Preparing a representative p53 modform population. A. Silver nitrate stained gel image showing the elution profile of recombinant human p53 (rp53) purified from an E. coli expression system by cobalt affinity chromatography. B. Stained and colorized gel image showing the presence of phosphorylation on rp53 after in vitro modification with Chk1 kinase. ProQ (green), phosphorylated protein; SYPRO Ruby (red), total protein.
34 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure S2. Analysis of intact recombinant human p53 by top-down LC-MS/MS. A. Medium resolution intact mass (MS1) spectrum of rp53 obtained on an Orbitrap Fusion Lumos mass spectrometer using an FT Resolving Power (RP) of 7,500 at 200 m/z and a custom short-transient method. Inset: Expansion of the 48+ charge state of rp53. B. High-resolution MS1 spectrum of rp53 obtained on a 21 Tesla FT-ICR mass spectrometer using an FT RP of 150,000 at 400 m/z. Inset: Expansion of the 48+ charge state of rp53, showing baseline isotopic resolution obtained at 21 Tesla.
35 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure S3. MS2 characterization of intact recombinant human p53 on a chromatographic timescale. A. Fragment ion (MS2) spectrum (top) and graphical fragment ion map (bottom) obtained from intact rp53 by collision-induced dissociation (CID) activation on an Orbitrap Fusion Lumos mass spectrometer. B. MS2 spectrum (top) and graphical fragment ion map (bottom) obtained from intact rp53 by front-end ETD (fETD) activation on a 21 Tesla FT-ICR mass spectrometer.
36 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure S4. Visualization of the rp53 phospho-modform distribution by whole protein mass spectrometry. A. Zoomed-in view of the 44+ charge state of in vitro phosphorylated rp53 MS1 spectra obtained at medium resolution on an Orbitrap Fusion Lumos mass spectrometer. B. Isolated MS1 spectra of the 44+ charge state of in vitro phosphorylated rp53 obtained at high resolution on a 21 T FT-ICR mass spectrometer. For both MS1 spectra, peaks resulting from the addition of up to 8 phosphorylations are colored blue and labeled according to the number of modifications.
37 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure S5. Sequential elution of intact phosphorylated p53 modforms. Zoomed-in view of the 44+ charge state of in vitro phosphorylated rp53 MS1 spectra acquired at medium resolution on an Orbitrap Fusion Lumos, sampled at sequential points along the chromatographic elution peak. RT, retention time.
38 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
39 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure S6. Phosphorylation sites localized to the rp53 termini by middle-down mass spectrometry. A. Top: Coverage map of phosphorylation sites identified by bottom-up and middle-down mass spectrometry; sites indicated with vertical lines. Bottom: Coverage map of the phosphorylation sites located within the smallest N-terminal and C-terminal rp53 partial digestion products identified by middle- down analysis. B. Relative abundance of all modforms of the N-terminal 42 amino acid residues of phospho-rp53 detected by middle-down mass spectrometry. C. Relative abundance of all modforms of the N-terminal 42 amino acid residues of phospho-rp53 detected by middle-down mass spectrometry. C. Relative abundance of all modforms of the C-terminal 87 amino acid residues of phospho-rp53 detected by middle-down mass spectrometry. D. Phosphorylation sites and combinations detected within the N- terminal 42 amino acid residues of phospho-rp53 by middle-down mass spectrometry. E, F. Phosphorylation sites and combinations detected within the C-terminal 87 amino acid residues of phospho-rp53 by middle-down mass spectrometry.
40 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
41 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure S7. Visualizing phosphorylated sites within N-terminal regions of rp53 by middle-down mass spectrometry. A. MS1 spectrum of the 29.8 kDa partial digestion product comprising the N-terminal 269 amino acid residues of Chk1 phosphorylated rp53 (top), with MS2 fragment ion map of the corresponding sequence and a hypothetical phosphorylation site (bottom). Inset: Zoomed-in view of the 53+ charge state, showing evidence for the presence of up to 8 phosphorylated sites. B. MS1 spectrum of the 14.8 kDa partial digestion product comprising the N-terminal 138 amino acid residues of Chk1 phosphorylated rp53 (top), with MS2 fragment ion map of the corresponding sequence and an example phosphorylation site (bottom). Inset: Zoomed-in view of the 18+ charge state, showing evidence for the presence of up to 4 phosphorylated sites.
42 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
43 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Figure S8. Visualizing phosphorylated sites within C-terminal regions of rp53 by middle-down mass spectrometry. A. MS1 spectrum of the 23.4 kDa partial digestion product comprising the C- terminal 206 amino acid residues of Chk1 phosphorylated rp53 (top), with MS2 fragment map showing partial characterization and example phosphorylation sites (bottom). Inset: Zoomed-in view of the 36+ charge state, showing evidence for the presence of up to 5 phosphorylated sites. B. MS1 spectrum of the 13.2 kDa partial digestion product comprising the C-terminal 116 amino acid residues of Chk1 phosphorylated rp53 (top), with MS2 fragment ion map of the corresponding sequence (bottom). Inset: Zoomed-in view of the 20+ charge state, showing evidence for the presence of up to 4 phosphorylated sites.
44 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Supplemental Tables
Residue Position Residue Position Ser 11 Thr 190 Ser 19 Ser 203 Ser 26 Ser 205 Ser 29 Tyr 225 Ser 35 Thr 231 Thr 38 Ser 235 Ser 40 Thr 273 Ser 57 Thr 276 Ser 110 Ser 289 Ser 114 Ser 323 Ser 119 Thr 324 Thr 122 Thr 332 Ser 126 Ser 333 Ser 136 Ser 334 Thr 138 Thr 349 Ser 141 Ser 382 Thr 143 Ser 386 Thr 160 Thr 407 Ser 169 Ser 412 Ser 186
Table S1. Phosphorylation sites mapped to rp53 by bottom-up mass spectrometry.
45 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Species Observed Mass (Da) Theoretical Mass (Da) Error (Da) Error (ppm) Intact rp53 45684.86 45685.31 0.45 9.84999336 Intact rp53 + 1 Pi 45764.08 45765.29 1.21 26.4392512 Intact rp53 + 2 Pi 45843.84 45845.27 1.43 31.1918765 Intact rp53 + 3 Pi 45923.43 45925.25 1.82 39.6296155 Intact rp53 + 4 Pi 46003.48 46005.23 1.75 38.0391534 Intact rp53 + 5 Pi 46083.25 46085.21 1.96 42.5299136 Intact rp53 + 6 Pi 46162.52 46165.19 2.67 57.8357849 Intact rp53 + 7 Pi 46242.82 46245.17 2.35 50.8161177 Intact rp53 + 8 Pi 46320.47 46323.13 2.66 57.4227173
Species Observed Mass (Da) Theoretical Mass (Da) Error (Da) Error (ppm) 29.8 kDa peptide (AA 1-270) 29728.86 29731.24 2.38 79.9401152 29.8 kDa peptide + 1 Pi 29808.51 29811.22 2.71 90.9787489 29.8 kDa peptide + 2 Pi 29889.58 29891.2 1.62 54.3246215 29.8 kDa peptide + 3 Pi 29968.24 29971.18 2.94 97.9560823 29.8 kDa peptide + 4 Pi 30047.35 30051.16 3.81 126.79679 29.8 kDa peptide + 5 Pi 30128.71 30131.14 2.43 80.8108115 29.8 kDa peptide + 6 Pi 30208.41 30211.12 2.71 89.6322488 29.8 kDa peptide + 7 Pi 30288.83 30291.1 2.27 74.9369245 29.8 kDa peptide + 8 Pi 30369.81 30371.08 1.27 41.6694722
Species Observed Mass (Da) Theoretical Mass (Da) Error (Da) Error (ppm) 23.4 kDa peptide (AA 206-412) 23336.09 23338.24 2.15 92.1234849 23.4 kDa peptide + 1 Pi 23415.09 23418.22 3.13 133.656614 23.4 kDa peptide + 2 Pi 23497.03 23498.2 1.17 49.7910478 23.4 kDa peptide + 3 Pi 23578.05 23580.2 2.15 91.1781919 23.4 kDa peptide + 4 Pi 23657.21 23660.18 2.97 125.527363 23.4 kDa peptide + 5 Pi 23737.46 23740.16 2.7 113.731331
Species Observed Mass (Da) Theoretical Mass (Da) Error (Da) Error (ppm) 14.8 kDa peptide (AA 1-139) 14883.31 14884.42 1.11 74.8659294 14.8 kDa peptide + 1 Pi 14962.80 14964.4 1.60 107.194512 14.8 kDa peptide + 2 Pi 15042.71 15044.38 1.67 111.267152 14.8 kDa peptide + 3 Pi 15121.39 15124.36 2.97 196.524328 14.8 kDa peptide + 4 Pi 15202.10 15204.34 2.24 147.480506
Species Observed Mass (Da) Theoretical Mass (Da) Error (Da) Error (ppm) 13.2 kDa peptide (AA 296-412) 13159.84 13159.69 -0.15 -11.7576145 13.2 kDa peptide + 1 Pi 13239.38 13239.67 0.29 21.6176791 13.2 kDa peptide + 2 Pi 13319.47 13319.65 0.18 13.2411278 13.2 kDa peptide + 3 Pi 13398.66 13399.63 0.97 72.305518 13.2 kDa peptide + 4 Pi 13478.09 13479.61 1.52 112.702051
Table S2. Deconvolution results from TD and MD MS of phospho-rp53 modforms
46 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
Number of Estimated Relative Intact rp53 Number of Potential Modform Modform Modification Modforms Distribution Abundance (%) Sites (N) (2^N) (%)*(2^N) 0.03 0 0 0 0.45 1 2 1 1.00 2 4 4 0.93 3 8 7 0.50 4 16 8 0.32 5 32 10 0.18 6 64 12 0.12 7 128 15 0.04 8 256 10 Total: 68
Number of Estimated Relative 29.8 kDa Number of Potential Modform Peptide Modform Modification Modforms Distribution Abundance (%) Sites (N) (2^N) (%)*(2^N) 0.06 0 0 0 0.64 1 2 1 0.59 2 4 2 1.00 3 8 8 0.67 4 16 11 0.94 5 32 30 0.34 6 64 22 0.26 7 128 33 0.05 8 256 13 Total: 120
Number of Estimated Relative 23.4 kDa Number of Potential Modform Peptide Modform Modification Modforms Distribution Abundance (%) Sites (N) (2^N) (%)*(2^N) 0.04 0 0 0 0.21 1 2 0 0.63 2 4 3 1 3 8 8 0.6 4 16 10 0.15 5 32 5 Total: 25
Table S3. Estimation of phospho-rp53 modform distributions
47