1 a Multi-Modal Proteomics Strategy for Characterizing Posttranslational Modifications of Tumor Suppressor P53 Reveals Many Site

bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. A multi-modal proteomics strategy for characterizing posttranslational modifications of tumor suppressor p53 reveals many sites but few modified forms Caroline J. DeHart1, Luca Fornelli1, Lissa C. Anderson2, Ryan T. Fellers1, Dan Lu3, Christopher L. Hendrickson2,4, Galit Lahav3, Jeremy Gunawardena3, and Neil L. Kelleher1,5* Summary Post-translational modifications (PTMs) are found on most proteins, particularly on “hub” proteins like the tumor suppressor p53, which has over 100 possible PTM sites. Substantial crosstalk between PTM sites underlies the ability of such proteins to integrate diverse signals and coordinate downstream responses. However, disentangling the combinatorial explosion in global PTM patterns across an entire protein (“modforms”) has been challenging, as conventional peptide-based mass spectrometry strategies (so-called “bottom-up” MS) destroy such global correlations. Alternatively, direct analysis of intact and modified proteins using “top-down” MS retains global information. Here, we applied both strategies to recombinant p53 phosphorylated in vitro with Chk1 kinase, which exhibited 41 modified sites by bottom- up MS, but no more than 8 modified sites per molecule detected by top-down MS. This observation that many low-abundance modifications comprise relatively few modforms above a 1% threshold indicates that endogenous p53 PTM complexity may be more definable than previously thought. Keywords: p53, top-down proteomics, post-translational modifications, modforms, 21 T FT-ICR 1 National Resource for Translational and Developmental Proteomics, Northwestern University, Evanston, IL, 60208, USA 2 Ion Cyclotron Resonance Program, National High Magnetic Field Laboratory, Tallahassee, FL, 32310, USA 3 Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA 4 Department of Chemistry and Biochemistry, Florida State University, Tallahassee, FL, 32310, USA 5 Departments of Chemistry, Molecular Biosciences and the Division of Hematology-Oncology, Northwestern University, Evanston, IL, 60208, USA * Correspondence: [email protected] 1 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Introduction The development of nucleic acid sequencing technologies has domesticated the genome and revealed the unexpected richness of RNA biology. Proteins, however, remain central players in most cellular processes as the vast complexity of the proteome becomes better appreciated through the application of emerging technologies [1]. Aside from somatic mutations in coding sequences and alternative splicing, most proteins are also post-translationally modified, often on multiple sites. This is particularly the case for “hub” proteins of key cellular pathways, which integrate diverse input signals and help orchestrate multiple downstream responses. The tumor suppressor p53 provides a potent example of proteomic complexity. p53 is a master cellular regulator and frequently mutated in human cancers [2-4]. It integrates input signals from myriad stressors, including genotoxic damage, oncogenesis, hypoxia, and viral infection [5, 6] into highly orchestrated downstream responses, such as apoptosis, mitosis, and metabolic changes [7-9]. It does so in a highly dynamic [10, 11] and stressor- and cell-type-specific manner [12]. The activity and stability of p53 is associated with hundreds of individual post-translational modifications (PTMs) reported to date [13-15], which are known to be regulated by numerous enzymes [13, 14] within stressed and unstressed cells. These PTMs have been identified on over 100 sites within the N-terminal transactivation domains (AA 1-92), central DNA-binding domain (AA 102-292), and C-terminal tetramerization and basic domains (AA 292-393) of the protein (Figure 1A) [3, 13, 14]. For p53 and similar proteins, it is likely that extensive crosstalk exists between PTM sites on different domains to integrate upstream signals and downstream responses. Such PTM crosstalk offers a way to encode information about context and function in the pattern of modifications across the whole protein, as suggested by the concept of combinatorial “PTM codes”, of which the histone code is best known [16]. Crosstalk between modified sites could arise through protein folding, which can bring widely separated parts of a peptide sequence together in space, as well as through allosteric conformational changes that 2 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. could alter binding partners [17]. If such crosstalk underlies information encoding and integrative function, then it becomes essential to know the patterns of modification across the entire protein, as well as the stoichiometric abundance and distribution of these patterns within a given biological context. This presents a formidable challenge, especially for extensively modified proteins such as p53, which could theoretically exhibit over 1030 modified forms. While these cannot all exist, given the limitation of protein copy number in cells or entire organisms [1], the sheer scale of combinatorial PTMs highlights the challenge of determining which modified p53 forms are actually present within a given biological context. The term “proteoform” covers all changes of a given protein molecule arising after transcription, including alternative spliceforms, chemical adducts, proteolytic cleavages, and PTMs [1, 18]. In the context of information encoding by PTMs, it is more helpful to focus on “modforms” [19], or those proteoforms arising from reversible enzymatic modification and de-modification, such as phosphorylation, acetylation and ubiquitinylation. The number of modforms that a protein can exhibit grows exponentially with the number of modification sites. An individual protein molecule will exhibit one modform, while a population of protein molecules will exhibit a distribution of modforms in varying proportions. This “modform distribution” offers the most complete information about global PTM crosstalk and a protein's capability to encode information. Post-translational modifications of p53 are typically analyzed by PTM-specific antibodies, which provide high sensitivity but can only identify a few modifications. Traditional proteomic approaches employing proteolytic digestion and analysis of the resulting peptides offer another approach for PTM detection and are commonly referred to as “bottom-up” (BU) mass spectrometry (MS). This latter strategy excels at high-throughput, high-confidence PTM characterization of p53 populations, most notably in a pair of recent studies which performed comprehensive PTM cataloging of p53 isolated from normal human fibroblasts and identified 230 PTMs across 101 amino acid residues of p53 [15, 20]. However, proteolytic digestion severs the linkage of combinatorial PTM patterns, losing information on PTM 3 bioRxiv preprint doi: https://doi.org/10.1101/455527; this version posted October 29, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. crosstalk between functional domains and making accurate determination of complex modform distributions impossible [21]. An alternative strategy, analysis of intact modforms by “top-down” (TD) MS [22], offers a complementary approach. By keeping the complete protein primary structure intact, the PTM landscape of a proteoform can be characterized in precise molecular detail, and correlations between widely separated modification sites can be maintained [23-25]. In recent work, we showed that TD MS offers a significant advantage for estimating modform distributions [19, 21], especially when used in combination with BU MS and other “middle-down” (MD) MS approaches, which employ partial enzymatic digestion to provide ~5-25 kDa protein fragments [26-28]. However, analysis of intact endogenous human p53 modforms by TD MS presents a formidable technical challenge, due to both the molecular weight (43.6 kDa) and the sheer number of potential modforms. A method by which the number and identity of PTMs on the most abundant p53 modforms could be characterized within a specific biological context would therefore be immensely valuable to the fields of p53 biology and translational proteomics. To explore the capabilities of TD MS for p53 modform characterization, we focused on recombinant human p53 (hereafter, “rp53”), phosphorylated in vitro by the kinase Chk1, which is known to target p53 in vivo [29-32]. Despite showing far less PTM complexity than endogenous p53, this representative model reflects the patterns of modification produced by typical regulatory enzymes. Although previous studies of Chk1 action on p53 have shown only 8 phosphorylation sites, we mapped a total of 41 phosphorylation sites to rp53 by combined BU and MD MS. However, TD MS revealed no more than 8 phosphorylation sites on each intact molecule, within the

Load more