<<

Journal of Research

This document is confidential and is proprietary to the American Chemical Society and its authors. Do not copy or disclose without written permission. If you have received this item in error, notify the sender and delete all copies.

Progress on Identifying and Characterizing the Human Proteome: 2018-2019 Metrics from the HUPO

Journal: Journal of Proteome Research

Manuscript ID Draft

Manuscript Type: Perspective

Date Submitted by the n/a Author:

Complete List of Authors: Omenn, Gilbert; University of Michican Medical School, Center for Computational Medicine and Lane, Lydie; Swiss Institute of Bioinformatics, Overall, Christopher; The University of British Columbia, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular ; The University of British Columbia, Departments of Oral Biological & Medical Sciences, and Biochemistry & Corrales, Fernando; Centro Nacional de Biotecnologia, Functional Schwenk, Jochen; KTH - Royal Institute of Technology, for Life Laboratory Paik, Young-Ki; Yonsei University, Yonsei Proteome Research Center Van Eyk, Jennifer; Cedars-Sinai Medical Center, Medicine Liu, Siqi; BGI-Shenzhen, Pennington, Stephen; University College Dublin, UCD Conway Insitute Snyder, Michael; Stanford University, Genetics Baker, Mark; Macquarie University, Biomedical Sciences Deutsch, Eric; Institute for ,

ACS Paragon Plus Environment Page 1 of 47 Journal of Proteome Research

1 2 3 4 5 6 7 PERSPECTIVE 8 9 10 11 12 13 Progress on Identifying and Characterizing the 14 15 16 17 18 Human Proteome: 2018-2019 Metrics from the 19 20 21 22 HUPO Human Proteome Project 23 24 25 26 27 Gilbert S. Omenn* × π, Lydie Lane∞, Christopher M. Overall, Fernando J. CorralesA, 28 29 30 Jochen M. SchwenkB, Young-Ki PaikC, Jennifer E. Van EykD, Siqi LiuE, Stephen 31 32 33 34 PenningtonH, Michael P. SnyderF, Mark S. BakerG, Eric W. Deutschπ 35 36 37 38 × Department of Computational Medicine and Bioinformatics, University of Michigan, 39 40 41 42 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States 43 44 45 ∞ CALIPHO Group, SIB Swiss Institute of Bioinformatics and Department of 46 47 48 49 Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CMU, 50 51 52 Michel-Servet 1, 1211 Geneva 4, Switzerland 53 54 55 56 57 58 1 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 2 of 47

1 2 3  4 Life Sciences Institute, Faculty of Dentistry, University of British Columbia, 2350 Health 5 6 7 Sciences Mall, Room 4.401, Vancouver, BC Canada V6T 1Z3 8 9 10 A 11 Centro Nacional de Biotecnologia (CSIC), Darwin 3, 28049, Madrid 12 13 14 B Science for Life Laboratory, KTH Royal Institute of Technology, Tomtebodavägen 15 16 17 18 23A, 17165 Solna, Sweden 19 20 21 22 C Yonsei Proteome Research Center, Room 425, Building #114, Yonsei University, 23 24 25 26 50 Yonsei-ro, Seodaemoon-ku, Seoul 120-749, South Korea 27 28 29 30 D Advanced Clinical BioSystems Research Institute, Cedars Sinai Precision 31 32 33 34 Laboratories, Barbra Streisand Women’s Heart Center, Cedars-Sinai Medical Center, 35 36 37 Los Angeles, CA 90048, United States 38 39 40 41 E 42 BGI Group-Shenzhen, Yantian District, Shenzhen, China 518083 43 44 45 46 H School of Medicine, University College Dublin, Conway Institute Belfield Dublin 4 47 48 49 50 F 51 Department of Genetics, Stanford University, Alway Building, 300 Pasteur Drive and 52 53 54 3165 Porter Drive, Palo Alto, 94304, United States 55 56 57 58 2 59 60 ACS Paragon Plus Environment Page 3 of 47 Journal of Proteome Research

1 2 3 G 4 Department of Biomedical Sciences, Faculty of Medicine & Health Sciences, 5 6 7 Macquarie University, 75 Talavera Rd, North Ryde, 2109 NSW 2109 Australia 8 9 10 11 π 12 Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109- 13 14 15 5263, United States 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 3 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 4 of 47

1 2 3 4 5 6 7 8 Corresponding Author: Gilbert S. Omenn, Department of Computational Medicine and 9 10 11 12 Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, Michigan 13 14 15 48109-2218, United States, [email protected], 734-763-7583. 16 17 18 19 20 21 22 23 24 ABSTRACT: The Human Proteome Project (HPP) annually reports on progress made 25 26 27 28 throughout the field in credibly identifying and characterizing the complete human 29 30 31 parts list and making proteomics an integral part of multi- studies in medicine and 32 33 34 35 the life sciences. NeXtProt release 2019-01-11 contains 17 694 with strong 36 37 38 protein-level evidence (PE1), compliant with HPP Guidelines for Interpretation of MS Data 39 40 41 42 v2.1; these represent 89% of all 19 823 neXtProt predicted coding (all PE1, 2, 3, 43 44 45 4 proteins), up from 17 470 one year earlier. Conversely, the number of neXtProt PE2, 46 47 48 49 3, 4 proteins, termed the “missing proteins” (MPs), has been reduced from 2949 to 2129 50 51 52 since 2016 through efforts throughout the community, including the chromosome-centric 53 54 55 56 HPP. PeptideAtlas is the source of uniformly re-analyzed raw data 57 58 4 59 60 ACS Paragon Plus Environment Page 5 of 47 Journal of Proteome Research

1 2 3 4 for neXtProt; PeptideAtlas added 495 canonical proteins between 2018 and 2019, 5 6 7 especially from studies designed to detect hard-to-identify proteins. Meanwhile, the 8 9 10 11 has released version 18.1 with immunohistochemical evidence of 12 13 14 expression of 17 000 proteins and survival plots as part of the Pathology Atlas. Many 15 16 17 18 investigators apply multiplexed SRM-targeted proteomics for quantitation of organ- 19 20 21 specific popular proteins in studies of various human diseases. The 19 teams of the 22 23 24 25 Biology and Disease-driven B/D-HPP published a total of 160 publications in 2018, 26 27 28 bringing proteomics to a broad array of biomedical research. 29 30 31 32 33 KEYWORDS: neXtProt Protein Existence metrics, missing proteins (MPs), unannotated 34 35 36 37 protein existence 1 (uPE1), neXt-MP50 and CP50 Challenges, Human Proteome Project 38 39 40 (HPP) Guidelines 3.0, Chromosomal-HPP (C-HPP), Biology & Disease-HPP (B/D-HPP), 41 42 43 44 PeptideAtlas, Human Protein Atlas 45 46 47 48 49 50 51 52 53 54 55 56 57 58 5 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 6 of 47

1 2 3 4 Progress on the Human Proteome Parts List: neXtProt and PeptideAtlas 5 6 7 8 9 Since its launch in September 2010, the HUPO Human Proteome Project (HPP) has 10 11 12 provided a productive framework for international communication, collaboration, quality 13 14 15 16 assurance, guideline generation, data sharing and re-analysis, and acceleration of 17 18 19 progress in building and utilizing proteomic knowledge globally. The HPP has 50 20 21 22 23 collaborating research teams organized by chromosome (1-22, X, Y) and mitochondria 24 25 26 (C-HPP), biological processes and disease categories (B/D-HPP), and resource pillars 27 28 29 30 for antibody-based protein localization, mass spectrometry, knowledgebases, and 31 32 33 pathology. 34 35 36 Remarkable progress has been documented on the chromosome-centric human 37 38 39 40 proteome parts list, as officially curated by neXtProt1-7. Table 1 shows the increase from 41 42 43 13,975 PE1 proteins in 2012 to 17,694 in release 2019-01. This PE1 total now represents 44 45 46 47 89% of the PE1, 2, 3, 4 proteins predicted to be coded in the latest version of the human 48 49 50 (GRCh38). Conversely, the number of yet-to-be detected proteins, termed 51 52 53 54 missing proteins, are those whose evidence is limited to transcript expression (PE2), 55 56 57 58 6 59 60 ACS Paragon Plus Environment Page 7 of 47 Journal of Proteome Research

1 2 3 4 homology from other species (PE3), or models (PE4) (MPs), which have declined 5 6 7 from 5511 in 2012 to 2129 in 2019. Splice variants, sequence variants, and some post- 8 9 10 8 11 translational modifications (PTMs) are tabulated for each protein entry . We acknowledge 12 13 14 that neXtProt continues to carry a category PE5, which represents dubious or uncertain 15 16 17 18 genes, including many . However, the HPP in 2013 removed PE5 from the 19 20 21 denominator of all predicted proteins to be found3 and excluded these from the MP-50 22 23 24 7 25 Challenge . 26 27 28 As shown at the bottom of Table 1, there has been a corresponding increase in the 29 30 31 32 number of proteins in the human build of PeptideAtlas. PeptideAtlas provides an 33 34 35 essential HPP function by uniformly processing raw MS data/metadata registered in 36 37 38 9-10 39 ProteomeXchange through its Trans-Proteomic Pipeline (TPP) and applying the 40 41 42 communally-agreed HPP Guidelines 2.1 for Interpretation of Mass Spectrometry Data11. 43 44 45 46 The number of proteins in PeptideAtlas that are assessed as canonical has increased 47 48 49 from 12 509 to 16 293, despite the implementation of more stringent guidelines, which 50 51 52 accounts for the observed dip between 2014 and 2016. We present the main sources of 53 54 55 56 the increment of 495 canonical proteins from 2018 to 2019 in Figure 3, below. 57 58 7 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 8 of 47

1 2 3 4 5 6 7 Table 1. neXtProt Protein Evidence Levels from 2012 to 2019: Progress in Identifying 8 9 10 a b 11 PE1 Proteins and PE2,3,4 Missing Proteins 12 13 14 15 PE Level 2012-02 2013-09 2014-10 2016-01 2017-01 2018-01 2019-01 16 1: Evidence at protein level 13 975 15 646 16 491 16 518 17 008 17 470 17 694 17 2: Evidence at transcript 18 level 5205 3570 2647 2290 1939 1660 1548 19 3: Inferred from homology 218 187 214 565 563 452 510 20 4: Predicted 88 87 87 94 77 74 71 21 5: Uncertain or dubious 622 638 616 588 572 574 576 22 23 24 Human PeptideAtlas 12 509 13 377 14 928 14 569 15 173 15 798 16 293 25 canonical proteins 26 a PE1/PE1+2+3+4 = 17 694/19 823 = 89.3%; b PE 2+3+4 = 2129 “missing proteins” as of 2019-01. 27 28 29 30 31 32 In Figure 1 we account for the types of experimental evidence upon which neXtProt has 33 34 35 36 classified the human proteome. Of all 17 694 PE1 proteins, 16 600 are now based on 37 38 39 validated mass spectrometry results (Figure 1, green). Of these, 98% represent 40 41 42 43 canonical proteins in the human PeptideAtlas build. In addition, 1094 PE1 proteins 44 45 46 (yellow) were identified based upon protein-protein interactions (388), PTMs or proteolytic 47 48 49 50 processing (158), disease (123), Edman (90), 3D structure (50), 51 52 53 Ab-based data (46), or other biochemical studies (239). The 2129 PE2,3,4 MPs are 54 55 56 57 58 8 59 60 ACS Paragon Plus Environment Page 9 of 47 Journal of Proteome Research

1 2 3 4 comprised of 1688 with no MS data (red) plus 441 with insufficient or unconfirmed MS 5 6 7 data (orange) that were not compliant with HPP Guidelines 2.1; the full list can be 8 9 10 11 retrieved by query NXQ_00204. 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 Fig. 1. Pie charts showing distribution of human proteins based on the type of protein 38 39 40 evidence data. Shown here are the numbers of PE1 proteins (green plus yellow), PE2,3,4 41 42 43 44 MPs, and PE5 entries as of neXtProt releases 2018−01 and 2019-01. 45 46 47 48 The year-to-year changes in neXtProt HPP metrics are quite complex to track, due to 49 50 51 such factors as revisions of total gene entries in UniProt/SwissProt, reassessment of 52 53 54 55 evidence level in neXtProt upon review of additional data, and policy changes about 56 57 58 9 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 10 of 47

1 2 3 4 potential inclusion of claimed products of small open reading frames (smORFs), long non- 5 6 7 coding , or immunoglobulin genes. The flow chart illustrating all PE level transitions 8 9 10 11 from neXtProt release 2018-01 to release 2019-01 is presented in Figure 2 (right), along 12 13 14 with the previously published transition from 2017 to 2018 for comparison (left).7 15 16 17 18 During 2017, 431 MPs were promoted to PE1 and 44 new SwissProt proteins were 19 20 21 added as PE1s, while 3 PE1s were demoted to MPs and 10 PE1s were deleted. During 22 23 24 25 2018, 213 MPs and 5 PE5 entries were promoted to PE1 along with 78 new proteins, 26 27 28 while 65 PE1s were demoted and 7 PE1s deleted. This results in a net gain of 224 PE1 29 30 31 32 proteins and a net reduction of 57 PE2, 3, 4 MPs in the HPP neXtProt baseline for 2019. 33 34 35 Note especially that the number of MPs was inflated by 40 new entries in 2017 and 116 36 37 38 39 in 2018. Uncertainties in the reference produce a significant undertow 40 41 42 which impacts the global efforts to reduce the number of the MPs. 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 10 59 60 ACS Paragon Plus Environment Page 11 of 47 Journal of Proteome Research

1 2 3 4 5 6 7 8 Fig 2. These flowcharts depict the changes in neXtProt PE1−5 categories from release 9 10 11 12 2017−01 to 2018−01 (left) and from 2018-01 to 2019-01 (right). 13 14 15 16 Table 2 presents a detailed chromosome-by-chromosome accounting of the status of 17 18 19 the MPs search, elaborated from neXtProt 2019-01 20 21 22 23 [https://www.nextprot.org/about/protein-existence and 24 25 26 ftp://ftp..org/pub/current_release/Chr_reports/]. Also tabulated are the numbers 27 28 29 30 of functionally unannotated PE1 proteins (uPE1) of the human proteome12. Together the 31 32 33 PE2,3,4 MPs (NXQ_00204) and the uPE1 proteins (NXQ_00022) constitute a large part 34 35 36 37 of what has been termed the “Dark Proteome” (DP)13. 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 11 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 12 of 47

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 Table 2. Chromosome-by-Chromosome Status of Predicted Proteins in neXtProt 2019- 46 47 48 49 01 50 51 52 53 54 55 56 57 58 12 59 60 ACS Paragon Plus Environment Page 13 of 47 Journal of Proteome Research

1 2 3 4 Progress from the Chromosome-centric C-HPP 5 6 7 The C-HPP initiated a Next-MP50 challenge at the Sun Moon Lake Workshop following 8 9 10 11 the Taipei HUPO Congress in 2016. The aim was to galvanize the 24 chromosome teams 12 13 14 to each find 50 missing proteins as a step toward completing the human proteome parts 15 16 17 18 list. As shown in Table 2, that 50 MP target has been reached for Chromosomes 19, 1, 19 20 21 5, and 17. Chromosomes 2, 3, 10, and X have 43 to 47 net reduction of MPs. The Chr 22 23 24 25 17 team has published detailed analyses of the paths to reduce PE2,3,4 and increase 26 27 28 PE114-15. As depicted for the whole proteome in Figure 2 flowcharts, the dynamics of 29 30 31 32 promoting, demoting, gaining, and deleting protein entries in neXtProt is quite complex. 33 34 35 More detailed chromosome-by-chromosome analyses are planned for the coming year. 36 37 38 39 C-HPP investigators generated 10 articles for the 6th annual HPP Special Issue of J 40 41 42 Proteome Research (2018) that reported evidence for detection and characterization of 43 44 45 16 17 18 46 a total of 108 MPs : 13 from cerebrospinal fluid , 1 from mesenchymal stem cells , 5 47 48 49 from olfactory epithelium19, 1 from HeLa cells20, 3 from mitochondria21, 14 with the use of 50 51 52 multiple proteases for digestion22, 2 using LysargiNase11 which mirrors trypsin 53 54 55 56 specificity23, 6 from embryonic stem cells24, 30 membrane proteins25, and 9 from other 57 58 13 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 14 of 47

1 2 3 26 4 unusual/rare/stressed tissues . In addition, evidence for detection of 107 MPs was put 5 6 7 forward from re-analyses by MassIVE27, many of which are included in the 2019-01 8 9 10 11 PeptideAtlas build. 12 13 14 The limited numbers of MPs found each year by the C-HPP teams and the global 15 16 17 18 proteomics community reflect the increasing difficulty in devising and executing deep 19 20 21 discovery of MPs in the human proteome. More sensitive methods (including enrichment 22 23 24 25 techniques) and targeted analyses of understudied tissues and cells are needed. 26 27 28 Examples include organ-specific endothelium, hard connective tissues with cartilaginous 29 30 31 28 32 structures and cortical versus trabecular versus membranous bones , dental tissues like 33 34 35 dentin and odontoblasts29, and brain regions. In embryonic tissues, limited bursts of 36 37 38 39 factor expression and specific hydrolases likely account for many 40 41 42 differentiating aspects of each and tissue in the body. Identifying such temporally or 43 44 45 46 spatially rare proteins will require dedicated searches at precise developmental 47 48 49 windows—an ethical and practical challenge. Likewise, tissue responses to bacterial, 50 51 52 viral, and parasitic infections, or after injury may harbor MPs key for regeneration or repair 53 54 55 56 57 58 14 59 60 ACS Paragon Plus Environment Page 15 of 47 Journal of Proteome Research

1 2 3 4 of these specific tissues. Integration with the B/D-HPP teams in this task is highly 5 6 7 desirable. 8 9 10 11 Meanwhile, the C-HPP initiated the neXt-CP50 challenge for the functional 12 13 14 characterization of the dark proteome, that is, human proteins with no known or inferred 15 16 17 18 function. These proteins often have antibody-based tissue distribution and intracellular 19 20 21 localization in Human Protein Atlas, mass spectrometry-identified peptides in 22 23 24 25 PeptideAtlas, and other information. At the March 2018 launch of the neXt-CP50 26 27 28 challenge there were 1937 PE1,2,3,4 proteins with no known function, of which 1260 were 29 30 31 30 32 uPE1 proteins (1254 in Table 2 as of 2019-01). 33 34 35 An interesting example of functional annotation and unusual cell types is MALT1 in B 36 37 38 39 and T cells. MALT1 deficiency causes a rare immunodeficiency with ~50% reduced NFkB 40 41 42 activation. The function of this mutant protease was uncovered by TAILS N-terminal 43 44 45 46 positional proteomics. A highly selective molecular connector therapeutic compound was 47 48 49 able to graft together two domains of the mutant MALT1 to restore molecular stability in 50 51 52 the patient’s B and T cells, rescue NFkB activation, and provide MALT1 cleavage activity 53 54 55 56 in B and T cell antigen receptor activation31. Meanwhile, a computational approach to 57 58 15 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 16 of 47

1 2 3 4 predict functions of uPE1 proteins using I-TASSER and COFACTOR was introduced by 5 6 7 the Chromosome 17 team32 and subjected to blinded analyses of newly annotated 8 9 10 33 11 proteins in CAFA3 and in neXtProt-2019 . 12 13 14 Newly Identified Canonical Proteins in Peptide Atlas 15 16 17 18 PeptideAtlas34-35 reprocessed and incorporated data from 120 new human sample 19 20 21 22 datasets between the 2018-01 and 2019-01 builds. Not all newly deposited and released 23 24 25 MS datasets are automatically captured and incorporated into PeptideAtlas because, with 26 27 28 29 each additional dataset, the stringency applied to all data must be increased to maintain 30 31 32 a constant false discovery rate (FDR). Thus, only datasets that are sufficiently novel are 33 34 35 36 subjected to TPP reanalysis and are incorporated. These additional MS data resulted in 37 38 39 an increase in the number of PeptideAtlas proteins for which there are at least two 40 41 42 43 uniquely-mapping, non-nested 9+ amino acid peptides by 495 to a 2019-01 count of 16 44 45 46 293 (Table 1). Figure 3 depicts the contributions of the top 10 datasets to this increase 47 48 49 50 in the number of proteins, accounting for 292 (59%) of the 495. The three datasets 51 52 53 contributing the most canonical proteins (a total of 161) were PXD00973736, 54 55 56 57 58 16 59 60 ACS Paragon Plus Environment Page 17 of 47 Journal of Proteome Research

1 2 3 37 24 th 4 PXD010630 , and PXD009840 , all resulting from the 6 HPP special issue in this 5 6 7 journal (2018). It is impressive that major progress has been made with membrane 8 9 10 11 proteins. The PeptideAtlas reanalysis of data from the use of multiple proteases by Sun 12 13 14 et al36 yielded 73 new canonical proteins, far more than the 14 new PE1 proteins validated 15 16 17 18 by synthetic peptides and proposed in their original article. Note the important distinction 19 20 21 between new PE1 proteins and new canonical proteins in PeptideAtlas (which may 22 23 24 25 already be PE1 via other means). Neither neXtProt nor PeptideAtlas requires 26 27 28 confirmatory synthetic peptide MS results for their annual re-analyses of community MS 29 30 31 32 data for the HPP. Also, it is notable that the raw data for PXD009840 were accessed at 33 34 35 jPOST38 along with one other new canonical protein from a second jPOST dataset. No 36 37 38 39 39 MS datasets from iProX have been included in the current PeptideAtlas or neXtProt 40 41 42 releases, but there will be in the next release (see B/D-HPP, below). Also, PeptideAtlas 43 44 45 46 does not yet incorporate SRM results. 47 48 49 50 51 52 53 54 55 56 57 58 17 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 18 of 47

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Figure 3. Top 10 newly-analyzed deposited ProteomeXchange datasets contributing to 27 28 29 the increased number of canonical proteins found in PeptideAtlas between 2018 and 30 31 32 33 2019 builds. Each dataset is labeled with the number of new canonical proteins, 34 35 36 ProteomeXchange identifier (PXD), reference citation22, 25, 40-47, and methods highlighted. 37 38 39 40 41 From the point of view of PeptideAtlas, the progress in pursuit of a complete human 42 43 44 proteome is depicted in Figure 4, using data from Table 1 since 2016, when the HPP 45 46 47 48 Guidelines 2.0 had been applied. In green on the left are the PeptideAtlas canonical 49 50 51 proteins, progressing to 16 293 in 2019. These proteins have at least two uniquely- 52 53 54 55 mapping, non-nested, peptides of 9 or more amino acids in PeptideAtlas. In yellow in the 56 57 58 18 59 60 ACS Paragon Plus Environment Page 19 of 47 Journal of Proteome Research

1 2 3 4 middle are the proteins classified as PE1 in neXtProt based on evidence from other 5 6 7 techniques (see Figure 1 and associated text); this number has been steadily decreasing 8 9 10 11 over the past four years. In red on the right are the PE2,3,4 MPs. The total has been 12 13 14 increasing at a small rate as understanding and annotation of the human proteome 15 16 17 18 improves each year. A simple extrapolation of increases in green and decreases in yellow 19 20 21 and red (based on 2016 to 2019 data) yields a convergence of all three on the year 2026. 22 23 24 25 Clearly, additional strategies need to be developed to accelerate progress to reach near- 26 27 28 complete human proteome coverage at a stringency required by the HPP Mass 29 30 31 32 Spectrometry Data Interpretation Guidelines 3.0 (see below). 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 19 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 20 of 47

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Figure 4. Summary of the since 2016 of the proteome categories of PE1 28 29 30 proteins based on PeptideAtlas (PA) MS evidence (green), PE1 proteins based on non- 31 32 33 34 MS evidence (yellow), and PE2,3,4 MPs (red). Progress has been remarkably linear 35 36 37 since 2016. 38 39 40 41 HPP Mass Spectrometry Data Interpretation Guidelines 3.0 42 43 44 45 46 At a full-day workshop prior to the 21st Saint-Malo C-HPP Workshop in May 2019, the 47 48 49 HPP leadership gathered for an extended discussion of 25 open questions from the HPP 50 51 52 53 Knowledgebase Resource Pillar and other HPP teams relating to updates of the HPP 54 55 56 57 58 20 59 60 ACS Paragon Plus Environment Page 21 of 47 Journal of Proteome Research

1 2 3 11 4 Mass Spectrometry Data Interpretation Guidelines 2.1 and implementation of the 5 6 7 workflow by which the HPP confirms the of potential coding genes. Each of 8 9 10 11 these questions was debated, potential solutions listed, and consensus decisions 12 13 14 achieved by the participants. The result will be an update of the Guidelines from version 15 16 17 48 18 2.1 to version 3.0, with a manuscript describing the set of questions, potential solutions, 19 20 21 proposed consensus decisions, and the logic behind those proposals. For the checklist 22 23 24 25 required for submitted manuscripts, numbered items will be re-factored into logical 26 27 28 subgroups and authors will be required to provide page numbers on the checklist so 29 30 31 32 reviewers can see where specific guidelines are addressed in the manuscripts. The term 33 34 35 “extraordinary detection claims,” which appears to have caused some confusion, will be 36 37 38 39 replaced with “new PE1 claims.” The revised guidelines will refine how 40 41 42 peptide nesting is defined and specify how sequence-identical protein entries should be 43 44 45 46 handled. 47 48 49 Two new guidelines will be added: (1) the provision of Universal Spectrum Identifiers 50 51 52 (USI; http://psidev.info/USI), a feature developed by the HUPO Proteomics Standards 53 54 55 56 Initiative (PSI) that enables the unique identification of a particular spectrum being held 57 58 21 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 22 of 47

1 2 3 4 up as evidence for a new PE1 protein detection claim across proteomics repositories, 5 6 7 suitable for searching; and (2) a guideline for handling HPP datasets that use data- 8 9 10 49 11 independent acquisition (DIA) workflows including SWATH-MS . These guideline 12 13 14 changes are expected to take effect for contributions to the 2020 JPR HPP Special Issue. 15 16 17 18 At the workshop, several changes to the overall pipeline for tracking detections of MPs 19 20 21 were considered. The current pipeline begins with deposition to ProteomeXchange, 22 23 24 25 reprocessing of datasets by PeptideAtlas, and final mapping and incorporation by 26 27 28 neXtProt. It was agreed that there would be no substantial change to the basic set of 29 30 31 32 guidelines for calling a protein successfully identified by mass spectrometry. However, 33 34 35 the meaning of the terms non-nested and uniquely-mapping has been interpreted and 36 37 38 39 implemented in slightly different ways among the different components of the pipeline. A 40 41 42 consensus interpretation was clarified and will be documented. 43 44 45 46 For proteins that come close to meeting the guidelines, but do not meet them due to 47 48 49 their extreme sequence composition (e.g., very short or very hydrophobic), it was decided 50 51 52 that complex exception rules are not warranted. Rather, a panel of researchers including 53 54 55 56 neXtProt curators will be established to review evidence for special cases and classify 57 58 22 59 60 ACS Paragon Plus Environment Page 23 of 47 Journal of Proteome Research

1 2 3 4 proteins as PE1 if the available evidence is compelling but falls short of the Guidelines 5 6 7 3.0 due to valid physiochemical reasons. No proteins would be declared too difficult to 8 9 10 11 detect yet, although there was substantial discussion about the extreme difficulties of 12 13 14 detecting olfactory receptors and other categories of membrane-bound proteins, let alone 15 16 17 18 those proteins predicted to come from purported genes lacking measurable transcripts. 19 20 21 Discussion sections of previous JPR HPP Special Issue Metrics papers have addressed 22 23 24 7 25 these challenges . Finally, a plan was initiated for incorporating the dataset reprocessing 26 27 28 results of MassIVE-KB ProteinExplorer27, 50, including BioPlex data (while excluding bait) 29 30 31 32 through the PeptideAtlas TPP to feed into the 2020 neXtProt HPP release. Refinements 33 34 35 to the overall HPP pipeline for tracking high stringency identification of MPs should 36 37 38 39 facilitate confident completion of the attainable protein parts list over the next several 40 41 42 years. 43 44 45 46 ProteomeXchange (PX) 47 48 49 50 The ProteomeXchange Consortium51 was founded a decade ago by the European 51 52 53 Bioinformatics Institute (EBI) and HUPO based on PRIDE and PeptideAtlas. As of 23 54 55 56 57 58 23 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 24 of 47

1 2 3 4 May 2019, ProteomeXchange (PX) contained 8134 publicly-released data submissions, 5 6 7 of which 3395 are human datasets. PRIDE52-53 (EMBL-EBI, Cambridge, UK) and 8 9 10 11 PeptideAtlas (ISB, Seattle, WA, USA) are the founding members, joined now by MassIVE 12 13 14 (UCSD, San Diego, CA, USA), jPOST (various institutions, Japan), iProX (Phoenix 15 16 17 54 18 National Center for Protein Sciences, Beijing, China), and Panorama Public (targeted 19 20 21 proteomic datasets from Skyline, University of Washington, Seattle, USA). Each assigns 22 23 24 25 ProteomeXchange identifiers (PXD) to its datasets. The PRIDE (PRoteomics 26 27 28 IDEntification) database had 12 585 projects [https://www.ebi.ac.uk/pride/archive/]. 29 30 31 32 Peptide Atlas has 1547 human data sets (120 new in 2018) with 16 303 canonical PE1- 33 34 35 4 proteins and 2949 uncertain or redundant protein entries (www.peptideatlas.org/hupo/c- 36 37 38 39 hpp/). MassIVE-KB [Mass Spectrometry Interactive Virtual Environment; 40 41 42 https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp] comprised 9286 public 43 44 45 46 datasets, with 20 116 proteins, 11M peptide variants, and 505 modifications. iProX 47 48 49 [www.iprox.org] contained 553 projects (247 public), and 112 608 data files (14 404 added 50 51 52 in the past year). The large dataset for the 2019 early-stage hepatocellular carcinoma 53 54 55 56 paper from Jiang et al55 will be included in 2020 HPP updates; this dataset was uploaded 57 58 24 59 60 ACS Paragon Plus Environment Page 25 of 47 Journal of Proteome Research

1 2 3 4 via ProteomeXchange to PRIDE as well as to iProX. JPOSTdb [https://globe.jpostdb.org/] 5 6 7 held 66 human, 25 mouse and 12 bacterial datasets. 8 9 10 11 12 13 Highlights of Progress from the B/D-HPP 14 15 16 17 The Biology and Disease-driven Human Proteome Project (B/D-HPP) currently 18 19 20 21 encompasses the efforts of 19 independent initiatives (https://www.hupo.org/B/D-HPP). 22 23 24 The main aims are to elucidate the molecular basis of human biology, to uncover protein 25 26 27 28 drivers of human disease, and to promote the development of novel proteomics-based 29 30 31 tools to improve the clinical management of patients. Collaborative research is conducted 32 33 34 35 in close interaction with specific biomedical and clinical communities. Overall, the 36 37 38 research work done in 2018 by B/D-HPP teams resulted in 160 published papers and 39 40 41 42 participation in 82 international congresses, among which 40% were organized by 43 44 45 scientific and clinical societies focused on specific topics, including cardiovascular, 46 47 48 49 , infectious diseases, gastroenterology, and nutrition. Also, there were 28 50 51 52 educational activities, including the first Summer School in Immunopeptidomics 53 54 55 56 (https://hupo.org/B/D-HPP/). 57 58 25 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 26 of 47

1 2 3 4 Web tools for computational bibliometric analyses provide researchers prioritized lists 5 6 7 of proteins in relation to particular organ systems56 or disease categories57, and 8 9 10 58 11 metabolites or chemicals . In the annual survey of B/D teams, 13 responded that they 12 13 14 are using (11) or planning to use (2) such popular proteins methodology and databases 15 16 17 18 in their research. 19 20 21 B/D-HPP teams have made substantial contributions to the elucidation of molecular 22 23 24 25 mechanisms driving human disorders, moving toward “precision medicine.” For example, 26 27 28 pathological conformations of TDP-43, a nucleic acid binding protein that regulates 29 30 31 32 splicing and expression of CFTR, can distinguish four pathological subtypes of 33 34 35 frontotemporal lobar degeneration59. The human arterial proteome has been extensively 36 37 38 39 analyzed for proteins associated with early atherosclerosis; a subset of plasma proteins 40 41 42 emerged as efficient predictors of angiographically-defined coronary disease60. 43 44 45 61 46 Biomarker identification has been pursued in urine of diabetic patients , peptides from 47 48 49 mucins, fibrinogen, and collagen fragments in ~1000 cancer patients62, and co-regulated 50 51 52 groups of circulating proteins correlating to past, current and future disease states in 5500 53 54 55 56 individuals in Iceland63. Studies from the Beijing Proteome Research Center addressed 57 58 26 59 60 ACS Paragon Plus Environment Page 27 of 47 Journal of Proteome Research

1 2 3 64 4 human HBV-associated HCC and a mouse model of metabolic syndrome and fatty 5 6 7 liver65. A major analysis of patients with early-stage hepatocellular delineated 8 9 10 11 three subtypes, notably S-III with high expression of sterol-O-acyl transferase (SOAT1), 12 13 14 which is well-suited for targeted therapy, as demonstrated in xenograft models55. 15 16 17 18 PTMs represent a functional regulatory level that is central to understanding 19 20 21 22 progression of diseases and to define novel therapeutic targets. Cross-talk across 23 24 25 different PTMs in the context of cardiovascular physiopathology has been addressed66. 26 27 28 29 A total of 1655 proteins with 3324 oxidized cysteine sites were identified in a mouse 30 31 32 cardiac hypertrophy model67. Changes in protein patterns have been 33 34 35 36 associated with HCC68 and ovarian tumors69. Protein profiles have been 37 38 39 explored in plasma from 300 cancer patients combining N-glycosite enrichment and 40 41 42 43 SWATH; some glycoproteins, notably those related to blood platelets, are common 44 45 46 changes across several cancers, while others are highly cancer-type specific70. Finally, 47 48 49 50 Murray et al. have demonstrated how dynamic changes in protein acetylation participate 51 52 53 in the herpes human cytomegalovirus replication and in the host cell defense71. 54 55 56 57 58 27 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 28 of 47

1 2 3 4 Many efforts have been devoted to development and standardization of new 5 6 7 proteomics-based applications. MALDI-TOF profiling has been used to discriminate 8 9 10 72 11 different infant milk formula in pediatric clinical settings . Piazza et al have defined 12 13 14 protein-metabolite interaction networks and identified 1700 interactions and 7000 15 16 17 18 interaction sites, revealing principles of chemical communication, mechanisms of enzyme 19 20 21 promiscuity, and estimates of metabolite binding at proteome-wide scale73. Reference 22 23 24 74 25 have been reported for clinical studies of cerebrospinal fluid and the NCI-7 26 27 28 cell line panel75. Gut have been investigated increasingly as a complex 29 30 31 32 community that influences many aspects of human physiology and disease; the platform 33 34 35 iMetaLab (http://imetalab.ca) facilitates functional studies of microbiota76. A key need in 36 37 38 39 proteomic workflows is format and data analysis standardization; an example is the 40 41 42 Minimal Information About an Immuno-Peptidomics Experiment (MIAIPE) guidelines77. 43 44 45 46 Finally, an interlaboratory study to assess the performance of glycoproteomics software 47 48 49 for automated intact N- and O-glycopeptide identification from high resolution MS/MS 50 51 52 spectral data is ongoing. 53 54 55 56 57 58 28 59 60 ACS Paragon Plus Environment Page 29 of 47 Journal of Proteome Research

1 2 3 4 The functional annotation of the human proteome requires a collaborative effort. 5 6 7 Fertilization of interactions across C-HPP and B/D-HPP teams21, 78 to share resources 8 9 10 11 and knowledge will help decipher the code of life and set the basis for future molecular 12 13 14 precision medicine. 15 16 17 18 19 20 21 Highlights from the Human Protein Atlas 22 23 24 25 During 2018, the Human Protein Atlas (HPA)79 released its version 18.1 that 26 27 28 29 summarizes data from 26 000 antibodies targeting proteins from 17 000 human protein- 30 31 32 coding genes. There is continuously growing global interest with 150 000 web visitors 33 34 35 36 per month. The latest update included interactive survival scatter plots as part of the 37 38 39 Pathology Atlas80. The Cell Atlas81 is being recognized in other international projects, 40 41 42 43 such as the Human Cell Atlas82, as well as deep learning and citizen science projects83. 44 45 46 HPA continues to enhance its work on antibody validation84 and on annotation of protein 47 48 49 50 expression in currently under-explored tissues85. Fredolini et al have applied a magnetic 51 52 53 bead-assisted workflow and immunoprecipitation-MS/MS to assess antibody selectivity 54 55 56 57 58 29 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 30 of 47

1 2 3 4 for enrichment and detection of 120 proteins in human EDTA-plasma; most of the 5 6 7 antibodies co-enriched other proteins besides the intended target, due to sequence 8 9 10 86 11 homology . 12 13 14 During 2019, the HPA plans to release three new components focusing on blood, the 15 16 17 18 brain, and metabolism. In addition, HPA is combining mRNA expression data obtained 19 20 21 from human tissues that are currently derived from different types of sequencing data; the 22 23 24 25 harmonized RNA expression data will be a valuable reference resource for neXtProt 26 27 28 Protein Evidence curation. A quantitative paired analysis of the proteome and 29 30 31 32 abundance for 29 healthy human tissues identified 13,640 proteins, but 33 34 35 only 37 without prior protein level evidence. Hundreds of proteins were not detected even 36 37 38 39 when the corresponding mRNAs were highly expressed, particularly in testis 40 41 42 (PXD010154)87. These findings, and many others, will be captured for standardized 43 44 45 46 reanalysis with the PeptideAtlas TransProteomicPipeline to be included in PeptideAtlas 47 48 49 2020-01 and incorporated into neXtProt release 2020-01. 50 51 52 53 54 55 56 57 58 30 59 60 ACS Paragon Plus Environment Page 31 of 47 Journal of Proteome Research

1 2 3 4 CONCLUSIONS 5 6 7 8 The Human Proteome Project continues to provide a quality-assured framework for 9 10 11 12 capturing sustained progress on the protein parts list and the characterization of the 13 14 15 features and functions of human proteins. PeptideAtlas identified 495 new canonical 16 17 18 proteins during the past year, including many neXtProt PE1 proteins that previously were 19 20 21 22 classified based on non-MS data. NeXtProt had a net increase of 224 PE1 proteins. As 23 24 25 of the baseline for the 2019 studies (neXtProt release 2019-01), there were still 2129 26 27 28 29 “Missing Proteins” (PE2,3,4) awaiting the application of advanced techniques and 30 31 32 analysis of under-studied specimen types. Since 2016 there has been a surprisingly 33 34 35 36 linear decrease in the number of missing proteins, which is likely to continue, though a 37 38 39 hard core of perhaps 1000 predicted proteins may be so low in abundance, so unusual in 40 41 42 43 the sites or conditions of expression, or so unsuited to detection that they will be “out of 44 45 46 reach.” We suspected that we were approaching the limit 2-3 years ago, but the pace of 47 48 49 50 progress has not diminished. Over the next few years we will clarify that question while 51 52 53 54 55 56 57 58 31 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 32 of 47

1 2 3 4 enhancing our knowledge of the proteins and stimulating the broader and broader use of 5 6 7 proteomics in biomedical research. 8 9 10 11 12 13 14 15 ACKNOWLEDGEMENTS 16 17 18 19 We appreciate the guidance from the HPP Executive Committee and the participation 20 21 22 23 of all HPP investigators. We thank the UniProt groups at SIB, EBI, and PIR for providing 24 25 26 high-quality annotations for the human proteins in UniProtKB/Swiss-Prot. The neXtProt 27 28 29 30 server is hosted by VitalIT in Switzerland. The PeptideAtlas server is hosted at the 31 32 33 Institute for Systems Biology in Seattle. G.S.O. acknowledges grant support from 34 35 36 37 National Institutes of Health grants P30ES017885-01A1 and NIH U24CA210967; E.W.D. 38 39 40 from the National Institutes of Health grants, National Institute of General Medical 41 42 43 44 Sciences grants: R01GM087221, R24GM127667, U54EB020406, and the 45 46 47 U19AG023122; L.L. and neXtProt from the SIB Swiss Institute of Bioinformatics; C.M.O. 48 49 50 51 by a Canadian Institutes of Health Research 7-year Foundation Grant and a Canada 52 53 54 Research Chair in Protease Proteomics and Systems Biology; M.S.B. by NHMRC Project 55 56 57 58 32 59 60 ACS Paragon Plus Environment Page 33 of 47 Journal of Proteome Research

1 2 3 4 Grant APP1010303; J.M.S. by the Knut and Alice Wallenberg Foundation for the Human 5 6 7 Protein Atlas; and Y.-K.P. by grants from the Korean Ministry of Health and Welfare 8 9 10 11 HI13C22098 and HI16C0257. 12 13 14 15 16 17 18 ORCID 19 20 21 22 Gilbert S. Omenn: 0000-0002-8976-6074 23 24 25 Lydie Lane: 0000-0002-9818-3030 26 27 28 29 Christopher M. Overall: 0000-0001-5844-2731 30 31 32 Fernando J. Corrales: 0000-0002-0231-5159 33 34 35 36 Jochen M. Schwenk: 0000-0001-8141-8449 37 38 39 Young-Ki Paik: 0000-0002-8146-1751 40 41 42 43 Siqi Liu: 0000-0001-9744-3681 44 45 46 Stephen Pennington: 0000-0001-7529-1015 47 48 49 50 Mark S. Baker: 0000-0001-5858-4035 51 52 53 Eric W. Deutsch: 0000-0001-8732-0928 54 55 56 57 58 33 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 34 of 47

1 2 3 4 REFERENCES 5 6 7 8 1. Legrain, P.; Aebersold, R.; Archakov, A.; Bairoch, A.; Bala, K.; Beretta, L.; 9 10 Bergeron, J.; Borchers, C. H.; Corthals, G. L.; Costello, C. E.; Deutsch, E. W.; Domon, 11 12 B.; Hancock, W.; He, F.; Hochstrasser, D.; Marko-Varga, G.; Salekdeh, G. H.; Sechi, S.; 13 Snyder, M.; Srivastava, S.; Uhlen, M.; Wu, C. H.; Yamamoto, T.; Paik, Y. K.; Omenn, G. 14 15 S., The Human Proteome Project: current state and future direction. Mol Cell 16 17 Proteomics 2011, 10 (7), M111 009993. 18 2. Marko-Varga, G.; Omenn, G. S.; Paik, Y. K.; Hancock, W. S., A first step toward 19 20 completion of a genome-wide characterization of the human proteome. J. Proteome 21 22 Res. 2013, 12 (1), 1-5. 23 24 3. Lane, L.; Bairoch, A.; Beavis, R. C.; Deutsch, E. W.; Gaudet, P.; Lundberg, E.; 25 Omenn, G. S., Metrics for the human proteome project 2013-2014 and strategies for 26 27 finding missing proteins. J. Proteome Res. 2014, 13 (1), 15-20. 28 29 4. Omenn, G. S.; Lane, L.; Lundberg, E. K.; Beavis, R. C.; Nesvizhskii, A. I.; 30 31 Deutsch, E. W., Metrics for the Human Proteome Project 2015: progress on the human 32 proteome and guidelines for high-confidence protein identification. J. Proteome Res. 33 34 2015, 14 (9), 3452-60. 35 36 5. Omenn, G. S.; Lane, L.; Lundberg, E. K.; Beavis, R. C.; Overall, C. M.; Deutsch, 37 38 E. W., Metrics for the Human Proteome Project 2016: Progress on Identifying and 39 Characterizing the Human Proteome, Including Post-Translational Modifications. 40 41 Journal of proteome research 2016, 15 (11), 3951-3960. 42 43 6. Omenn, G. S.; Lane, L.; Lundberg, E. K.; Overall, C. M.; Deutsch, E. W., 44 45 Progress on the HUPO draft human proteome: 2017 metrics of the Human Proteome 46 Project. Journal of proteome research 2017, 16 (12), 4281-4287. 47 48 7. Omenn, G. S.; Lane, L.; Overall, C. M.; Corrales, F. J.; Schwenk, J. M.; Paik, Y. 49 50 K.; Van Eyk, J. E.; Liu, S.; Snyder, M.; Baker, M. S.; Deutsch, E. W., Progress on 51 52 identifying and characterizing the human proteome: 2018 metrics from the HUPO 53 Human Proteome Project. Journal of proteome research 2018. 54 55 56 57 58 34 59 60 ACS Paragon Plus Environment Page 35 of 47 Journal of Proteome Research

1 2 3 4 8. Gaudet, P.; Michel, P. A.; Zahn-Zabal, M.; Britan, A.; Cusin, I.; Domagalski, M.; 5 Duek, P. D.; Gateau, A.; Gleizes, A.; Hinard, V.; Rech de Laval, V.; Lin, J.; Nikitin, F.; 6 7 Schaeffer, M.; Teixeira, D.; Lane, L.; Bairoch, A., The neXtProt knowledgebase on 8 9 human proteins: 2017 update. Nucleic Acids Res 2017, 45 (D1), D177-d182. 10 11 9. Keller, A.; Eng, J.; Zhang, N.; Li, X. J.; Aebersold, R., A uniform proteomics 12 MS/MS analysis platform utilizing open XML file formats. Molecular systems biology 13 14 2005, 1, 2005.0017. 15 16 10. Deutsch, E. W.; Mendoza, L.; Shteynberg, D.; Slagel, J.; Sun, Z.; Moritz, R. L., 17 18 Trans-proteomic pipeline, a standardized data processing pipeline for large-scale 19 reproducible proteomics informatics. Proteomics Clin. Appl. 2015, 9 (7-8), 745-54. 20 21 11. Deutsch, E. W.; Overall, C. M.; Van Eyk, J. E.; Baker, M. S.; Paik, Y. K.; 22 23 Weintraub, S. T.; Lane, L.; Martens, L.; Vandenbrouck, Y.; Kusebauch, U.; Hancock, W. 24 25 S.; Hermjakob, H.; Aebersold, R.; Moritz, R. L.; Omenn, G. S., Human Proteome Project 26 mass spectrometry data interpretation guidelines 2.1. Journal of proteome research 27 28 2016, 15 (11), 3961-3970. 29 30 12. Duek, P.; Gateau, A.; Bairoch, A.; Lane, L., Exploring the uncharacterized human 31 32 proteome using neXtProt. Journal of proteome research 2018, 17 (12), 4211-4226. 33 13. Paik, Y. K.; Lane, L.; Overall, C. M., neXt-CP50, the C-HPP pilot project for 34 35 functional characterization of identified proteins with no known function. J. Proteome 36 37 Res. 2018. 38 39 14. Siddiqui, O.; Zhang, H.; Guan, Y.; Omenn, G. S., Chromosome 17 missing 40 proteins: Recent progress and future directions as part of the neXt-MP50 challenge. 41 42 Journal of proteome research 2018, 17 (12), 4061-4071. 43 44 15. Zhang, H.; Siddiqui, O.; Guan, Y.; Omenn, G. S., Chromosome 17: neXt-MP50 45 46 challenge completed! Journal of proteome research, 2019, submitted. 47 16. Paik, Y.-K.; Overall, C. M.; Corrales, F.; Deutsch, E. W.; Lane, L.; Omenn, G. S., 48 49 Toward completion of the human proteome parts list: Progress uncovering proteins that 50 51 are missing or have unknown function and developing analytical methods. Journal of 52 proteome research 2018, 17 (12), 4023-4030. 53 54 55 56 57 58 35 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 36 of 47

1 2 3 4 17. Macron, C.; Lane, L.; Núñez Galindo, A.; Dayon, L., Deep dive on the proteome 5 of human cerebrospinal fluid: A valuable data resource for and 6 7 missing protein identification. Journal of proteome research 2018, 17 (12), 4113-4126. 8 9 18. Clemente, L. F.; Hernáez, M. L.; Ramos-Fernández, A.; Ligero, G.; Gil, C.; 10 11 Corrales, F. J.; Marcilla, M., Identification of the missing protein hyaluronan synthase 1 12 in human mesenchymal stem cells derived from adipose tissue or umbilical cord. 13 14 Journal of proteome research 2018, 17 (12), 4325-4328. 15 16 19. Hwang, H.; Jeong, J. E.; Lee, H. K.; Yun, K. N.; An, H. J.; Lee, B.; Paik, Y.-K.; 17 18 Jeong, T. S.; Yee, G. T.; Kim, J. Y.; Yoo, J. S., Identification of missing proteins in 19 human olfactory epithelial tissue by liquid . 20 21 Journal of proteome research 2018, 17 (12), 4320-4324. 22 23 20. Robin, T.; Bairoch, A.; Müller, M.; Lisacek, F.; Lane, L., Large-scale reanalysis of 24 25 publicly available HeLa cell proteomics data in the context of the human proteome 26 project. Journal of proteome research 2018, 17 (12), 4160-4170. 27 28 21. Ronci, M.; Pieroni, L.; Greco, V.; Scotti, L.; Marini, F.; Carregari, V. C.; Cunsolo, 29 30 V.; Foti, S.; Aceto, A.; Urbani, A., Sequential fractionation strategy identifies three 31 32 missing proteins in the mitochondrial proteome of commonly used cell lines. Journal of 33 proteome research 2018, 17 (12), 4307-4314. 34 35 22. Sun, J.; Shi, J.; Wang, Y.; Chen, Y.; Li, Y.; Kong, D.; Chang, L.; Liu, F.; Lv, Z.; 36 37 Zhou, Y.; He, F.; Zhang, Y.; Xu, P., Multiproteases combined with high-pH reverse- 38 39 phase separation strategy verified fourteen missing proteins in human testis tissue. 40 Journal of proteome research 2018, 17 (12), 4171-4177. 41 42 23. Huesgen, P. F.; Lange, P. F.; Rogers, L. D.; Solis, N.; Eckhard, U.; Kleifeld, O.; 43 44 Goulas, T.; Gomis-Ruth, F. X.; Overall, C. M., LysargiNase mirrors trypsin for protein C- 45 46 terminal and methylation-site identification. Nat Methods 2015, 12 (1), 55-8. 47 24. Weldemariam, M. M.; Han, C.-L.; Shekari, F.; Kitata, R. B.; Chuang, C.-Y.; Hsu, 48 49 W.-T.; Kuo, H.-C.; Choong, W.-K.; Sung, T.-Y.; He, F.-C.; Chung, M. C. M.; Salekdeh, 50 51 G. H.; Chen, Y.-J., Subcellular proteome landscape of human embryonic stem cells 52 revealed missing membrane proteins. Journal of proteome research 2018, 17 (12), 53 54 4138-4151. 55 56 57 58 36 59 60 ACS Paragon Plus Environment Page 37 of 47 Journal of Proteome Research

1 2 3 4 25. Zhang, Y.; Lin, Z.; Hao, P.; Hou, K.; Sui, Y.; Zhang, K.; He, Y.; Li, H.; Yang, H.; 5 Liu, S.; Ren, Y., Improvement of peptide separation for exploring the missing proteins 6 7 localized on membranes. Journal of proteome research 2018, 17 (12), 4152-4159. 8 9 26. Sjöstedt, E.; Sivertsson, Å.; Hikmet Noraddin, F.; Katona, B.; Näsström, Å.; Vuu, 10 11 J.; Kesti, D.; Oksvold, P.; Edqvist, P.-H.; Olsson, I.; Uhlén, M.; Lindskog, C., Integration 12 of transcriptomics and antibody-based proteomics for exploration of proteins expressed 13 14 in specialized tissues. Journal of proteome research 2018, 17 (12), 4127-4137. 15 16 27. Pullman, B. S.; Wertz, J.; Carver, J.; Bandeira, N., ProteinExplorer: A repository- 17 18 scale resource for exploration of protein detection in public mass spectrometry data 19 sets. Journal of proteome research 2018, 17 (12), 4227-4234. 20 21 28. Bell, P.; Solis, N.; Kizhakkedathu, J.; Matthew, I.; Overduin, B., Proteomic and N- 22 23 terminomic TAILS analyses of human alveolar bone proteins: Improved protein 24 25 extraction methodology and LysargiNase digestion increases proteome coverage 26 missing protein identification. Journal of Proteome Research, 2019, submitted. 27 28 29. Abbey, S. R.; Eckhard, U.; Solis, N.; Marino, G.; Matthew, I.; Overall, C. M., The 29 30 human odontoblast cell layer and dental pulp proteomes and N-terminomes. Journal of 31 32 Dental Research 2018, 97 (3), 338-346. 33 30. Paik, Y. K.; Lane, L.; Kawamura, T.; Chen, Y. J.; Cho, J. Y.; LaBaer, J.; Yoo, J. 34 35 S.; Domont, G.; Corrales, F.; Omenn, G. S.; Archakov, A.; Encarnacion-Guevara, S.; 36 37 Lui, S.; Salekdeh, G. H.; Cho, J. Y.; Kim, C. Y.; Overall, C. M., Launching the C-HPP 38 39 neXt-CP50 pilot project for functional characterization of identified proteins with no 40 known function. Journal of proteome research 2018, 17 (12), 4042-4050. 41 42 31. Quancard, J.; Klein, T.; Fung, S. Y.; Renatus, M.; Hughes, N.; Israel, L.; Priatel, 43 44 J. J.; Kang, S.; Blank, M. A.; Viner, R. I.; Blank, J.; Schlapbach, A.; Erbel, P.; 45 46 Kizhakkedathu, J.; Villard, F.; Hersperger, R.; Turvey, S. E.; Eder, J.; Bornancin, F.; 47 Overall, C. M., An allosteric MALT1 inhibitor is a molecular corrector rescuing function in 48 49 an immunodeficient patient. chemical biology 2019, 15 (3), 304-313. 50 51 32. Zhang, C.; Omenn, G. S.; Zhang, Y., Structure and protein interaction-based 52 gene ontology annotations reveal likely functinos of uncharacterized proteins of human 53 54 chromosome 17. Journal of Proteome Res. 2018, submitted. 55 56 57 58 37 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 38 of 47

1 2 3 4 33. Zhang, C.; Lane, L.; Omenn, G. S.; Zhang, Y., A blinded testing of function 5 annotation for uPE1 proteins by the I-TASSER/COFACTOR pipeline using the 2018- 6 7 2019 additions to neXtProt and CAFA3 challenge. Journal of proteome research 2019, 8 9 submitted. 10 11 34. Desiere, F.; Deutsch, E. W.; Nesvizhskii, A. I.; Mallick, P.; King, N. L.; Eng, J. K.; 12 Aderem, A.; Boyle, R.; Brunner, E.; Donohoe, S.; Fausto, N.; Hafen, E.; Hood, L.; Katze, 13 14 M. G.; Kennedy, K. A.; Kregenow, F.; Lee, H.; Lin, B.; Martin, D.; Ranish, J. A.; 15 16 Rawlings, D. J.; Samelson, L. E.; Shiio, Y.; Watts, J. D.; Wollscheid, B.; Wright, M. E.; 17 18 Yan, W.; Yang, L.; Yi, E. C.; Zhang, H.; Aebersold, R., Integration with the human 19 genome of peptide sequences obtained by high-throughput mass spectrometry. 20 21 Genome Biol 2005, 6 (1), R9. 22 23 35. Deutsch, E. W.; Sun, Z.; Campbell, D.; Kusebauch, U.; Chu, C. S.; Mendoza, L.; 24 25 Shteynberg, D.; Omenn, G. S.; Moritz, R. L., State of the human proteome in 2014/2015 26 as viewed through PeptideAtlas: enhancing accuracy and coverage through the 27 28 AtlasProphet. Journal of proteome research 2015, 14 (9), 3461-73. 29 30 36. Sun, J.; Shi, J.; Wang, Y.; Chen, Y.; Li, Y.; Kong, D.; Chang, L.; Liu, F.; Lv, Z.; 31 32 Zhou, Y.; He, F.; Zhang, Y.; Xu, P., Multiproteases combined with high-pH reverse- 33 phase separation strategy verified fourteen missing proteins in human testis tissue. 34 35 Journal of proteome research 2018, 17 (12), 4171-4177. 36 37 37. Zhang, Y.; Lin, Z.; Hao, P.; Hou, K.; Sui, Y.; Zhang, K.; He, Y.; Li, H.; Yang, H.; 38 39 Liu, S., Ren Y., Improvement of peptide separation for exploring the missing proteins 40 localized on membranes. J. Proteome Res. 2018(17(12):4152-4159. 41 42 38. Moriya, Y.; Kawano, S.; Okuda, S.; Watanabe, Y.; Matsumoto, M.; Takami, T.; 43 44 Kobayashi, D.; Yamanouchi, Y.; Araki, N.; Yoshizawa, A. C.; Tabata, T.; Iwasaki, M.; 45 46 Sugiyama, N.; Tanaka, S.; Goto, S.; Ishihama, Y., The jPOST environment: an 47 integrated proteomics data repository and database. Nucleic Acids Res 2019, 47 (D1), 48 49 D1218-d1224. 50 51 39. Ma, J.; Chen, T.; Wu, S.; Yang, C.; Bai, M.; Shu, K.; Li, K.; Zhang, G.; Jin, Z.; He, 52 F.; Hermjakob, H.; Zhu, Y., iProX: an integrated proteome resource. Nucleic Acids Res 53 54 2019, 47 (D1), D1211-d1217. 55 56 57 58 38 59 60 ACS Paragon Plus Environment Page 39 of 47 Journal of Proteome Research

1 2 3 4 40. Muller, M. M.; Lehmann, R.; Klassert, T. E.; Reifenstein, S.; Conrad, T.; Moore, 5 C.; Kuhn, A.; Behnert, A.; Guthke, R.; Driesch, D.; Slevogt, H., Global analysis of 6 7 glycoproteins identifies markers of endotoxin tolerant monocytes and GPR84 as a 8 9 modulator of TNFalpha expression. Scientific reports 2017, 7 (1), 838. 10 11 41. Deeb, S. J.; Cox, J.; Schmidt-Supprian, M.; Mann, M., N-linked glycosylation 12 enrichment for in-depth cell surface proteomics of diffuse large B-cell lymphoma 13 14 subtypes. Mol Cell Proteomics 2014, 13 (1), 240-51. 15 16 42. Bassani-Sternberg, M.; Pletscher-Frankild, S.; Jensen, L. J.; Mann, M., Mass 17 18 spectrometry of human leukocyte antigen class I peptidomes reveals strong effects of 19 protein abundance and turnover on antigen presentation. Mol Cell Proteomics 2015, 14 20 21 (3), 658-73. 22 23 43. Bassani-Sternberg, M.; Braunlein, E.; Klar, R.; Engleitner, T.; Sinitcyn, P.; 24 25 Audehm, S.; Straub, M.; Weber, J.; Slotta-Huspenina, J.; Specht, K.; Martignoni, M. E.; 26 Werner, A.; Hein, R.; D, H. B.; Peschel, C.; Rad, R.; Cox, J.; Mann, M.; Krackhardt, A. 27 28 M., Direct identification of clinically relevant neoepitopes presented on native human 29 30 melanoma tissue by mass spectrometry. Nature communications 2016, 7, 13404. 31 32 44. Kulak, N. A.; Geyer, P. E.; Mann, M., Loss-less Nano-fractionator for High 33 Sensitivity, High Coverage Proteomics. Mol Cell Proteomics 2017, 16 (4), 694-705. 34 35 45. Klaeger, S.; Heinzlmeir, S.; Wilhelm, M.; Polzer, H.; Vick, B.; Koenig, P. A.; 36 37 Reinecke, M.; Ruprecht, B.; Petzoldt, S.; Meng, C.; Zecha, J.; Reiter, K.; Qiao, H.; 38 39 Helm, D.; Koch, H.; Schoof, M.; Canevari, G.; Casale, E.; Depaolini, S. R.; Feuchtinger, 40 A.; Wu, Z.; Schmidt, T.; Rueckert, L.; Becker, W.; Huenges, J.; Garz, A. K.; Gohlke, B. 41 42 O.; Zolg, D. P.; Kayser, G.; Vooder, T.; Preissner, R.; Hahne, H.; Tonisson, N.; Kramer, 43 44 K.; Gotze, K.; Bassermann, F.; Schlegl, J.; Ehrlich, H. C.; Aiche, S.; Walch, A.; Greif, P. 45 46 A.; Schneider, S.; Felder, E. R.; Ruland, J.; Medard, G.; Jeremias, I.; Spiekermann, K.; 47 Kuster, B., The target landscape of clinical kinase drugs. Science 2017, 358 (6367). 48 49 46. Doll, S.; Dressen, M.; Geyer, P. E.; Itzhak, D. N.; Braun, C.; Doppler, S. A.; 50 51 Meier, F.; Deutsch, M. A.; Lahm, H.; Lange, R.; Krane, M.; Mann, M., Region and cell- 52 type resolved quantitative proteomic map of the human heart. Nature communications 53 54 2017, 8 (1), 1469. 55 56 57 58 39 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 40 of 47

1 2 3 4 47. Weldemariam, M. M.; Han, C. L.; Shekari, F.; Kitata, R. B.; Chuang, C. Y.; Hsu, 5 W. T.; Kuo, H. C.; Choong, W. K.; Sung, T. Y.; He, F. C.; Chung, M. C. M.; Salekdeh, G. 6 7 H.; Chen, Y. J., Subcellular Proteome Landscape of Human Embryonic Stem Cells 8 9 Revealed Missing Membrane Proteins. Journal of proteome research 2018, 17 (12), 10 11 4138-4151. 12 48. Deutsch, E. W.; Lane, L.; Overall, C. M.; Baker, M. S.; Pineau, C.; Mortiz, R. L.; 13 14 Bandeira, N.; Corrales, F.; Orchard, S.; Van Eyk, J.; Paik, Y.-K.; Weintraub, S. T.; 15 16 Vandenbrouck, Y.; Omenn, G. S., Human Proteome Project Mass Spectrometry Data 17 18 Interpretation Guidelines 3.0. Journal of proteome research 2019, submitted. 19 49. Gillet, L. C.; Navarro, P.; Tate, S.; Rost, H.; Selevsek, N.; Reiter, L.; Bonner, R.; 20 21 Aebersold, R., Targeted data extraction of the MS/MS spectra generated by data- 22 23 independent acquisition: a new concept for consistent and accurate proteome analysis. 24 25 Mol Cell Proteomics 2012, 11 (6), O111.016717. 26 50. Deutsch, E. W.; Csordas, A.; Sun, Z.; Jarnuczak, A.; Perez-Riverol, Y.; Ternent, 27 28 T.; Campbell, D. S.; Bernal-Llinares, M.; Okuda, S.; Kawano, S.; Moritz, R. L.; Carver, J. 29 30 J.; Wang, M.; Ishihama, Y.; Bandeira, N.; Hermjakob, H.; Vizcaino, J. A., The 31 32 ProteomeXchange consortium in 2017: supporting the cultural change in proteomics 33 public data deposition. Nucleic Acids Res 2017, 45 (D1), D1100-d1106. 34 35 51. Vizcaino, J. A.; Deutsch, E. W.; Wang, R.; Csordas, A.; Reisinger, F.; Rios, D.; 36 37 Dianes, J. A.; Sun, Z.; Farrah, T.; Bandeira, N.; Binz, P. A.; Xenarios, I.; Eisenacher, M.; 38 39 Mayer, G.; Gatto, L.; Campos, A.; Chalkley, R. J.; Kraus, H. J.; Albar, J. P.; Martinez- 40 Bartolome, S.; Apweiler, R.; Omenn, G. S.; Martens, L.; Jones, A. R.; Hermjakob, H., 41 42 ProteomeXchange provides globally coordinated proteomics data submission and 43 44 dissemination. Nat Biotechnol 2014, 32 (3), 223-6. 45 46 52. Martens, L.; Hermjakob, H.; Jones, P.; Adamski, M.; Taylor, C.; States, D.; 47 Gevaert, K.; Vandekerckhove, J.; Apweiler, R., PRIDE: the proteomics identifications 48 49 database. Proteomics 2005, 5 (13), 3537-45. 50 51 53. Perez-Riverol, Y.; Csordas, A.; Bai, J.; Bernal-Llinares, M.; Hewapathirana, S.; 52 Kundu, D. J.; Inuganti, A.; Griss, J.; Mayer, G.; Eisenacher, M.; Perez, E.; Uszkoreit, J.; 53 54 Pfeuffer, J.; Sachsenberg, T.; Yilmaz, S.; Tiwary, S.; Cox, J.; Audain, E.; Walzer, M.; 55 56 Jarnuczak, A. F.; Ternent, T.; Brazma, A.; Vizcaino, J. A., The PRIDE database and 57 58 40 59 60 ACS Paragon Plus Environment Page 41 of 47 Journal of Proteome Research

1 2 3 4 related tools and resources in 2019: improving support for quantification data. Nucleic 5 Acids Res 2019, 47 (D1), D442-d450. 6 7 54. Sharma, V.; Eckels, J.; Schilling, B.; Ludwig, C.; Jaffe, J. D.; MacCoss, M. J.; 8 9 MacLean, B., Panorama Public: A Public Repository for Quantitative Data Sets 10 11 Processed in Skyline. Mol Cell Proteomics 2018, 17 (6), 1239-1244. 12 55. Jiang, Y.; Sun, A.; Zhao, Y.; Ying, W.; Sun, H.; Yang, X.; Xing, B.; Sun, W.; Ren, 13 14 L.; Hu, B.; Li, C.; Zhang, L.; Qin, G.; Zhang, M.; Chen, N.; Zhang, M.; Huang, Y.; Zhou, 15 16 J.; Zhao, Y.; Liu, M.; Zhu, X.; Qiu, Y.; Sun, Y.; Huang, C.; Yan, M.; Wang, M.; Liu, W.; 17 18 Tian, F.; Xu, H.; Zhou, J.; Wu, Z.; Shi, T.; Zhu, W.; Qin, J.; Xie, L.; Fan, J.; Qian, X.; He, 19 F., Proteomics identifies new therapeutic targets of early-stage hepatocellular 20 21 carcinoma. Nature 2019, 567 (7747), 257-261. 22 23 56. Lam, M. P.; Venkatraman, V.; Xing, Y.; Lau, E.; Cao, Q.; Ng, D. C.; Su, A. I.; Ge, 24 25 J.; Van Eyk, J. E.; Ping, P., Data-Driven Approach To Determine Popular Proteins for 26 Targeted Proteomics Translation of Six Organ Systems. Journal of proteome research 27 28 2016, 15 (11), 4126-4134. 29 30 57. Yu, K. H.; Lee, T. M.; Wang, C. S.; Chen, Y. J.; Ré, C.; Kou, S. C.; Chiang, J. H.; 31 32 Kohane, I. S.; Snyder, M., Systematic protein prioritization for targeted proteomics 33 studies through literature mining. Journal of proteome research 2018, 17(4):1383-1396. 34 35 58. Yu, K. H.; Lee, T. M.; Chen, Y. J.; Re, C.; Kou, S. C.; Chiang, J. H.; Snyder, M.; 36 37 Kohane, I. S., A Cloud-Based Metabolite and Chemical Prioritization System for the 38 39 Biology/Disease-Driven Human Proteome Project. Journal of proteome research 2018, 40 17 (12), 4345-4357. 41 42 59. Laferriere, F.; Maniecka, Z.; Perez-Berlanga, M.; Hruska-Plochan, M.; Gilhespy, 43 44 L.; Hock, E. M.; Wagner, U.; Afroz, T.; Boersema, P. J.; Barmettler, G.; Foti, S. C.; Asi, 45 46 Y. T.; Isaacs, A. M.; Al-Amoudi, A.; Lewis, A.; Stahlberg, H.; Ravits, J.; De Giorgi, F.; 47 Ichas, F.; Bezard, E.; Picotti, P.; Lashley, T.; Polymenidou, M., TDP-43 extracted from 48 49 frontotemporal lobar degeneration subject brains displays distinct aggregate assemblies 50 51 and neurotoxic effects reflecting disease progression rates. Nature neuroscience 2019, 52 22 (1), 65-77. 53 54 60. Herrington, D. M.; Mao, C.; Parker, S. J.; Fu, Z.; Yu, G.; Chen, L.; Venkatraman, 55 56 V.; Fu, Y.; Wang, Y.; Howard, T. D.; Jun, G.; Zhao, C. F.; Liu, Y.; Saylor, G.; Spivia, W. 57 58 41 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 42 of 47

1 2 3 4 R.; Athas, G. B.; Troxclair, D.; Hixson, J. E.; Vander Heide, R. S.; Wang, Y.; Van Eyk, J. 5 E., Proteomic Architecture of Human Coronary and Aortic Atherosclerosis. Circulation 6 7 2018, 137 (25), 2741-2756. 8 9 61. Hirao, Y.; Saito, S.; Fujinaka, H.; Miyazaki, S.; Xu, B.; Quadery, A. F.; Elguoshy, 10 11 A.; Yamamoto, K.; Yamamoto, T., Proteome Profiling of Diabetic Mellitus Patient Urine 12 for Discovery of by Comprehensive MS-Based Proteomics. Proteomes 13 14 2018, 6 (1). 15 16 62. Belczacka, I.; Latosinska, A.; Siwy, J.; Metzger, J.; Merseburger, A. S.; Mischak, 17 18 H.; Vlahou, A.; Frantzi, M.; Jankowski, V., Urinary CE-MS peptide marker pattern for 19 detection of solid tumors. Scientific reports 2018, 8 (1), 5227. 20 21 63. Emilsson, V.; Ilkov, M.; Lamb, J. R.; Finkel, N.; Gudmundsson, E. F.; Pitts, R.; 22 23 Hoover, H.; Gudmundsdottir, V.; Horman, S. R.; Aspelund, T.; Shu, L.; Trifonov, V.; 24 25 Sigurdsson, S.; Manolescu, A.; Zhu, J.; Olafsson, O.; Jakobsdottir, J.; Lesley, S. A.; To, 26 J.; Zhang, J.; Harris, T. B.; Launer, L. J.; Zhang, B.; Eiriksdottir, G.; Yang, X.; Orth, A. 27 28 P.; Jennings, L. L.; Gudnason, V., Co-regulatory networks of human serum proteins link 29 30 genetics to disease. Science 2018, 361 (6404), 769-773. 31 32 64. Cao, P.; Yang, A.; Wang, R.; Xia, X.; Zhai, Y.; Li, Y.; Yang, F.; Cui, Y.; Xie, W.; 33 Liu, Y.; Liu, T.; Jia, W.; Jiang, Z.; Li, Z.; Han, Y.; Gao, C.; Song, Q.; Xie, B.; Zhang, L.; 34 35 Zhang, H.; Zhang, J.; Shen, X.; Yuan, Y.; Yu, F.; Wang, Y.; Xu, J.; Ma, Y.; Mo, Z.; Yu, 36 37 W.; He, F.; Zhou, G., Germline Duplication of SNORA18L5 Increases Risk for HBV- 38 39 related Hepatocellular Carcinoma by Altering Localization of Ribosomal Proteins and 40 Decreasing Levels of p53. Gastroenterology 2018, 155 (2), 542-556. 41 42 65. Wei, J.; Yuan, Y.; Chen, L.; Xu, Y.; Zhang, Y.; Wang, Y.; Yang, Y.; Peek, C. B.; 43 44 Diebold, L.; Yang, Y.; Gao, B.; Jin, C.; Melo-Cardenas, J.; Chandel, N. S.; Zhang, D. D.; 45 46 Pan, H.; Zhang, K.; Wang, J.; He, F.; Fang, D., ER-associated ubiquitin ligase HRD1 47 programs metabolism by targeting multiple metabolic enzymes. Nature 48 49 communications 2018, 9 (1), 3659. 50 51 66. Fert-Bober, J.; Murray, C. I.; Parker, S. J.; Van Eyk, J. E., Precision profiling of 52 the cardiovascular post-translationally modified proteome: Where there is a will, there is 53 54 a way. Circulation research 2018, 122 (9), 1221-1237. 55 56 57 58 42 59 60 ACS Paragon Plus Environment Page 43 of 47 Journal of Proteome Research

1 2 3 4 67. Wang, J.; Choi, H.; Chung, N. C.; Cao, Q.; Ng, D. C. M.; Mirza, B.; Scruggs, S. 5 B.; Wang, D.; Garlid, A. O.; Ping, P., Integrated dissection of cysteine oxidative post- 6 7 translational modification proteome during cardiac hypertrophy. Journal of proteome 8 9 research 2018, 17(12):4243-4257. 10 11 68. Ren, L.; Li, C.; Wang, Y.; Teng, Y.; Sun, H.; Xing, B.; Yang, X.; Jiang, Y.; He, F., 12 In vivo phosphoproteome analysis reveals kinome reprogramming in hepatocellular 13 14 carcinoma. Mol Cell Proteomics 2018, 17 (6), 1067-1083. 15 16 69. Song, G.; Chen, L.; Zhang, B.; Song, Q.; Yu, Y.; Moore, C.; Wang, T. L.; Shih, I. 17 18 M.; Zhang, H.; Chan, D. W.; Zhang, Z.; Zhu, H., Proteome-wide tyrosine 19 phosphorylation analysis reveals dysregulated signaling pathways in ovarian tumors. 20 21 Mol Cell Proteomics 2019, 18 (3), 448-460. 22 23 70. Sajic, T.; Liu, Y.; Arvaniti, E.; Surinova, S.; Williams, E. G.; Schiess, R.; 24 25 Huttenhain, R.; Sethi, A.; Pan, S.; Brentnall, T. A.; Chen, R.; Blattmann, P.; Friedrich, 26 B.; Nimeus, E.; Malander, S.; Omlin, A.; Gillessen, S.; Claassen, M.; Aebersold, R., 27 28 Similarities and differences of blood N-Glycoproteins in five solid carcinomas at 29 30 localized clinical stage analyzed by SWATH-MS. Cell reports 2018, 23 (9), 2819- 31 32 2831.e5. 33 71. Murray, L. A.; Sheng, X.; Cristea, I. M., Orchestration of protein acetylation as a 34 35 toggle for cellular defense and virus replication. Nature communications 2018, 9 (1), 36 37 4967. 38 39 72. Di Francesco, L.; Di Girolamo, F.; Mennini, M.; Masotti, A.; Salvatori, G.; Rigon, 40 G.; Signore, F.; Pietrantoni, E.; Scapaticci, M.; Lante, I.; Goffredo, B. M.; Mazzina, O.; 41 42 Elbousify, A. I.; Roncada, P.; Dotta, A.; Fiocchi, A.; Putignani, L., A MALDI-TOF MS 43 44 approach for mammalian, human, and formula milks' profiling. Nutrients 2018, 10 (9). 45 46 73. Piazza, I.; Kochanowski, K.; Cappelletti, V.; Fuhrer, T.; Noor, E.; Sauer, U.; 47 Picotti, P., A map of protein-metabolite interactions reveals principles of chemical 48 49 communication. Cell 2018, 172 (1-2), 358-372.e23. 50 51 74. Barkovits, K.; Linden, A.; Galozzi, S.; Schilde, L.; Pacharra, S.; Mollenhauer, B.; 52 Stoepel, N.; Steinbach, S.; May, C.; Uszkoreit, J.; Eisenacher, M.; Marcus, K., 53 54 Characterization of cerebrospinal fluid via data-independent acquisition mass 55 56 spectrometry. Journal of proteome research 2018, 17 (10), 3418-3430. 57 58 43 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 44 of 47

1 2 3 4 75. Clark, D. J.; Hu, Y.; Bocik, W.; Chen, L.; Schnaubelt, M.; Roberts, R.; Shah, P.; 5 Whiteley, G.; Zhang, H., Evaluation of NCI-7 cell line panel as a reference material for 6 7 clinical proteomics. Journal of proteome research 2018, 17 (6), 2205-2215. 8 9 76. Liao, B.; Ning, Z.; Cheng, K.; Zhang, X.; Li, L.; Mayne, J.; Figeys, D., iMetaLab 10 11 1.0: a web platform for data analysis. Bioinformatics (Oxford, England) 12 2018, 34 (22), 3954-3956. 13 14 77. Lill, J. R.; van Veelen, P. A.; Tenzer, S.; Admon, A.; Caron, E.; Elias, J. E.; Heck, 15 16 A. J. R.; Marcilla, M.; Marino, F.; Muller, M.; Peters, B.; Purcell, A.; Sette, A.; Sturm, T.; 17 18 Ternette, N.; Vizcaino, J. A.; Bassani-Sternberg, M., Minimal information about an 19 immuno-peptidomics experiment (MIAIPE). Proteomics 2018, 18 (12), e1800110. 20 21 78. Kopylov, A. T.; Ponomarenko, E. A.; Ilgisonis, E. V.; Pyatnitskiy, M. A.; Lisitsa, A. 22 23 V.; Poverennaya, E. V.; Kiseleva, O. I.; Farafonova, T. E.; Tikhonova, O. V.; Zavialova, 24 25 M. G.; Novikova, S. E.; Moshkovskii, S. A.; Radko, S. P.; Morukov, B. V.; Grigoriev, A. 26 I.; Paik, Y. K.; Salekdeh, G. H.; Urbani, A.; Zgoda, V. G.; Archakov, A. I., 200+ protein 27 28 concentrations in healthy human : targeted quantitative SRM SIS 29 30 screening of Chromosomes 18, 13, Y, and the mitochondrial chromosome encoded 31 32 proteome. Journal of proteome research 2019, 18 (1), 120-129. 33 79. Uhlen, M.; Fagerberg, L.; Hallstrom, B. M.; Lindskog, C.; Oksvold, P.; 34 35 Mardinoglu, A.; Sivertsson, A.; Kampf, C.; Sjostedt, E.; Asplund, A.; Olsson, I.; Edlund, 36 37 K.; Lundberg, E.; Navani, S.; Szigyarto, C. A.; Odeberg, J.; Djureinovic, D.; Takanen, J. 38 39 O.; Hober, S.; Alm, T.; Edqvist, P. H.; Berling, H.; Tegel, H.; Mulder, J.; Rockberg, J.; 40 Nilsson, P.; Schwenk, J. M.; Hamsten, M.; von Feilitzen, K.; Forsberg, M.; Persson, L.; 41 42 Johansson, F.; Zwahlen, M.; von Heijne, G.; Nielsen, J.; Ponten, F., Proteomics. Tissue- 43 44 based map of the human proteome. Science 2015, 347 (6220), 1260419. 45 46 80. Uhlen, M.; Zhang, C.; Lee, S.; Sjostedt, E.; Fagerberg, L.; Bidkhori, G.; Benfeitas, 47 R.; Arif, M.; Liu, Z.; Edfors, F.; Sanli, K.; von Feilitzen, K.; Oksvold, P.; Lundberg, E.; 48 49 Hober, S.; Nilsson, P.; Mattsson, J.; Schwenk, J. M.; Brunnstrom, H.; Glimelius, B.; 50 51 Sjoblom, T.; Edqvist, P. H.; Djureinovic, D.; Micke, P.; Lindskog, C.; Mardinoglu, A.; 52 Ponten, F., A pathology atlas of the human cancer transcriptome. Science 2017, 357 53 54 (6352). 55 56 57 58 44 59 60 ACS Paragon Plus Environment Page 45 of 47 Journal of Proteome Research

1 2 3 4 81. Thul, P. J.; Akesson, L.; Wiking, M.; Mahdessian, D.; Geladaki, A.; Ait Blal, H.; 5 Alm, T.; Asplund, A.; Bjork, L.; Breckels, L. M.; Backstrom, A.; Danielsson, F.; 6 7 Fagerberg, L.; Fall, J.; Gatto, L.; Gnann, C.; Hober, S.; Hjelmare, M.; Johansson, F.; 8 9 Lee, S.; Lindskog, C.; Mulder, J.; Mulvey, C. M.; Nilsson, P.; Oksvold, P.; Rockberg, J.; 10 11 Schutten, R.; Schwenk, J. M.; Sivertsson, A.; Sjostedt, E.; Skogs, M.; Stadler, C.; 12 Sullivan, D. P.; Tegel, H.; Winsnes, C.; Zhang, C.; Zwahlen, M.; Mardinoglu, A.; Ponten, 13 14 F.; von Feilitzen, K.; Lilley, K. S.; Uhlen, M.; Lundberg, E., A subcellular map of the 15 16 human proteome. Science 2017, 356 (6340). 17 18 82. Regev, A.; Teichmann, S. A.; Lander, E. S.; Amit, I.; Benoist, C.; Birney, E.; 19 Bodenmiller, B.; Campbell, P.; Carninci, P.; Clatworthy, M.; Clevers, H.; Deplancke, B.; 20 21 Dunham, I.; Eberwine, J.; Eils, R.; Enard, W.; Farmer, A.; Fugger, L.; Gottgens, B.; 22 23 Hacohen, N.; Haniffa, M.; Hemberg, M.; Kim, S.; Klenerman, P.; Kriegstein, A.; Lein, E.; 24 25 Linnarsson, S.; Lundberg, E.; Lundeberg, J.; Majumder, P.; Marioni, J. C.; Merad, M.; 26 Mhlanga, M.; Nawijn, M.; Netea, M.; Nolan, G.; Pe'er, D.; Phillipakis, A.; Ponting, C. P.; 27 28 Quake, S.; Reik, W.; Rozenblatt-Rosen, O.; Sanes, J.; Satija, R.; Schumacher, T. N.; 29 30 Shalek, A.; Shapiro, E.; Sharma, P.; Shin, J. W.; Stegle, O.; Stratton, M.; Stubbington, 31 32 M. J. T.; Theis, F. J.; Uhlen, M.; van Oudenaarden, A.; Wagner, A.; Watt, F.; Weissman, 33 J.; Wold, B.; Xavier, R.; Yosef, N., The Human Cell Atlas. eLife 2017, 6. 34 35 83. Sullivan, D. P.; Winsnes, C. F.; Akesson, L.; Hjelmare, M.; Wiking, M.; Schutten, 36 37 R.; Campbell, L.; Leifsson, H.; Rhodes, S.; Nordgren, A.; Smith, K.; Revaz, B.; 38 39 Finnbogason, B.; Szantner, A.; Lundberg, E., Deep learning is combined with massive- 40 scale citizen science to improve large-scale image classification. Nat Biotechnol 2018, 41 42 36 (9), 820-828. 43 44 84. Edfors, F.; Hober, A.; Linderback, K.; Maddalo, G.; Azimi, A.; Sivertsson, A.; 45 46 Tegel, H.; Hober, S.; Szigyarto, C. A.; Fagerberg, L.; von Feilitzen, K.; Oksvold, P.; 47 Lindskog, C.; Forsstrom, B.; Uhlen, M., Enhanced validation of antibodies for research 48 49 applications. Nature communications 2018, 9 (1), 4130. 50 51 85. Sjostedt, E.; Sivertsson, A.; Hikmet Noraddin, F.; Katona, B.; Nasstrom, A.; Vuu, 52 J.; Kesti, D.; Oksvold, P.; Edqvist, P. H.; Olsson, I.; Uhlen, M.; Lindskog, C., Integration 53 54 of transcriptomics and antibody-based proteomics for exploration of proteins expressed 55 56 in specialized tissues. Journal of proteome research 2018, 17 (12), 4127-4137. 57 58 45 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 46 of 47

1 2 3 4 86. Fredolini, C.; Bystrom, S.; Sanchez-Rivera, L.; Ioannou, M.; Tamburro, D.; 5 Ponten, F.; Branca, R. M.; Nilsson, P.; Lehtio, J.; Schwenk, J. M., Systematic 6 7 assessment of antibody selectivity in plasma based on a resource of enrichment 8 9 profiles. Scientific reports 2019, 9 (1), 8324. 10 11 87. Wang, D.; Eraslan, B.; Wieland, T.; Hallstrom, B.; Hopf, T.; Zolg, D. P.; Zecha, J.; 12 Asplund, A.; Li, L. H.; Meng, C.; Frejno, M.; Schmidt, T.; Schnatbaum, K.; Wilhelm, M.; 13 14 Ponten, F.; Uhlen, M.; Gagneur, J.; Hahne, H.; Kuster, B., A deep proteome and 15 16 transcriptome abundance atlas of 29 healthy human tissues. Molecular systems biology 17 18 2019, 15 (2), e8503. 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 46 59 60 ACS Paragon Plus Environment Page 47 of 47 Journal of Proteome Research

1 2 3 4 TOC graphic 5 6 7 8 9 10 11 12 13 14 15 16 17 Human Proteome Protein Existence Metrics (neXtProt Release 2019-01-11) 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 47 59 60 ACS Paragon Plus Environment