2018-2019 Metrics from the HUPO Human Proteome Project

2018-2019 Metrics from the HUPO Human Proteome Project

Journal of Proteome Research This document is confidential and is proprietary to the American Chemical Society and its authors. Do not copy or disclose without written permission. If you have received this item in error, notify the sender and delete all copies. Progress on Identifying and Characterizing the Human Proteome: 2018-2019 Metrics from the HUPO Human Proteome Project Journal: Journal of Proteome Research Manuscript ID Draft Manuscript Type: Perspective Date Submitted by the n/a Author: Complete List of Authors: Omenn, Gilbert; University of Michican Medical School, Center for Computational Medicine and Bioinformatics Lane, Lydie; Swiss Institute of Bioinformatics, Overall, Christopher; The University of British Columbia, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular Biology; The University of British Columbia, Departments of Oral Biological & Medical Sciences, and Biochemistry & Molecular Biology Corrales, Fernando; Centro Nacional de Biotecnologia, Functional Proteomics Schwenk, Jochen; KTH - Royal Institute of Technology, Science for Life Laboratory Paik, Young-Ki; Yonsei University, Yonsei Proteome Research Center Van Eyk, Jennifer; Cedars-Sinai Medical Center, Medicine Liu, Siqi; BGI-Shenzhen, Pennington, Stephen; University College Dublin, UCD Conway Insitute Snyder, Michael; Stanford University, Genetics Baker, Mark; Macquarie University, Biomedical Sciences Deutsch, Eric; Institute for Systems Biology, ACS Paragon Plus Environment Page 1 of 47 Journal of Proteome Research 1 2 3 4 5 6 7 PERSPECTIVE 8 9 10 11 12 13 Progress on Identifying and Characterizing the 14 15 16 17 18 Human Proteome: 2018-2019 Metrics from the 19 20 21 22 HUPO Human Proteome Project 23 24 25 26 27 Gilbert S. Omenn* × π, Lydie Lane∞, Christopher M. Overall, Fernando J. CorralesA, 28 29 30 Jochen M. SchwenkB, Young-Ki PaikC, Jennifer E. Van EykD, Siqi LiuE, Stephen 31 32 33 34 PenningtonH, Michael P. SnyderF, Mark S. BakerG, Eric W. Deutschπ 35 36 37 38 × Department of Computational Medicine and Bioinformatics, University of Michigan, 39 40 41 42 100 Washtenaw Avenue, Ann Arbor, Michigan 48109-2218, United States 43 44 45 ∞ CALIPHO Group, SIB Swiss Institute of Bioinformatics and Department of 46 47 48 49 Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, CMU, 50 51 52 Michel-Servet 1, 1211 Geneva 4, Switzerland 53 54 55 56 57 58 1 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 2 of 47 1 2 3 4 Life Sciences Institute, Faculty of Dentistry, University of British Columbia, 2350 Health 5 6 7 Sciences Mall, Room 4.401, Vancouver, BC Canada V6T 1Z3 8 9 10 A 11 Centro Nacional de Biotecnologia (CSIC), Darwin 3, 28049, Madrid 12 13 14 B Science for Life Laboratory, KTH Royal Institute of Technology, Tomtebodavägen 15 16 17 18 23A, 17165 Solna, Sweden 19 20 21 22 C Yonsei Proteome Research Center, Room 425, Building #114, Yonsei University, 23 24 25 26 50 Yonsei-ro, Seodaemoon-ku, Seoul 120-749, South Korea 27 28 29 30 D Advanced Clinical BioSystems Research Institute, Cedars Sinai Precision Biomarker 31 32 33 34 Laboratories, Barbra Streisand Women’s Heart Center, Cedars-Sinai Medical Center, 35 36 37 Los Angeles, CA 90048, United States 38 39 40 41 E 42 BGI Group-Shenzhen, Yantian District, Shenzhen, China 518083 43 44 45 46 H School of Medicine, University College Dublin, Conway Institute Belfield Dublin 4 47 48 49 50 F 51 Department of Genetics, Stanford University, Alway Building, 300 Pasteur Drive and 52 53 54 3165 Porter Drive, Palo Alto, 94304, United States 55 56 57 58 2 59 60 ACS Paragon Plus Environment Page 3 of 47 Journal of Proteome Research 1 2 3 G 4 Department of Biomedical Sciences, Faculty of Medicine & Health Sciences, 5 6 7 Macquarie University, 75 Talavera Rd, North Ryde, 2109 NSW 2109 Australia 8 9 10 11 π 12 Institute for Systems Biology, 401 Terry Avenue North, Seattle, Washington 98109- 13 14 15 5263, United States 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 3 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 4 of 47 1 2 3 4 5 6 7 8 Corresponding Author: Gilbert S. Omenn, Department of Computational Medicine and 9 10 11 12 Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, Michigan 13 14 15 48109-2218, United States, [email protected], 734-763-7583. 16 17 18 19 20 21 22 23 24 ABSTRACT: The Human Proteome Project (HPP) annually reports on progress made 25 26 27 28 throughout the field in credibly identifying and characterizing the complete human protein 29 30 31 parts list and making proteomics an integral part of multi-omics studies in medicine and 32 33 34 35 the life sciences. NeXtProt release 2019-01-11 contains 17 694 proteins with strong 36 37 38 protein-level evidence (PE1), compliant with HPP Guidelines for Interpretation of MS Data 39 40 41 42 v2.1; these represent 89% of all 19 823 neXtProt predicted coding genes (all PE1, 2, 3, 43 44 45 4 proteins), up from 17 470 one year earlier. Conversely, the number of neXtProt PE2, 46 47 48 49 3, 4 proteins, termed the “missing proteins” (MPs), has been reduced from 2949 to 2129 50 51 52 since 2016 through efforts throughout the community, including the chromosome-centric 53 54 55 56 HPP. PeptideAtlas is the source of uniformly re-analyzed raw mass spectrometry data 57 58 4 59 60 ACS Paragon Plus Environment Page 5 of 47 Journal of Proteome Research 1 2 3 4 for neXtProt; PeptideAtlas added 495 canonical proteins between 2018 and 2019, 5 6 7 especially from studies designed to detect hard-to-identify proteins. Meanwhile, the 8 9 10 11 Human Protein Atlas has released version 18.1 with immunohistochemical evidence of 12 13 14 expression of 17 000 proteins and survival plots as part of the Pathology Atlas. Many 15 16 17 18 investigators apply multiplexed SRM-targeted proteomics for quantitation of organ- 19 20 21 specific popular proteins in studies of various human diseases. The 19 teams of the 22 23 24 25 Biology and Disease-driven B/D-HPP published a total of 160 publications in 2018, 26 27 28 bringing proteomics to a broad array of biomedical research. 29 30 31 32 33 KEYWORDS: neXtProt Protein Existence metrics, missing proteins (MPs), unannotated 34 35 36 37 protein existence 1 (uPE1), neXt-MP50 and CP50 Challenges, Human Proteome Project 38 39 40 (HPP) Guidelines 3.0, Chromosomal-HPP (C-HPP), Biology & Disease-HPP (B/D-HPP), 41 42 43 44 PeptideAtlas, Human Protein Atlas 45 46 47 48 49 50 51 52 53 54 55 56 57 58 5 59 60 ACS Paragon Plus Environment Journal of Proteome Research Page 6 of 47 1 2 3 4 Progress on the Human Proteome Parts List: neXtProt and PeptideAtlas 5 6 7 8 9 Since its launch in September 2010, the HUPO Human Proteome Project (HPP) has 10 11 12 provided a productive framework for international communication, collaboration, quality 13 14 15 16 assurance, guideline generation, data sharing and re-analysis, and acceleration of 17 18 19 progress in building and utilizing proteomic knowledge globally. The HPP has 50 20 21 22 23 collaborating research teams organized by chromosome (1-22, X, Y) and mitochondria 24 25 26 (C-HPP), biological processes and disease categories (B/D-HPP), and resource pillars 27 28 29 30 for antibody-based protein localization, mass spectrometry, knowledgebases, and 31 32 33 pathology. 34 35 36 Remarkable progress has been documented on the chromosome-centric human 37 38 39 40 proteome parts list, as officially curated by neXtProt1-7. Table 1 shows the increase from 41 42 43 13,975 PE1 proteins in 2012 to 17,694 in release 2019-01. This PE1 total now represents 44 45 46 47 89% of the PE1, 2, 3, 4 proteins predicted to be coded in the latest version of the human 48 49 50 genome (GRCh38). Conversely, the number of yet-to-be detected proteins, termed 51 52 53 54 missing proteins, are those whose evidence is limited to transcript expression (PE2), 55 56 57 58 6 59 60 ACS Paragon Plus Environment Page 7 of 47 Journal of Proteome Research 1 2 3 4 homology from other species (PE3), or gene models (PE4) (MPs), which have declined 5 6 7 from 5511 in 2012 to 2129 in 2019. Splice variants, sequence variants, and some post- 8 9 10 8 11 translational modifications (PTMs) are tabulated for each protein entry . We acknowledge 12 13 14 that neXtProt continues to carry a category PE5, which represents dubious or uncertain 15 16 17 18 genes, including many pseudogenes. However, the HPP in 2013 removed PE5 from the 19 20 21 denominator of all predicted proteins to be found3 and excluded these from the MP-50 22 23 24 7 25 Challenge . 26 27 28 As shown at the bottom of Table 1, there has been a corresponding increase in the 29 30 31 32 number of proteins in the human build of PeptideAtlas. PeptideAtlas provides an 33 34 35 essential HPP function by uniformly processing raw MS data/metadata registered in 36 37 38 9-10 39 ProteomeXchange through its Trans-Proteomic Pipeline (TPP) and applying the 40 41 42 communally-agreed HPP Guidelines 2.1 for Interpretation of Mass Spectrometry Data11. 43 44 45 46 The number of proteins in PeptideAtlas that are assessed as canonical has increased 47 48 49 from 12 509 to 16 293, despite the implementation of more stringent guidelines, which 50 51 52 accounts for the observed dip between 2014 and 2016.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    48 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us