The Chromosome-Centric Human Proteome Project for Cataloging Proteins Encoded in the Genome

Total Page:16

File Type:pdf, Size:1020Kb

The Chromosome-Centric Human Proteome Project for Cataloging Proteins Encoded in the Genome CORRESPONDENCE The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome To the Editor: utility for biological and disease studies. Table 1 Features of salient genes on The Chromosome-Centric Human With development of new tools for in- chromosomes 13 and 17 Proteome Project (C-HPP) aims to define depth characterization of the transcriptome Genea AST nsSNPs the full set of proteins encoded in each and proteome, the HPP is well positioned Chromosome 13 chromosome through development of a to have a strategic role in addressing the BRCA2 3 54 standardized approach for analyzing the complexity of human phenotypes. With this RB1 2 3 massive proteomic data sets currently being in mind, the HUPO has organized national IRS2 1 3 generated from dedicated efforts of national chromosome teams that will collaborate and international teams. The initial goal with well-established laboratories building Chromosome 17 of the C-HPP is to identify at least one complementary proteotypic peptides, BRCA1 24 24 representative protein encoded by each of antibodies and informatics resources. ERBB2 6 13 the approximately 20,300 human genes1,2. An important C-HPP goal is to encourage TP53 14 5 aEnsembl protein and AST information can be found at The proteins will be characterized for tissue capture and open sharing of proteomic http://www.ensembl.org/Homo_sapiens/. localization and major isoforms, including data sets from diverse samples to enhance AST, alternative splicing transcript; nsSNP, nonsyno- mous single-nucleotide polyphorphism assembled from post-translational modifications (PTMs), a gene- and chromosome-centric display data from the 1000 Genomes Projects. using quantitative mass spectrometry and This will display several layers of biological antibody reagents. Our rationale is that information on a common reference effective integration of proteomics data into platform comparable to a genome browser. machine database (GPMDB), UniProt and a genomic framework will lead to improved Such context will effectively integrate neXtProt (Supplementary Fig. 3). knowledge of complex biological systems transcriptomics data such as RNA-Seq with The C-HPP does not propose any and facilitate access to protein level data. proteomic data sets (Fig. 1). alteration in the work flow of a typical Although the intent to engage in a C-HPP Although the C-HPP program has proteomics laboratory; instead, it seeks © 2012 Nature America, Inc. All rights reserved. America, Inc. © 2012 Nature program has been noted1–3, our objective some similarities to the Human Genome more effective use of data encompassed in here is to define the goals and process for its Project (HGP)4 in its quest for complete existing bioinformatics resources, which development as a multinational program. coverage across the genome, the C-HPP will be combined with targeted studies to npg Over the past three years, the Human has the added challenge of characterizing generate a robust list of observed protein Proteome Organization (HUPO) has protein expression at the tissue, cellular and isoforms (Supplementary Fig. 3). A potential developed a strategy for the first phase of subcellular levels, as well as PTMs, ASTs challenge to data collection from different the Human Proteome Project (HPP; http:// and protease-processed protein variants. An laboratories is the diversity of instrument and thehpp.org/; Supplementary Fig. 1). example of protein variation is shown for bioinformatics platforms and quality criteria. HPP1 goals will be achieved through 6 selected genes on chromosome 13 The C-HPP will work closely with proteomics cooperation with the C-HPP to characterize (BRCA2, 3 ASTs and 54 SNPs in protein- journals, and use existing data (GPMDB and the human proteome on a chromosome- coding regions (nsSNPs); RB1, 2 ASTs and 3 PeptideAtlas), literature curation (Uniprot by-chromosome basis and with the nsSNPs; and IRS2, 1 AST and 3 nsSNPs) and and neXtProt) and standardization programs biology- and disease-driven projects chromosome 17 (BRCA1, 24 ASTs and 24 (PSI, CPTAC, Unimod, ABRF and ASMS) (B/D-HPP). Human genome studies, nsSNPs; ERBB2, 6 ASTs and 13 nsSNPs; and to ensure that the data collection is efficient, such as the 1000 Genomes Project and TP53, 14 ASTs and 5 nsSNPs; Table 1 and with consistent quality assurance and quality Encode, and transcriptome sequencing Supplementary Table 1). control. Journal mandates for deposition of provide a basis for identification of protein The C-HPP will build on the three HPP raw data upon publication will reinforce this isoforms generated by alternative splicing pillars that provide both technology and process5. The C-HPP has already encouraged transcripts (ASTs) and by nonsynonymous resources for mapping the human proteome: formation of chromosome-formatted single-nucleotide polymorphisms (nsSNPs; mass spectrometry–based SRMAtlas, databases (http://www.nextprot.org/; http:// Supplementary Fig. 2). Additional antibody reagents in the Human Protein www.gpm.org/) in which new data sets are protein forms will be identified through Atlas and bioinformatics knowledge linked integrated with existing ones. In this manner characterization of post-translational by ProteomeXchange, specifically the the C-HPP will capture the protein evidence modifications. A basic premise of the HPP proteomics identification database (PRIDE), emerging from the hundreds of laboratories is that C-HPP data sets will have substantial Tranche, PeptideAtlas, the global proteome worldwide engaged in hypothesis-driven NATURE BIOTECHNOLOGY VOLUME 30 NUMBER 3 MARCH 2012 221 CORRESPONDENCE a Molecular function Biological process Cellular component example of such a global view for selected • Cell proliferation • Tumor necrosis factor • Extracellular space • Immune response receptor binding • Plasma membrane regions of chromosomes 13 and 17 (Fig. 1) • Signal transduction summarizes the following extensive data sets • Regulation of cell cycle/ • Plasma membrane • Motor activity proliferation • Nucleoplasm on the basis of existing data compilations: • Signaling pathway protein evidence, mass spectrometry data, • Insulin receptor binding • Cytosol • Glucose metabolic • Signal transduction • Plasma membrane process antibody availability, major PTMs, disease • Extracellular matrix • Angiogenesis information and transcript level, including constituent N/A • Axon guidance • Protein binding ASTs from three different samples in a • Tumor necrosis factor • Angiogenesis • Collagen type IV format viewable for associations between receptor binding • Axon guidance data sets and information gaps in specific b Molecular function Biological process Cellular component chromosome regions. • Methyltransferase • Hormone biosynthetic • Cytosol activity process In phase 1 (~6 years), the C-HPP plans to map all proteins currently lacking high- • GPI anchor biosynthetic • Golgi membrane • Hydrolase activity process • Integral to membrane quality mass spectrometry evidence, three • Growth factor receptor major classes of PTMs, many representative activity • Signaling pathway • Integral to membrane 6 • ErbB-3 class receptor • Angiogenesis • Nucleus AST products and many nsSNP sequence binding variants. The characterizations will be • Cytosol • Selenium binding • Cell redox homeostasis • Membrane followed by antibody-based detection in • EGFR signaling • SH3/SH2 adaptor selected tissues and cell lines. In phase 2 pathway • Cytosol activity • Blood coagulation (~4 years), identified proteins will be c Biological process Cellular component characterized and validated with additional N/A N/A N/A proteomic and antibody measurements. N/A • Cell death • Nucleus Throughout this 10-year project, the C-HPP N/A N/A • Extracellular region aims to generate information useful in N/A N/A • Extracellular region the search for new biomarkers and drug N/A N/A N/A targets and also in the study of disease gene d families clustered in each chromosome Molecular function Biological process Cellular component (for example, the cytokeratin gene family • Olfactory receptor • Sensory perception • Integral to plasma in chromosome 17). C-HPP outputs will activity of smell membrane be integrated with output from the parallel B/D-HPP project. The C-HPP has selected the UniProt protein list (based on Ensembl Figure 1 Genomic, transcriptomic and protein information for the set of genes present in selected genome builds) as the starting point for regions of chromosomes 13 and 17. (a,b) The information provides a comprehensive landscape with respect to protein evidence, quality of mass spectrometry–based protein identification, availability identified proteins. Individual chromosome of antibody, disease relationship, and phosphorylation, acetylation, glycosylation and transcriptomic teams will use information collected in well- information. It shows the degree of protein annotation on two important regions on chromosomes 13 (a) annotated databases (for example, GPMDB, © 2012 Nature America, Inc. All rights reserved. America, Inc. © 2012 Nature and 17 (b) and regions with little annotated protein information on chromosomes 13 (c) and 17 (d). PE, PeptideAtlas and neXtProt) to develop a protein evidence from UniProt; Mq, mass quality from GPMDB; Mo, number of mass spectrometry data list of missing or poorly identified proteins sets in GPMDB; Ab, antibody availability; Di, disease information; Ph, Ac and Gl, phosphoryl, acetyl for a particular chromosome. A plot of and glyco, respectively; Pt, placenta transcript; Pa, placenta
Recommended publications
  • A High-Stringency Blueprint of the Human Proteome
    Providence St. Joseph Health Providence St. Joseph Health Digital Commons Articles, Abstracts, and Reports 10-16-2020 A high-stringency blueprint of the human proteome. Subash Adhikari Edouard C Nice Eric W Deutsch Institute for Systems Biology, Seattle, WA, USA. Lydie Lane Gilbert S Omenn See next page for additional authors Follow this and additional works at: https://digitalcommons.psjhealth.org/publications Part of the Genetics and Genomics Commons Recommended Citation Adhikari, Subash; Nice, Edouard C; Deutsch, Eric W; Lane, Lydie; Omenn, Gilbert S; Pennington, Stephen R; Paik, Young-Ki; Overall, Christopher M; Corrales, Fernando J; Cristea, Ileana M; Van Eyk, Jennifer E; Uhlén, Mathias; Lindskog, Cecilia; Chan, Daniel W; Bairoch, Amos; Waddington, James C; Justice, Joshua L; LaBaer, Joshua; Rodriguez, Henry; He, Fuchu; Kostrzewa, Markus; Ping, Peipei; Gundry, Rebekah L; Stewart, Peter; Srivastava, Sanjeeva; Srivastava, Sudhir; Nogueira, Fabio C S; Domont, Gilberto B; Vandenbrouck, Yves; Lam, Maggie P Y; Wennersten, Sara; Vizcaino, Juan Antonio; Wilkins, Marc; Schwenk, Jochen M; Lundberg, Emma; Bandeira, Nuno; Marko-Varga, Gyorgy; Weintraub, Susan T; Pineau, Charles; Kusebauch, Ulrike; Moritz, Robert L; Ahn, Seong Beom; Palmblad, Magnus; Snyder, Michael P; Aebersold, Ruedi; and Baker, Mark S, "A high-stringency blueprint of the human proteome." (2020). Articles, Abstracts, and Reports. 3832. https://digitalcommons.psjhealth.org/publications/3832 This Article is brought to you for free and open access by Providence St. Joseph Health
    [Show full text]
  • Enhanced Representation of Natural Product Metabolism in Uniprotkb
    H OH metabolites OH Article Diverse Taxonomies for Diverse Chemistries: Enhanced Representation of Natural Product Metabolism in UniProtKB Marc Feuermann 1,* , Emmanuel Boutet 1,* , Anne Morgat 1 , Kristian B. Axelsen 1, Parit Bansal 1, Jerven Bolleman 1 , Edouard de Castro 1, Elisabeth Coudert 1, Elisabeth Gasteiger 1,Sébastien Géhant 1, Damien Lieberherr 1, Thierry Lombardot 1,†, Teresa B. Neto 1, Ivo Pedruzzi 1, Sylvain Poux 1, Monica Pozzato 1, Nicole Redaschi 1 , Alan Bridge 1 and on behalf of the UniProt Consortium 1,2,3,4,‡ 1 Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 Michel-Servet, CH-1211 Geneva 4, Switzerland; [email protected] (A.M.); [email protected] (K.B.A.); [email protected] (P.B.); [email protected] (J.B.); [email protected] (E.d.C.); [email protected] (E.C.); [email protected] (E.G.); [email protected] (S.G.); [email protected] (D.L.); [email protected] (T.L.); [email protected] (T.B.N.); [email protected] (I.P.); [email protected] (S.P.); [email protected] (M.P.); [email protected] (N.R.); [email protected] (A.B.); [email protected] (U.C.) 2 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK 3 Protein Information Resource, University of Delaware, 15 Innovation Way, Suite 205, Newark, DE 19711, USA 4 Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven Street NorthWest, Suite 1200, Washington, DC 20007, USA * Correspondence: [email protected] (M.F.); [email protected] (E.B.); Tel.: +41-22-379-58-75 (M.F.); +41-22-379-49-10 (E.B.) † Current address: Centre Informatique, Division Calcul et Soutien à la Recherche, University of Lausanne, CH-1015 Lausanne, Switzerland.
    [Show full text]
  • Research Resources for Nuclear Receptor Signaling Pathways Neil J
    Molecular Pharmacology Fast Forward. Published on May 23, 2016 as DOI: 10.1124/mol.116.103713 This article has not been copyedited and formatted. The final version may differ from this version. MOL #103713 Research resources for nuclear receptor signaling pathways Neil J. McKenna Department of Molecular and Cellular Biology and Nuclear Receptor Signaling Atlas (NURSA) Bioinformatics Resource, Downloaded from Baylor College of Medicine, Houston, TX, 77030, USA molpharm.aspetjournals.org at ASPET Journals on September 27, 2021 1 Molecular Pharmacology Fast Forward. Published on May 23, 2016 as DOI: 10.1124/mol.116.103713 This article has not been copyedited and formatted. The final version may differ from this version. MOL #103713 Running title: Research resources for NR signaling pathways Corresponding author: Neil J McKenna Room M620 Baylor College of Medicine One Baylor Plaza Downloaded from Houston, TX, 77030, USA t: 713-798-7490 molpharm.aspetjournals.org f: 713-798-6822 e: [email protected] Number of text pages: 21 at ASPET Journals on September 27, 2021 Number of tables: 1 Number of figures: 1 Number of references: 56 Number of words in Abstract: 124 Review: 3613 List of non-standard abbreviations: 17βE2, 17β-estradiol; AB, Allen Brain Atlas; BG, BIOGRID; BGS, BioGPS; CoR, coregulator; CTD, Comparative Toxicogenomics Database; DAV, DAVID; DB, DrugBank; EDC, endocrine disrupting chemical; EG, Entrez Gene; EM, Edinburgh Mouse; ENC, ENCODE; ENR, ENRICHR; ENS, Ensembl; EX, Expression Atlas; GC, GeneCards; GSEA, GeneSet Enrichment Analysis; GtoP, IUPHAR Guide To Pharmacology; 2 Molecular Pharmacology Fast Forward. Published on May 23, 2016 as DOI: 10.1124/mol.116.103713 This article has not been copyedited and formatted.
    [Show full text]
  • Uniprot: the Universal Protein Knowledgebase in 2021 the Uniprot Consortium1,2,3,4,*
    D480–D489 Nucleic Acids Research, 2021, Vol. 49, Database issue Published online 25 November 2020 doi: 10.1093/nar/gkaa1100 UniProt: the universal protein knowledgebase in 2021 The UniProt Consortium1,2,3,4,* 1European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK, 2Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven Street NW, Suite 1200, Washington, DC 20007, USA, 3Protein Information Resource, University of Delaware, Ammon-Pinizzotto Biopharmaceutical Innovation Building, Suite 147, 590 Avenue 1743, Newark, DE 19713, USA and 4SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, CH-1211 Geneva 4, Switzerland Received September 15, 2020; Revised October 21, 2020; Editorial Decision October 22, 2020; Accepted November 02, 2020 ABSTRACT tomated systems. The UniRef databases cluster sequence sets at various levels of sequence identity and the UniProt The aim of the UniProt Knowledgebase is to provide Archive (UniParc) delivers a complete set of known se- users with a comprehensive, high-quality and freely quences, including historical obsolete sequences. UniProt accessible set of protein sequences annotated with additionally integrates, interprets, and standardizes data functional information. In this article, we describe from multiple selected resources to add biological knowl- significant updates that we have made over the last edge and associated metadata to protein records and acts two years to the resource. The number of sequences as a central hub from which users can link out to 180 in UniProtKB has risen to approximately 190 million, other resources. In recognition of the quality of our data, despite continued work to reduce sequence redun- and the service we provide, UniProt was recognised as dancy at the proteome level.
    [Show full text]
  • Microbes and Metagenomics in Human Health an Overview of Recent Publications Featuring Illumina® Technology TABLE of CONTENTS
    Microbes and Metagenomics in Human Health An overview of recent publications featuring Illumina® technology TABLE OF CONTENTS 4 Introduction 5 Human Microbiome Gut Microbiome Gut Microbiome and Disease Inflammatory Bowel Disease (IBD) Metabolic Diseases: Diabetes and Obesity Obesity Oral Microbiome Other Human Biomes 25 Viromes and Human Health Viral Populations Viral Zoonotic Reservoirs DNA Viruses RNA Viruses Human Viral Pathogens Phages Virus Vaccine Development 44 Microbial Pathogenesis Important Microorganisms in Human Health Antimicrobial Resistance Bacterial Vaccines 54 Microbial Populations Amplicon Sequencing 16S: Ribosomal RNA Metagenome Sequencing: Whole-Genome Shotgun Metagenomics Eukaryotes Single-Cell Sequencing (SCS) Plasmidome Transcriptome Sequencing 63 Glossary of Terms 64 Bibliography This document highlights recent publications that demonstrate the use of Illumina technologies in immunology research. To learn more about the platforms and assays cited, visit www.illumina.com. An overview of recent publications featuring Illumina technology 3 INTRODUCTION The study of microbes in human health traditionally focused on identifying and 1. Roca I., Akova M., Baquero F., Carlet J., treating pathogens in patients, usually with antibiotics. The rise of antibiotic Cavaleri M., et al. (2015) The global threat of resistance and an increasingly dense—and mobile—global population is forcing a antimicrobial resistance: science for interven- tion. New Microbes New Infect 6: 22-29 1, 2, 3 change in that paradigm. Improvements in high-throughput sequencing, also 2. Shallcross L. J., Howard S. J., Fowler T. and called next-generation sequencing (NGS), allow a holistic approach to managing Davies S. C. (2015) Tackling the threat of anti- microbial resistance: from policy to sustainable microbes in human health.
    [Show full text]
  • Bioinformatics Tools for RNA-Seq Gene and Isoform Quantification
    on: Sequ ati en er c n in e g G & t x A Journal of e p Zhang, et al., Next Generat Sequenc & Applic p N l f i c o 2016, 3:3 a l t a i o n r ISSN: 2469-9853n u s DOI: 10.4172/2469-9853.1000140 o Next Generation Sequencing & Applications J Review Article Open Access Bioinformatics Tools for RNA-seq Gene and Isoform Quantification Chi Zhang1, Baohong Zhang1, Michael S Vincent2 and Shanrong Zhao1* 1Early Clinical Development, Pfizer Worldwide R&D, Cambridge, MA, USA 2Inflammation and Immunology RU, Pfizer Worldwide R&D, Cambridge, MA, USA *Corresponding author: Shanrong Zhao, Early Clinical Development, Pfizer Worldwide R&D, Cambridge, MA, 02139, USA, Tel: + 1-212-733-2323; E-mail: [email protected] Rec date: Oct 27, 2016; Acc date: Dec 15, 2016; Pub date: Dec 17, 2016 Copyright: © 2016 Zhang C, et al. This is an open-access article distributed under the terms of the creative commons attribution license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Abstract In recent years, RNA-seq has emerged as a powerful technology in estimation of gene or transcript expression. ‘Union-exon’ and transcript based approaches are widely used in gene quantification. The ‘Union-exon’ based approach is simple, but it does not distinguish between isoforms when multiple alternatively spliced transcripts are expressed from the same gene. Because a gene is expressed in one or more transcript isoforms, the transcript based approach is more biologically meaningful than the ‘union exon’-based approach.
    [Show full text]
  • PROTEOMICS the Human Proteome Takes the Spotlight
    RESEARCH HIGHLIGHTS PROTEOMICS The human proteome takes the spotlight Two papers report mass spectrometry– big data. “We then thought, include some surpris- based draft maps of the human proteome ‘What is a potentially good ing findings. For example, and provide broadly accessible resources. illustration for the utility Kuster’s team found protein For years, members of the proteomics of such a database?’” says evidence for 430 long inter- community have been trying to garner sup- Kuster. “We very quickly genic noncoding RNAs, port for a large-scale project to exhaustively got to the idea, ‘Why don’t which have been thought map the normal human proteome, including we try to put together the not to be translated into pro- identifying all post-translational modifica- human proteome?’” tein. Pandey’s team refined tions and protein-protein interactions and The two groups took the annotations of 808 genes providing targeted mass spectrometry assays slightly different strategies and also found evidence and antibodies for all human proteins. But a towards this common goal. for the translation of many Nik Spencer/Nature Publishing Group Publishing Nik Spencer/Nature lack of consensus on how to exactly define Pandey’s lab examined 30 noncoding RNAs and pseu- Two groups provide mass the proteome, how to carry out such a mis- normal tissues, including spectrometry evidence for dogenes. sion and whether the technology is ready has adult and fetal tissues, as ~90% of the human proteome. Obtaining evidence for not so far convinced any funding agencies to well as primary hematopoi- the last roughly 10% of pro- fund on such an ambitious project.
    [Show full text]
  • Downloaded As a CSV Dump file
    cells Article Transcriptome and Methylome Analysis Reveal Complex Cross-Talks between Thyroid Hormone and Glucocorticoid Signaling at Xenopus Metamorphosis Nicolas Buisine 1,† , Alexis Grimaldi 1,†, Vincent Jonchere 1,† , Muriel Rigolet 1, Corinne Blugeon 2 , Juliette Hamroune 2 and Laurent Marc Sachs 1,* 1 UMR7221 Molecular Physiology and Adaption, CNRS, Museum National d’Histoire Naturelle, 57 Rue Cuvier, CEDEX 05, 75231 Paris, France; [email protected] (N.B.); [email protected] (A.G.); [email protected] (V.J.); [email protected] (M.R.) 2 Genomics Core Facility, Département de Biologie, Institut de Biologie de l’ENS (IBENS), École Normale Supérieure, CNRS, INSERM, Université PSL, 75005 Paris, France; [email protected] (C.B.); [email protected] (J.H.) * Correspondence: [email protected] † Co-first authors, alphabetic order. Abstract: Background: Most work in endocrinology focus on the action of a single hormone, and very little on the cross-talks between two hormones. Here we characterize the nature of interactions between thyroid hormone and glucocorticoid signaling during Xenopus tropicalis metamorphosis. Methods: We used functional genomics to derive genome wide profiles of methylated DNA and measured changes of gene expression after hormonal treatments of a highly responsive tissue, tailfin. Clustering classified the data into four types of biological responses, and biological networks were Citation: Buisine, N.; Grimaldi, A.; modeled by system biology. Results: We found that gene expression is mostly regulated by either Jonchere, V.; Rigolet, M.; Blugeon, C.; T or CORT, or their additive effect when they both regulate the same genes. A small but non- Hamroune, J.; Sachs, L.M.
    [Show full text]
  • Is It Time for Cognitive Bioinformatics?
    g in Geno nin m i ic M s ta & a P Lisitsa et al., J Data Mining Genomics Proteomics 2015, 6:2 D r f o Journal of o t e l DOI: 10.4172/2153-0602.1000173 o a m n r i c u s o J ISSN: 2153-0602 Data Mining in Genomics & Proteomics Review Article Open Access Is it Time for Cognitive Bioinformatics? Andrey Lisitsa1,2*, Elizabeth Stewart2,3, Eugene Kolker2-5 1The Russian Human Proteome Organization (RHUPO), Institute of Biomedical Chemistry, Moscow, Russian Federation 2Data Enabled Life Sciences Alliance (DELSA Global), Moscow, Russian Federation 3Bioinformatics and High-Throughput Data Analysis Laboratory, Seattle Children’s Research Institute, Seattle, WA, USA 4Predictive Analytics, Seattle Children’s Hospital, Seattle, WA, USA 5Departments of Biomedical Informatics & Medical Education and Pediatrics, University of Washington, Seattle, WA, USA Abstract The concept of cognitive bioinformatics has been proposed for structuring of knowledge in the field of molecular biology. While cognitive science is considered as “thinking about the process of thinking”, cognitive bioinformatics strives to capture the process of thought and analysis as applied to the challenging intersection of diverse fields such as biology, informatics, and computer science collectively known as bioinformatics. Ten years ago cognitive bioinformatics was introduced as a model of the analysis performed by scientists working with molecular biology and biomedical web resources. At present, the concept of cognitive bioinformatics can be examined in the context of the opportunities represented by the information “data deluge” of life sciences technologies. The unbalanced nature of accumulating information along with some challenges poses currently intractable problems for researchers.
    [Show full text]
  • Proteomics and Metabolomics: the Final Frontier of Nutrition Research 71
    SIGHT AND LIFE | VOL. 29(1) | 2015 PROTEOMICS AND METABOLOMICS: THE FINAL FRONTIER OF NUTRITION RESEARCH 71 Proteomics and Metabolomics: The Final Frontier of Nutrition Research Richard D Semba RNA editing, RNA splicing, post-translational modifications, and Wilmer Eye Institute, Johns Hopkins University School protein degradation; the proteome does not strictly reflect the of Medicine, Baltimore, Maryland, USA genome. Proteins function as enzymes, hormones, receptors, immune mediators, structure, transporters, and modulators of cell communication and signaling. The metabolome consists Introduction of amino acids, amines, peptides, sugars, oligonucleotides, ke- Revolutionary new technologies allow us to penetrate scientific tones, aldehydes, lipids, steroids, vitamins, and other molecules. frontiers and open vast new territories for discovery. In astrono- These metabolites reflect intrinsic chemical processes in cells my, the Hubble Space Telescope has facilitated an unprecedent- as well as environmental exposures such as diet and gut micro- ed view outwards, beyond our galaxy. Wherever the telescope is bial flora. The current Human Metabolome Database contains directed, scientists are making exciting new observations of the more than 40,000 entries7 – a number that is expected to grow deep universe. Another revolution is taking place in two fields quickly in the future. of “omics” research: proteomics and metabolomics. In contrast, The goals of proteomics include the detection of the diver- this view is directed inwards, towards the complexity of biologi- sity of proteins, their quantity, their isoforms, and the localiza- cal processes in living organisms. Proteomics is the study of the tion and interactions of proteins. The goals of metabolomics structure and function of proteins expressed by an organism.
    [Show full text]
  • How Many Human Proteoforms Are There?
    PERSPECTIVE PUBLISHED ONLINE: 14 FEBRUARY 2018 | DOI: 10.1038/NCHEMBIO.2576 How many human proteoforms are there? Ruedi Aebersold1, Jeffrey N Agar2, I Jonathan Amster3 , Mark S Baker4 , Carolyn R Bertozzi5, Emily S Boja6, Catherine E Costello7, Benjamin F Cravatt8 , Catherine Fenselau9, Benjamin A Garcia10, Ying Ge11,12, Jeremy Gunawardena13, Ronald C Hendrickson14, Paul J Hergenrother15, Christian G Huber16 , Alexander R Ivanov2, Ole N Jensen17, Michael C Jewett18, Neil L Kelleher19* , Laura L Kiessling20 , Nevan J Krogan21, Martin R Larsen17, Joseph A Loo22 , Rachel R Ogorzalek Loo22, Emma Lundberg23,24, Michael J MacCoss25, Parag Mallick5, Vamsi K Mootha13, Milan Mrksich18, Tom W Muir26, Steven M Patrie19, James J Pesavento27 , Sharon J Pitteri5 , Henry Rodriguez6, Alan Saghatelian28, Wendy Sandoval29, Hartmut Schlüter30 , Salvatore Sechi31, Sarah A Slavoff32, Lloyd M Smith12,33, Michael P Snyder24, Paul M Thomas19 , Mathias Uhlén34, Jennifer E Van Eyk35, Marc Vidal36, David R Walt37, Forest M White38, Evan R Williams39, Therese Wohlschlager16, Vicki H Wysocki40, Nathan A Yates41, Nicolas L Young42 & Bing Zhang42 Despite decades of accumulated knowledge about proteins and their post-translational modifications (PTMs), numerous ques- tions remain regarding their molecular composition and biological function. One of the most fundamental queries is the extent to which the combinations of DNA-, RNA- and PTM-level variations explode the complexity of the human proteome. Here, we outline what we know from current databases and measurement strategies including mass spectrometry–based proteomics. In doing so, we examine prevailing notions about the number of modifications displayed on human proteins and how they combine to generate the protein diversity underlying health and disease.
    [Show full text]
  • Biomolecule and Bioentity Interaction Databases in Systems Biology: a Comprehensive Review
    biomolecules Review Biomolecule and Bioentity Interaction Databases in Systems Biology: A Comprehensive Review Fotis A. Baltoumas 1,* , Sofia Zafeiropoulou 1, Evangelos Karatzas 1 , Mikaela Koutrouli 1,2, Foteini Thanati 1, Kleanthi Voutsadaki 1 , Maria Gkonta 1, Joana Hotova 1, Ioannis Kasionis 1, Pantelis Hatzis 1,3 and Georgios A. Pavlopoulos 1,3,* 1 Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center “Alexander Fleming”, 16672 Vari, Greece; zafeiropoulou@fleming.gr (S.Z.); karatzas@fleming.gr (E.K.); [email protected] (M.K.); [email protected] (F.T.); voutsadaki@fleming.gr (K.V.); [email protected] (M.G.); hotova@fleming.gr (J.H.); [email protected] (I.K.); hatzis@fleming.gr (P.H.) 2 Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen, Denmark 3 Center for New Biotechnologies and Precision Medicine, School of Medicine, National and Kapodistrian University of Athens, 11527 Athens, Greece * Correspondence: baltoumas@fleming.gr (F.A.B.); pavlopoulos@fleming.gr (G.A.P.); Tel.: +30-210-965-6310 (G.A.P.) Abstract: Technological advances in high-throughput techniques have resulted in tremendous growth Citation: Baltoumas, F.A.; of complex biological datasets providing evidence regarding various biomolecular interactions. Zafeiropoulou, S.; Karatzas, E.; To cope with this data flood, computational approaches, web services, and databases have been Koutrouli, M.; Thanati, F.; Voutsadaki, implemented to deal with issues such as data integration, visualization, exploration, organization, K.; Gkonta, M.; Hotova, J.; Kasionis, scalability, and complexity. Nevertheless, as the number of such sets increases, it is becoming more I.; Hatzis, P.; et al. Biomolecule and and more difficult for an end user to know what the scope and focus of each repository is and how Bioentity Interaction Databases in redundant the information between them is.
    [Show full text]