Proteogenomic Analysis and Global Discovery of Posttranslational

Total Page:16

File Type:pdf, Size:1020Kb

Proteogenomic Analysis and Global Discovery of Posttranslational Proteogenomic analysis and global discovery of PNAS PLUS posttranslational modifications in prokaryotes Ming-kun Yanga,1, Yao-hua Yanga,1, Zhuo Chena, Jia Zhanga, Yan Lina, Yan Wanga, Qian Xionga, Tao Lia,2, Feng Gea,2, Donald A. Bryantb, and Jin-dong Zhaoa aKey Laboratory of Algal Biology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan 430072, China; and bDepartment of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802 Edited* by Jiayang Li, Chinese Academy of Sciences, Beijing, China, and approved November 13, 2014 (received for review July 6, 2014) We describe an integrated workflow for proteogenomic analysis and cyanobacterial cells adjust their cellular activities in response and global profiling of posttranslational modifications (PTMs) in to a wide range of environmental cues and stimuli. Recently, prokaryotes and use the model cyanobacterium Synechococcus sp. cyanobacteria have attracted great interest due to their crucial PCC 7002 (hereafter Synechococcus 7002) as a test case. We found roles in global carbon and nitrogen cycles and their ability to more than 20 different kinds of PTMs, and a holistic view of PTM produce clean and renewable biofuels such as hydrogen (14–16). events in this organism grown under different conditions was Synechococcus 7002 is a unicellular, marine cyanobacterium and obtained without specific enrichment strategies. Among 3,186 pre- a model organism for studying photosynthetic carbon fixation dicted protein-coding genes, 2,938 gene products (>92%) were and the development of biofuels (17, 18). However, whereas the identified. We also identified 118 previously unidentified proteins genome of Synechococcus 7002 is fully sequenced, it is annotated and corrected 38 predicted gene-coding regions in the Synecho- only by in silico methods (www.ncbi.nlm.nih.gov/), with a large coccus 7002 genome. This systematic analysis not only provides portion (1,210 out of 3,186) of protein-coding genes annotated as comprehensive information on protein profiles and the diversity hypothetical proteins (17). Therefore, a comprehensive analysis of PTMs in Synechococcus 7002 but also provides some insights is needed to provide experimental support for the genome an- into photosynthetic pathways in cyanobacteria. The entire proteo- notation so as to facilitate systems-level analysis. Using our method, genomics pipeline is applicable to any sequenced prokaryotic we performed the validation of the predicted protein-coding genes, organism, and we suggest that it should become a standard part identified previously unidentified genes, and corrected gene initia- of genome annotation projects. tion and stop-codon positions in Synechococcus 7002, and di- rectional RNA-Seq was used to determine the existence of a proteogenomics | post-translational modifications | cyanobacteria | number of previously unidentified genes identified in this study. photosynthesis | Synechococcus Significance roteogenomics refers to the correlation of mass spectrome- SCIENCES Ptry-derived proteomic data to refine genome annotation (1) Proteogenomics is the application of mass spectrometry- and has been applied to the identification of previously un- derived proteomic data for testing and refining predicted genetic APPLIED BIOLOGICAL identified genes and the correction and validation of predicted models. Cyanobacteria, the only prokaryotes capable of oxy- genes in various organisms (2–4). It is an important tool for in- genic photosynthesis, are the ancestor of chloroplasts in plants tegrating protein-level information into the genome annotation and play crucial roles in global carbon and nitrogen cycles. An process and can greatly improve genome annotation quality. The integrated proteogenomic workflow was developed, and we same experimental proteomic datasets are also useful in identi- tested this system on a model cyanobacterium, Synechococcus fying posttranslational modifications (PTMs) on a proteome- 7002, grown under various conditions. We obtained a nearly wide level (5, 6). Many cellular proteins undergo appreciable complete genome translational profile of this model organism. amounts of PTM in response to certain stimuli, and this dynamic In addition, a holistic view of posttranslational modification process occurs in various cell compartments to dictate the fate (PTM) events is provided using the same dataset, and the and activity of the modified proteins (7). Identification and results provide insights into photosynthesis. The entire pro- mapping of PTMs in proteins have been improved dramatically, teogenomics pipeline is applicable to any sequenced prokar- mainly due to increases in the sensitivity, speed, accuracy, and yotes and could be applied as a standard part of genome resolution of mass spectrometry (MS). However, system-wide annotation projects. identification of multiple PTMs remains a highly challenging task, especially in situations where some reversible PTMs are Author contributions: T.L., F.G., and J.-d.Z. designed research; M.-k.Y., Y.-h.Y., Z.C., Y.L., induced by a particular stimulus and are present for only a short Y.W., T.L., and F.G. performed research; D.A.B. contributed new reagents/analytic tools; period (8). To the best of our knowledge, very few reports of M.-k.Y., Y.-h.Y., Z.C., J.Z., Y.L., Q.X., T.L., F.G., and J.-d.Z. analyzed data; and M.-k.Y., Z.C., T.L., F.G., D.A.B., and J.-d.Z. wrote the paper. proteogenomic datasets have presently been used to analyze PTM events comprehensively in a genome sequence (9, 10). The authors declare no conflict of interest. In this study, we developed a proteogenomic approach to carry *This Direct Submission article had a prearranged editor. out genome annotation and whole-proteome analysis of PTMs in Freely available online through the PNAS open access option. prokaryotes by using high resolution and high accuracy MS data Data deposition: MS raw data files, processed data files, annotated peptide spectral matches for all accepted single peptide hits, annotated peptide spectrum matches of and the cyanobacterium Synechococcus sp. PCC 7002 (hereafter identified peptides with stop codon read-through, annotated spectra of all identified Synechococcus 7002) as a test case. Cyanobacteria are a mor- peptides with specific PTMs, snapshots of genome browser for all the previously uniden- phologically diverse group of Gram-negative bacteria and are the tified genes along with the corrections to the existing gene models identified in this study, Tables S1–S9, RNA-Seq data, and in-house integrated proteogenomic analysis soft- only prokaryotes capable of oxygenic photosynthesis (11). It is ware have been deposited in the PeptideAtlas repository (www.peptideatlas.org) (acces- estimated that more than half of the photosynthetic activity on sion no. PASS00285). Earth is contributed by cyanobacteria (12). Cyanobacteria make 1M.-k.Y. and Y.-h.Y. contributed equally to this work. substantial contributions to global CO2 assimilation, O2 pro- 2To whom correspondence may be addressed. Email: [email protected] or [email protected]. duction, and N2 fixation and are the progenitors of chloroplasts This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. in higher plants (13). Cyanobacterial habitats are highly diverse, 1073/pnas.1412722111/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1412722111 PNAS | Published online December 15, 2014 | E5633–E5642 Downloaded by guest on September 25, 2021 Fig. 1. Experimental and bioinformatic workflow of the proteogenomic analysis. Protein extracts were prepared from Synechococcus 7002 cultures grown to exponential phase (flask 1) and stationary phase (flask 2) and exposed to stresses, including iron de- ficiency (flask 3), phosphate deficiency (flask 4), ni- trogen deficiency (flask 5), A5 deficiency (flask 6), high light (flask 7), and high salt concentration (flask 8) (2.5 M). After protein extraction, proteins are subjected to Glu-C and Trypsin digestion, producing peptide mixture. The mixture was analyzed by means of nano-LC-MS/MS with an LTQ-Orbitrap Elite mass spectrometer. MS/MS peptide spectra were searched against specific organism genome sequences, vali- dating and correcting genomic annotations, as well as identifying previously unidentified protein-cod- ing genes and diverse PTMs. More importantly, we characterized PTM features on a pro- identified from the two fractionation methods applied for ana- teome-wide level using the same experimental proteomic data- lyzing the cell-lysate proteins. Sequest, Mascot, MaxQuant, pFind, sets and identified many different PTM types that may play and X!Tandem searches identified 238,918; 239,779; 252,696; important roles in cellular functions. Our proteogenomic data 241,601; and 268,763 peptides, respectively, resulting in the iden- provided significant information for revising the genome anno- tification of 55,862 unique peptides. tation of Synechococcus 7002 and offered insights into the physiology of this model organism. The method and approach Protein Expression Analysis of Synechococcus. The CyanoBase da- can also be used to study genome annotation and cellular protein tabase (genome.microbedb.jp/cyanobase/) lists 3,186 protein-coding PTMs in other organisms. genes in the Synechococcus 7002 genome (3.41 Mb, released 2012). Unique peptides were mapped onto the genome-translated da- Results tabase, and proteins identified by at least two unique peptides or Proteogenomic Strategy for the Analysis of Synechococcus
Recommended publications
  • Enhanced Representation of Natural Product Metabolism in Uniprotkb
    H OH metabolites OH Article Diverse Taxonomies for Diverse Chemistries: Enhanced Representation of Natural Product Metabolism in UniProtKB Marc Feuermann 1,* , Emmanuel Boutet 1,* , Anne Morgat 1 , Kristian B. Axelsen 1, Parit Bansal 1, Jerven Bolleman 1 , Edouard de Castro 1, Elisabeth Coudert 1, Elisabeth Gasteiger 1,Sébastien Géhant 1, Damien Lieberherr 1, Thierry Lombardot 1,†, Teresa B. Neto 1, Ivo Pedruzzi 1, Sylvain Poux 1, Monica Pozzato 1, Nicole Redaschi 1 , Alan Bridge 1 and on behalf of the UniProt Consortium 1,2,3,4,‡ 1 Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, CMU, 1 Michel-Servet, CH-1211 Geneva 4, Switzerland; [email protected] (A.M.); [email protected] (K.B.A.); [email protected] (P.B.); [email protected] (J.B.); [email protected] (E.d.C.); [email protected] (E.C.); [email protected] (E.G.); [email protected] (S.G.); [email protected] (D.L.); [email protected] (T.L.); [email protected] (T.B.N.); [email protected] (I.P.); [email protected] (S.P.); [email protected] (M.P.); [email protected] (N.R.); [email protected] (A.B.); [email protected] (U.C.) 2 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK 3 Protein Information Resource, University of Delaware, 15 Innovation Way, Suite 205, Newark, DE 19711, USA 4 Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven Street NorthWest, Suite 1200, Washington, DC 20007, USA * Correspondence: [email protected] (M.F.); [email protected] (E.B.); Tel.: +41-22-379-58-75 (M.F.); +41-22-379-49-10 (E.B.) † Current address: Centre Informatique, Division Calcul et Soutien à la Recherche, University of Lausanne, CH-1015 Lausanne, Switzerland.
    [Show full text]
  • Clinical Implications of Recent Advances in Proteogenomics
    Clinical implications of recent advances in proteogenomics Marie Locard-Paulet, Olivier Pible, Anne Gonzalez de Peredo, Béatrice Alpha-Bazin, Christine Almunia, Odile Burlet-Schiltz, J. Armengaud To cite this version: Marie Locard-Paulet, Olivier Pible, Anne Gonzalez de Peredo, Béatrice Alpha-Bazin, Christine Almu- nia, et al.. Clinical implications of recent advances in proteogenomics. Expert Review of Proteomics, Taylor & Francis, 2016, 13 (2), pp.185-199. 10.1586/14789450.2016.1132169. hal-03080146 HAL Id: hal-03080146 https://hal.archives-ouvertes.fr/hal-03080146 Submitted on 19 Mar 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Publisher: Taylor & Francis Journal: Expert Review of Proteomics DOI: 10.1586/14789450.2016.1132169 Review Clinical implications of recent advances in proteogenomics Marie Locard-Paulet1,2, Olivier Pible3, Anne Gonzalez de Peredo1,2, Béatrice Alpha-Bazin3, Christine Almunia3, Odile Burlet-Schiltz1,2, Jean Armengaud3* 1CNRS, IPBS (Institut de Pharmacologie et Biologie Structurale), 205 route de Narbonne, 31077 Toulouse, France. 2Université de Toulouse, UPS, IPBS, 31077 Toulouse, France. 3CEA-Marcoule, DSV/IBITEC-S/SPI/Li2D, Laboratory “Innovative technologies for Detection and Diagnostics”, BP 17171, F-30200 Bagnols-sur-Cèze, France.
    [Show full text]
  • Research Resources for Nuclear Receptor Signaling Pathways Neil J
    Molecular Pharmacology Fast Forward. Published on May 23, 2016 as DOI: 10.1124/mol.116.103713 This article has not been copyedited and formatted. The final version may differ from this version. MOL #103713 Research resources for nuclear receptor signaling pathways Neil J. McKenna Department of Molecular and Cellular Biology and Nuclear Receptor Signaling Atlas (NURSA) Bioinformatics Resource, Downloaded from Baylor College of Medicine, Houston, TX, 77030, USA molpharm.aspetjournals.org at ASPET Journals on September 27, 2021 1 Molecular Pharmacology Fast Forward. Published on May 23, 2016 as DOI: 10.1124/mol.116.103713 This article has not been copyedited and formatted. The final version may differ from this version. MOL #103713 Running title: Research resources for NR signaling pathways Corresponding author: Neil J McKenna Room M620 Baylor College of Medicine One Baylor Plaza Downloaded from Houston, TX, 77030, USA t: 713-798-7490 molpharm.aspetjournals.org f: 713-798-6822 e: [email protected] Number of text pages: 21 at ASPET Journals on September 27, 2021 Number of tables: 1 Number of figures: 1 Number of references: 56 Number of words in Abstract: 124 Review: 3613 List of non-standard abbreviations: 17βE2, 17β-estradiol; AB, Allen Brain Atlas; BG, BIOGRID; BGS, BioGPS; CoR, coregulator; CTD, Comparative Toxicogenomics Database; DAV, DAVID; DB, DrugBank; EDC, endocrine disrupting chemical; EG, Entrez Gene; EM, Edinburgh Mouse; ENC, ENCODE; ENR, ENRICHR; ENS, Ensembl; EX, Expression Atlas; GC, GeneCards; GSEA, GeneSet Enrichment Analysis; GtoP, IUPHAR Guide To Pharmacology; 2 Molecular Pharmacology Fast Forward. Published on May 23, 2016 as DOI: 10.1124/mol.116.103713 This article has not been copyedited and formatted.
    [Show full text]
  • Integrative Annotation of 21,037 Human Genes Validated by Full-Length Cdna Clones
    PLoS BIOLOGY Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones Tadashi Imanishi1, Takeshi Itoh1,2, Yutaka Suzuki3,68, Claire O’Donovan4, Satoshi Fukuchi5, Kanako O. Koyanagi6, Roberto A. Barrero5, Takuro Tamura7,8, Yumi Yamaguchi-Kabata1, Motohiko Tanino1,7, Kei Yura9, Satoru Miyazaki5, Kazuho Ikeo5, Keiichi Homma5, Arek Kasprzyk4, Tetsuo Nishikawa10,11, Mika Hirakawa12, Jean Thierry-Mieg13,14, Danielle Thierry-Mieg13,14, Jennifer Ashurst15, Libin Jia16, Mitsuteru Nakao3, Michael A. Thomas17, Nicola Mulder4, Youla Karavidopoulou4, Lihua Jin5, Sangsoo Kim18, Tomohiro Yasuda11, Boris Lenhard19, Eric Eveno20,21, Yoshiyuki Suzuki5, Chisato Yamasaki1, Jun-ichi Takeda1, Craig Gough1,7, Phillip Hilton1,7, Yasuyuki Fujii1,7, Hiroaki Sakai1,7,22, Susumu Tanaka1,7, Clara Amid23, Matthew Bellgard24, Maria de Fatima Bonaldo25, Hidemasa Bono26, Susan K. Bromberg27, Anthony J. Brookes19, Elspeth Bruford28, Piero Carninci29, Claude Chelala20, Christine Couillault20,21, Sandro J. de Souza30, Marie-Anne Debily20, Marie-Dominique Devignes31, Inna Dubchak32, Toshinori Endo33, Anne Estreicher34, Eduardo Eyras15, Kaoru Fukami-Kobayashi35, Gopal R. Gopinath36, Esther Graudens20,21, Yoonsoo Hahn18, Michael Han23, Ze-Guang Han21,37, Kousuke Hanada5, Hideki Hanaoka1, Erimi Harada1,7, Katsuyuki Hashimoto38, Ursula Hinz34, Momoki Hirai39, Teruyoshi Hishiki40, Ian Hopkinson41,42, Sandrine Imbeaud20,21, Hidetoshi Inoko1,7,43, Alexander Kanapin4, Yayoi Kaneko1,7, Takeya Kasukawa26, Janet Kelso44, Paul Kersey4, Reiko Kikuno45, Kouichi
    [Show full text]
  • Personalized Single-Cell Proteogenomics to Distinguish Acute Myeloid Leukemia from Nonmalignant Clonal Hematopoiesis
    RESEARCH BRIEF Personalized Single-Cell Proteogenomics to Distinguish Acute Myeloid Leukemia from Nonmalignant Clonal Hematopoiesis Laura W. Dillon1, Jack Ghannam1, Chidera Nosiri1, Gege Gui1, Meghali Goswami1, Katherine R. Calvo2, Katherine E. Lindblad1, Karolyn A. Oetjen1, Matthew D. Wilkerson3,4,5, Anthony R. Soltis3,4, Gauthaman Sukumar4,6, Clifton L. Dalgard5,6, Julie Thompson1, Janet Valdez1, Christin B. DeStefano1, Catherine Lai1, Adam Sciambi7, Robert Durruthy-Durruthy7, Aaron Llanso7, Saurabh Gulati7, Shu Wang7, Aik Ooi7, Pradeep K. Dagur8, J. Philip McCoy8, Patrick Burr9, Yuesheng Li9, and Christopher S. Hourigan1 ABSTRACT Genetic mutations associated with acute myeloid leukemia (AML) also occur in age- related clonal hematopoiesis, often in the same individual. This makes confident assignment of detected variants to malignancy challenging. The issue is particularly crucial for AML posttreatment measurable residual disease monitoring, where results can be discordant between genetic sequencing and flow cytometry. We show here that it is possible to distinguish AML from clonal hematopoiesis and to resolve the immunophenotypic identity of clonal architecture. To achieve this, we first design patient-specific DNA probes based on patient’s whole-genome sequencing and then use them for patient-personalized single-cell DNA sequencing with simultaneous single-cell antibody– oligonucleotide sequencing. Examples illustrate AML arising from DNMT3A- and TET2-mutated clones as well as independently. The ability to personalize single-cell proteogenomic assessment for individual patients based on leukemia-specific genomic features has implications for ongoing AML precision medicine efforts. SIGNIFICANCE: This study offers a proof of principle of patient-personalized customized single-cell proteogenomics in AML including whole-genome sequencing–defined structural variants, currently unmeasurable by commercial “off-the-shelf” panels.
    [Show full text]
  • Uniprot: the Universal Protein Knowledgebase in 2021 the Uniprot Consortium1,2,3,4,*
    D480–D489 Nucleic Acids Research, 2021, Vol. 49, Database issue Published online 25 November 2020 doi: 10.1093/nar/gkaa1100 UniProt: the universal protein knowledgebase in 2021 The UniProt Consortium1,2,3,4,* 1European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK, 2Protein Information Resource, Georgetown University Medical Center, 3300 Whitehaven Street NW, Suite 1200, Washington, DC 20007, USA, 3Protein Information Resource, University of Delaware, Ammon-Pinizzotto Biopharmaceutical Innovation Building, Suite 147, 590 Avenue 1743, Newark, DE 19713, USA and 4SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, CH-1211 Geneva 4, Switzerland Received September 15, 2020; Revised October 21, 2020; Editorial Decision October 22, 2020; Accepted November 02, 2020 ABSTRACT tomated systems. The UniRef databases cluster sequence sets at various levels of sequence identity and the UniProt The aim of the UniProt Knowledgebase is to provide Archive (UniParc) delivers a complete set of known se- users with a comprehensive, high-quality and freely quences, including historical obsolete sequences. UniProt accessible set of protein sequences annotated with additionally integrates, interprets, and standardizes data functional information. In this article, we describe from multiple selected resources to add biological knowl- significant updates that we have made over the last edge and associated metadata to protein records and acts two years to the resource. The number of sequences as a central hub from which users can link out to 180 in UniProtKB has risen to approximately 190 million, other resources. In recognition of the quality of our data, despite continued work to reduce sequence redun- and the service we provide, UniProt was recognised as dancy at the proteome level.
    [Show full text]
  • In This Issue High-Throughput Single Protein Pulling Backpack Recorders for Zebra Finches a Deeper Look at Proteogenomics Genome
    NATURE METHODS | VOL.11 NO.11 | NOVEMBER 2014 IN THIS ISSUE software tool that bins genomic fragments that have Improving tools for synthetic first undergone limited preassembly into contigs. biology Their strategy uses a variational Bayesian approach combining sequence composition and correlated To rapidly and reliably engineer biological abundance across multiple samples to produce networks—one of the promises of synthetic sequence assignments. CONCOCT bins genomes with biology—a larger repertoire of regulatory high precision and recall in simulated and real data, elements and better characterization of their providing higher coverage than can typically be performance are needed. Two groups now deliver reached by single-cell sequencing. on each of these aspects. Smolke and colleagues Brief Communication p1144 present a model to predict the expression of a target gene that is regulated by a microRNA and then extend the model to anticipate the High-throughput single protein behavior of genetic circuits that use protein- responsive microRNA switches to detect the pulling concentration of a nuclear protein in mammalian One of the main technical challenges in making cells. Fussenegger and colleagues create a library single-molecule force spectroscopy measurements of protein-responsive ribozymes for translational control and then design a three-input AND gate has been the method’s low-throughput nature, which in mammalian cells that combines transcriptional has precluded the screening of large protein-variant and translational control. libraries. Nash and colleagues now describe a system Articles p1147, p1154, News and Views p1105 that readily enables thousands of protein pulling measurements. They begin with a microspotted DNA array and synthesize proteins in situ with the aid of A deeper look at proteogenomics microfluidics-based cell-free expression technology.
    [Show full text]
  • Download Or Browsing at Phylomedb [118]
    Gerdol et al. Genome Biology (2020) 21:275 https://doi.org/10.1186/s13059-020-02180-3 RESEARCH Open Access Massive gene presence-absence variation shapes an open pan-genome in the Mediterranean mussel Marco Gerdol1, Rebeca Moreira2, Fernando Cruz3, Jessica Gómez-Garrido3, Anna Vlasova4, Umberto Rosani5, Paola Venier5, Miguel A. Naranjo-Ortiz4,6, Maria Murgarella7, Samuele Greco1, Pablo Balseiro2,8, André Corvelo3,9, Leonor Frias3, Marta Gut3, Toni Gabaldón4,6,10,11, Alberto Pallavicini1,12, Carlos Canchaya7,13,14, Beatriz Novoa2, Tyler S. Alioto3,6, David Posada7,13,14* and Antonio Figueras2* * Correspondence: dposada@uvigo. es; [email protected] Abstract 7Department of Biochemistry, Genetics and Immunology, Background: The Mediterranean mussel Mytilus galloprovincialis is an ecologically University of Vigo, 36310 Vigo, and economically relevant edible marine bivalve, highly invasive and resilient to Spain biotic and abiotic stressors causing recurrent massive mortalities in other bivalves. 2Instituto de Investigaciones Marinas (IIM - CSIC), Eduardo Although these traits have been recently linked with the maintenance of a high Cabello, 6, 36208 Vigo, Spain genetic variation within natural populations, the factors underlying the evolutionary Full list of author information is success of this species remain unclear. available at the end of the article Results: Here, after the assembly of a 1.28-Gb reference genome and the resequencing of 14 individuals from two independent populations, we reveal a complex pan-genomic architecture in M. galloprovincialis, with a core set of 45,000 genes plus a strikingly high number of dispensable genes (20,000) subject to presence-absence variation, which may be entirely missing in several individuals.
    [Show full text]
  • The Chromosome-Centric Human Proteome Project for Cataloging Proteins Encoded in the Genome
    CORRESPONDENCE The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome To the Editor: utility for biological and disease studies. Table 1 Features of salient genes on The Chromosome-Centric Human With development of new tools for in- chromosomes 13 and 17 Proteome Project (C-HPP) aims to define depth characterization of the transcriptome Genea AST nsSNPs the full set of proteins encoded in each and proteome, the HPP is well positioned Chromosome 13 chromosome through development of a to have a strategic role in addressing the BRCA2 3 54 standardized approach for analyzing the complexity of human phenotypes. With this RB1 2 3 massive proteomic data sets currently being in mind, the HUPO has organized national IRS2 1 3 generated from dedicated efforts of national chromosome teams that will collaborate and international teams. The initial goal with well-established laboratories building Chromosome 17 of the C-HPP is to identify at least one complementary proteotypic peptides, BRCA1 24 24 representative protein encoded by each of antibodies and informatics resources. ERBB2 6 13 the approximately 20,300 human genes1,2. An important C-HPP goal is to encourage TP53 14 5 aEnsembl protein and AST information can be found at The proteins will be characterized for tissue capture and open sharing of proteomic http://www.ensembl.org/Homo_sapiens/. localization and major isoforms, including data sets from diverse samples to enhance AST, alternative splicing transcript; nsSNP, nonsyno- mous single-nucleotide polyphorphism assembled from post-translational modifications (PTMs), a gene- and chromosome-centric display data from the 1000 Genomes Projects.
    [Show full text]
  • Minireview: Novel Micropeptide Discovery by Proteomics and Deep Sequencing Methods
    fgene-12-651485 May 6, 2021 Time: 11:28 # 1 MINI REVIEW published: 06 May 2021 doi: 10.3389/fgene.2021.651485 Minireview: Novel Micropeptide Discovery by Proteomics and Deep Sequencing Methods Ravi Tharakan1* and Akira Sawa2,3 1 National Institute on Aging, National Institutes of Health, Baltimore, MD, United States, 2 Departments of Psychiatry, Neuroscience, Biomedical Engineering, and Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, United States, 3 Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States A novel class of small proteins, called micropeptides, has recently been discovered in the genome. These proteins, which have been found to play important roles in many physiological and cellular systems, are shorter than 100 amino acids and were overlooked during previous genome annotations. Discovery and characterization of more micropeptides has been ongoing, often using -omics methods such as proteomics, RNA sequencing, and ribosome profiling. In this review, we survey the recent advances in the micropeptides field and describe the methodological and Edited by: conceptual challenges facing future micropeptide endeavors. Liangliang Sun, Keywords: micropeptides, miniproteins, proteogenomics, sORF, ribosome profiling, proteomics, genomics, RNA Michigan State University, sequencing United States Reviewed by: Yanbao Yu, INTRODUCTION J. Craig Venter Institute (Rockville), United States The sequencing and publication of complete genomic sequences of many organisms have aided Hongqiang Qin, the medical sciences greatly, allowing advances in both human genetics and the biology of human Dalian Institute of Chemical Physics, Chinese Academy of Sciences, China disease, as well as a greater understanding of the biology of human pathogens (Firth and Lipkin, 2013).
    [Show full text]
  • Pathophysiology and Proteogenomics of Post-Infectious and Post-Hemorrhagic Hydrocephalus in Infants
    University of Calgary PRISM: University of Calgary's Digital Repository Graduate Studies The Vault: Electronic Theses and Dissertations 2020-07-21 Pathophysiology and Proteogenomics of Post-infectious and Post-hemorrhagic Hydrocephalus in Infants Isaacs, Albert M. Isaacs, A. M. (2020). Pathophysiology and Proteogenomics of Post-infectious and Post-hemorrhagic Hydrocephalus in Infants (Unpublished doctoral thesis). University of Calgary, Calgary, AB. http://hdl.handle.net/1880/112344 doctoral thesis University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. Downloaded from PRISM: https://prism.ucalgary.ca UNIVERSITY OF CALGARY Pathophysiology and Proteogenomics of Post-infectious and Post-hemorrhagic Hydrocephalus in Infants by Albert M. Isaacs A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY GRADUATE PROGRAM IN NEUROSCIENCE CALGARY, ALBERTA JULY, 2020 © Albert M. Isaacs 2020 Abstract Post-infectious (PIH) and post-hemorrhagic (PHH) hydrocephalus occur as sequalae of neonatal sepsis or intraventricular hemorrhage (IVH) of prematurity, respectively. Together, PIH and PHH represent the most common form of infantile hydrocephalus, the most common indication for neurosurgery
    [Show full text]
  • Proteogenomics and Hi-C Reveal Transcriptional Dysregulation in High Hyperdiploid Childhood Acute Lymphoblastic Leukemia
    Proteogenomics and Hi-C reveal transcriptional dysregulation in high hyperdiploid childhood acute lymphoblastic leukemia. Yang, M., Vesterlund, M., Siavelis, I., Moura-Castro, LH., Castor, A., Fioretos, T., Jafari, R., Lilljebjörn, H., Odom, DT., Olsson, L., Ravi, N., Woodward, EL., Harewood, L., & Paulsson, K. (2019). Proteogenomics and Hi- C reveal transcriptional dysregulation in high hyperdiploid childhood acute lymphoblastic leukemia. Nature Communications, 10, [1519]. https://doi.org/10.1038/s41467-019-09469-3 Published in: Nature Communications Document Version: Publisher's PDF, also known as Version of record Queen's University Belfast - Research Portal: Link to publication record in Queen's University Belfast Research Portal Publisher rights Copyright 2019 the authors. This is an open access article published under a Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium, provided the author and source are cited. General rights Copyright for the publications made accessible via the Queen's University Belfast Research Portal is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The Research Portal is Queen's institutional repository that provides access to Queen's research output. Every effort has been made to ensure that content in the Research Portal does not infringe any person's rights, or applicable UK laws. If you discover content in the Research Portal that you believe breaches copyright or violates any law, please contact [email protected].
    [Show full text]