Proteomics-Based Refinement of Deinococcus
Total Page:16
File Type:pdf, Size:1020Kb
Proteomics-based Refinement of Deinococcus deserti Genome Annotation Reveals an Unwonted Use of Non-canonical Translation Initiation Codons Mathieu Baudet, Philippe Ortet, Jean-Charles Gaillard, Bernard Fernandez, Philippe Guérin, Christine Enjalbal, Gilles Subra, Arjan de Groot, Mohamed Barakat, Alain Dedieu, et al. To cite this version: Mathieu Baudet, Philippe Ortet, Jean-Charles Gaillard, Bernard Fernandez, Philippe Guérin, et al.. Proteomics-based Refinement of Deinococcus deserti Genome Annotation Reveals an Unwonted Useof Non-canonical Translation Initiation Codons. Molecular and Cellular Proteomics, American Society for Biochemistry and Molecular Biology, 2010, 9, pp.415-426. 10.1074/mcp.M900359-MCP200. hal- 00581589 HAL Id: hal-00581589 https://hal.archives-ouvertes.fr/hal-00581589 Submitted on 8 Jun 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution| 4.0 International License Research Proteomics-based Refinement of Deinococcus deserti Genome Annotation Reveals an Unwonted Use of Non-canonical Translation Initiation Codons*□S Mathieu Baudet‡, Philippe Ortet§¶ʈ, Jean-Charles Gaillard‡, Bernard Fernandez‡, Philippe Gue´ rin‡, Christine Enjalbal**, Gilles Subra**, Arjan de Groot§¶ʈ, Mohamed Barakat§¶ʈ, Alain Dedieu‡, and Jean Armengaud‡ ‡‡ Deinococcaceae are a family of extremely radiation-toler- from non-canonical codons (ATC and CTG, respectively). ant bacteria that are currently subjected to numerous Such use may be the basis of specific regulation mecha- studies aimed at understanding the molecular mecha- nisms affecting replication and translation. We also report nisms for such radiotolerance. To achieve a comprehen- the use of non-conventional translation initiation codons sive and accurate annotation of the Deinococcus deserti for two other genes: Deide_03051 and infC. Whether such genome, we performed an N terminus-oriented charac- use of non-canonical translation initiation codons is much terization of its proteome. For this, we used a labeling more frequent than for other previously reported bacterial reagent, N-tris(2,4,6-trimethoxyphenyl)phosphonium acetyl phyla or restricted to Deinococcaceae remains to be in- succinimide, to selectively derivatize protein N termini. vestigated. Our results demonstrate that predicting trans- The large scale identification of N-tris(2,4,6-trimethoxy- lation initiation codons is still difficult for some bacteria phenyl)phosphonium acetyl succinimide-modified N-ter- and that proteomics-based refinement of genome anno- minal-most peptides by shotgun liquid chromatography- tations may be helpful in such cases. Molecular & Cel- tandem mass spectrometry analysis led to the validation lular Proteomics 9:415–426, 2010. of 278 and the correction of 73 translation initiation codons in the D. deserti genome. In addition, four new genes were detected, three located on the main chromo- some and one on plasmid P3. We also analyzed signal A comprehensive parts list is required prior to accurate peptide cleavages on a genome-wide scale. Based on global modeling of the complex whole cellular unit. For this comparative proteogenomics analysis, we propose a set reason, the quality of genome annotation is a very important of 137 corrections to improve Deinococcus radiodurans parameter for an efficient system biology study of the cell. and Deinococcus geothermalis gene annotations. Some Besides the combination of gene prediction tools, homology- of these corrections affect important genes involved in based inference, and manual curation (1), the use of experi- DNA repair mechanisms such as polA, ligA, and ddrB. mental data either from massive gene transcript sequencing Surprisingly, experimental evidences were obtained indi- (2) or massive protein sequence determination by mass spec- cating that DnaA (the protein involved in the DNA replica- tion initiation process) and RpsL (the S12 ribosomal con- trometry is currently considered with great interest (3–5). Pro- served protein) translation is initiated in Deinococcaceae teogenomics consists in high throughput identification of pro- teins, their accurate partial sequencing by nano-LC-MS/MS This is an open access article under the CC BY license. shotgun approaches, and the integration of these data with From the ‡Laboratoire de Biochimie des Syste` mes Perturbe´ s, Service de Biochimie et Toxicologie Nucle´ aire, Institut de Biologie Environne- the genome (6). Proteogenomics allows validation of pre- mentale et Biotechnologie (iBEB), Direction des Sciences du Vivant dicted genes and, more importantly, correction of genome (DSV), Commissariat a` l’Energie Atomique et aux Energies Alternatives annotation errors such as discovery of unannotated genes; (CEA), F-30207 Bagnols-sur-Ce` ze, France, §Laboratoire d’Ecologie Mi- reversal of reading frames; identification of translational start crobienne de la Rhizosphe` re et d’Environnements extreˆ mes, Service de sites, stop codon read-throughs, or programmed frameshifts; Biologie Ve´ge´ tale et de Microbiologie Environnementale, iBEB, DSV, CEA, F-13108 Saint-Paul-lez-Durance, France, ¶Unite´ mixte de re- and detection of signal peptide processing and other matu- cherche Biologie Ve´ge´ tale et Microbiologie Environnementales ration events at the protein level. Several studies dedicated to UMR6191, CNRS, F-13108 Saint-Paul-lez-Durance, France, ʈUni- genome reannotation based on experimental proteomics ev- versite´ Aix-Marseille, F-13108 Saint-Paul-lez-Durance, France, and idences have been reported (7–15) and have paved the way **Institut des Biomole´ cules Max Mousseron UMR5247, CNRS-Uni- for the proteogenomics approach. For example, the Myco- versite´ s Montpellier 1 et 2, F-34095 Montpellier, Cedex 5, France Received, August 5, 2009, and in revised form, October 15, 2009 bacterium smegmatis bacterium was subjected to an in-depth Published, MCP Papers in Press, October 29, 2009, DOI 10.1074/ MS/MS analysis over the course of 25 different growth con- mcp.M900359-MCP200 ditions (16). A set of 901 proteins was identified with most of © 2010 by The American Society for Biochemistry and Molecular Biology, Inc. Molecular & Cellular Proteomics 9.2 415 This paper is available on line at http://www.mcponline.org Reannotation of Deinococcus Genomes them annotated as hypothetical. This study was further com- of 15 genes that were not predicted by the two annotation pleted with mass spectrometry analysis and a resequencing softwares that were used. Surprisingly, we had to propose procedure that showed that a significant proportion of inter- reversal of incorrectly predicted orientations of 11 genes. In rupted coding sequences in this genome were sequencing this previous study, we checked the whole MS/MS data set errors (17). More recently, a reannotation of the genome was for N-terminal peptides and found 212 distinct peptide signa- proposed after a novel mass spectrometry analysis in which tures corresponding to N termini of 145 proteins. These data 946 distinct proteins were uncovered (18). confirmed the starts of 112 proteins but also corrected the Although proteogenomics has greatly improved over the starts of 33 polypeptides that were incorrectly predicted even last years, annotation of a significant number of N-terminal after manual inspection (25). Although several proteomics starts is still a challenging task whatever the genome under analyses have been carried out on Deinococcus radiodurans, consideration. Large scale sequence determination of N-ter- the radioresistant model among bacteria (26), none has been minal peptides has been reported for the Aeropyrum pernix specifically focused on genome reannotation (27–32). The K1 crenarchaeon (19), the Halobacterium salinarum and Na- genome annotation of D. radiodurans was among the first tronomonas pharaonis halophilic euryarchaeota (20), and the complete bacterial genome annotations ever reported (33). Shewanella oneidensis (21) and M. smegmatis bacteria (18). Those of Deinococcus geothermalis (34) and Thermus ther- Among these reports, the most unexpected result consisted mophilus HB27 (35) have been reported more recently, allow- in the discovery that TTG was the most used translational ing a better genome coverage of the Deinococcus-Thermus initiation codon, far more common than ATG and GTG in A. phylum. pernix (19). In high throughput nano-LC-MS/MS studies, low That one-fifth of the detected N termini were not correctly sequence coverage is observed for most proteins. It results in predicted in our previous D. deserti proteogenomics study a low coverage of N-terminal-most peptides. Over the last 7 (25) led us to develop a specific strategy for identifying N- years, a series of specific strategies have been devised to terminal most-peptides on a very large scale for D. deserti systematically catalogue N-terminal-most peptides (for a re- VCD115 proteome. We labeled the proteome of cells har- view, see Ref. 4). The most comprehensive studies reported vested in exponential and prestationary growth phases with up