Mipepid: Micropeptide Identification Tool Using Machine Learning Mengmeng Zhu1,2 and Michael Gribskov2*
Total Page:16
File Type:pdf, Size:1020Kb

Load more
Recommended publications
-
Requirement of the Fusogenic Micropeptide Myomixer for Muscle Formation in Zebrafish
Requirement of the fusogenic micropeptide myomixer for muscle formation in zebrafish Jun Shia,b,1, Pengpeng Bia,b,c,1, Jimin Peid, Hui Lia,b,c, Nick V. Grishind,e, Rhonda Bassel-Dubya,b,c, Elizabeth H. Chena,b,2, and Eric N. Olsona,b,c,2 aDepartment of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390; bHamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390; cSenator Paul D. Wellstone Muscular Dystrophy Cooperative Research Center, University of Texas Southwestern Medical Center, Dallas, TX 75390; dHoward Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX 75390; and eDepartment of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX 75390 Contributed by Eric N. Olson, September 28, 2017 (sent for review August 29, 2017; reviewed by Chen-Ming Fan and Thomas A. Rando) Skeletal muscle formation requires fusion of mononucleated myo- CRISPR/Cas9, we show that myomixer is required for zebrafish blasts to form multinucleated myofibers. The muscle-specific mem- myoblast fusion. We also identify additional distantly related brane proteins myomaker and myomixer cooperate to drive myomixer orthologs from genomes of turtle and elephant shark, mammalian myoblast fusion. Whereas myomaker is highly conserved which like zebrafish myomixer can substitute for mouse myomixer across diverse vertebrate species, myomixer is a micropeptide that to promote cell fusion together with myomaker in cultured cells. shows relatively weak cross-species conservation. To explore the By comparison of amino acid sequences across these diverse functional conservation of myomixer, we investigated the expres- species, we identify a unique protein motif required for the sion and function of the zebrafish myomixer ortholog. -
Computational Analysis Predicts Hundreds of Coding Lncrnas in Zebrafish
biology Communication Computational Analysis Predicts Hundreds of Coding lncRNAs in Zebrafish Shital Kumar Mishra 1,2 and Han Wang 1,2,* 1 Center for Circadian Clocks, Soochow University, Suzhou 215123, China; [email protected] 2 School of Biology & Basic Medical Sciences, Medical College, Soochow University, Suzhou 215123, China * Correspondence: [email protected] or [email protected]; Tel.: +86-512-6588-2115 Simple Summary: Noncoding RNAs (ncRNAs) regulate a variety of fundamental life processes such as development, physiology, metabolism and circadian rhythmicity. RNA-sequencing (RNA- seq) technology has facilitated the sequencing of the whole transcriptome, thereby capturing and quantifying the dynamism of transcriptome-wide RNA expression profiles. However, much remains unrevealed in the huge noncoding RNA datasets that require further bioinformatic analysis. In this study, we applied six bioinformatic tools to investigate coding potentials of approximately 21,000 lncRNAs. A total of 313 lncRNAs are predicted to be coded by all the six tools. Our findings provide insights into the regulatory roles of lncRNAs and set the stage for the functional investigation of these lncRNAs and their encoded micropeptides. Abstract: Recent studies have demonstrated that numerous long noncoding RNAs (ncRNAs having more than 200 nucleotide base pairs (lncRNAs)) actually encode functional micropeptides, which likely represents the next regulatory biology frontier. Thus, identification of coding lncRNAs from ever-increasing lncRNA databases would be a bioinformatic challenge. Here we employed the Coding Potential Alignment Tool (CPAT), Coding Potential Calculator 2 (CPC2), LGC web server, Citation: Mishra, S.K.; Wang, H. Coding-Non-Coding Identifying Tool (CNIT), RNAsamba, and MicroPeptide identification tool Computational Analysis Predicts (MiPepid) to analyze approximately 21,000 zebrafish lncRNAs and computationally to identify Hundreds of Coding lncRNAs in 2730–6676 zebrafish lncRNAs with high coding potentials, including 313 coding lncRNAs predicted Zebrafish. -
Stem Cell Metabolic and Epigenetic Reprogramming
RESEARCH HIGHLIGHTS Cell 16, 39–50, 2015). They discovered that inactivating Rb facilitates Enhancer evolution in mammals reprogramming of fibroblasts to a pluripotent state. Surprisingly, their Widespread changes in regulatory genomic regions have data indicate that this does not involve interference with the cell cycle but underpinned mammalian evolution, but our knowledge about instead that Rb directly binds to and represses pluripotency-associated these regions is still incomplete. Paul Flicek, Duncan Odom and loci such as Oct4 (Pou5f1) and Sox2. Loss of Rb seems to compensate colleagues have now contributed to a better understanding of for the omission of Sox2 from the cocktail of reprogramming factors. how the noncoding portions of mammalian genomes have been Furthermore, genetic disruption of Sox2 precludes tumor formation in reshaped over the last 180 million years (Cell 160, 554–566, mice lacking functional Rb protein. This study positions Rb as a repres- 2015). They characterized active promoters and enhancers in sor of the pluripotency gene regulatory network and suggests that loss of liver samples from 20 mammalian species—from Tasmanian Rb might clear the path for Sox2, or other master regulators of stem cell devil to human—by examining the genome-wide enrichment identity, to induce cancer. It will be interesting to analyze the potential profiles of H3K4me3 and H3K27ac, two histone modifications role of Rb in other types of in vitro reprogramming. TF associated with transcriptional activity. Their analyses suggest that rapid enhancer evolution and high promoter conservation are fundamental traits of mammalian genomes. Intriguingly, they Micropeptide regulates muscle performance find that the majority of newly evolved enhancers originated via It is becoming increasingly recognized that some transcripts anno- functional exaptation of ancestral DNA and not through clade- tated as long noncoding RNAs (lncRNAs) contain short ORFs that specific expansions of repeat elements. -
Polysome-Profiling in Small Tissue Samples
bioRxiv preprint doi: https://doi.org/10.1101/104596; this version posted February 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Polysome-profiling in small tissue samples Shuo Liang1,#, Hermano Bellato2,#, Julie Lorent1,#, Fernanda Lupinacci2, Vincent Van Hoef1, Laia Masvidal1,*, Glaucia Hajj2,* and Ola Larsson1,* 1. Department of Oncology-Pathology, Karolinska Institutet, Stockholm, 171 77, Sweden 2. International Research Center, A.C. Camargo Cancer Center, São Paulo, Brazil # Equally contributing authors * Correspondence: Laia Masvidal ([email protected]), Glaucia Hajj ([email protected]) and Ola Larsson ([email protected]) ABSTRACT Polysome-profiling is commonly used to study genome wide patterns of translational efficiency, i.e. the translatome. The standard approach for collecting efficiently translated polysome-associated RNA results in laborious extraction of RNA from a large volume spread across multiple fractions. This property makes polysome-profiling inconvenient for larger experimental designs or samples with low RNA amounts such as primary cells or frozen tissues. To address this we optimized a non-linear sucrose gradient which reproducibly enriches for mRNAs associated with >3 ribosomes in only one or two fractions, thereby reducing sample handling 5-10 fold. The technique can be applied to cells and frozen tissue samples from biobanks, and generates RNA with a quality reflecting the starting material. When coupled with smart-seq2, a single-cell RNA sequencing technique, translatomes from small tissue samples can be obtained. Translatomes acquired using optimized non-linear gradients are very similar to those obtained when applying linear gradients. -
Using Ribosome Profiling to Quantify Differences in Protein Expression
bioRxiv preprint doi: https://doi.org/10.1101/501478; this version posted December 19, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 1 Using ribosome profiling to quantify differences in protein expression: 2 a case study in Saccharomyces cerevisiae oxidative stress conditions 3 4 5 William R. Blevins1,#, Teresa Tavella1,#, Simone G. Moro1, Bernat Blasco-Moreno2, 6 Adrià Closa-Mosquera2, Juana Díez2, Lucas B. Carey2, M. Mar Albà1,2,3,* 7 8 1Evolutionary Genomics Groups, Research Programme on Biomedical Informatics (GRIB), Hospital del 9 Mar Research Institute (IMIM), Barcelona, Spain 10 2Health and Experimental Sciences Department, Universitat Pompeu Fabra(UPF), Barcelona, Spain 11 3Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain. 12 #Shared first co-authorship 13 *To whom correspondence should be addressed. 14 Running title: differential gene translation 15 Keywords: differential gene expression, ribosome profiling, RNA-Seq, translation, oxidative stress 1 bioRxiv preprint doi: https://doi.org/10.1101/501478; this version posted December 19, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 16 Abstract 17 18 Cells respond to changes in the environment by modifying the concentration of specific 19 proteins. Paradoxically, the cellular response is usually examined by measuring variations 20 in transcript abundance by high throughput RNA sequencing (RNA-Seq), instead of 21 directly measuring protein concentrations. -
The Lncrna Toolkit: Databases and in Silico Tools for Lncrna Analysis
non-coding RNA Review The lncRNA Toolkit: Databases and In Silico Tools for lncRNA Analysis Holly R. Pinkney † , Brandon M. Wright † and Sarah D. Diermeier * Department of Biochemistry, University of Otago, Dunedin 9016, New Zealand; [email protected] (H.R.P.); [email protected] (B.M.W.) * Correspondence: [email protected] † These authors contributed equally to this work. Received: 29 November 2020; Accepted: 15 December 2020; Published: 16 December 2020 Abstract: Long non-coding RNAs (lncRNAs) are a rapidly expanding field of research, with many new transcripts identified each year. However, only a small subset of lncRNAs has been characterized functionally thus far. To aid investigating the mechanisms of action by which new lncRNAs act, bioinformatic tools and databases are invaluable. Here, we review a selection of computational tools and databases for the in silico analysis of lncRNAs, including tissue-specific expression, protein coding potential, subcellular localization, structural conformation, and interaction partners. The assembled lncRNA toolkit is aimed primarily at experimental researchers as a useful starting point to guide wet-lab experiments, mainly containing multi-functional, user-friendly interfaces. With more and more new lncRNA analysis tools available, it will be essential to provide continuous updates and maintain the availability of key software in the future. Keywords: non-coding RNAs; long non-coding RNAs; databases; computational analysis; bioinformatic prediction software; RNA interactions; coding potential; RNA structure; RNA function 1. Introduction For decades, the human genome was thought to be a desert of ‘junk DNA’ with sporadic oases of transcriptionally active genes, most of them coding for proteins. -
Super-Resolution Ribosome Profiling Reveals Unannotated Translation Events in Arabidopsis
Super-resolution ribosome profiling reveals unannotated translation events in Arabidopsis Polly Yingshan Hsua, Lorenzo Calviellob,c, Hsin-Yen Larry Wud,1, Fay-Wei Lia,e,f,1, Carl J. Rothfelse,f, Uwe Ohlerb,c, and Philip N. Benfeya,g,2 aDepartment of Biology, Duke University, Durham, NC 27708; bBerlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, 13125 Berlin, Germany; cDepartment of Biology, Humboldt Universität zu Berlin, 10099 Berlin, Germany; dBioinformatics Research Center and Department of Statistics, North Carolina State University, Raleigh, NC 27695; eUniversity Herbarium, University of California, Berkeley, CA 94720; fDepartment of Integrative Biology, University of California, Berkeley, CA 94720; and gHoward Hughes Medical Institute, Duke University, Durham, NC 27708 Contributed by Philip N. Benfey, September 13, 2016 (sent for review June 30, 2016; reviewed by Pam J. Green and Albrecht G. von Arnim) Deep sequencing of ribosome footprints (ribosome profiling) maps and contaminants. Several metrics associated with translation have and quantifies mRNA translation. Because ribosomes decode mRNA been exploited (11), for example, the following: (i)ribosomesre- every 3 nt, the periodic property of ribosome footprints could be lease after encountering a stop codon (9), (ii) local enrichment of used to identify novel translated ORFs. However, due to the limited footprints within the predicted ORF (4, 13), (iii) ribosome footprint resolution of existing methods, the 3-nt periodicity is observed length distribution (7), and (iv) 3-nt periodicity displayed by trans- mostly in a global analysis, but not in individual transcripts. Here, we lating ribosomes (2, 6, 10, 14, 15). Among these features, some work report a protocol applied to Arabidopsis that maps over 90% of the well in distinguishing groups of coding vs. -
Integrated Analyses of Translatome and Proteome Identify the Rules of Translation Selectivity in RPS14-Deficient Cells
ARTICLE Red Cell Biology and its Disorders Integrated analyses of translatome and Ferrata Storti Foundation proteome identify the rules of translation selectivity in RPS14-deficient cells Ismael Boussaid,1 Salomé Le Goff,1,2,* Célia Floquet,1,* Emilie-Fleur Gautier,1,3 Anna Raimbault,1 Pierre-Julien Viailly,4 Dina Al Dulaimi,1 Barbara Burroni,5 Isabelle Dusanter-Fourt,1 Isabelle Hatin,6 Patrick Mayeux,1,2,3,# Bertrand Cosson7,# and Michaela Fontenay1,2,3,4,8 Haematologica 2021 1 Université de Paris, Institut Cochin, CNRS UMR 8104, INSERM U1016, Paris; Volume 106(3):746-758 2Laboratoire d’Excellence du Globule Rouge GR-Ex, Université de Paris, Paris; 3Proteomic Platform 3P5, Université de Paris, Paris; 4Centre Henri-Becquerel, Institut de Recherche et d’Innovation Biomedicale de Haute Normandie, INSERM U1245, Rouen; 5Assistance Publique-Hôpitaux de Paris, Centre-Université de Paris - Cochin, Service de Pathologie, Paris; 6Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université de Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette Cedex; 7Université de Paris, Epigenetics and Cell Fate, CNRS UMR 7216, Paris and 8Assistance Publique- Hôpitaux de Paris, Centre-Université de Paris - Hôpital Cochin, Service d’Hématologie Biologique, Paris, France *SLG and CF contributed equally to this work. #PM and BC contributed equally as co-senior authors. ABSTRACT n ribosomopathies, the Diamond-Blackfan anemia (DBA) or 5q- syn- drome, ribosomal protein (RP) genes are affected by mutation or deletion, Iresulting in bone marrow erythroid hypoplasia. Unbalanced production of ribosomal subunits leading to a limited ribosome cellular content regu- lates translation at the expense of the master erythroid transcription factor GATA1. -
Minireview: Novel Micropeptide Discovery by Proteomics and Deep Sequencing Methods
fgene-12-651485 May 6, 2021 Time: 11:28 # 1 MINI REVIEW published: 06 May 2021 doi: 10.3389/fgene.2021.651485 Minireview: Novel Micropeptide Discovery by Proteomics and Deep Sequencing Methods Ravi Tharakan1* and Akira Sawa2,3 1 National Institute on Aging, National Institutes of Health, Baltimore, MD, United States, 2 Departments of Psychiatry, Neuroscience, Biomedical Engineering, and Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, United States, 3 Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, United States A novel class of small proteins, called micropeptides, has recently been discovered in the genome. These proteins, which have been found to play important roles in many physiological and cellular systems, are shorter than 100 amino acids and were overlooked during previous genome annotations. Discovery and characterization of more micropeptides has been ongoing, often using -omics methods such as proteomics, RNA sequencing, and ribosome profiling. In this review, we survey the recent advances in the micropeptides field and describe the methodological and Edited by: conceptual challenges facing future micropeptide endeavors. Liangliang Sun, Keywords: micropeptides, miniproteins, proteogenomics, sORF, ribosome profiling, proteomics, genomics, RNA Michigan State University, sequencing United States Reviewed by: Yanbao Yu, INTRODUCTION J. Craig Venter Institute (Rockville), United States The sequencing and publication of complete genomic sequences of many organisms have aided Hongqiang Qin, the medical sciences greatly, allowing advances in both human genetics and the biology of human Dalian Institute of Chemical Physics, Chinese Academy of Sciences, China disease, as well as a greater understanding of the biology of human pathogens (Firth and Lipkin, 2013). -
Efficient Analysis of Mammalian Polysomes in Cells and Tissues
TOOLS AND RESOURCES Efficient analysis of mammalian polysomes in cells and tissues using Ribo Mega-SEC Harunori Yoshikawa1†, Mark Larance1,2†, Dylan J Harney2, Ramasubramanian Sundaramoorthy1, Tony Ly1,3, Tom Owen-Hughes1, Angus I Lamond1* 1Centre for Gene Regulation and Expression, School of Life Sciences, University of Dundee, Dundee, United Kingdom; 2Charles Perkins Centre, School of Life and Environmental Sciences, University of Sydney, Sydney, Australia; 3Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, United Kingdom Abstract We describe Ribo Mega-SEC, a powerful approach for the separation and biochemical analysis of mammalian polysomes and ribosomal subunits using Size Exclusion Chromatography and uHPLC. Using extracts from either cells, or tissues, polysomes can be separated within 15 min from sample injection to fraction collection. Ribo Mega-SEC shows translating ribosomes exist predominantly in polysome complexes in human cell lines and mouse liver tissue. Changes in polysomes are easily quantified between treatments, such as the cellular response to amino acid starvation. Ribo Mega-SEC is shown to provide an efficient, convenient and highly reproducible method for studying functional translation complexes. We show that Ribo Mega-SEC is readily combined with high-throughput MS-based proteomics to characterize proteins associated with polysomes and ribosomal subunits. It also facilitates isolation of complexes for electron microscopy and structural studies. *For correspondence: DOI: https://doi.org/10.7554/eLife.36530.001 [email protected] †These authors contributed equally to this work Introduction Competing interests: The The ribosome is a large RNA-protein complex, comprising four ribosomal RNAs (rRNAs) and >80 authors declare that no ribosomal proteins (RPs), which coordinates mRNA-templated protein synthesis. -
Comparative Analysis Reveals Genomic Features of Stress
Comparative analysis reveals genomic features of PNAS PLUS stress-induced transcriptional readthrough Anna Vilborga,b,1, Niv Sabathc, Yuval Wieselc, Jenny Nathansa,b, Flonia Levy-Adamc, Therese A. Yarioa,b, Joan A. Steitza,b, and Reut Shalgic,1 aDepartment of Molecular Biophysics and Biochemistry, Boyer Center for Molecular Medicine, Yale University School of Medicine, New Haven, CT 06536; bHoward Hughes Medical Institute, Yale University School of Medicine, New Haven, CT 06536; and cDepartment of Biochemistry, Rappaport Faculty of Medicine, Technion–Israel Institute of Technology, Haifa 31096, Israel Edited by Jasper Rine, University of California, Berkeley, CA, and approved July 14, 2017 (received for review July 10, 2017) Transcription is a highly regulated process, and stress-induced degraded by exonucleases that access the unprotected 5′ end changes in gene transcription have been shown to play a major role generated by cleavage at the polyA site (12, 13). However, recent in stress responses and adaptation. Genome-wide studies reveal studies show that various stress and disease states, including prevalent transcription beyond known protein-coding gene loci, osmotic stress (10), HSV-1 infection (9), and renal carcinoma generating a variety of RNA classes, most of unknown function. (8), increase both the levels and length of transcripts mapping to One such class, termed downstream of gene-containing transcripts regions downstream of the cleavage and polyadenylation sites. (DoGs), was reported to result from transcriptional readthrough Ourpreviousstudy(10)showedthat these transcripts are contin- upon osmotic stress in human cells. However, how widespread the uous with the RNAs generated from the upstream protein-coding readthrough phenomenon is, and what its causes and consequences gene, suggesting that they result from alterations in cleavage and are, remain elusive. -
Micropeptide Regulates Muscle Performance
RESEARCH HIGHLIGHTS Cell 16, 39–50, 2015). They discovered that inactivating Rb facilitates Enhancer evolution in mammals reprogramming of fibroblasts to a pluripotent state. Surprisingly, their Widespread changes in regulatory genomic regions have data indicate that this does not involve interference with the cell cycle but underpinned mammalian evolution, but our knowledge about instead that Rb directly binds to and represses pluripotency-associated these regions is still incomplete. Paul Flicek, Duncan Odom and loci such as Oct4 (Pou5f1) and Sox2. Loss of Rb seems to compensate colleagues have now contributed to a better understanding of for the omission of Sox2 from the cocktail of reprogramming factors. how the noncoding portions of mammalian genomes have been Furthermore, genetic disruption of Sox2 precludes tumor formation in reshaped over the last 180 million years (Cell 160, 554–566, mice lacking functional Rb protein. This study positions Rb as a repres- 2015). They characterized active promoters and enhancers in sor of the pluripotency gene regulatory network and suggests that loss of liver samples from 20 mammalian species—from Tasmanian Rb might clear the path for Sox2, or other master regulators of stem cell devil to human—by examining the genome-wide enrichment identity, to induce cancer. It will be interesting to analyze the potential profiles of H3K4me3 and H3K27ac, two histone modifications role of Rb in other types of in vitro reprogramming. TF associated with transcriptional activity. Their analyses suggest that rapid enhancer evolution and high promoter conservation are fundamental traits of mammalian genomes. Intriguingly, they Micropeptide regulates muscle performance find that the majority of newly evolved enhancers originated via It is becoming increasingly recognized that some transcripts anno- functional exaptation of ancestral DNA and not through clade- tated as long noncoding RNAs (lncRNAs) contain short ORFs that specific expansions of repeat elements.