Genome-Scale Models of Metabolism and Gene Expression : : Construction and Use for Growth Phenotype Prediction

Total Page:16

File Type:pdf, Size:1020Kb

Genome-Scale Models of Metabolism and Gene Expression : : Construction and Use for Growth Phenotype Prediction UC San Diego UC San Diego Electronic Theses and Dissertations Title Genome-scale Models of Metabolism and Gene Expression : : Construction and Use for Growth Phenotype Prediction Permalink https://escholarship.org/uc/item/8zq4p227 Author Lerman, Joshua Adam Publication Date 2014 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, SAN DIEGO Genome-scale Models of Metabolism and Gene Expression: Construction and Use for Growth Phenotype Prediction A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Bioinformatics & Systems Biology by Joshua Adam Lerman Committee in charge: Professor Bernhard Ø. Palsson, Chair Professor Milton H. Saier, Jr., Co-Chair Professor Philip E. Bourne Professor Terence Hwa Professor Victor Nizet 2014 Copyright Joshua Adam Lerman, 2014 All rights reserved. The dissertation of Joshua Adam Lerman is approved, and it is acceptable in quality and form for publication on microfilm and electronically: Co-Chair Chair University of California, San Diego 2014 iii DEDICATION To my mother and father, for your love, guidance, and all the sacrifices you made for Justin, Rachel, and I. To Lauren, for your love and all those times I told you, \One sec." To the loving memory of Bubby. iv EPIGRAPH Tony Stark was able to build this in a cave! WITH A BOX OF SCRAPS!! |Obadiah Stane, Iron Man v TABLE OF CONTENTS Signature Page................................. iii Dedication.................................... iv Epigraph....................................v Table of Contents................................ vi List of Figures................................. xi List of Tables.................................. xiii Acknowledgements............................... xiv Vita....................................... xviii Abstract of the Dissertation.......................... xx Chapter 1 Introduction..........................1 1.1 Life, constraints, and being good enough........1 1.2 Systems microbiology and its promise..........2 1.3 Systems microbiology as a four-step procedure: Then and Now............................3 1.3.1 Then: M-Models (c. 2008)............4 1.3.2 Now: ME-Models (c. 2013)............5 1.4 There's more than one way to model a cell, so where do ME-Models fit in?.....................7 Chapter 2 The genome organization of Thermotoga maritima reflects its lifestyle............................. 10 2.1 Abstract.......................... 10 2.2 Author Summary..................... 11 2.3 Introduction........................ 11 2.4 Results........................... 14 2.4.1 An integrative, multi-omic approach for the an- notation of the genome organization....... 14 2.4.2 Identification of promoters and RBSs followed by quantitative intra- and interspecies analysis of bind- ing free energies.................. 18 2.4.3 T. maritima promoter-containing intergenic re- gions reveal a unique distribution of 50UTRs and spatial limitations on regulation......... 25 vi 2.4.4 T. maritima has an actively transcribed genome that is tightly correlated to protein abundances 28 2.5 Discussion......................... 30 2.6 Materials and Methods.................. 33 2.6.1 Culture conditions and physiology........ 33 2.6.2 Genome resequencing and annotation updates. 33 2.6.3 Transcription start site determination...... 34 2.6.4 Transcriptome characterization and gene expres- sion......................... 34 2.6.5 Proteomics, peptide mapping, and protein abun- dance quantitation................ 35 2.6.6 Promoter element motif analysis and position weight matrix (PWM) generation............ 36 2.6.7 Information content calculations......... 37 2.6.8 Ribosome binding site energy calculations... 38 2.6.9 Rho-independent terminator site determination 39 2.6.10 Prediction of small RNAs............ 39 2.6.11 Transcription unit assembly........... 39 2.6.12 Transcription factor binding site mapping.... 40 2.6.13 Data deposition.................. 40 2.7 Acknowledgments..................... 40 Chapter 3 In silico method for modelling metabolism and gene product expression at genome scale.................. 42 3.1 Abstract.......................... 42 3.2 Introduction........................ 43 3.3 Results........................... 46 3.3.1 Genome-scale modelling of metabolism and ex- pression...................... 46 3.3.2 Molecularly efficient simulation of cellular physi- ology........................ 49 3.3.3 Gene product production and turnover alters path- way activity.................... 52 3.3.4 Simulation of systems-level molecular phenotypes 55 3.3.5 In silico gene expression profiling drives discovery 58 3.4 Discussion......................... 63 3.5 Methods.......................... 66 3.5.1 Network reconstruction procedure........ 66 3.5.2 Protein complexes................. 67 3.5.3 Genetic code determination........... 67 3.5.4 TU architecture determination.......... 67 3.5.5 In silico molecular biology............ 68 3.5.6 In vivo methods.................. 68 vii 3.5.7 RNA modifications................ 69 3.5.8 Sensitivity analysis................ 70 3.5.9 File formats.................... 70 3.5.10 Accession codes.................. 70 3.6 Acknowledgements.................... 70 Chapter 4 Genome-scale models of metabolism and gene expression ex- tend and refine growth phenotype prediction........ 72 4.1 Abstract.......................... 72 4.2 Introduction........................ 73 4.3 Results........................... 75 4.3.1 Integration of genome-scale reaction networks of protein synthesis and metabolism........ 75 4.3.2 Growth demands and general constraints on molec- ular catalysis................... 76 4.3.3 Derivation of constraints on molecular catalytic rates........................ 79 4.3.4 Growth regions under varying nutrient availabil- ity......................... 80 4.3.5 Effect of proteome limitation on secretion pheno- types........................ 85 4.3.6 Central carbon fluxes reflect growth optimization subject to catalytic constraints.......... 86 4.3.7 In silico gene expression profiling from nutrient-limited to batch growth conditions... 89 4.4 Discussion......................... 94 4.5 Materials and methods.................. 98 4.5.1 Network reconstruction.............. 98 4.5.2 Coupling constraint formulation and imposition 98 4.5.3 Optimization procedure.............. 99 4.5.4 Hierarchical clustering.............. 99 4.5.5 File formats and accessibility........... 99 4.6 Acknowledgements.................... 100 Chapter 5 Reconciling a Salmonella enterica metabolic model with ex- perimental data confirms that overexpression of the glyoxylate shunt can rescue a lethal ppc deletion mutant........ 101 5.1 Abstract.......................... 101 5.2 Introduction........................ 102 5.3 Materials and methods.................. 104 5.3.1 Bacterial strains.................. 104 5.3.2 Growth media................... 106 5.3.3 Construction of the ∆ppc mutant........ 106 viii 5.3.4 Growth rate and glucose uptake rate measure- ments....................... 106 5.3.5 In silico modeling................. 107 5.3.6 Construction of pASK1988............ 108 5.3.7 Construction of pS7, pS8, pS10......... 108 5.3.8 Induction and protein overexpression...... 109 5.4 Results........................... 109 5.4.1 In contrast to model simulations, a Salmonella Typhimurium ∆ppc mutant is nonviable in glu- cose M9 medium................. 109 5.4.2 Comparing efficient flux states enables a hypothesis- driven approach to reconcile metabolic models with experimental data................. 110 5.4.3 Deleting iclR from the ∆ppc mutant restores via- bility........................ 113 5.4.4 Simultaneous expression of aceBA and aceK from two separate plasmids can rescue growth in the ∆ppc mutant, but overexpression of aceBA, aceK, or aceBAK individually from a single plasmid can- not......................... 115 5.5 Discussion......................... 117 5.6 Acknowledgements.................... 118 Chapter 6 ME-Models as a conduit for integration of systems and syn- thetic biology.......................... 119 6.1 Introduction........................ 119 6.2 pUC19 cloning vector................... 122 6.3 Production of spider silk proteins............ 125 6.4 Introduction of a 2-step heterologous pathway to produce indole-3-acetaldehyde................... 130 Chapter 7 Conclusions and Outlook................... 133 7.1 Conclusions........................ 133 7.2 Outlook.......................... 139 7.2.1 The most promising basic uses of the E. coli ME- Model....................... 139 7.2.2 The most promising applied uses of the E. coli ME-Model..................... 148 7.2.3 Automating the construction of a ME-Model for a bacteria of your choice............. 149 7.2.4 ME-Models for Yeast and Humans....... 150 7.2.5 Roadmap to a steady-state whole-cell E. coli model................... 150 ix Bibliography.................................. 152 x LIST OF FIGURES Figure 1.1: Systems microbiology in a nutshell...............2 Figure 1.2: Types of omics data and their uses for constructing and applying ME-Models............................6 Figure 1.3: The microbial genotype-phenotype relationship........8 Figure 2.1: Generation of multiple genome-scale datasets integrated with bioinformatics predictions reveals the genome organization.. 15 Figure 2.2: Identification and quantitative comparison of genetic elements for transcription and translation initiation........... 19 Figure 2.3: Arrangement of genomic
Recommended publications
  • UMP/CMPK Is Not the Critical Enzyme in the Metabolism of Pyrimidine Ribonucleotide and Activation of Deoxycytidine Analogs in Human RKO Cells
    UMP/CMPK Is Not the Critical Enzyme in the Metabolism of Pyrimidine Ribonucleotide and Activation of Deoxycytidine Analogs in Human RKO Cells Rong Hu1, Wing Lam1, Chih-Hung Hsu1,2, Yung-Chi Cheng* 1 Department of Pharmacology, Yale University School of Medicine, New Haven, Connecticut, United States of America, 2 Department of Oncology, National Taiwan University Hospital, Taipei, Taiwan, Republic of China Abstract Background: Human UMP/CMP kinase was identified based on its enzymatic activity in vitro. The role of this protein is considered critical for the maintenance of pyrimidine nucleotide pool profile and for the metabolism of pyrimidine analogs in cells, based on the in vitro study of partially purified enzyme and recombinant protein. However, no detailed study has yet addressed the role of this protein in nucleotide metabolism in cells. Methodology/Principal Findings: Two stable cell lines in which UMP/CMP kinase (mRNA: AF087865, EC 2.7.4.14) can be either up-regulated or down-regulated were developed using Tet-On Gene Expression Systems. The amount and enzymatic activity of UMP/CMP kinase extracted from these two cell lines can be induced up by 500% or down by 95–98%. The ribonucleotides of endogenous pyrimidine as well as the metabolism of exogenous natural pyrimidine nucleosides and their analogs were not susceptible to the altered amount of UMP/CMP kinase in these two stable RKO cell lines. The level of incorporation of pyrimidine nucleoside analogs, such as gemcitabine (dFdC) and troxacitabine (L-OddC), into cellular DNA and their potency in inhibiting cell growth were not significantly altered by up-regulation or down-regulation of UMP/CMP kinase expression in cells.
    [Show full text]
  • Identification and Analysis of Single-Nucleotide Polymorphisms in the Gemcitabine Pharmacologic Pathway
    The Pharmacogenomics Journal (2004) 4, 307–314 & 2004 Nature Publishing Group All rights reserved 1470-269X/04 $30.00 www.nature.com/tpj ORIGINAL ARTICLE Identification and analysis of single-nucleotide polymorphisms in the gemcitabine pharmacologic pathway AK Fukunaga1 ABSTRACT 2 Significant variability in the antitumor efficacy and systemic toxicity of S Marsh gemcitabine has been observed in cancer patients. However, there are 1 DJ Murry currently no tools for prospective identification of patients at risk for TD Hurley3 untoward events. This study has identified and validated single-nucleotide HL McLeod2 polymorphisms (SNP) in genes involved in gemcitabine metabolism and transport. Database mining was conducted to identify SNPs in 14 genes 1Department of Clinical Pharmacy and Pharmacy involved in gemcitabine metabolism. Pyrosequencing was utilized to Practice, Purdue University, W. Lafayette, IN, determine the SNP allele frequencies in genomic DNA from European and 2 USA; Departments of Medicine, Genetics, and African populations (n ¼ 190). A total of 14 genetic variants (including 12 Molecular Biology and Pharmacology, Washington University School of Medicine and SNPs) were identified in eight of the gemcitabine metabolic pathway genes. the Siteman Cancer Center, St Louis, MO, USA; The majority of the database variants were observed in population samples. 3Department of Biochemistry and Molecular Nine of the 14 (64%) polymorphisms analyzed have allele frequencies that Biology, Indiana University School of Medicine, were found to be significantly different between the European and African Indianapolis, IN, USA populations (Po0.05). This study provides the first step to identify markers Correspondence: for predicting variability in gemcitabine response and toxicity. Dr HL McLeod, Washington University The Pharmacogenomics Journal (2004) 4, 307–314.
    [Show full text]
  • In-Cell Architecture of an Actively Transcribing-Translating Expressome
    bioRxiv preprint doi: https://doi.org/10.1101/2020.02.28.970111; this version posted February 28, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. In-cell architecture of an actively transcribing-translating expressome Francis J. O’Reilly1, †, Liang Xue2,3, †, Andrea Graziadei1, †, Ludwig Sinn1, Swantje Lenz1, 5 Dimitry Tegunov4, Cedric Blötz5, Wim J. H. Hagen2, Patrick Cramer4, Jörg Stülke5, Julia Mahamid2,*, Juri Rappsilber1,6,* Affiliations: 1 Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, 13355 Berlin, 10 Germany 2 Structural and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstraße 1, 69117 Heidelberg, Germany. 3 Collaboration for joint PhD degree between EMBL and Heidelberg University, Faculty of Biosciences 15 4 Department of Molecular Biology, Max-Planck-Institute for Biophysical Chemistry, Am Faßberg 11, 37077, Göttingen, Germany 5 Department of General Microbiology, Institute of Microbiology and Genetics, GZMB, Georg- August-University Göttingen, Grisebachstraße 8, 37077 Göttingen, Germany 6 Wellcome Centre for Cell Biology, University of Edinburgh, Max Born Crescent, Edinburgh, 20 EH9 3BF, UK *Correspondence to: [email protected], [email protected] †These authors contributed equally to this work. 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.28.970111; this version posted February 28, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
    [Show full text]
  • Expressomal Approach for Comprehensive Analysis and Visualization of Ligand Sensitivities of Xenoestrogen Responsive Genes
    Expressomal approach for comprehensive analysis and visualization of ligand sensitivities of xenoestrogen responsive genes Toshi Shiodaa,b,1, Noël F. Rosenthala, Kathryn R. Cosera, Mizuki Sutoa, Mukta Phatakc, Mario Medvedovicc, Vincent J. Careyb,d, and Kurt J. Isselbachera,b,1 aMolecular Profiling Laboratory, Massachusetts General Hospital Center for Cancer Research, Charlestown, MA 02129; bDepartment of Medicine, Harvard Medical School, Boston, MA 02115; cLaboratory for Statistical Genomics and Systems Biology, Department of Environmental Health, University of Cincinnati College of Medicine, Cincinnati, OH 45267; and dChanning Laboratory, Brigham and Women’s Hospital, Boston, MA 02115 Contributed by Kurt J. Isselbacher, August 26, 2013 (sent for review June 17, 2013) Although biological effects of endocrine disrupting chemicals Evidence is accumulating that the EDCs may cause significant (EDCs) are often observed at unexpectedly low doses with occa- biological effects in humans or animals at doses far lower than sional nonmonotonic dose–response characteristics, transcriptome- the exposure limits set by regulatory agencies (8, 9). In addition wide profiles of sensitivities or dose-dependent behaviors of the to such low-dose effects, an increasing number of studies also EDC responsive genes have remained unexplored. Here, we describe support the concept of the nonmonotonic EDC effects, whose dose–response curves show U shapes or inverted-U shapes (8- expressome analysis for the comprehensive examination of dose- – dependent gene responses
    [Show full text]
  • Bioinformaticsbioinformatics Introduction to Genomics and Proteomics I
    www. .uni-rostock.de BioinformaticsBioinformatics Introduction to genomics and proteomics I Ulf Schmitz [email protected] Bioinformatics and Systems Biology Group www.sbi.informatik.uni-rostock.de Ulf Schmitz, Introduction to genomics and proteomics I 1 www. .uni-rostock.de Outline Genomics/Genetics 1. The tree of life • Prokaryotic Genomes –Bacteria – Archaea • Eukaryotic Genomes – Homo sapiens 2. Genes • Expression Data Ulf Schmitz, Introduction to genomics and proteomics I 2 www. .uni-rostock.de Genomics - Definitions #Genetics: is the science of genes, heredity, and the variation of organisms. Humans began applying knowledge of genetics in prehistory with the domestication and breeding of plants and animals. In modern research, genetics provides tools in the investigation of the function of a particular gene, e.g. analysis of genetic interactions. #Genomics: attempts the study of large-scale genetic patterns across the genome for a given species. It deals with the systematic use of genome information to provide answers in biology, medicine, and industry. Genomics has the potential of offering new therapeutic methods for the treatment of some diseases, as well as new diagnostic methods. Major tools and methods related to genomics are bioinformatics, genetic analysis, measurement of gene expression, and determination of gene function. Ulf Schmitz, Introduction to genomics and proteomics I 3 Genes www. .uni-rostock.de •a gene coding for a protein corresponds to a sequence of nucleotides along one or more regions of a molecule of DNA • in species with double stranded DNA (dsDNA), genes may appear on either strand • bacterial genes are continuous regions of DNA bacterium: • a string of 3N nucleotides encodes a string of N amino acids • or a string of N nucleotides encodes a structural RNA molecule of N residues eukaryote: • a gene may appear split into separated segments in the DNA • an exon is a stretch of DNA retained in mRNA that the ribosomes translate into protein Ulf Schmitz, Introduction to genomics and proteomics I 4 www.
    [Show full text]
  • Genomic Interrogation of Melanoma for Identification of Driver Oncogenic Factors
    UC Merced UC Merced Electronic Theses and Dissertations Title Genomic Interrogation of Melanoma for Identification of Driver Oncogenic Factors Permalink https://escholarship.org/uc/item/1fb0q0s2 Author Gupta, Rohit Publication Date 2019 License https://creativecommons.org/licenses/by/4.0/ 4.0 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, MERCED Genomic Interrogation of Melanoma for Identification of Driver Oncogenic Factors by Rohit Gupta A dissertation submitted in partial fulfillment for the degree of Doctor of Philosophy in the Program for Quantitative & Systems Biology School of Natural Sciences August 2019 Committee in charge: Professor Miriam Barlow, Advisor Professor Michael Colvin, Chair Professor Suzanne Sindi Professor Mark Sistrom Copyright Rohit Gupta, 2019 All rights reserved The Dissertation of Rohit Gupta is approved, and it is acceptable in quality and form for publication on microfilm and electronically: Professor Miriam Barlow, Advisor Professor Michael Colvin, Chair Professor Suzanne Sindi Professor Mark Sistrom University of California, Merced 2019 \To reach a port we must set sail { Sail, not tie at anchor Sail, not drift." Franklin D. Roosevelt UNIVERSITY OF CALIFORNIA, MERCED Abstract Program for Quantitative & Systems Biology School of Natural Sciences Doctor of Philosophy by Rohit Gupta Underpinning genomic principles are defining feature in every physiologic and pathologic process. Through advances in metabolomics, systems biologists can now track the dynamic interactions of the metabolome with the epigenome, genome, transcriptome and proteome. Understanding of cross talk between genomic, epige- nomic and structural changes at biophysical scale on cellular metabolism is still in its infancy. Using Next-Gen Sequencing data in tandem with biophysical approaches, metabolism was found to play an important role in cellular prolif- eration, differentiation, metastasis.
    [Show full text]
  • Structural Basis of Mitochondrial Translation Shintaro Aibara1†, Vivek Singh1,2, Angelika Modelska3‡, Alexey Amunts1,2*
    RESEARCH ARTICLE Structural basis of mitochondrial translation Shintaro Aibara1†, Vivek Singh1,2, Angelika Modelska3‡, Alexey Amunts1,2* 1Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden; 2Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden; 3Laboratory of Translational Genomics, Centre for Integrative Biology, University of Trento, Trento, Italy Abstract Translation of mitochondrial messenger RNA (mt-mRNA) is performed by distinct mitoribosomes comprising at least 36 mitochondria-specific proteins. How these mitoribosomal proteins assist in the binding of mt-mRNA and to what extent they are involved in the translocation of transfer RNA (mt-tRNA) is unclear. To visualize the process of translation in human mitochondria, we report ~3.0 A˚ resolution structure of the human mitoribosome, including the L7/L12 stalk, and eight structures of its functional complexes with mt-mRNA, mt-tRNAs, recycling factor and additional trans factors. The study reveals a transacting protein module LRPPRC-SLIRP that delivers mt-mRNA to the mitoribosomal small subunit through a dedicated platform formed by the mitochondria-specific protein mS39. Mitoribosomal proteins of the large subunit mL40, mL48, and *For correspondence: mL64 coordinate translocation of mt-tRNA. The comparison between those structures shows [email protected] dynamic interactions between the mitoribosome and its ligands, suggesting a sequential Present address: †Department mechanism of conformational changes. of Molecular Biology, Max- Planck-Institute for BiophysicalChemistry, Go¨ ttingen, Germany; ‡Aix Marseille Introduction Universite´, CNRS, INSERM, Translation in humans takes place in the cytosol and mitochondria. Mitochondrial translation is Centre d’Immunologie responsible for the maintenance of the cellular energetic balance through synthesis of proteins deMarseille-Luminy (CIML), Marseille, France involved in oxidative phosphorylation.
    [Show full text]
  • Mycobacterial Held Is a Nucleic Acids-Clearing Factor for RNA
    bioRxiv preprint doi: https://doi.org/10.1101/2020.07.20.211821; this version posted July 22, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 Mycobacterial HelD is a nucleic acids-clearing factor for RNA 2 polymerase 3 4 5 Tomáš Koubaa*#, Tomáš Koval’b*, Petra Sudzinovác*, Jiří Pospíšilc, Barbora Brezovskác, Jarmila 6 Hnilicovác, Hana Šanderovác, Martina Janouškovác, Michaela Šikovác, Petr Haladac, Michal 7 Sýkorad, Ivan Barvíke, Jiří Nováčekf, Mária Trundováb, Jarmila Duškováb, Tereza Skálováb, 8 URee Chong, Katsuhiko S. Murakamig, Jan Dohnálekb#, Libor Krásnýc# 9 10 a EMBL Grenoble, 71 Avenue des Martyrs, France 11 b Institute of Biotechnology of the Czech Academy of Sciences, Centre BIOCEV, Prumyslova 12 595, 25250 Vestec, Czech Republic 13 c Institute of Microbiology of the Czech Academy of Sciences, Prague, Czech Republic 14 d Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic 15 e Charles University, Faculty of Mathematics and Physics, Institute of Physics, Prague, Czech 16 Republic 17 f CEITEC, Masaryk University, Brno, Czech Republic 18 g Department of Biochemistry and Molecular Biology, The Center for RNA Molecular Biology, 19 Pennsylvania State University, University Park, PA 16802, USA 20 21 22 23 24 25 *Equal contribution 26 #Corresponding author: [email protected], [email protected], [email protected] 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.07.20.211821; this version posted July 22, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder.
    [Show full text]
  • Identification of the Expressome by Machine Learning on Omics Data
    UC San Diego UC San Diego Previously Published Works Title Identification of the expressome by machine learning on omics data. Permalink https://escholarship.org/uc/item/93d8w5pb Journal Proceedings of the National Academy of Sciences of the United States of America, 116(36) ISSN 0027-8424 Authors Sartor, Ryan C Noshay, Jaclyn Springer, Nathan M et al. Publication Date 2019-09-01 DOI 10.1073/pnas.1813645116 Peer reviewed eScholarship.org Powered by the California Digital Library University of California Identification of the expressome by machine learning on omics data Ryan C. Sartora, Jaclyn Noshayb, Nathan M. Springerb, and Steven P. Briggsa,1 aDivision of Biology, University of California San Diego, La Jolla, CA 92093; and bDepartment of Plant Biology, University of Minnesota, St. Paul, MN 55108 Contributed by Steven P. Briggs, July 11, 2019 (sent for review August 14, 2018; reviewed by James A. Birchler and Virginia Walbot) Accurate annotation of plant genomes remains complex due to Most researchers study the predicted genes that are derived the presence of many pseudogenes arising from whole-genome from whole-genome annotations. These annotation approaches duplication-generated redundancy or the capture and movement can be complicated by the presence of sequences with homology of gene fragments by transposable elements. Machine learning on to protein coding genes that may not be functional genes. These genome-wide epigenetic marks, informed by transcriptomic and false gene annotations can result from silenced paralogs fol- proteomic training data, could be used to improve annotations lowing either whole-genome duplications or tandem duplica- through classification of all putative protein-coding genes as either tions, or they may arise from capture of gene fragments by constitutively silent or able to be expressed.
    [Show full text]
  • Deciphering the Long Non-Coding Rnas and Micrornas Coregulation Networks in Ovarian Cancer Development: an Overview
    cells Review Deciphering the Long Non-Coding RNAs and MicroRNAs Coregulation Networks in Ovarian Cancer Development: An Overview César López-Camarillo 1,2,* , Erika Ruíz-García 2,3 , Yarely M. Salinas-Vera 4 , Macrina B. Silva-Cázares 5 , Olga N. Hernández-de la Cruz 1, Laurence A. Marchat 6 and Dolores Gallardo-Rincón 2,3 1 Posgrado en Ciencias Genómicas, Universidad Autónoma de la Ciudad de México, 03100 CDMX, Mexico; [email protected] 2 Grupo de Investigación en Cáncer de Ovario y Endometrio, Instituto Nacional de Cancerología, 14080 CDMX, Mexico; [email protected] (E.R.-G.); [email protected] (D.G.-R.) 3 Laboratorio de Medicina Traslacional y Departamento de Tumores Gastrointestinales, Instituto Nacional de Cancerología, 14080 CDMX, Mexico 4 Departamento de Bioquímica, CINVESTAV-IPN, 07360 CDMX, Mexico; [email protected] 5 Coordinación Académica Región Altiplano, Universidad Autónoma de San Luis Potosí, 78700 San Luis Potosí, Mexico; [email protected] 6 Programa en Biomedicina Molecular y Red de Biotecnología, Instituto Politécnico Nacional, 07340 CDMX, Mexico; [email protected] * Correspondence: [email protected] Abstract: Non-coding RNAs are emergent elements from the genome, which do not encode for proteins but have relevant cellular functions impacting almost all the physiological processes oc- Citation: López-Camarillo, C.; curring in eukaryotic cells. In particular, microRNAs and long non-coding RNAs (lncRNAs) are a Ruíz-García, E.; Salinas-Vera, Y.M.; new class of small RNAs transcribed from the genome, which modulate the expression of specific Silva-Cázares, M.B.; Hernández-de la Cruz, O.N.; Marchat, L.A.; genes at transcriptional and posttranscriptional levels, thus adding a new regulatory layer in the flux Gallardo-Rincón, D.
    [Show full text]
  • Bioinformatics Master's Course Genome Analysis Lecture 1
    Bioinformatics Master’s C E N Course T R E F B O I R O I I Genome Analysis N N T F E O (Integrative Bioinformatics & Genomics) G R R M A A T T I I V C E S Lecture 1: Introduction V U Centre for Integrative Bioinformatics VU (IBIVU) Faculty of Exact Sciences / Faculty of Earth and Life Sciences http://ibi.vu.nl, [email protected], 87649 (Heringa), Room P1.28 Other teachers (assistants) in the course • Anton Feenstra, UD (1/09/05) • Bart van Houte – PhD (1/09/04) • Walter Pirovano – PhD (1/09/05) • Thomas Binsl - PhD (18/6/06) • Bernd Brandt (18/06/07) Issues in data analysis • Pattern recognition – Supervised/unsupervised learning – Types of data, data normalisation, lacking data – Search image – Similarity/distance measures – Clustering – Principal component analysis Protein Science (the ‘doers’ in the cell) • Protein – Folding – Structure and function – Protein structure prediction – Secondary structure – Tertiary structure – Function – Post-translational modification – Prot.-Prot. Interaction -- Docking algorithm – Molecular dynamics/Monte Carlo Central Bioinformatics issue: Sequence Analysis • Sequence analysis – Pairwise alignment – Dynamic programming (NW, SW, shortcuts) – Multiple alignment – Combining information – Database/homology searching (Fasta, Blast, Statistical issues-E/P values) Bioinformatics algorithms for Genomics • Gene structure and gene finding algorithms • Algorithms to integrate Genomics databases: – Sequencing projects – Expression data, Nucleus to ribosome, translation, etc. – Proteomics, Metabolomics, Physiomics
    [Show full text]
  • BIOLOGY Bioinformatics 1 Systems Biology My Personal Background
    Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Introduction BIOLOGY Bioinformatics 1 Systems Biology Overview an Introduction & 29.11.2018 – 10:15 to 11:45 Definitions Eberhard Korsching [email protected] http://www.bioinformatics.uni-muenster.de/teaching/ 4 Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY My personal background What fields are involved ? i d e a s Interests in Physics / Applied Mathematics / Life Sciences Cell Biology Informatics Biology Studies in Chemistry / Biochemistry w e t l PhD in fields of Biochemistry / Immunohistochemistry / a Biochemistry b Cell Biology / sequence theoretical methods (HUSAR) Computational Bioinformatics Venia Legendi in Experimental Pathology (Medicine) Biology transition human Life Sciences / Expressome / Phenotypes Genetics t h Now a stronger focus on theoretical methods in e o r Systems Biology / Computational Biology y l Statistics a but embedded in Biology / Medicine b (cellular) Systems Biology Probability Theory ... this double period should encourage you to discover the wealth of theoretical science ... a theoretical discipline & interdisciplinary …. 2 5 Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Korsching ---- Systems Biology -- Bioinformatics 1 -- BIOLOGY Resources The evolution of biology follows physics Internet resources : there are many introductions, tutorials since 1950 since 1500 and scientific
    [Show full text]