Automated Annotation of Protein Sequences, This Is Vital and Similar Approaches Will Be Implemented for Other Domains in the Future

Total Page:16

File Type:pdf, Size:1020Kb

Automated Annotation of Protein Sequences, This Is Vital and Similar Approaches Will Be Implemented for Other Domains in the Future An Environment for Consistent Sequence Annotation and its Application to Transmembrane Proteins Steffen Möller Selwyn College Cambridge, United Kingdom June, 2001 The dissertation is submitted for the degree of Doctor of Philosophy. An Environment for Consistent Sequence Annotation and its Application to Transmembrane Proteins Steffen Möller Summary This thesis describes my research leading to the development of a novel software environment to combine multiple tools for an automated prediction of the properties of protein sequences. The test case of the implementation of the tool lies on transmembrane proteins. Part of the environment is a conflict-resolution mechanism and respective rules for its application. This contributes to an improvement of the automated sequence annotation of transmembrane proteins. The integrated tools include several membrane prediction methods. These are combined to provide an integrated method for both signal peptide prediction and membrane spanning region prediction. A database was created to describe the correlation between individual InterPro entries and transmembrane annotation. This led to the development of specialised predictors, constrained to individual protein families, and set the basis for an automated discovery of constraints for transmembrane topologies. To facilitate an evaluation of membrane protein prediction, a collection of biochemically well-characterised transmembrane protein sequences was created. As a novelty, raw data from those experiments where added from which a protein’s topology was elucidated. This was applied for analysing aberrations of predictions from the experimental results. The thesis closes with a novel application of the developed techniques to find determinants for the coupling of G protein-coupled receptors to G proteins and thereby facilitates a functional characterisation of these transmembrane receptors. 2 Publications The following publications resulted from the research described in this thesis. S. Möller, U. Leser, W. Fleischmann, R. Apweiler; "EDITtoTrEMBL: A distributed approach to high-quality automated protein sequence annotation." In: Proceedings of the German Conference on Bioinformatics (GCB'98), Zimmermann O., Schomburg D. (eds.); Köln, Germany (1998). W. Fleischmann, S. Möller, A. Gateau, R. Apweiler; "A novel method for automatic and reliable functional annotation." In: Proceedings of the German Conference on Bioinformatics (GCB'98), Zimmermann O., Schomburg D. (eds.); Köln, Germany (1998). R. Apweiler, C. O'Donovan, M.J. Martin, W. Fleischmann, H. Hermjakob, S. Möller, S. Contrino; "SWISS-PROT and its computer-annotated supplement TrEMBL: How to produce high quality automatic annotation." In: Proceedings of the World Multiconference on Systemics, Cybernetics and Informatics (SCI '98 / ISAS '98), Callaos N., Holmes L., Osers R. (eds.), Orlando, FL, USA; 4:184-191 (1998). S. Möller, U. Leser, W. Fleischmann, R. Apweiler "EDITtoTrEMBL: a distributed approach to high-quality automated protein sequence annotation." Bioinformatics 15 (3):219-27 (1999). W. Fleischmann, S. Möller, A. Gateau, R. Apweiler "A novel method for automatic functional annotation of proteins." Bioinformatics 15 (3):228-33 (1999). S. Möller, M. Schroeder "Conflict-resolution for the automated annotation of transmembrane proteins." In: Proceedings of the AISB 2000 in Birmingham, UK S. Möller, M. Schroeder, R. Apweiler ”Conflict-resolution for the automated annotation of transmembrane proteins.” Computers and Chemistry 26 (1):41-49 (2001). S. Möller, E. V. Kriventseva, R. Apweiler "A collection of well characterised integral membrane proteins." Bioinformatics 16 (12):1159-1160 (2000). S. Möller, M. D. R. Croning, R. Apweiler "Evaluation of methods for the prediction of membrane spanning regions in proteins." Bioinformatics, 17 (7) 646-653 (2001) S. Möller, J. Vilo, M. D. R. Croning ” Prediction of the coupling specificity of GPCRs to their G proteins.” In: Int. Sys. Mol. Biol., Copenhagen, 2001, appeared as Bioinformatics 17 (Supplement 1) S174-S181. Özgün Babur, Steffen Möller and Rolf Apweiler "TransMotif - Linking sequence motifs with transmembrane annotation." in preparation 3 Michael D. R. Croning, Steffen Möller, James Stalker, Arne Stubenau, Alex Kasprzyk, Ewan Birney, Teresa K. Attwood “Sequence mining reveals the full extent of the human 7tm receptor families.” submitted Posters that were presented on international conferences are shown in the appendix. For references to the location of source code to the programs, the collection of membrane proteins or the predictor or the GPCR coupling please contact Steffen Möller per email: [email protected]. On the Internet the page http://www.ebi.ac.uk/~moeller also provides respective links. 4 Contents PUBLICATIONS...............................................................................................................................................3 CONTENTS.......................................................................................................................................................5 LIST OF FIGURES...........................................................................................................................................9 LIST OF TABLES...........................................................................................................................................12 ABBREVIATIONS..........................................................................................................................................14 PREFACE........................................................................................................................................................15 AUDIENCE........................................................................................................................................................16 ACKNOWLEDGEMENTS........................................................................................................................................16 II.INTRODUCTION.......................................................................................................................................19 A.GENOMIC SEQUENCING...................................................................................................................................19 1.The Quest..............................................................................................................................................19 2.Biological background.........................................................................................................................19 3.Genomic Sequencing............................................................................................................................22 B.BIOINFORMATICS AND MOLECULAR BIOLOGY DATABASES...................................................................................22 1.Motivation.............................................................................................................................................22 2.Current Applications of Protein Sequence Databases.........................................................................25 C.NEED FOR AUTOMATED ANNOTATION.................................................................................................................26 1.Background...........................................................................................................................................26 III.AUTOMATED ANNOTATION OF PEPTIDE SEQUENCES.............................................................31 A.DATA FLOW OF PROTEIN ANNOTATION...............................................................................................................31 1.Submission of nucleotide sequence to EMBL.......................................................................................31 2.Manual protein annotation...................................................................................................................33 B.CONCEPT OF AUTOMATED PROTEIN ANNOTATION.................................................................................................33 1.Direct transfer by sequence similarity..................................................................................................34 2.Indirect annotation by protein domains (InterPro)..............................................................................35 3.Abstract sequence annotation (Algorithms).........................................................................................35 4.Future sources......................................................................................................................................36 5.Performance of automated annotation.................................................................................................36 C.EVIDENCES FOR INFORMATION..........................................................................................................................38 5 D.HOW A BIOLOGIST WOULD USE COMPUTER ALGORITHMS TO ANNOTATE A PROTEIN SEQUENCE.....................................38 1.Selection of programs...........................................................................................................................39 2.Integration of results............................................................................................................................40 E.POTENTIAL ALTERNATIVES TO A NEW DEVELOPMENT............................................................................................41
Recommended publications
  • Applied Category Theory for Genomics – an Initiative
    Applied Category Theory for Genomics { An Initiative Yanying Wu1,2 1Centre for Neural Circuits and Behaviour, University of Oxford, UK 2Department of Physiology, Anatomy and Genetics, University of Oxford, UK 06 Sept, 2020 Abstract The ultimate secret of all lives on earth is hidden in their genomes { a totality of DNA sequences. We currently know the whole genome sequence of many organisms, while our understanding of the genome architecture on a systematic level remains rudimentary. Applied category theory opens a promising way to integrate the humongous amount of heterogeneous informations in genomics, to advance our knowledge regarding genome organization, and to provide us with a deep and holistic view of our own genomes. In this work we explain why applied category theory carries such a hope, and we move on to show how it could actually do so, albeit in baby steps. The manuscript intends to be readable to both mathematicians and biologists, therefore no prior knowledge is required from either side. arXiv:2009.02822v1 [q-bio.GN] 6 Sep 2020 1 Introduction DNA, the genetic material of all living beings on this planet, holds the secret of life. The complete set of DNA sequences in an organism constitutes its genome { the blueprint and instruction manual of that organism, be it a human or fly [1]. Therefore, genomics, which studies the contents and meaning of genomes, has been standing in the central stage of scientific research since its birth. The twentieth century witnessed three milestones of genomics research [1]. It began with the discovery of Mendel's laws of inheritance [2], sparked a climax in the middle with the reveal of DNA double helix structure [3], and ended with the accomplishment of a first draft of complete human genome sequences [4].
    [Show full text]
  • Gene Prediction: the End of the Beginning Comment Colin Semple
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by PubMed Central http://genomebiology.com/2000/1/2/reports/4012.1 Meeting report Gene prediction: the end of the beginning comment Colin Semple Address: Department of Medical Sciences, Molecular Medicine Centre, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, UK. E-mail: [email protected] Published: 28 July 2000 reviews Genome Biology 2000, 1(2):reports4012.1–4012.3 The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2000/1/2/reports/4012 © GenomeBiology.com (Print ISSN 1465-6906; Online ISSN 1465-6914) Reducing genomes to genes reports A report from the conference entitled Genome Based Gene All ab initio gene prediction programs have to balance sensi- Structure Determination, Hinxton, UK, 1-2 June, 2000, tivity against accuracy. It is often only possible to detect all organised by the European Bioinformatics Institute (EBI). the real exons present in a sequence at the expense of detect- ing many false ones. Alternatively, one may accept only pre- dictions scoring above a more stringent threshold but lose The draft sequence of the human genome will become avail- those real exons that have lower scores. The trick is to try and able later this year. For some time now it has been accepted increase accuracy without any large loss of sensitivity; this deposited research that this will mark a beginning rather than an end. A vast can be done by comparing the prediction with additional, amount of work will remain to be done, from detailing independent evidence.
    [Show full text]
  • Meeting Review: Bioinformatics and Medicine – from Molecules To
    Comparative and Functional Genomics Comp Funct Genom 2002; 3: 270–276. Published online 9 May 2002 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.178 Feature Meeting Review: Bioinformatics And Medicine – From molecules to humans, virtual and real Hinxton Hall Conference Centre, Genome Campus, Hinxton, Cambridge, UK – April 5th–7th Roslin Russell* MRC UK HGMP Resource Centre, Genome Campus, Hinxton, Cambridge CB10 1SB, UK *Correspondence to: Abstract MRC UK HGMP Resource Centre, Genome Campus, The Industrialization Workshop Series aims to promote and discuss integration, automa- Hinxton, Cambridge CB10 1SB, tion, simulation, quality, availability and standards in the high-throughput life sciences. UK. The main issues addressed being the transformation of bioinformatics and bioinformatics- based drug design into a robust discipline in industry, the government, research institutes and academia. The latest workshop emphasized the influence of the post-genomic era on medicine and healthcare with reference to advanced biological systems modeling and simulation, protein structure research, protein-protein interactions, metabolism and physiology. Speakers included Michael Ashburner, Kenneth Buetow, Francois Cambien, Cyrus Chothia, Jean Garnier, Francois Iris, Matthias Mann, Maya Natarajan, Peter Murray-Rust, Richard Mushlin, Barry Robson, David Rubin, Kosta Steliou, John Todd, Janet Thornton, Pim van der Eijk, Michael Vieth and Richard Ward. Copyright # 2002 John Wiley & Sons, Ltd. Received: 22 April 2002 Keywords: bioinformatics;
    [Show full text]
  • Chuanxiong Rhizoma Compound on HIF-VEGF Pathway and Cerebral Ischemia-Reperfusion Injury’S Biological Network Based on Systematic Pharmacology
    ORIGINAL RESEARCH published: 25 June 2021 doi: 10.3389/fphar.2021.601846 Exploring the Regulatory Mechanism of Hedysarum Multijugum Maxim.-Chuanxiong Rhizoma Compound on HIF-VEGF Pathway and Cerebral Ischemia-Reperfusion Injury’s Biological Network Based on Systematic Pharmacology Kailin Yang 1†, Liuting Zeng 1†, Anqi Ge 2†, Yi Chen 1†, Shanshan Wang 1†, Xiaofei Zhu 1,3† and Jinwen Ge 1,4* Edited by: 1 Takashi Sato, Key Laboratory of Hunan Province for Integrated Traditional Chinese and Western Medicine on Prevention and Treatment of 2 Tokyo University of Pharmacy and Life Cardio-Cerebral Diseases, Hunan University of Chinese Medicine, Changsha, China, Galactophore Department, The First 3 Sciences, Japan Hospital of Hunan University of Chinese Medicine, Changsha, China, School of Graduate, Central South University, Changsha, China, 4Shaoyang University, Shaoyang, China Reviewed by: Hui Zhao, Capital Medical University, China Background: Clinical research found that Hedysarum Multijugum Maxim.-Chuanxiong Maria Luisa Del Moral, fi University of Jaén, Spain Rhizoma Compound (HCC) has de nite curative effect on cerebral ischemic diseases, *Correspondence: such as ischemic stroke and cerebral ischemia-reperfusion injury (CIR). However, its Jinwen Ge mechanism for treating cerebral ischemia is still not fully explained. [email protected] †These authors share first authorship Methods: The traditional Chinese medicine related database were utilized to obtain the components of HCC. The Pharmmapper were used to predict HCC’s potential targets. Specialty section: The CIR genes were obtained from Genecards and OMIM and the protein-protein This article was submitted to interaction (PPI) data of HCC’s targets and IS genes were obtained from String Ethnopharmacology, a section of the journal database.
    [Show full text]
  • The Ethos and Effects of Data-Sharing Rules: Examining The
    Informed consent for: "The ethos and effects of data-sharing rules: Examining the history of the 'Bermuda principles' and their effects on 21 st century science" University of Adelaide Duke University Researchers at the University of Adelaide, Australia, and the IGSP Center for Genome Ethics, Law & Policy, Duke University, are engaged in research on the Bermuda Principles for sharing DNA sequence data from high-volume sequencing centers. You have been selected for an interview because we believe that the recollections you may have of your experiences with the International Strategy Meetings for Human Genome Sequencing (1996-1998) will be interesting and helpful for our project. We expect that interviews will last from 30 minutes to much longer, but you may stop your interview at any time. Your participation is strictly voluntary, and you do not have to answer every question asked. Your interview is being recorded and we may take written notes during the interview. After your interview, we may prepare a typed transcript of the interview. If we prepare a transcript, you will have an opportunity to review it and to make deletions and corrections. Unless you indicate otherwise, the information that you provide in this interview will be "on the record"-that is, it can be attributed to you in the various articles and chapters that we plan to write, and thus could become public through these channels. Jf, however, at some point in the interview you want to provide us with information that might be useful for us to know, but which you do not want to have attributed to you, you should tell us that you wish to go "off the record" and we will stop the recording.
    [Show full text]
  • Computational Exploration of the Medium-Chain Dehydrogenase / Reductase Superfamily
    From the Department of Medical Biochemistry and Biophysics Karolinska Institutet, Stockholm, Sweden COMPUTATIONAL EXPLORATION OF THE MEDIUM-CHAIN DEHYDROGENASE / REDUCTASE SUPERFAMILY Linus J. Östberg Stockholm 2017 All previously published papers were reproduced with permission from the publisher. Published by Karolinska Institutet. Printed by E-Print AB 2017 ©Linus J. Östberg, 2017 ISBN 978-91-7676-650-7 COMPUTATIONAL EXPLORATION OF THE MEDIUM-CHAIN DEHYDROGENASE / REDUCTASE SUPERFAMILY THESIS FOR DOCTORAL DEGREE (Ph.D.) By Linus J. Östberg Principal Supervisor: Opponent: Prof. Jan-Olov Höög Prof. Jaume Farrés Karolinska Institutet Autonomous University of Barcelona Department of Medical Biochemistry and Department of Biochemistry and Biophysics Molecular Biology Co-supervisor: Examination Board: Prof. Bengt Persson Prof. Erik Lindahl Uppsala University Stockholm University Department of Cell and Molecular Biology Department of Biochemistry and Biophysics Assoc. Prof. Tanja Slotte Stockholm University Department of Ecology, Environment and Plant Sciences Prof. Elias Arnér Karolinska Institutet Department of Medical Biochemistry and Biophysics ABSTRACT The medium-chain dehydrogenase/reductase (MDR) superfamily is a protein family with more than 170,000 members across all phylogenetic branches. In humans there are 18 repre- sentatives. The entire MDR superfamily contains many protein families such as alcohol dehy- drogenase, which in mammals is in turn divided into six classes, class I–VI (ADH1–6). Most MDRs have enzymatical functions, catalysing the conversion of alcohols to aldehyde/ketones and vice versa, but the function of some members is still unknown. In the first project, a methodology for identifying and automating the classification of mammalian ADHs was developed using BLAST for identification and class-specific hidden Markov models were generated for identification.
    [Show full text]
  • Contcenter for Genomic Regul
    CONTCENTER FOR GENOMIC REGUL CRG SCIENTIFIC STRUCTURE . 4 CRG MANAGEMENT STRUCTURE . 6 CRG SCIENTIFIC ADVISORY BOARD (SAB) . 8 CRG BUSINESS BOARD . 9 YEAR RETROSPECT BY THE DIRECTOR OF THE CRG: MIGUEL BEATO . 10 GENE REGULATION. 14 p Chromatin and gene expression .....................16 p Transcriptional regulation and chromatin remodelling .....19 p Regulation of alternative pre-mRNA splicing during cell . 22 differentiation, development and disease p RNA interference and chromatin regulation . 26 p RNA-protein interactions and regulation . 30 p Regulation of protein synthesis in eukaryotes . 33 p Translational control of gene expression . 36 DIFFERENTIATION AND CANCER ...........................40 p Hematopoietic differentiation and stem cell biology..........42 p Myogenesis.....................................46 p Epigenetics events in cancer.......................49 p Epithelial homeostasis and cancer ...................52 ENTSATION ANNUAL REPORT 2006 GENES AND DISEASE .................................56 p Genetic causes of disease .............................58 p Gene therapy ......................................63 p Murine models of disease .............................66 p Neurobehavioral phenotyping of mouse models of disease .....68 p Gene function ......................................73 p Associated Core Facility: Genotyping Unit..................76 BIOINFORMATICS AND GENOMICS ..........................80 p Bioinformatics and genomics ...........................82 p Genomic analysis of development and disease ..............86
    [Show full text]
  • Concepts, Historical Milestones and the Central Place of Bioinformatics in Modern Biology: a European Perspective
    1 Concepts, Historical Milestones and the Central Place of Bioinformatics in Modern Biology: A European Perspective Attwood, T.K.1, Gisel, A.2, Eriksson, N-E.3 and Bongcam-Rudloff, E.4 1Faculty of Life Sciences & School of Computer Science, University of Manchester 2Institute for Biomedical Technologies, CNR 3Uppsala Biomedical Centre (BMC), University of Uppsala 4Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences 1UK 2Italy 3,4Sweden 1. Introduction The origins of bioinformatics, both as a term and as a discipline, are difficult to pinpoint. The expression was used as early as 1977 by Dutch theoretical biologist Paulien Hogeweg when she described her main field of research as bioinformatics, and established a bioinformatics group at the University of Utrecht (Hogeweg, 1978; Hogeweg & Hesper, 1978). Nevertheless, the term had little traction in the community for at least another decade. In Europe, the turning point seems to have been circa 1990, with the planning of the “Bioinformatics in the 90s” conference, which was held in Maastricht in 1991. At this time, the National Center for Biotechnology Information (NCBI) had been newly established in the United States of America (USA) (Benson et al., 1990). Despite this, there was still a sense that the nation lacked a “long-term biology ‘informatics’ strategy”, particularly regarding postdoctoral interdisciplinary training in computer science and molecular biology (Smith, 1990). Interestingly, Smith spoke here of ‘biology informatics’, not bioinformatics; and the NCBI was a ‘center for biotechnology information’, not a bioinformatics centre. The discipline itself ultimately grew organically from the needs of researchers to access and analyse (primarily biomedical) data, which appeared to be accumulating at alarming rates simultaneously in different parts of the world.
    [Show full text]
  • Biolayout Paper Fig3 V3
    Edinburgh Research Explorer Construction, visualisation, and clustering of transcription networks from Microarray expression data Citation for published version: Freeman, TC, Goldovsky, L, Brosch, M, Van Dongen, S, Maziere, P, Grocock, RJ, Freilich, S, Thornton, J & Enright, AJ 2007, 'Construction, visualisation, and clustering of transcription networks from Microarray expression data', PLoS Computational Biology, vol. 3, no. 10, e206, pp. 2032-2042. https://doi.org/10.1371/journal.pcbi.0030206 Digital Object Identifier (DOI): 10.1371/journal.pcbi.0030206 Link: Link to publication record in Edinburgh Research Explorer Document Version: Publisher's PDF, also known as Version of record Published In: PLoS Computational Biology Publisher Rights Statement: This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 04. Oct. 2021 Construction, Visualisation, and Clustering of Transcription Networks from Microarray Expression Data Tom C. Freeman1,2*, Leon Goldovsky3¤, Markus Brosch2, Stijn van Dongen2, Pierre Mazie`re2, Russell J.
    [Show full text]
  • Computational Biology: Plus C'est La Même Chose, Plus Ça Change
    Computational biology: plus c'est la même chose, plus ça change The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Huttenhower, Curtis. 2011. Computational biology: plus c'est la même chose, plus ça change. Genome Biology 12(8): 307. Published Version doi:10.1186/gb-2011-12-8-307 Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:10576037 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Huttenhower Genome Biology 2011, 12:307 http://genomebiology.com/2011/12/8/307 MEETING REPORT Computational biology: plus c’est la même chose, plus ça change Curtis Huttenhower* The data deluge: still keeping our heads above water Abstract Bioinformatics has been dealing with an exponential A report on the joint 19th Annual International growth in data since its coalescence as a field in the Conference on Intelligent Systems for Molecular 1980s, making the Senior Scientist Award keynote with Biology (ISMB)/10th Annual European Conference which Michael Ashburner closed the conference on Computational Biology (ECCB) meetings and particularly appropriate. This retrospective by the ‘father the 7th International Society for Computational of ontologies in biology’, to quote the introduction by Biology Student Council Symposium, Vienna, Austria, ISCB president Burkhard Rost, detailed the remarkable 15‑19 July 2011. expansion of computational biology since Ashburner’s start as a Cambridge undergraduate 50 years ago.
    [Show full text]
  • A Detailed Genome-Wide Reconstruction of Mouse Metabolism Based on Human Recon 1
    UC San Diego UC San Diego Previously Published Works Title A detailed genome-wide reconstruction of mouse metabolism based on human Recon 1 Permalink https://escholarship.org/uc/item/0ck1p05f Journal BMC Systems Biology, 4(1) ISSN 1752-0509 Authors Sigurdsson, Martin I Jamshidi, Neema Steingrimsson, Eirikur et al. Publication Date 2010-10-19 DOI http://dx.doi.org/10.1186/1752-0509-4-140 Supplemental Material https://escholarship.org/uc/item/0ck1p05f#supplemental Peer reviewed eScholarship.org Powered by the California Digital Library University of California Sigurdsson et al. BMC Systems Biology 2010, 4:140 http://www.biomedcentral.com/1752-0509/4/140 RESEARCH ARTICLE Open Access A detailed genome-wide reconstruction of mouse metabolism based on human Recon 1 Martin I Sigurdsson1,2,3, Neema Jamshidi4, Eirikur Steingrimsson1,3, Ines Thiele3,5*, Bernhard Ø Palsson3,4* Abstract Background: Well-curated and validated network reconstructions are extremely valuable tools in systems biology. Detailed metabolic reconstructions of mammals have recently emerged, including human reconstructions. They raise the question if the various successful applications of microbial reconstructions can be replicated in complex organisms. Results: We mapped the published, detailed reconstruction of human metabolism (Recon 1) to other mammals. By searching for genes homologous to Recon 1 genes within mammalian genomes, we were able to create draft metabolic reconstructions of five mammals, including the mouse. Each draft reconstruction was created in compartmentalized and non-compartmentalized version via two different approaches. Using gap-filling algorithms, we were able to produce all cellular components with three out of four versions of the mouse metabolic reconstruction.
    [Show full text]
  • Integration of Text Mining with Systems Biology Provides New Insight Into the Pathogenesis of Diabetic Neuropathy
    INTEGRATION OF TEXT MINING WITH SYSTEMS BIOLOGY PROVIDES NEW INSIGHT INTO THE PATHOGENESIS OF DIABETIC NEUROPATHY by Junguk Hur A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Bioinformatics) in The University of Michigan 2010 Doctoral Committee: Professor Eva L. Feldman, Co-Chair Professor Hosagrahar V. Jagadish, Co-Chair Professor Matthias Kretzler Research Assistant Professor Maureen Sartor Professor David J. States, University of Texas Junguk Hur © 2010 All Rights Reserved DEDICATION To my family ii ACKNOWLEDGMENTS Over the past few years, I have been tremendously fortunate to have the company and mentorship of the most wonderful and smartest scientists I know. My advisor, Prof. Eva Feldman, guided me through my graduate studies with constant support, encouragement, enthusiasm and infinite patience. I would like to thank her for being a mom in my academic life and raising me to become a better scientist. I would also like to thank my co-advisors Prof. H. V. Jagadish and Prof. David States. They provided sound advice and inspiration that have been instrumental in my Ph.D. study. I am also very grateful to Prof. Matthias Kretzler and Prof. Maureen Sartor for being an active part of my committee as well as for their continuous encouragement and guidance of my work. I would also like to thank Prof. Daniel Burns, Prof. Margit Burmeister, Prof. Gil Omenn and Prof. Brain Athey from the Bioinformatics Program, who have been very generous with their support and advice about academic life. I am also very grateful to Sherry, Julia and Judy, who helped me through the various administrative processes with cheerful and encouraging dispositions.
    [Show full text]