Computational Prediction and Character Ization of Genomic Islands: Insights Into Bacterial Pathogenicity
Total Page:16
File Type:pdf, Size:1020Kb
COMPUTATIONAL PREDICTION AND CHARACTER IZATION OF GENOMIC ISLANDS: INSIGHTS INTO BACTERIAL PATHOGENICITY by Morgan Gavel Ira Langille B.Sc. University of New Brunswick, 2004 B.CS University of New Brunswick, 2004 THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY In the Department of Molecular Biology and Biochemistry © Morgan Gavel Ira Langille 2009 SIMON FRASER UNIVERSITY Summer 2009 All rights reserved. This work may not be reproduced in whole or in part, by photocopy or other means, without permission of the author. APPROVAL Name: Morgan Gavel Ira Langille Degree: Doctor of Philosophy Title of Thesis: Computational prediction and characterization of genomic islands: insights into bacterial pathogenicity Examining Committee: Chair: Dr. Paul C.H. Li Associate Professor, Department of Chemistry ______________________________________ Dr. Fiona S.L. Brinkman Senior Supervisor Associate Professor of Molecular Biology and Biochemistry ______________________________________ Dr. David L. Baillie Supervisor Professor of Molecular Biology and Biochemistry ______________________________________ Dr. Frederic F. Pio Supervisor Assistant Professor of Molecular Biology and Biochemistry ______________________________________ Dr. Jack Chen Internal Examiner Associate Professor of Molecular Biology and Biochemistry ______________________________________ Dr. Steven Hallam External Examiner Assistant Professor of Microbiology and Immunology, University of British Columbia Date Defended/Approved: Thursday April 16, 2009 ii Declaration of Partial Copyright Licence The author, whose copyright is declared on the title page of this work, has granted to Simon Fraser University the right to lend this thesis, project or extended essay to users of the Simon Fraser University Library, and to make partial or single copies only for such users or in response to a request from the library of any other university, or other educational institution, on its own behalf or for one of its users. The author has further granted permission to Simon Fraser University to keep or make a digital copy for use in its circulating collection (currently available to the public at the “Institutional Repository” link of the SFU Library website <www.lib.sfu.ca> at: <http://ir.lib.sfu.ca/handle/1892/112>) and, without changing the content, to translate the thesis/project or extended essays, if technically possible, to any medium or format for the purpose of preservation of the digital work. The author has further agreed that permission for multiple copying of this work for scholarly purposes may be granted by either the author or the Dean of Graduate Studies. It is understood that copying or publication of this work for financial gain shall not be allowed without the author’s written permission. Permission for public performance, or limited permission for private scholarly use, of any multimedia materials forming part of this work, may have been granted by the author. This information may be found on the separately catalogued multimedia material and in the signed Partial Copyright Licence. While licensing SFU to permit the above uses, the author retains copyright in the thesis, project or extended essays, including the right to change the work for subsequent purposes, including editing and publishing the work in whole or in part, and licensing other parties, as the author may desire. The original Partial Copyright Licence attesting to these terms, and signed by this author, may be found in the original bound copy of this work, retained in the Simon Fraser University Archive. Simon Fraser University Library Burnaby, BC, Canada Last revision: Spring 09 ABSTRACT Genomic islands (GIs), including pathogenicity islands, are commonly defined as clusters of genes in prokaryotic genomes that have probable horizontal origins. These genetic elements have been associated with rapid adaptations in prokaryotes that are of medical, economical or environmental importance, such as pathogen virulence, antibiotic resistance, symbiotic interactions, and notable secondary metabolic capabilities. As the number of genomic sequences increases, the impact of GIs in prokaryotic evolution has become more apparent and detecting these regions using bioinformatics approaches has become an integral part of studying microbial evolution and function. In this dissertation, I describe a novel comparative genomics approach for identifying GIs, called IslandPick, and the application of this method to construct robust datasets that were used to test the accuracy of several previously published GI prediction programs. In addition, I will discuss the features of a new GI web resource, called IslandViewer, which integrates the most accurate GI predictors currently available. Further, the role of several GI and prophage regions and their involvement in virulence in an epidemic Pseudomonas aeruginosa strain that infects cystic fibrosis patients will be described; as well as an observation that recently discovered phage defence elements, CRISPRs, are over-represented within GIs. Keywords: bioinformatics; genomic islands; horizontal gene transfer; phylogenomics; comparative genomics; evolution; bacteria; archaea; pathogenesis; phage iii DEDICATION In memory of my uncle, Dr. Stephen Kerr. I wish you were here to read this. iv ACKNOWLEDGEMENTS I would like to express my sincerest thanks to my supervisor, Dr. Fiona Brinkman, for her support, guidance, and great insight. In addition, I would like to thank my committee members Dr. David Baillie and Dr. Frederic Pio for their positive suggestions and guidance. I would like to acknowledge all of the collaborators from the LES project, including, Drs. Craig Winstanley, Roger Levesque, Bob Hancock, and Nicholas Thomson. Furthermore, I would like to thank all of the members of the Brinkman Lab especially Dr. William Hsiao for sharing his knowledge on genomic islands. In addition, I would like to thank both the supervisors and students of the Bioinformatics Training Program with a special acknowledgment to Benjamin Good for the many discussions and honest opinions on academic life and science. Lastly, I would like to thank my parents and brothers for always supporting me, and my wonderful wife Sylvia and son Gavin, who made this journey a truly enjoyable adventure. v TABLE OF CONTENTS Approval .............................................................................................................. ii Abstract .............................................................................................................. iii Dedication .......................................................................................................... iv Acknowledgements ............................................................................................ v Table of Contents .............................................................................................. vi List of Figures .................................................................................................. viii List of Tables ..................................................................................................... ix Glossary .............................................................................................................. x Chapter 1 Introduction ................................................................................... 1 1.1 Horizontal gene transfer ..................................................................... 1 1.2 Mobile genetic elements ..................................................................... 3 1.2.1 Prophage ........................................................................................ 4 1.2.2 Integrons ......................................................................................... 4 1.2.3 Transposons and IS elements ........................................................ 8 1.2.4 Genomic islands ........................................................................... 13 1.3 Detection of genomic islands............................................................ 22 1.3.1 Sequenced composition based methods ...................................... 22 1.3.2 Comparative genomics methods................................................... 25 1.3.3 Databases and other computational resources ............................. 28 1.4 Goal of present research .................................................................. 30 Chapter 2 IslandPick: A comparative genomics approach for genomic island identification .......................................................................... 32 2.1 Introduction ...................................................................................... 32 2.2 MicrobeDB ....................................................................................... 33 2.3 Identifying genomic islands using a comparative genomics approach .......................................................................................... 34 2.4 Automated selection of comparison genomes .................................. 35 2.4.1 Calculating genome distances ...................................................... 37 2.4.2 Genome selection parameters ...................................................... 40 2.5 Genomic island predictions using IslandPick ................................... 42 2.6 Developing a negative dataset of GIs ............................................... 43 2.7 Discussion .......................................................................................