The Development and Application of a Transposon Insertion Sequencing Methodology in Escherichia Coli BW25113
Total Page:16
File Type:pdf, Size:1020Kb
The Development and Application of a Transposon Insertion Sequencing Methodology in Escherichia coli BW25113 by Ashley Robinson A thesis submitted to The University of Birmingham For the degree of Doctor of Philosophy College of Medical and Dental Sciences September 2016 [1] University of Birmingham Research Archive e-theses repository This unpublished thesis/dissertation is copyright of the author and/or third parties. The intellectual property rights of the author or third parties in respect of this work are as defined by The Copyright Designs and Patents Act 1988 or as modified by any successor legislation. Any use made of information contained in this thesis/dissertation must be in accordance with that legislation and must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the permission of the copyright holder. ABSTRACT Escherichia coli is one of the most studied model organisms in biology. Even with decades of research, there are a substantial number of genes with an as yet unknown function. Previously, to determine the link between gene function and phenotype took significant experimental effort. However, newer methods are capable of providing large amounts of biological data in short timeframes. One such method, transposon insertion sequencing, is a powerful research tool, which couples transposon mutagenesis and next generation sequencing to identify genes that have important or essential functions. Here, three transposon insertion sequencing methods were compared. The techniques were adapted from previously published literature. Based on a number of metrics one technique was shown to be superior for data generation. This method was chosen for application in further transposon-insertion sequencing experiments. Subsequently, the optimised method was used to assess which genes were essential for the viability of the model organism E. coli K12. The results of this work were compared with the literature and other databases of gene essentiality. A high degree of concordance was observed between our datasets and those generated previously through other methods. Indeed, the method described here was shown to have several benefits over previously used approaches. Finally, genes involved with maintenance of the outer membrane were identified by using markers for membrane permeability in tandem with the chosen method. In keeping with previous literature multiple genes involved with many aspects of the cell envelope were reported. Many of the reported genes were shown to be involved with metabolic processes related to the biogenesis and maintenance of the cell envelope. [2] ACKNOWLEDGEMENTS This collection of ordered letters, by one monkey with one (admittedly modern) typewriter, would not have been completed without the friendship, guidance and support of so many people. I fear this will not do you justice, but here goes! Magic Rainbow Five ® - you have made my time in the lab incredible! I may never be a part of such a friendly, hilarious and supportive microcosm again, and for your friendship I am truly grateful. O Hai Chris – the laughs we have had! I am indebted to you for so much - your graffiti skills, taste in cinema and your incisive wit amongst other things. Thank you, and keep on searching for those triangles bro. Jacky B, Caddaby and Dairylea – your astounding knowledge of science, music and arabica alike always made for an entertaining discussion. Thank you for always being downstairs and available for caffeination, contemplation and celebration. James – my musical brother from an unrelated matriarch! Whenever I hear anything even remotely Latin in origin, I will fondly remember our improvs. Thank you so much for the music and your support. Pete – you have helped me for quite a few years now, and I am indebted – thank you so much for your ever helpful knowledge and insight, as well as Christmas festive cheer. Amanda, Faye, Josh, Sara, Rachael, Tamar, and likely many others that have fallen through the sieve between my ears – I have you all to thank for so much! Ian and Cathy – you have been wonderful to me, and really made me feel like part of the lab family. It has been a privilege –I am eternally grateful for the opportunity you provided, and I hope that what I have done will always be of use. If you ever need computer help… Finally, my beautiful family – Kayleigh, Erin, Niamh and Isla. You enrich my life with an unspeakable quality, and make my days joyous beyond description. Thank you for making life so worthwhile. [3] TABLE OF CONTENTS Chapter 1. General Introduction 10 The model organism Escherichia coli 11 The cell envelope 11 Inner membrane 14 Protein translocation 14 Energy generation and proton motive force 16 Environmental sensing and response 17 Periplasm 19 Peptidoglycan 20 Chaperones 23 Outer membrane 25 Lipopolysaccharide 25 Lipoproteins 25 Outer membrane proteins (OMPs) 28 Envelope integrity and permeability 31 Vancomycin 31 SDS 32 Industrial application 34 Gene essentiality 35 Research methodologies 37 Advances in DNA sequencing and applications 40 Next generation sequencing 41 Transposon insertion sequencing 44 Aims 47 Chapter 2. Materials and methods 49 Bacterial strains and primers 50 Transposon library creation 50 Two-PCR library preparation method 50 [4] Two-PCR method adaptation 54 Two-PCR method protocol 56 Shearing-based library preparation method 57 Adaptation of the shearing based method 58 Shearing method protocol 61 Hybrid shearing based library preparation method 62 Adaptation of the hybrid method 64 Hybrid method protocol 64 Sequence read analysis 65 Preliminary read processing 65 Primary read processing 66 Essential gene prediction 67 Differential representation calculation 67 Chapter 3. A comparison of transposon sequencing library preparation methods 69 Introduction 70 Results 70 A two-PCR based library preparation method 70 A shearing based library preparation method 77 A hybrid two-PCR/shearing based library preparation method 82 Discussion 88 Chapter 4. Escherichia coli BW25113 essential gene analysis 93 [5] Introduction 94 Results 94 Datasets and essential gene prediction 95 Manual inspection of essential genes 98 Supporting evidence for manually inspected genes 112 Cluster of orthologous groups (COG) analysis 115 Discussion 142 Chapter 5. Escherichia coli BW25113 conditional gene analysis in response to markers for outer membrane permeability 146 Introduction 147 Results 148 Sample datasets 148 Differential representation analysis 148 Genes differentially represented after growth in the presence of SDS 154 Genes differentially represented after growth in the presence of vancomycin 168 Comparison with other studies 173 Discussion 176 Chapter 6 Final discussion 178 Chapter 7 Bibliography 184 [6] List of Figures Figure 1.1. The E. coli cell envelope. 13 Figure 1.2. ATP Synthase (Berman et al ., 2000). 18 Figure 1.3. Two component systems. 21 Figure 1.4. Peptidoglycan biosynthesis (Typas et al., 2011). 22 Figure 1.5. Lipoprotein anchoring. 27 Figure 1.6. Schematic of the AcrAB/TolC efflux pump. 30 Figure 1.7. Vancomycin and its mechanism of action. 33 Figure 1.8. The Datsenko and Wanner (2000) gene deletion method. 39 Figure 1.9. A simplified depiction of Illumina sequencing by synthesis. 43 Figure 1.10. A simplified depiction of transposon insertion sequencing. 46 Figure 2.1. Transposome mediated mutagenesis. 53 Figure 2.2. The adapted 2 PCR method used in this work. 55 Figure 2.3. The adapted shearing based method used in this work. 59 Figure 2.4. The hybrid based method used in this work. 63 Figure 3.1. Histograms of insertion indexes calculated from datasets generated using the two-PCR methodology. 75 Figure 3.2. Insertion index correlation scatterplots for the datasets generated using the two-PCR method. 76 Figure 3.3. Histograms of insertion indexes calculated from datasets generated using the shearing methodology. 80 Figure. 3.4. Correlation of LB datasets derived from the shearing method. 83 [7] Figure 3.5. Histograms of insertion indexes calculated from datasets generated using the hybrid methodology. 87 Figure 3.6 Insertion index correlation scatterplots for the hybrid datasets. 89 Figure 3.7. Differences in insertion representation between the two-PCR and hybrid methods. 91 Figure 4.1 Insertion index histograms for the combined NTL and LB datasets. 96 Figure 4.2 Comparison of the KEIO, LB and NTL essential gene lists. 97 Figure 4.3 Insertions throughout genes previously predicted to be essential. 99 Figure 4.4 Essential gene regions of grpE and ftsK . 106 Figure 4.5 Insertions into the 5′ and 3′ regions of ftsH and rnpA . 108 Figure 4.6 Insertions into cydC and secD . 109 Figure 4.7 Lack of insertions in genes not reported to be essential from the KEIO library. 111 Figure 4.8 Distribution of COG categories in the summarised gene lists. 117 Figure 4.9 The overlap of an essential gene region with the transmembrane domain of yejM . 144 Figure 5.1. Insertion index correlation scatterplots for the biological replicates of the S4.8 and V100 samples. 150 Figure 5.2. Insertion index histograms for the combined S4.8 and V100 datasets. 151 Figure 5.3. Insertion profiles for genes known to be involved with response to vancomycin and SDS 153 Figure 5.4. Schematic showing the RssB regulation of RpoS. 160 Figure 5.5. Negatively represented genes in the LPS biosynthetic pathway after growth in SDS. 166 [8] Abbreviations ATP – adenosine triphosphate BAM complex – beta barrel assembly machinery COG – cluster of orthologous groups DNA