High-throughput Experimental and Computational Studies of Bacterial Evolution Lars Barquist Queens' College University of Cambridge A thesis submitted for the degree of Doctor of Philosophy 23 August 2013 Arrakis teaches the attitude of the knife { chopping off what's incomplete and saying: \Now it's complete because it's ended here." Collected Sayings of Muad'dib Declaration High-throughput Experimental and Computational Studies of Bacterial Evolution The work presented in this dissertation was carried out at the Wellcome Trust Sanger Institute between October 2009 and August 2013. This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration except where specifically indicated in the text. This dissertation does not exceed the limit of 60,000 words as specified by the Faculty of Biology Degree Committee. This dissertation has been typeset in 12pt Computer Modern font using LATEX according to the specifications set by the Board of Graduate Studies and the Faculty of Biology Degree Committee. No part of this dissertation or anything substantially similar has been or is being submitted for any other qualification at any other university. Acknowledgements I have been tremendously fortunate to spend the past four years on the Wellcome Trust Genome Campus at the Sanger Institute and the European Bioinformatics Institute. I would like to thank foremost my main collaborators on the studies described in this thesis: Paul Gardner and Gemma Langridge. Their contributions and support have been invaluable. I would also like to thank my supervisor, Alex Bateman, for giving me the freedom to pursue a wide range of projects during my time in his group and for advice. Many others have influenced my thinking through collaborations and discussions; in no particular order: Amy Cain, Christine Boinett, Oscar Westesson (UC Berkeley), Ian Holmes (UC Berkeley), Leo Parts, Zasha Weinberg (Yale University/HHMI), Nick Thomson, Julian Parkhill, Chinyere Okoro, Sandra Reuter, Nick Croucher, Thomas Dan Otto, Simon Harris, Rob Kingsley, Melissa Martin (London School of Hygiene & Tropical Medicine), John Wain (University of East Anglia), Theresa Feltwell, Helena Seth-Smith, Eric Nawrocki (Janelia Farm Research Campus), Sean Eddy (Janelia Farm Research Campus), Anton Enright, Marija Buljan, Derek Pickard, Marco Punta, Fabian Schreiber, Sarah Burge, John Marioni, Keith Turner, and Nick Feasey. I am sure I have forgotten still more who deserve my thanks. Finally, I would like to especially thank Joanne Chung and Gomi Jung. It's been a gas. 21 August 2013 Cambridge, UK Abstract The work in this thesis is concerned with the study of bacterial adaptation on short and long timescales. In the first section, consisting of three chapters, I describe a recently developed high-throughput technology for probing gene function, transposon- insertion sequencing, and its application to the study of functional differences between two important human pathogens, Salmonella enterica subspecies enterica serovars Typhi and Typhimurium. In a first study, I use transposon-insertion sequencing to probe differences in gene requirements during growth on rich laboratory media, revealing differences in serovar requirements for genes involved in iron-utilization and cell-surface structure biogenesis, as well as in requirements for non-coding RNA. In a second study I more directly probe the genomic features responsible for differences in serovar pathogenicity by analyzing transposon-insertion sequencing data produced following a two hour infection of human macrophage, revealing large differences in the selective pressures felt by these two closely related serovars in the same environment. The second section, consisting of two chapters, uses statistical models of sequence variation, i.e. covariance models, to examine the evolution of intrinsic termination across the bacterial kingdom. A first collaborative study provides background and motivation in the form of a method for identifying Rho-independent terminators using covariance models built from deep alignments of experimentally-verified terminators from Escherichia coli and Bacillus subtilis. In the course of the development of this method I discovered a novel putative intrinsic terminator in Mycobacterium tuberculosis. In the final chapter, I extend this approach to de novo discovery of intrinsic termination motifs across the bacterial phylogeny. I present evidence for lineage-specific variations in canonical Rho-independent terminator composition, as well as discover seven non-canonical putative termination motifs. Using a collection of publicly available RNA-seq datasets, I provide evidence for the function of some of these elements as bona fide transcriptional attenuators. Contents Declaration iii Contents ix List of Figures xiii List of Tables xv List of Symbols xxii Introduction xxiii 1 Querying bacterial genomes with transposon-insertion sequencing1 1.1 Introduction . .1 1.2 Protocols . .7 1.2.1 Transposon mutagenesis . .8 1.2.2 Pool construction . .9 1.2.3 Enrichment of transposon-insertion junctions . .9 1.2.4 Sequencing . 10 1.3 Reproducibility, accuracy, and concordance with previous methods . 11 1.4 Identifying gene requirements . 12 1.5 Determining conditional gene requirements . 14 1.6 Monitoring ncRNA contributions to fitness . 16 1.7 Limitations . 18 1.8 The future of transposon-insertion sequencing . 19 2 A comparison of dense transposon insertion libraries in the Salmonella serovars Typhi and Typhimurium 21 2.1 Introduction . 21 2.1.1 The genus Salmonella ........................ 22 2.1.2 Host adaptation and restriction . 25 2.1.3 Serovars Typhi and Typhimurium . 26 2.2 Materials and Methods . 29 2.2.1 Strains . 29 2.2.2 Annotation . 29 2.2.3 Creation of S. Typhimurium transposon mutant library . 29 2.2.4 DNA manipulations and sequencing . 30 2.2.5 Sequence analysis . 30 2.2.6 Statistical analysis of required genes . 30 2.3 Results and Discussion . 31 2.3.1 TraDIS assay of every Salmonella Typhimurium protein-coding gene 31 2.3.2 Cross-species comparison of genes required for growth . 33 2.3.3 Serovar-specific genes required for growth . 38 2.3.4 TraDIS provides resolution sufficient to evaluate ncRNA contribu- tions to fitness . 51 2.3.5 sRNAs required for competitive growth . 55 2.4 Conclusions . 57 3 Methods for the analysis of TraDIS experiments, with an application to Salmonella macrophage invasion 61 3.1 Introduction . 61 3.1.1 Salmonella interactions with macrophage . 62 3.1.2 Conditional gene fitness . 63 3.2 Experimental methods . 65 3.2.1 Strains and cell lines . 65 3.2.2 Preparation of THP-1 cells . 65 3.2.3 Preparation of transposon libraries . 66 3.2.4 Infection assay . 66 3.3 Analysis of conditional gene fitness using TraDIS . 67 3.3.1 Experimental design . 67 3.3.2 Mapping insertion sites . 67 3.3.3 Quality control . 68 3.3.4 Inter-library normalization . 71 3.3.5 Identifying fitness effects . 72 3.3.5.1 Theory . 72 3.3.5.2 Application to macrophage infection data . 75 3.3.6 Functional analysis of gene sets that affect fitness . 77 3.4 Results and Discussion . 79 4 Detecting Rho-independent terminators in genomic sequence with co- variance models 89 4.1 Introduction . 89 4.1.1 Rho-independent termination . 90 4.1.2 Previous approaches to identifying intrinsic terminators . 91 4.1.3 Covariance models . 93 4.2 Methods . 95 4.2.1 Construction of a covariance model for Rho-independent terminators 95 4.2.2 RNIE run modes . 96 4.2.3 Definitions . 97 4.3 Results . 99 4.3.1 Alpha benchmark . 99 4.3.2 Beta benchmark . 100 4.3.3 A novel termination motif in Mycobacterium tuberculosis ..... 102 5 Kingdom-wide discovery of bacterial intrinsic termination motifs 107 5.1 Introduction . 107 5.2 Methods . 108 5.2.1 Genome-wise motif discovery . 108 5.2.2 Clustering covariance models . 109 5.2.3 Building consensus covariance models . 110 5.2.4 Genome annotation . 110 5.2.5 Analysis of expression data . 111 5.3 Results . 112 5.3.1 Kingdom-wide motif discovery . 112 5.3.2 Canonical RIT diversity . 117 5.3.2.1 Validating RIT activity with RNA-seq . 118 5.3.2.2 Lineage-specific enrichment of canonical RIT clusters . 119 5.3.3 Non-canonical putative attenuation motifs . 122 5.3.3.1 The Neisserial DNA uptake sequence TAM . 122 5.3.3.2 The Actinobacterial TAM . 124 5.3.3.3 Type 1 integron attC sites . 125 5.3.3.4 Other non-canonical TAMs . 127 5.4 Discussion . 127 Publications 129 Appendix A: Supplementary data for chapters 2 and 3 133 Appendix B: Genomic sequences analyzed for termination motifs 135 References 161 List of Figures 1.1 Transposon-insertion sequencing protocols . .7 1.2 Applications of transposon-insertion sequencing to non-coding RNAs . 17 2.1 Genomic acquisitions in the evolution of the salmonellae . 23 2.2 The distribution of gene-wise insertion indexes in S. Typhi . 32 2.3 Genome-wide transposon mutagenesis of S. Typhimurium . 34 2.4 Comparison of required genes . 35 2.5 Comparison of cell surface operon structure and requirements . 44 2.6 H-NS enrichment across the SPI-2 locus . 46 2.7 Proposed differences in sRNA utilization . 56 3.1 Biogenesis of the Samonella containing vacuole . 64 3.2 Principal component analysis of TraDIS macrophage infection assays . 70 3.3 Smear plot of differences in logFC over macrophage infection between S. Typhimurium and S. Typhi . 76 3.4 Smear plot of logFC in mutant prevalences over macrophage infection in S. Typhimurium . 77 3.5 Smear plot of logFC in mutant prevalences over macrophage infection in S. Typhi.................................... 78 3.6 Walking hypergeometric test for depletion of insertions in the S. Ty- phimurium flagellar subsystem . 80 3.7 Mutant depletion in the S. Typhimurium flagellar subsystem . 82 4.1 Rho-independent termination . 90 4.2 TransTermHP motif .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages216 Page
-
File Size-