1 Reflecting on a Decade of Transposon-Insertion Sequencing 2 3 Amy K. Cain1*, Lars Barquist2,3, Andrew L. Goodman4, Ian T. Paulsen1, Julian Parkhill5 and 4 Tim van Opijnen6* 5 6 Authors Affiliations: 7 1ARC Centre of Excellence in Synthetic Biology, Department of Molecular Sciences, 8 Macquarie University, Sydney, NSW, Australia 9 2 Helmholtz Institute for RNA-based Infection Research, Helmholtz Centre for Infection 10 Research, Würzburg, BY, Germany 11 3 Faculty of Medicine, University of Würzburg, Würzburg, BY, Germany 12 4 Department of Microbial Pathogenesis and Microbial Sciences Institute, Yale University 13 School of Medicine, New Haven, CT, USA 14 5Department of Veterinary Medicine, University of Cambridge, Cambridge, Cambs, UK 15 6 Department of Biology, Boston College, Boston, MA, USA 16 17 *corresponding authors: [email protected] or [email protected] 18 1 19 Abstract 20 It has been 10 years since the birth of modern transposon-insertion sequencing (TIS) 21 methods, which combine genome-wide transposon mutagenesis with high-throughput 22 sequencing to estimate the fitness contribution or essentiality of each genetic component 23 simultaneously in a bacterial genome. Four TIS variations were published in 2009: 24 transposon sequencing (Tn-Seq), transposon-directed insertion site sequencing (TraDIS), 25 insertion sequencing (INSeq) and high-throughput insertion tracking by deep sequencing 26 (HITS). TIS has since become an important tool for molecular microbiologists, being one of 27 the few genome-wide techniques that directly links phenotype to genotype and ultimately can 28 assign gene function. In this review, we discuss the recent applications of TIS to answer 29 overarching biological questions. We explore emerging and multi-disciplinary methods that 30 build on TIS, with an eye towards future applications. 31 32 Introduction 33 Transposon-insertion sequencing (TIS) methods combine large-scale transposon mutagenesis 34 with next-generation sequencing to estimate the essentiality and/or fitness contribution of 35 each genetic feature in a bacterial genome simultaneously. A strength of TIS is that 36 experiments are performed with pooled transposon libraries, which allows direct linkage of 37 phenotype to genotype in a high-throughput manner. Ultimately TIS aims to elucidate the 38 function of each genomic feature and is therefore a critical tool to help interpret the mounting 39 levels of genome sequencing data being generated. TIS methods can be sensitive enough to 40 detect even minor changes in mutant fitness but also, with sufficient density, precise enough 41 to be able to assay not only genes, but intergenic regions, promoter regions, and essential 42 protein domains within coding regions. Four variations on the TIS method were published 43 simultaneously in 2009: transposon sequencing (Tn-Seq) 1, transposon-directed insertion site 44 sequencing (TraDIS) 2, insertion sequencing (INSeq) 3 and high-throughput insertion tracking 45 by deep sequencing (HITS)4. Since then, TIS has become a valuable tool in our molecular 46 biology toolkit, whose full utility is still being explored. 47 48 The basic TIS workflow is summarised in Figure 1. Briefly, it begins with construction of a 49 saturated mutant library (Fig 1A) by introducing a randomly-inserting transposon (Tn), 50 commonly a Tn5 or mariner Tn, into a strain of interest often using transformation or 51 conjugation. The goal is to create a population of bacteria where each cell carries a single Tn 2 52 insertion in the genome, and each genetic component is disrupted multiple times at different 53 sites. By directly sequencing the transposon flanking regions of the initial library, potential 54 essential features can be identified as those that do not tolerate insertions. Alternatively, the 55 library can be subjected to a selective condition, for instance antibiotic stress (Fig 1B), in 56 order to query features involved in survival and growth within that environment. Such 57 conditionally important components are defined by insertions whose frequency significantly 58 changes in the population during the selection, determined by sequencing before and after 59 selection. Genomic features that have Tn insertions with a decrease in frequency over 60 experimental selection are assumed to be important for fitness in the test conditions; such 61 features could include antibiotic resistance genes during antibiotic selection, or virulence 62 factors in an infection model. Features where insertions show an increase in frequency are 63 assumed to have an advantageous effect in the test conditions; including negative regulators 64 or metabolically-costly systems not necessary in those conditions. 65 66 There are 4 major TIS versions that differ in various steps of their sequencing procedures (see 67 REF5 for more detail on these variations). For example, the way DNA is fragmented for 68 library preparation varies: two methods (Tn-Seq and INSeq) use the type II restriction 69 enzyme MmeI to yield uniform-length shorter reads, which can remove PCR amplification 70 bias, while the other two (TraDIS and HITS) use random-sized shearing via sonication, that 71 can have the advantage of improved Tn mapping owing to longer reads. Similarly, Tn-Seq 72 and INSeq exclusively use the mariner Tn, which inserts into TA (thymine-adenine 73 dinucleotide) but otherwise does not have a sequence preference, and the others have the 74 flexibility that they can employ any Tn, but commonly use Tn5 as it is commercially 75 available and does not have a insertion site bias. After fragmentation, various adaptors are 76 added, and transposon–genome junctions are amplified and sequenced with a sequencing 77 primer either facing out of the Tn or adapter. Finally, mapping of the adjacent genomic DNA 78 allows the exact position of each transposon in the bacterial genome to be determined using 79 appropriate bioinformatic tools (see the “Developments in TIS data analysis” section below). 80 81 Since the last comprehensive reviews on TIS5,6 in 2013, a range of exciting and multi- 82 disciplinary methods that build on TIS have emerged to answer increasingly complex 83 biological questions. These include scaling TIS analysis to hundreds of different conditions 84 using high-throughput phenotyping, the use of machine learning to predict bacterial survival 85 outcomes, and combining transposon-insertion sequencing with cutting-edge techniques from 3 86 single-cell analysis (droplet Tn-Seq) and fluorescence sorting (TraDISort). Analysis tools 87 have also evolved to cope with this increase in complexity of TIS studies. Lastly, a broad 88 range of in vitro and in vivo applications of TIS have been implemented in pathogenic, 89 commensal and environmental bacteria in the last decade. In this review, we discuss these 90 exciting developments and applications of TIS and present our vision for TIS into the future. 91 Detailed information on the experimental design of TIS experiments, including choice of Tn 92 and statistical impacts of experimental parameters, comparisons of TIS method variations, 93 limitations of standard TIS and details on applications prior to 2013, can be found in previous 94 reviews 7–9. 95 96 Advances and extensions of TIS methods 97 Over the last decade, TIS methods have been developed to incorporate other technologies and 98 techniques to answer complex biological questions in creative ways. These include physical 99 separation and sorting of individual mutant cells, using inducible promoters to study essential 100 genes and scaling of current techniques to simultaneously screen multiple environments and 101 different species enabling pan-organism analysis (see Fig 2). 102 103 Beyond growth-based selection approaches. A major recent advance of TIS is based on the 104 ability to separate mutants by their physical characteristics, rather than selecting solely on 105 growth. The simplest forms of this have adapted classical microbiological assays to the 106 massive multiplexing enabled by TIS (Fig 2A). For example, motility genes can be assayed 107 by “racing” mutant libraries across agar plates and comparing mutants in the inner (less 108 motile) to the outer (more motile) populations. This approach has been applied to Escherichia 109 coli ST13110 and Pseudomonas aeruginosa PA14 11, leading to the identification of known 110 motility genes, like those encoding common bacterial motors (flagella, fimbria and pili), in 111 addition to new candidates. Similarly, Density-TraDISort 12 combines TraDIS and density 112 gradient centrifugation to separate mutants based on their density (Fig 2B) and identify genes 113 involved in bacterial capsule production, a major virulence factor for many pathogens. In this 114 study, 78 genes underlying capsule production were identified across two clinically relevant 115 Klebsiella pneumoniae strains12. 116 117 The application of cell sorting to TIS has led to the development of techniques that progress 118 from bulk separation to separation at the level of single cells. One such application is 4 119 TraDISort, which combines fluorescence-activated cell sorting (FACS) and TraDIS13 and 120 sorts single cells based on fluorescence. TraDISort has used the cytosolic concentration of 121 ethidium bromide (EthBr), a fluorescent DNA intercalating agent, as a marker for altered 122 efflux activity (Fig 2C). For instance, mutants with insertions in efflux pump genes, such as 123 amvA had reduced ability to remove EthBr from the cell, resulting in an overall
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages47 Page
-
File Size-