Mobile Genetic Element Insertions Drive Antibiotic Resistance Across
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/527788; this version posted January 23, 2019. The copyright holder for this preprint (which was1 not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 Mobile genetic element insertions drive antibiotic 2 resistance across pathogens 3 Authors: Matthew G. Durrant1, Michelle M. Li1, Ben Siranosian1, Ami S. Bhatt1 4 Affiliations: 1) Department of Medicine (Hematology, Blood and Marrow Transplantation) and 5 Department of Genetics, Stanford University, Stanford, California, USA. 6 Abstract 7 Mobile genetic elements contribute to bacterial adaptation and evolution; however, detecting 8 these elements in a high-throughput and unbiased manner remains challenging. Here, we demonstrate a 9 de novo approach to identify mobile elements from short-read sequencing data. The method identifies 10 the precise site of mobile element insertion and infers the identity of the inserted sequence. This is an 11 improvement over previous methods that either rely on curated databases of known mobile elements or 12 rely on ‘split-read’ alignments that assume the inserted element exists within the reference genome. We 13 apply our approach to 12,419 sequenced isolates of nine prevalent bacterial pathogens, and we identify 14 hundreds of known and novel mobile genetic elements, including many candidate insertion sequences. 15 We find that the mobile element repertoire and insertion rate vary considerably across species, and that 16 many of the identified mobile elements are biased toward certain target seQuences, several of them being 17 highly specific. Mobile element insertion hotspots often cluster near genes involved in mechanisms of 18 antibiotic resistance, and such insertions are associated with antibiotic resistance in laboratory 19 experiments and clinical isolates. Finally, we demonstrate that mutagenesis caused by these mobile 20 elements contributes to antibiotic resistance in a genome-wide association study of mobile element 21 insertions in pathogenic Escherichia coli. In summary, by applying a de novo approach to precisely identify bioRxiv preprint doi: https://doi.org/10.1101/527788; this version posted January 23, 2019. The copyright holder for this preprint (which was2 not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 22 mobile genetic elements and their insertion sites, we thoroughly characterize the mobile element 23 repertoire and insertion spectrum of nine pathogenic bacterial species and find that mobile element 24 insertions play a significant role in the evolution of clinically relevant phenotypes, such as antibiotic 25 resistance. 26 Main 27 Genomic variation is critical for the adaptation of pathogenic bacteria. Successful human 28 pathogens, such as Escherichia coli and Staphylococcus aureus, are able to acquire adaptive phenotypes, 29 such as antibiotic resistance, through the acquisition of single nucleotide polymorphisms, small insertions 30 and deletions, inversions, duplications, and also through the movement of mobile genetic elements 31 (MGEs). Prokaryotic MGEs come in many forms such as insertion seQuence (IS) elements, transposons, 32 integrons, plasmids, and bacteriophages (Stokes and Gillings 2011; Rankin, Rocha, and Brown 2011). Some 33 of these elements can mobilize and insert themselves in a site-specific or random manner throughout the 34 host genome. They are of great interest to the scientific and medical communities, in large part because 35 they can transfer between microbes via horizontal gene transfer, spreading their genetic potential 36 between strains of the same species, and even across species (Juhas 2015; Bloemendaal, Brouwer, and 37 Fluit 2010). 38 IS elements are among the simplest MGEs, and typically code for only the gene (transposase) 39 necessary for their transposition (Mahillon and Chandler 1998). IS elements can transpose into genes, 40 resulting in insertional mutagenesis and loss of function of the gene (Figure 1f) (Emmanuelle Lerat and 41 Ochman 2004); alternatively, they can transpose into gene regulatory elements and influence expression 42 of neighboring genes (Figure 1f,g). IS elements play a role in antibiotic resistance (Linkevicius, Sandegren, 43 and Andersson 2013), horizontal gene transfer (Ochman, Lawrence, and Groisman 2000), and genome 44 evolution (Plague et al. 2017; Biémont and Vieira 2006; Barrick et al. 2009). In addition to minimal bioRxiv preprint doi: https://doi.org/10.1101/527788; this version posted January 23, 2019. The copyright holder for this preprint (which was3 not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 45 elements, such as insertion sequences, MGEs can contain additional “passenger” genes (Figure 1e). These 46 passenger genes can code for a variety of proteins, including virulence factors, antibiotic resistance genes, 47 detoxifying agents, and enzymes for secondary metabolism (Rankin, Rocha, and Brown 2011). 48 Historical estimates suggest that IS movements make up approximately 3% of acQuired mutations 49 in adaptive laboratory evolution (ALE) experiments (Conrad, Lewis, and Palsson 2011), but more recent 50 studies have found that the rate of IS mutagenesis can vary depending on growth conditions, such as 51 oxygen exposure (Finn et al. 2017). Under neutral conditions, the rate of IS element insertion in E. coli is 52 estimated as 3.5 x 10-4 per genome per generation under neutral laboratory conditions (Lee et al. 2016); 53 this IS insertion rate is about 1/3rd the rate of base substitution (Lee et al. 2012). Whereas the majority of 54 studies investigating IS insertion rates have been carried out in E. coli, the rate of IS insertions does vary 55 between strains and species (Knöppel et al. 2018; Lee et al. 2016). Functionally, IS elements have been 56 shown to play an important role in antibiotic resistance in studies of small collections of clinical isolates 57 (Olliver et al. 2005; Sun et al. 2016), but a larger survey of IS insertions in clinical isolates has not yet been 58 carried out. Our somewhat narrow understanding of the location and role of IS elements in bacterial 59 adaptation is likely the result of limited awareness of the role of these elements in adaptation, an 60 assumption that short reads cannot accurately identify such insertions, and the lack of flexible and 61 sensitive tools to identify such mutations. 62 Several approaches exist for identifying IS elements and cargo-carrying MGEs for both prokaryotic 63 and eukaryotic genomes (Barrick et al. 2014; E. Lerat 2010; Xie and Tang 2017; Treepong et al. 2018; Jiang 64 et al. 2015; Adams, Bishop, and Wright 2016; Biswas et al. 2015; Hawkey et al. 2015). For example, a 65 database-dependent approach has been used to identify novel IS elements through performing a BLAST 66 search on completed genome or draft genome assemblies (Siguier et al. 2006), and other approaches 67 relying on split-read alignments can identify MGE insertions with respect to a well-annotated reference 68 genome (Barrick et al. 2014). While these approaches are useful, most are limited due to their dependence bioRxiv preprint doi: https://doi.org/10.1101/527788; this version posted January 23, 2019. The copyright holder for this preprint (which was4 not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 69 on shared homology with known MGEs or genes, or by requiring well-annotated reference genomes using 70 sequenced isolates that closely resemble the reference. 71 Here, we sought to comprehensively identify complete MGEs, the genes they contained, and their 72 site of insertion with respect to a reference genome from short-read sequencing data. Our approach is 73 flexible, as it can use a database of MGEs when available, but it does not depend entirely on a database 74 of known MGEs and mobile genes. It is sensitive and precise enough to be used on both laboratory 75 samples and environmental isolates. We focus on MGE insertion sites, generate consensus sequences for 76 inserted elements from the clipped ends of locally-aligned reads, infer complete MGEs from sequence 77 assemblies, and build a de novo database of elements across all analyzed samples. We combine several 78 sequence inference approaches to identify large insertions, resulting in a highly sensitive and precise 79 overall approach. 80 By focusing on MGE insertion sites, we answer several Questions about these elements that have 81 not been thoroughly addressed in the past. For example, we determine the target-site specificity of each 82 of these MGEs, their level of activity relative to other MGEs within the same species, and the overall MGE 83 insertional capacity of the species of interest. Understanding the overall MGE potential helps us to 84 understand to what extent a bacterial species can adapt by means of MGE activity, either within a single 85 bacterial genome or by way of horizontal gene transfer. Additionally, we can identify genes that are 86 freQuently disrupted by MGE insertions within the population, hinting at selective pressures that may be 87 driving these adaptive insertion events 88 We use our de novo approach to analyze 12,419 sequenced isolates of nine pathogenic bacterial 89 species and demonstrate large differences in the overall MGE repertoire and the rate of MGE insertion 90 between species. We characterize MGE insertion hotspots across species and demonstrate that these 91 insertions alter the activity of genes in clinically relevant biological pathways, such as those related to 92 antibiotic resistance. We analyze MGE insertions in the context of adaptive laboratory evolution (ALE) bioRxiv preprint doi: https://doi.org/10.1101/527788; this version posted January 23, 2019.