CRISPR System Acquisition and Evolution of an Obligate Intracellular
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/028977; this version posted October 13, 2015. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 CRISPR system acquisition and evolution of an obligate intracellular 2 Chlamydia-related bacterium 3 4 Claire Bertelli1,2, Ousmane Cissé1, Brigida Rusconi1, Carole Kebbi-Beghdadi1, Antony Croxatto1, 5 Alexander Goesmann3, François Collyn1, Gilbert Greub°1 6 1 Center for Research on Intracellular Bacteria, Institute of Microbiology, University Hospital Center 7 and University of Lausanne, Lausanne, Switzerland, 2 SIB Swiss Institute of Bioinformatics, Lausanne, 8 Switzerland 3 Bioinformatics and Systems Biology, Justus-Liebig-University Giessen, Gießen, Germany 9 10 °Corresponding author: 11 Gilbert Greub, MD PhD 12 Institute of Microbiology 13 University of Lausanne 14 1011 Lausanne 15 SWITZERLAND 16 Phone : +41-21-314 49 79 17 Fax : +41-21-341 40 60 18 e-mail : [email protected] 19 20 Running Title: The Protochlamydia CRISPR system 21 22 bioRxiv preprint doi: https://doi.org/10.1101/028977; this version posted October 13, 2015. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 23 ABSTRACT 24 Recently, a new Chlamydia-related organism, Protochlamydia naegleriophila KNic, was discovered 25 within a Naegleria amoeba. To decipher the mechanisms at play in the modeling of genomes from 26 the Protochlamydia genus, we sequenced de novo the full genome of Pr. naegleriophila combining 27 the advantages of two second-generation sequencing technologies. The assembled complete 28 genome comprises a 2,885,111 bp chromosome and a 145,285 bp megaplasmid. For the first time 29 within the Chlamydiales order, a CRISPR system, the immune system of bacteria, was discovered on 30 the chromosome. It is composed of a small CRISPR locus comprising eight repeats and the associated 31 cas and cse genes of the subtype I-E. A CRISPR locus was also found within Chlamydia sp. Diamant, 32 another Pr. naegleriophila strain whose genome was recently released, suggesting that the CRISPR 33 system was acquired by a common ancestor of these two members of Pr. naegleriophila, after the 34 divergence from Pr. amoebophila. The plasmid encodes an F-type conjugative system similar to that 35 found in the Pam100G genomic island of Pr. amoebophila suggesting an acquisition of this 36 conjugative system before the divergence of both Protochlamydia species and the integration of a 37 putative Pr. amoebophila plasmid into its main chromosome giving rise to the Pam100G genomic 38 island. Overall, this new Pr. naegleriophila genome sequence enables to investigate further the 39 dynamic processes shaping the genomes of Chlamydia-related bacteria. 40 bioRxiv preprint doi: https://doi.org/10.1101/028977; this version posted October 13, 2015. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 41 INTRODUCTION 42 A large diversity prevails in the order Chlamydiales, as suggested by the discovery of a large number 43 of Chlamydia and Chlamydia-related bacteria belonging to nine different families (Greub 2010; 44 Everett et al. 1999; Horn 2011) and the cross-examination of metagenomics data (Lagkouvardos et al. 45 2014). The family Parachlamydiaceae comprises five genera that are each represented by a small 46 number of isolated strains. The genus Protochlamydia was lately enriched by the isolation of a 47 Naegleria endosymbiont that presented 97.6% identity in the 16S rRNA with Pr. amoebophila UWE25 48 and was thus named Pr. naegleriophila strain KNic (Casson et al. 2008). Since other members of the 49 Parachlamydiaceae family were suspected to be associated with lung infections (Greub 2009), a 50 diagnostic PCR specific for Pr. naegleriophila was developed and applied to bronchoalveolar lavages. 51 Pr. naegleriophila DNA was detected in the bronchoalveolar lavage of an immunocompromised 52 patient with pneumonia by two PCRs targeting different genomic regions and the presence of the 53 bacterium in the sample was confirmed by direct immunofluorescence (Casson et al. 2008). These 54 results indicated a potential role of Pr. naegleriophila in lower respiratory tract infections. 55 A recent study including Chlamydia genomes and other members of the Planctomycetes- 56 Verrucomicrobia-Chlamydia superphylum suggested that the branch leading to Chlamydia was 57 shaped mainly by genome reduction and evidenced limited occurrence of gene birth, duplication and 58 transfer (Kamneva et al. 2012), as it is the case in other strict intracellular pathogens (Darby et al. 59 2007). On the contrary, the occurrence of large families of paralogs in the genome of Chlamydia- 60 related bacteria suggested an evolution by extensive gene duplication (Domman et al. 2014; Eugster 61 et al. 2007). The chromosome sequence of Pr. amoebophila UWE25 exhibited little evidence for the 62 occurrence of lateral gene transfer (Horn et al. 2004). However, a number of probable lateral gene 63 transfers were identified between Parachlamydia and other amoeba-infecting bacteria such as 64 Legionella (Gimenez et al. 2011), a process that may take place within the amoeba itself (Bertelli & 65 Greub 2012). The Pr. amoebophila genome presents a genomic island (Pam100G) that encodes a 66 type IV secretion system of the F-type that might be involved in conjugative DNA transfer (Greub et 67 al. 2004). A similar system was found on the plasmid of Simkania negevensis (Collingro et al. 2011) 68 and a partial operon was described in Parachlamydia acanthamoebae (Greub et al. 2009), suggesting 69 active DNA transfer capabilities in the ancestor of the Chlamydiales and some of its descendants. 70 Small interspaced repetitions were initially observed in E. coli (Ishino et al. 1987) and they were then 71 named CRISPR, an acronym for Clustered Regularly Interspaced Short Palindromic Repeats (Jansen et 72 al. 2002). Although found in 50% of bacteria and in 90% of archaea (Weinberger et al. 2012), a 73 CRISPR system had never been previously reported in a member of the order Chlamydiales bioRxiv preprint doi: https://doi.org/10.1101/028977; this version posted October 13, 2015. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 74 (Makarova et al. 2011). The CRISPR locus usually consists of a variable number (up to 587) of 23-47bp 75 repeats with some dyad symmetry, but not truly palindromic, interspaced by 21-72 bp spacers 76 (Horvath & Barrangou 2010). Associated with these repeats are 2 core cas genes and additional 77 subtype-specific genes putatively providing mechanistic specificity (Koonin & Makarova 2013). 78 Similarity between spacers and extrachromosomal elements first suggested a role in immunity 79 against phage infection and more generally against conjugation or transformation by acquisition of 80 external DNA (Bolotin et al. 2005). The CRISPR-Cas system was shown to mediate an antiviral 81 response thus inducing resistance to phage infection (Deveau et al. 2010), notably in E. coli (Brouns 82 et al. 2008). More recently, CRISPR-Cas systems were shown to regulate stress-related response, 83 changing gene expression and virulence traits in several pathogens among which is the intracellular 84 bacteria Francisella novicida (Sampson & Weiss 2014; Louwen et al. 2014). 85 In this contribution, we sequenced and analyzed the complete genome of Pr. naegleriophila strain 86 KNic and discovered two potentially antagonistic systems, a type IV secretion system likely implicated 87 in conjugative DNA transfer and a CRISPR system that generally controls foreign DNA acquisition. 88 Furthermore, the complete genome sequence of a new species within the genus Protochlamydia 89 offered the possibility to look into the genome dynamics throughout evolution by comparing Pr. 90 naegleriophila KNic gene content and genome architecture to its closest relatives of the family 91 Parachlamydiaceae. 92 93 RESULTS 94 Chromosome features and evolution 95 Pr. naegleriophila KNic possesses a 2’885’090 bp circular chromosome with a mean GC content of 96 42.7%. The genome size and the GC content are surprisingly high compared to the most closely- 97 related species, Pr. amoebophila (Table 1), but it is consistent with its closest relative Chlamydia sp. 98 Diamant, another Pr. naegleriophila strain (hereafter referred to as Pr. naegleriophila Diamant). The 99 chromosome of Pr. naegleriophila strain KNic is predicted to encode 2,415 proteins and exhibits four 100 ribosomal operons and 43 tRNAs, more than any other Chlamydiales (Table 1). Two types of spacers 101 are found between the 16S and the 23S rRNA: either a simple intergenic spacer or a spacer 102 containing two tRNAs for Ala and Ile. 103 The cumulative G versus C nucleotide bias (GC skew) presents a typical pyramidal shape (Figure S1) 104 that is expected in the absence of particular large genomic islands and confirms the assembly 105 accuracy. The GC skew of Pr. naegleriophila is smoother than that of Pr. amoebophila, and does not bioRxiv preprint doi: https://doi.org/10.1101/028977; this version posted October 13, 2015. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.