Recent Mobility of Casposons, Self-Synthesizing Transposons at the Origin of the CRISPR-Cas Immunity
Total Page:16
File Type:pdf, Size:1020Kb
Recent Mobility of Casposons, Self-Synthesizing Transposons at the Origin of the CRISPR-Cas Immunity. Mart Krupovic, Sergey Shmakov, Kira S Makarova, Patrick Forterre, Eugene V Koonin To cite this version: Mart Krupovic, Sergey Shmakov, Kira S Makarova, Patrick Forterre, Eugene V Koonin. Recent Mobility of Casposons, Self-Synthesizing Transposons at the Origin of the CRISPR-Cas Immunity.. Genome Biology and Evolution, Society for Molecular Biology and Evolution, 2016, 8 (2), pp.375-86. 10.1093/gbe/evw006. hal-01443928 HAL Id: hal-01443928 https://hal.archives-ouvertes.fr/hal-01443928 Submitted on 13 Feb 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution - NonCommercial| 4.0 International License GBE Recent Mobility of Casposons, Self-Synthesizing Transposons at the Origin of the CRISPR-Cas Immunity Mart Krupovic1,*, Sergey Shmakov2,3,KiraS.Makarova2, Patrick Forterre1, and Eugene V. Koonin2 1Unite´ Biologie Mole´culaire Du Ge`ne Chez Les Extreˆmophiles, Department of Microbiology, Institut Pasteur, Paris, France 2National Library of Medicine, National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 3Skolkovo Institute of Science and Technology, Skolkovo, Russia *Corresponding author: E-mail: [email protected]. Accepted: January 8, 2016 Abstract Casposons are a superfamily of putative self-synthesizing transposable elements that are predicted to employ a homolog of Cas1 protein as a recombinase and could have contributed to the origin of the CRISPR-Cas adaptive immunity systems in archaea and bacteria. Casposons remain uncharacterized experimentally, except for the recent demonstration of the integrase activity of the Cas1 homolog, and given their relative rarity in archaea and bacteria, original comparative genomic analysis has not provided direct indications of their mobility. Here, we report evidence of casposon mobility obtained by comparison of the genomes of 62 strains of the archaeon Methanosarcina mazei. In these genomes, casposons are variably inserted in three distinct sites indicative of multiple, recent gains, and losses. Some casposons are inserted into other mobile genetic elements that might provide vehicles for horizontal transfer of the casposons. Additionally, many M. mazei genomes contain previously undetected solo terminal inverted repeats that apparently are derived from casposons and could resemble intermediates in CRISPR evolution. We further demonstrate the sequence specificity of casposon insertion and note clear parallels with the adaptation mechanism of CRISPR-Cas. Finally, besides identifying additional representatives in each of the three originally defined families, we describe a new, fourth, family of casposons. Key words: casposons, self-synthesizing transposons, CRISPR-Cas, mobile genetic elements, transposition. Transposons are a major type of MGE that move from one Introduction location in the host genome to another. Transposons are nat- The genomes of most bacteria, archaea, and eukaryotes con- urally divided into two classes (Wicker et al. 2007; Kapitonov tain multiple, integrated mobile genetic elements (MGEs), and Jurka 2008; Hua-Van et al. 2011; Pie´gu et al. 2015). Class such as transposons, proviruses, and integrative plasmids. In I includes retrotransposons which transpose via an RNA inter- many eukaryotes, in particular plants, the MGE-derived se- mediate that prior to integration is converted into the DNA quences comprise the majority of the genomic DNA. Most form by the transposon-encoded reverse transcriptase. Class II of these elements are “dead,” that is, inactive and disrupted consists of DNA transposons that are mobilized via the cut- to various extents (Kazazian 2004; Venner et al. 2009; Tollis and-paste mechanism, that is, excision of the transposon from and Boissinot 2012). The MGEs are less abundant in archaea its initial location and insertion into a new genomic locus. The and bacteria, conceivably due to the intense purifying selec- excision and insertion are catalyzed by an element-encoded tion that constrains the spread of selfish elements but never- transposase (or by a transposase supplied by another element, theless constitute up to 30% of some bacterial genomes in the case of nonautonomous transposons) and typically re- (Casjens 2003; Carle et al. 2010). The MGE insertion can be quire terminal inverted repeats (TIR) that flank the transposon. deleterious when an element disrupts an essential host gene, Class II transposons show remarkable diversity with respect to nearly neutral when it inserts into an intergenic region or a the specific mechanisms of transposition, the identity of the nonessential gene, or beneficial to the host, through the gain transposase, the element size, and gene content (Jurka et al. of new phenotypes, such as antibiotic resistance or toxin pro- 2007; Wicker et al. 2007; Hua-Van et al. 2011; Pie´gu et al. duction (Roberts and Mullany 2009; Cambray et al. 2010; 2015). Most transposases belong to the DDE superfamily, Carle et al. 2010; Hua-Van et al. 2011). named after the amino acid residues that form the catalytic ß The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Genome Biol. Evol. 8(2):375–386. doi:10.1093/gbe/evw006 Advance Access publication January 13, 2016 375 Krupovic et al. GBE triad (Rice and Baker 2001; Wicker et al. 2007; Hua-Van et al. closely parallel to the proposed scheme for the origin of the 2011), but some transposons possess transposases homolo- vertebrate adaptive immunity systems that involves a distinct gous to the rolling-circle replication initiation endonucleases family of Transib transposons (Kapitonov and Jurka 2005; (Ilyina and Koonin 1992; Chandler et al. 2013; Krupovic Kapitonov and Koonin 2015), suggesting that in the evolu- 2013), phage integrase-like tyrosine recombinases (Goodwin tionary arms race between pathogens and hosts, transposons et al. 2003; Goodwin and Poulter 2004), or serine integrases/ are regularly recruited as assault weapons for cellular defense invertases (Boocock and Rice 2013). (Koonin and Krupovic 2015b). A distinct group of Class II elements that is widespread in The casposons have been identified in a relatively small diverse unicellular and multicellular eukaryotes includes large number of archaeal and only a handful of bacterial genomes. (15–20 kb), self-synthesizing DNA transposons, known as In the absence of direct experimental evidence, the original Mavericks or Polintons (Kapitonov and Jurka 2006; comparative genomic analysis provided no specific evidence Feschotte and Pritham 2007; Pritham et al. 2007). These trans- that any of the casposons were active MGE. Here, we update posons encode their own protein-primed family B DNA poly- the genomic census of the casposons and take advantage of merase that is implicated in the transposon replication the expanding collection of microbial genomes to analyze the (Kapitonov and Jurka 2006). In addition, Polintons encode distribution of these elements in 62 strains of the archaeon several homologs of viral proteins that perform key functions Methanosarcina mazei. The results of this analysis reveal in virion morphogenesis, namely the genome packaging recent mobility of the casposons and yield additional clues ATPase and capsid maturation protease, and as recently for the casposon involvement in the evolution of CRISPR-Cas. shown, also major and minor capsid proteins (Krupovic et al. 2014a). Thus, these elements (that perhaps more appropri- Materials and Methods ately could be denoted polintoviruses) combine features of viruses and transposons (Krupovic and Koonin 2015), al- All genome sequences were downloaded from the NCBI se- though their virions and the conditions that trigger the pre- quence database. For comprehensive identification of cas1 dicted switch to the virus life style remain to be identified. genes, the TBLASTN program with the E-value cutoff of Polintons have apparently played a key role in the emergence 0.01 and low complexity filtering turned off (Altschul et al. of several groups of eukaryotic DNA viruses and plasmids 1997) was used to search the NCBI WGS (whole-genome (Koonin et al. 2015; Krupovic and Koonin 2015). short gun contigs) database using the Cas1 profile Until recently, Polintons remained the only known (su- (Makarova and Koonin 2015) as the query. Protein sequences per)family of self-synthesizing transposons. However, in the were searched against the nonredundant sequence database course of an in depth investigation of bacterial and archaeal at the NCBI using PSI-BLAST (Altschul et al. 1997; Marchler- genomic