CRISPR-Cas Systems and RNA-Guided Interference Rodolphe Barrangou∗
Total Page:16
File Type:pdf, Size:1020Kb
Advanced Review CRISPR-Cas systems and RNA-guided interference Rodolphe Barrangou∗ Clustered regularly interspaced short palindromic repeats (CRISPR) together with associated sequences (cas) form the CRISPR-Cas system, which provides adaptive immunity against viruses and plasmids in bacteria and archaea. Immunity is built through acquisition of short stretches of invasive nucleic acids into CRISPR loci as ‘spacers’. These immune markers are transcribed and processed into small noncoding interfering CRISPR RNAs (crRNAs) that guide Cas proteins toward target nucleic acids for specific cleavage of homologous sequences. Mechanistically, CRISPR-Cas systems function in three distinct stages, namely: (1) adaptation, where new spacers are acquired from invasive elements for immunization; (2) crRNA biogenesis, where CRISPR loci are transcribed and processed into small interfering crRNAs; and (3) interference, where crRNAs guide the Cas machinery to specifically cleave homologous invasive nucleic acids. A number of studies have shown that CRISPR-mediated immunity can readily increase the breadth and depth of virus resistance in bacteria and archaea. CRISPR interference can also target plasmid sequences and provide a barrier against the uptake of undesirable mobile genetic elements. These inheritable hypervariable loci provide phylogenetic information that can be insightful for typing purposes, epidemiological studies, and ecological surveys of natural habitats and environmental samples. More recently, the ability to reprogram CRISPR-directed endonuclease activity using customizable small noncoding interfering RNAs has set the stage for novel genome editing and engineering avenues. This review highlights recent studies that revealed the molecular basis of CRISPR-mediated immunity, and discusses applications of crRNA-guided interference. 2013 John Wiley & Sons, Ltd. How to cite this article: WIREs RNA 2013, 4:267–278. doi: 10.1002/wrna.1159 INTRODUCTION CRISPR acronym was coined in 2002, following the observation of similar structures in a number lustered regularly interspaced short palindromic of bacterial and archaeal genomes.2 These loci were repeats (CRISPR) are a relatively novel family C subsequently identified in a variety of bacterial of DNA repeats that was first discovered in 1987, and archaeal organisms, as genome drafts were in an intergenic region adjacent to the iap gene generated and became publicly available. In addition in the Escherichia coli K12 genome.1 This locus to coining the acronym, Jansen et al. established included an array of short identical 29-bp direct that these peculiar DNA repeat arrays were often repeats that were interspaced at regular intervals 2 by stretches of 32 nucleotides. Such loci seemed of associated with CRISPR-associated (cas) sequences. little interest for approximately 15 years, until the Over time, studies have shown that CRISPR-Cas systems generally occur in approximately 45% ∗ of bacterial and approximately 90% of archaeal Correspondence to: [email protected] genomes, as tracked by the CRISPRdb.3 Department of Food, Bioprocessing and Nutrition Sciences, North Carolina State University, Raleigh, NC, USA In 2005, three studies nearly simultaneously Conflict of interest: RB is a co-inventor on several patents related established that seemingly random spacer sequences to CRISPR use and applications. actually showed homology to foreign genetic elements Volume 4, May/June 2013 2013 John Wiley & Sons, Ltd. 267 Advanced Review wires.wiley.com/rna such as viruses and plasmids.4–6 This subsequently with the ability to form hairpin structures. Repeats led to the hypothesis that CRISPR-Cas systems may are separated by stretches of unique sequences of be an RNA-mediated immune system, based on comparable length, called ‘spacers’.2 These sequences the presence of several key elements that would are derived from foreign genetic elements such as carry out the necessary functionalities.7 In 2007, plasmids and viruses, and these proto-spacers23 are the first experimental evidence of CRISPR-Cas typically adjacent to a short (2–5 nt) conserved proto- adaptive immunity was provided in Streptococcus spacer adjacent motif (PAM).24–25 While most loci are thermophilus, with evidence that novel spacers were relatively modest in size (∼30 repeats, representing acquired in a polarized manner following phage ∼1.6 kb), they can be large in some systems (up to exposure, with the ability to provide phage resistance nearly 500).3 Although most chromosomes carry 1–2 in a Cas-dependent manner.8 Shortly thereafter, it loci, there are extremophiles that carry 20 CRISPR was shown in E. coli that CRISPR immunity is loci belonging to different CRISPR-Cas systems.3,22 mediated by small noncoding interfering CRISPR Cas proteins constitute a highly genetically poly- RNAs (crRNAs) that guide the CRISPR-associated morphic and functionally diverse family which is complex for antiviral defense (Cascade).9 It was also involved in the various steps of CRISPR-mediated shown in Staphylococcus epidermidis that CRISPR immunity.11–22 There are several distinct Cas protein can provide immunity against plasmid DNA.10 families that have been established, and their number, These milestone studies established that CRISPR-Cas occurrence, distribution, and organization are highly systems provide adaptive immunity against phages variable. Recently, three CRISPR-Cas types have been and plasmids via crRNA-guided interference, and established,22 based on cas phylogeny and differences set the stage for the characterization of the CRISPR in the molecular mechanism of action. Although the molecular mechanism of action in functional CRISPR- most widely distributed functional domain character- Cas model systems. In the recent past, several extensive istic of Cas proteins is the RNA recognition motif or focused reviews have summarized the rapidly (RRM), several Cas families include DNA binding, expanding literature on CRISPR-Cas systems.11–21 RNA binding, helicase and nuclease motifs, reflecting Here, the core elements of CRISPR-Cas systems are their involvement in nucleic acid transactions. presented, the main system types are compared and Cas1 and Cas2, which are universally distributed contrasted, and applications of these RNA-driven across the three types, have been implicated in molecular machines are discussed. the spacer acquisition process.26–29 Cas1 specifically exhibits nuclease activity against single stranded and branched DNA such as Holiday junctions and CRISPR-Cas SYSTEMS replication forks, and may be implicated in the 26 While CRISPR-Cas systems occur in nearly half addition of novel repeats and/or spacers. It is of bacteria and most of archaea, there are vast generally believed that both Cas1 and Cas2 are differences in their distribution, number, size, and involved in novel spacer acquisition (through PAM- components across systems and organisms.11–21 associated foreign DNA sampling), novel repeat Several studies have investigated the many elements synthesis and repeat-spacer insertion at the leader 29 that constitute CRISPR-Cas systems, leading to a new end, as recently reviewed. In addition to these two classification which established three main types and universal cas genes, signature genes define each of the 10 subtypes, based on genetic content, and differences three main CRISPR-Cas types, namely cas3, cas9 and 22 in their crRNA processing and interference molecular cas10 for types I, II, and III, respectively. underpinnings.22 The three main CRISPR-Cas types Overall, CRISPR-Cas systems function in three reflect the origin and evolutionary paths that cas genes distinct stages: (1) adaptation, where new spacers and the proteins they encode have experienced, and are acquired from invasive elements and integrated illustrate their structural and functional differences. into the CRISPR locus for immunization; (2) crRNA biogenesis, where CRISPR loci are transcribed and processed into small interfering crRNAs; and Core Elements of CRISPR-Cas Systems (3) interference, where crRNAs guide the Cas The most peculiar feature of CRISPR-Cas systems machinery to specifically cleave homologous invasive is the array of identical DNA repeats that are nucleic acids. regularly separated by spacers (Figure 1). The CRISPR The leader sequence, generally defined as an repeat sequence is typically short (27–50 nt), with A/T-rich, noncoding stretch of nucleotides located most systems carrying repeats around 30 bp.11–22 upstream of the first CRISPR repeat, and often down- The repeat sequence is often partially palindromic, stream of the last cas gene in the CRISPR operon, 268 2013 John Wiley & Sons, Ltd. Volume 4, May/June 2013 WIREs RNA CRISPR-Cas systems and RNA-guided interference FIGURE 1| CRISPR-Cas elements and mechanism of action. The various elements that constitute CRISPR-Cas systems are graphically depicted, including cas genes (with the universal cas1 and cas2), the leader (L), spacers (boxes), repeats (diamonds). Adaptive immunity is build through acquisition of new spacers at the leader end, which are derived from invasive DNA (here, phage DNA). In the expression stage, the repeat-spacer array is then transcribed as a full-length pre-crRNA, which is processed by cleavage into mature crRNAs. The mature RNAs then form a ribonucleoprotein complex with Cas proteins to mediate interference by guiding endonucleases toward homologous nucleic acid sequences. acts as a promoter for the transcription of the pre- of target DNA.9 Another key protein in