<<

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

FEBS Letters 586 (2012) 939–945

journal homepage: www.FEBSLetters.org

Crystal structure of Cmr2 suggests a nucleotide cyclase-related in type III CRISPR-Cas systems ⇑ Xing Zhu a,b, Keqiong Ye b,

a College of Life Sciences, Beijing Normal University, Beijing 100875, China b National Institute of Biological Sciences, 7 Science Park Road, Beijing 102206, China

article info abstract

Article history: CRISPR RNAs (crRNAs) mediate sequence-specific silencing of invading viruses and plasmids in pro- Received 6 January 2012 karyotes. The crRNA–Cmr complex cleaves complementary RNA. We report the crystal struc- Revised 15 February 2012 ture of Pyrococcus furiosus Cmr2 (Cas10), a component of this Cmr complex and the signature Accepted 20 February 2012 protein in type III CRISPR systems. The structure reveals a nucleotide cyclase domain with a set of Available online 28 February 2012 conserved catalytic residues that associates with an unexpected deviant cyclase domain like dimeric Edited by Christian Griesinger cyclases. Additionally, two helical domains resemble the thumb domain of A-family DNA polymer- ase and Cmr5, respectively. Our results suggest that Cmr2 possesses novel enzymatic activity that remains to be elucidated. Keywords: CRISPR Ó 2012 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved. Nucleotide cyclase X-ray crystallography

1. Introduction from invading DNAs is acquired and integrated into the 50 leader- end of CRISPR arrays as a new repeat-spacer unit [13]. Cas1 and Clustered regularly interspaced short palindromic repeat Cas2, universal to all CRISPR-Cas systems, are likely involved in (CRIPSR) loci and CRIPSR-associated (Cas) constitute an spacer acquisition [14,15]. In the second expression phase, CPISPR adaptive and inheritable immunity system against the invasion arrays are transcribed into precursor CRISPR RNAs (pre-crRNAs), of viruses and conjugative plasmids in prokaryotes (see reviews which are then cleaved at repeat regions by site-specific endonu- [1–6]). This defense system is widespread and present in approxi- cleases (Cas6 and its variants) into small crRNAs [16–23]. In the mately 40% of and 81% of archaea. CRIPSR loci are com- third interference stage, the guide crRNAs associate with Cas pro- posed of arrays of direct repeats (23–47 bp) separated by teins to form effector complexes that neutralize matching DNA variable DNA sequences called spacers (30–40 bp) that are derived or RNA. Genetic and biochemical evidence has indicated that from invader genetic elements [7–9]. Approximately 50 cas gene DNA is targeted in several organisms [22,24,25]. The Escherichia families located near CRISPR loci in various organisms have been coli Cascade (CRISPR-associated complex for antiviral defense) documented and are proposed to be involved in various steps in complex (subtype I-E) is the best-studied DNA-targeting effector CRISPR-based defense pathways [5,10–12]. According to an up- complex that binds complementary DNA and induces its degrada- dated classification, CRISPR-Cas systems are grouped into three tion by Cas3 [22,26–28]. distinct types (I, II and III) and ten subtypes [5]. Each type and sub- The only known RNA-targeting effector complex has been puri- type is characterized by the presence of a signature Cas protein; for fied from Pyrococcus furiosus and consists of a crRNA and proteins example, Cas3, Cas9 and Cas10 are the signature proteins for Cmr1, Cmr2 (Cas10), Cmr3, Cmr4, Cmr5 and Cmr6 [29]. These com- types I, II and III, respectively. ponent proteins belong to the RAMP (repeat-associated mysterious The CRISPR-based defense system generally works in three protein) module [11], which was renamed the type III-B CRISPR- phases. In the first adaption phase, a short sequence (proto-spacer) Cas system [5]. P. furiosus crRNAs are either 39 or 45 nucleotides (nt) in length, and the two variants share a 8-nt 50-handle derived 0 Abbreviations: CRISPR, clustered regularly interspaced short palindromic repeat; from the repeat sequence but differ in their 3 -ends [20,29]. Cas, CRISPR-associated; crRNA, CRISPR RNA; NC, nucleotide cyclase; AC, adenylate The crRNA–Cmr complex cleaves complementary RNA at a fixed cyclase; DGC, diguanylate cyclase; c-di-GMP, bis-(30–50)-cyclic di-guanosine 14-nt distance from the 30-end of crRNA [29]. The structural monophosphate organization of the Cmr complex and the function of its individual ⇑ Corresponding author. Fax: +86 10 80728592. components are unknown. E-mail address: [email protected] (K. Ye).

0014-5793/$36.00 Ó 2012 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.febslet.2012.02.036 940 X. Zhu, K. Ye / FEBS Letters 586 (2012) 939–945

Cmr2 is the largest protein in the Cmr complex and is essential space group P212121 with unit cell dimensions of a = 73.1 Å, for the RNA-guided RNA cleavage activity [29]. Cmr2 is homolo- b = 87.0 Å and c = 137.7 Å. The asymmetric unit contains one mol- gous to Cms1 in subtype III-B and to unclassified Csx11 [30]; these ecule of Cmr2. related proteins were recently renamed Cas10, which is the signa- ture protein for type III CRISPR-Cas systems [5]. The signature pro- 2.3. Structural determination tein of subtype I-D, Cas10d, is degenerate to Cas10. Moreover, Cas10 has been proposed to play an important role in the evolution The crystal structure was determined by single-wavelength of most CRISPR-Cas systems [30]. Cas10 has been denoted CRISPR anomalous diffraction using a Se derivative dataset collected at because of its homology to the palm domain of nucle- the Se-peak wavelength to 3.6 Å resolution. Eight heavy atoms, otide cyclases (NCs) and DNA [31]. Despite the impor- including seven of a total ten Se atoms and one zinc atom, were tant role of Cmr2 for the functioning of the Cmr complex and the identified with SHELXD [34]. Phases were calculated and solvent evolution of the CRISPR-Cas systems, its structure and function re- modified using SHARP [35]. The initial experimental electron den- main unknown. sity map was used to build approximately 50% of total residues In this study, we determined the crystal structure of Cmr2, the including a complete D3 domain. The sequence register was deter- first structure of a member of the Cas10 superfamily. The major mined based on the positions of seven visible Se-Met residues and part of Cmr2 structure resembles an adenylate cyclase dimer and bulky side chains. The phases from partial structural models were includes one cyclase domain with a set of putative catalytic resi- combined with the experimental phases during density modifica- dues and another highly deviant cyclase domain. Additionally, tion to calculate new electron density maps, which showed im- Cmr2 has two helical bundle domains that are similar to the thumb proved quality and allowed for building 83% of the residues in domain of A-family DNA polymerase and Cmr5, respectively. The the final model. The model was built in COOT [36] and refined in crystal structure strongly suggests that Cmr2 is an enzyme that Refmac [37] and Phenix [38]. The current model comprises Cmr2 acts on nucleotides or nucleic acids. residues 205–340, 349–375, 417–435, 446–448, 451–537, 545– 613, 638–698, 707–780, 786–818 and 824–871 and 1 zinc ion. 2. Materials and methods RAMPAGE analysis showed that 90.5% of the residues are in most favored regions, 8.9% in allowed regions and 0.6% in outlier regions. 2.1. Gene cloning, protein expression and purification Structural figures were generated with PyMOL [39]. The coordi- nates and structural factors have been deposited in the Protein The DNA sequence of Cmr2 (residues 195–871) was PCR-ampli- Data Bank with accession number 4DOZ. fied from genomic DNA of P. furiosus DSM 3638 and cloned into a pET28a vector with a C-terminal His6-tag. The recombinant protein was expressed in E. coli BL21 (DE3) strains in LB broth containing 3. Results and discussion

50 lg/ml of kanamycin. The cells were grown to an OD600 of 0.7, and protein expression was induced with 0.3 mM isopropyl-b-D- 3.1. Structure determination and overall structure thiogalactopyranoside for 16 h at 18 °C. The cells were pelleted, resuspended in buffer A (20 mM sodium pH 7.0, We constructed, purified and crystallized a large fragment of P. 500 mM sodium chloride) supplemented with 0.1 mM phenyl- furiosus Cmr2 (residues 195–871, hereafter referred to as Cmr2) methylsulfonyl fluoride, and then disrupted in a high-pressure that lacks the N-terminal putative HD nuclease domain (Fig. 1A). JN-3000 PLUS cell disruptor (JNBIO). After clarification by centrifu- The HD domain is absent in some Cmr2 and Csm1 homologs. The gation, the supernatant was heated at 75 °C for 15 min. The clari- crystal belonged to space group P212121 with one Cmr2 fied supernatant was supplemented with 20 mM imidazole and in the asymmetric unit. The structure was solved by Se-phasing applied to a 5-ml HisTrap column (GE Healthcare). The column and refined to an Rwork/Rfree of 0.245/0.313 at 3.1 Å resolution was washed with 50 ml of 50 mM imidazole in buffer A and the (Table 1, Fig. 2A). Many peripheral structural elements displayed protein was eluted with 500 mM imidazole in buffer A. Fractions weak electron density due to crystallographic disorder or intrinsic containing the target protein were pooled, concentrated to 1.5 ml flexibility. In the current model, 17% of the residues were not and further purified by gel filtration with a Superdex S200 16/60 modeled due to poor or missing electron density. column (GE Healthcare) equilibrated in buffer B (20 mM HEPES- The Cmr2 structure displays a triangle shape with overall Na pH 7.0 and 150 mM NaCl). The protein was concentrated to dimensions of 35 Å 68 Å 87 Å (Fig. 2B). The structure can be 30 mg/ml by an ultrafiltration device and stored at 80 °C. The divided into four domains: D1 (residues 205–502), D2 (residues protein was labeled with selenomethionine in M9 medium by 503–592), D3 (residues 593–764) and D4 (residues 765–871) blocking methionine biosynthesis [32]. The Se-labeled protein (Fig. 2C–G). The D3 domain is located in the center of the structure was purified in the same way as the unlabeled protein except that and contacts the other three domains in three directions. We the protein solution was supplemented with 2 mM dithiothreitol searched the Protein Data Bank for structural homologs of the four (DTT) after the HisTrap elution step. individual domains of Cmr2 using the DALI server [40].

2.2. Crystallization and data collection 3.2. Cmr2 D1 and D3 are structurally similar to class III nucleotide cyclases Native and Se-labeled Cmr2 (195–871) were crystallized at 20 °C by the hanging drop vapor diffusion method with a mixture The central D3 domain consists of a mixed 6-stranded b-sheet of 1 ll of protein solution (30 mg/ml, 20 mM HEPES-Na pH 7.0, flanked by one or two a-helices on each side. D3 shares significant 150 mM NaCl) and 1 ll of reservoir solution (0.1 M Tris–Cl pH structural similarity with the catalytic domain of class III NCs (Z 8.5, 33% PEG 400, 0.2 M sodium citrate). The crystals were directly score 10.98.6) and the palm domain of many nucleic acid poly- frozen in liquid without additional cryoprotection. The merases such as Y- and A-family DNA polymerases that belong to native and selenium derivative datasets were collected at 100 K the RRM (RNA-recognition motif)-like fold (Z score 8.24) [41]. at the Shanghai Synchrotron Radiation Facility beamline BL17U This is consistent with the previous sequence-based prediction and processed with the HKL package [33]. The crystals belong to [31]. X. Zhu, K. Ye / FEBS Letters 586 (2012) 939–945 941

A

B

Fig. 1. Domain organization and sequence conservation of Cmr2. (A) Domain diagram of Cmr2. The N-terminal HD nuclease domain was not included in the construct for crystallization. (B) Sequence alignment of the Cas10 family. Sequences of 238 Cmr2 and Cms1 homologs were aligned using ClustalW. The sequences of Cmr2 from Pyrococcus furiosus DSM 3638 (gi 18977501, Pf) and Csm1 from Mycobacterium tuberculosis CPHL_A (gi 289448484, Mt) are displayed. Residues conserved in 98%, 80% and 50% of all aligned sequences are shaded in black, gray and light gray, respectively. The secondary structures observed in the Pf Cmr2 crystal structure are indicated on the top of the alignment. Dashed lines indicate disordered regions.

Table 1 phate (cGMP) from ATP and GTP, respectively. The cyclization Data collection and refinement statistics. reaction involves an in-line nucleophilic attack of the 30-OH onto Crystal form Se-labeled Native the a-phosphate of the same nucleotide. DGCs condense two GTP Data collection into bis-(30–50)-cyclic di- Space group P212121 P212121 (c-di-GMP), an important signaling molecule in prokaryotes. ACs Cell dimensions fold into a 7-stranded b-sheet packed by four a-helices (Fig. 2E) a, b, c (Å) 73.1, 87.0 137.7 73.4, 87.5, 137.1 a, b, c (°) 90, 90, 90 90, 90, 90 and further dimerize into a wreath-like structure with two or Wavelength (Å) 0.9794 0.9792 one active sites located in the central groove at the dimer interface X-ray source SSRF BL17U SSRF BL17U [43,44]. The palm domain of NCs, corresponding to the b1-a1-a2- Resolution range (Å) 303.6 (3.663.6) 303.1 (3.153.1) b2-b3-a3-b4 part of the AC structure, is conserved in many DNA Unique reflections 10577 16512 and RNA polymerases. Redundancy 13.3 (13.5) 6.7 (6.9) I/r 16.5 (11.0) 13.6 (4.7) Structure alignment of D3 with CyaC from cyanobacterium Spi- Completeness (%) 99.8 (100) 99.2(100) rulina platensis [45], a typical AC, yielded a root mean square devi- 0 Rmerge 0.147 (0.538) 0.103 (0.482) ation (rmsd) of 1.225 Å (24 Ca pairs). However, strands b4 and b6 Structure refinement of CyaC are missing in Cmr2 D3. Resolution range (Å) 123.1 (3.1753.1) No. reflections 15332 Surprisingly, the N-terminal D1 domain also exhibits weak No. atoms 4522 resemblance with AC (Z score 3.4) (Fig. 2C). Nevertheless, D1 is Mean B factor (Å2) 110.7 highly divergent from AC, contains many deletions and insertions Rwork 0.244 (0.365) and is degenerate at the , which accounts for why the Rfree 0.313 (0.400) structural homology is not recognizable from the sequence. Specif- Rmsd bond lengths (Å) 0.009 Rmsd bond angles (°) 1.283 ically, D1 retains all four helices of AC but a much-abbreviated b- sheet that is composed of three strands corresponding to strands Values for the data in the highest resolution bin are shown in parentheses. b3, b1 and b4 in the AC fold. In Cmr2 D1, strand b2 of AC is replaced by helix a4 and several structured loops, and strands b5, b6 and b7 of AC are missing. Compared to AC, Cmr2 D1 contains several extra Class III NCs constitute a large family of proteins, including structural elements, including an N-terminal helix a1, two helices adenylate cyclases (ACs), guanylate cyclases (GCs) and diguanylate a6 and a7 inserted between a5 and b3 and a Zn-ribbon motif that cyclases (DGCs) [42]. ACs and GCs are closely related in structure leads to the D2 domain (Fig. 2C). and synthesize the second messengers adenosine 30–50 cyclic In the Zn-ribbon, a mental ion, most likely a zinc ion, is sand- monophosphate (cAMP) and guanosine 30–50 cyclic monophos- wiched between the a8–a9 loop and the C-terminal part of helix 942 X. Zhu, K. Ye / FEBS Letters 586 (2012) 939–945

A B D2 D2

D1 180o D1

C

N D3 D3 D4 D4

C α9 DE C C α6 α10 Zn α17 α16 α1 α α2 2 α5 α18 α3 α7 α4 β2 β3 β β4 β α1 β β4 7 3 β β3 β5 6 β2 1 β α4 β α19 4’ 5 β7 β6 α8 β8 β9 N N α1 CyaC AC D1 D3 G H I G Cmr5 Cmr2-D4 α T7 pol thumb 20 α23 N Cmr2-D2 α α14 13 α22 α25 α15 N α α21 C 24 C α11 D4 α12 D2

Fig. 2. Crystal structure of Cmr2. (A) The 2fofc electron density map contoured at the 1.5 r level. (B) Ribbon representation of the Cmr2 structure, shown in two opposite views. The D1, D2, D3 and D4 domains are colored lime, cyan, wheat and orange, respectively. The zinc ion is shown as a yellow sphere. Dots denote disordered regions. (C) Structure of D1. Structural elements that have counterparts in AC structures are green, the Zn-ribbon is yellow and others are white. Secondary structures and the N- and C- termini are labeled. (D) Structure of D3. (E) Structure of one subunit of CyaC AC (PDB ID 1WC6) [45]. The secondary structure elements are labeled. The D1, D3 and CyaC structures are aligned to similar orientation. (G) Structure of D2. (H) Structure alignment of Cmr2 D2 and the thumb domain of T7 DNA polymerase (PDB ID 1T7P) [51]. (I) Structure of D4. (G) Structure alignment of Cmr2 D4 and Thermus thermophilus (Tt) Cmr5 (PDB ID 2ZOP) [46].

a10. The ion is coordinated by four thiol groups of C448, C451, and an inter-helix region that folds back to one side of the helices C478 and C481 with a tetrahedral geometry. The presence of the (Fig. 2G–H). The thumb domain interacts with the primer– metal ion was confirmed using an anomalous difference map cal- template duplex in polymerases. However, the functional implica- culated with data acquired for Se-labeled crystals at the peak tion of structural resemblance between Cmr2 D2 and thumb wavelength of Se. Notably, these four Zn-coordinating cysteines domains is unclear. are conserved in only approximately half of Cmr2 and Csm1 pro- The D4 domain adopts a six-helix bundle (Fig. 2I), which is teins (Fig. 1B), indicating that the Zn-ribbon motif is a variable homologous with structures of diverse functions, including the structural element. anticodon binding domain of glutamyl-tRNA synthase (Z score 5.5). The most relevant hits are two structures of Cmr5, another 3.3. Cmr2 D2 and D4 domains component in the Cmr complex, from Thermus thermophilus (PDB ID 2ZOP, Z score 4.4) [46] and Archaeoglobus fulgidus (PDB ID The D2 domain forms a helical bundle that remotely resembles 2OEB, Z score 4.1) (Fig. 2G). This structural homology suggests that the thumb domain of A-family DNA polymerases (Z score 3–4). Cmr2 D4 and Cmr5 may perform a similar function or interact with Both contain two antiparallel helices (a13 and a15 in Cmr2 D2) each other in the Cmr complex. X. Zhu, K. Ye / FEBS Letters 586 (2012) 939–945 943

A B α2 D1 D2 β Q230 D1 5 Y669 D674 G671 β6 G672 α8 D600 D673 β4 α16 D602 S711

D3 D3 K744 Cmr2 D4 C D D1 ATPαS

D1061 R1150 A α4’ D1017 N1146 B α8 Loop β6’-β7’ R1117 α D3 CyaC AC 1

F E D1

G368 E371 G369 D327 A E370 GTPαS α8

B S414 D329 D3

PleD DGC R446

G H

D1

Thumb Primer Template Template D2 Primer D475 A D654 dGTP

B dGTP Fingers Fingers D3 Palm T7 DNA polymerase

Fig. 3. The putative active site of Cmr2 compared with that of nucleotide cyclases and DNA polymerases. (A) Conserved surface of Cmr2. The orientation is the same as the right figure in Fig. 2B. The D1, D2, D3 and D4 domains are colored as in Fig. 2B. Invariant residues are red and residues with at least 80% conservation are yellow. Two ATP analogs are modeled based on the CyaC dimer structure (PDB ID 1WC6). The analogous active site in D1 is occluded for ligand binding. (B) The putative active site of Cmr2. Conserved residues are shown as sticks and labeled. (C) Comparison of Cmr2 with the activated CyaC AC structure in complex with the analog ATPaS and Mg2+ (PDB ID 1WC6) [45]. Cmr2 D3 is aligned with one subunit of the CyaC dimer, and equivalent helices between Cmr2 D1 and the other subunit of CyaC dimer are connected by lines. (D) A close-up view of the aligned active sites in (C). CyaC residues are labeled and Mg2+ ions are shown as green spheres. (E) Comparison of Cmr2 with the DGC domain of Pled in complex with the substrate analog GTPaS and Mg2+ (PDB ID 2V0N) [50]. Cmr2 D3 is aligned with the DGC domain. (F) A close-up view of the aligned active sites in E. Pled residues are labeled. (G) Comparison of Cmr2 with A-family T7 DNA polymerase (T7pol) in complex with a primer–template and a nucleoside triphosphate (PDB ID 1T7P) [51]. The N-terminal 30–50 exonuclease domain thioredoxin, and the insertion of the thumb to which thioredoxin binds are not displayed for T7pol. The primer (yellow) and template (magenta) are shown as surfaces. Cmr2 D3 is aligned with the palm domain. (H) A close-up view of the aligned active sites in (G). T7pol residues are labeled. 944 X. Zhu, K. Ye / FEBS Letters 586 (2012) 939–945

3.4. Putative active site of Cmr2 domain of DGC is monomeric at the resting state. During the con- densation reaction, two GTP-load DGC domains are thought to be Overall, Cmr2 and Csm1 proteins are fairly divergent in se- aligned such that the 30-OH of each GTP can make a nucleophilic quence. The highly (>80%) conserved residues are mainly distrib- attack on the a-phosphate of the other GTP [50]. However, such uted in the central D3 domain (Fig. 1B). The conserved buried a pre- state has not yet been captured in a crystal struc- residues constitute the hydrophobic core, whereas the conserved ture. In the Cmr2 structure, the active cyclase domain D3 already surface residues are primarily clustered around the nucleotide- associates with the degenerate D1 domain and hence is unlikely binding pocket analogous to that in NCs and polymerases to perform a reaction that requires two active sites and yields a (Fig. 3A). This strongly suggests that Cmr2 is an active enzyme that product with dyad symmetry. employs this pocket for substrate binding and catalysis. Based on Cmr2 proteins have been termed CRISPR polymerases due to sequence conservation consideration, the candidate catalytic resi- homology to the palm domain of polymerases [31]. However, com- dues of Cmr2 include the invariant residues D600, D673, D674 parison with DNA polymerase structures suggests that the Cmr2 and S711 and the highly conserved residues Q230, D602, Y669 structure is incompatible with a template-dependent polymerase and K744 (Figs. 1B and 3B). activity (Fig. 3G–H). All polymerases contain the thumb, palm Residues D600 and D673 correspond to two invariant acidic res- and fingers domains that arrange like a right hand [47]. First, idues at the active site of NCs and polymerases (Fig. 3C–H). These Cmr2 apparently lacks an equivalent of the fingers domain, which all employ a two-metal-ion mechanism to catalyze phos- makes important contacts with the incoming nucleotide and the phoryl transfer reactions [47,48]. The two acidic residues coordi- base to which it pairs. The D1 domain that constitutes one face nate two Mg2+ ions, generally referred to as ions A and B, which of the active site pocket might fulfill the role of fingers domain in are involved in substrate binding and activation. The conserved binding the nucleotide. Alternatively, other components in the configuration of the two acidic residues suggests that the reaction Cmr complex might provide a fingers domain in trans. Second, catalyzed by Cmr2 is likely dependent on a two-metal-ion Cmr2 D2 shows interesting resemblance with the thumb domain mechanism. of A-family DNA polymerases, but it is not in a competent position In addition to similar domain structures, Cmr2 and ACs are re- to contact the primer–template. Last, Cmr2 D1 stands on the path lated at the quaternary structure level. ACs form obligatory dimers of the template strand and would occlude the template. These fea- because their active sites are located at the dimer interface and tures argue against that Cmr2 is a canonical template-dependent formed by both subunits. The two cyclase domains of Cmr2 associ- polymerase, but suggest that Cmr2 may catalyze a phosphoryl ate with each other with an extensive interface, which buries transfer reaction in a template-independent manner. 1471 Å2 of solvent accessible area per subunit comparable to 2 1770 Å in the CyaC AC dimer [45], and in an orientation roughly 4. Conclusion similar to that in AC dimers (Fig. 3C). However, the cyclase dimer interface is significantly different between Cmr2 and AC. In AC di- The structure of Cmr2 bears significant similarity to dimeric cy- mers, the N-terminal part of b1, the C-terminal part of b4 and b40 clase structures with a likely enzymatically active and a second form a long arm wrapping around the other subunit, contributing highly degenerate cyclase domain that associate with each other. critically to the dimer interface; however, a similar arm is absent The features of the active site suggest that Cmr2 catalyzes a phos- in both cyclase domains of Cmr2. In terms of domain architecture, phoryl transfer reaction that requires a 2-metal-ion mechanism, Cmr2 is similar to mammalian membrane-bound ACs that contain but seems be distinct from the reactions that ACs, DGCs and DNA an active and an inactive cyclase domain in a single polypeptide polymerases catalyze. The exact substrate and catalyzed reaction that form an intramolecular dimer. of Cmr2 remain to be elucidated. In each active site of AC dimers, one subunit coordinates two 2+ Mg ions and the other subunit (denoted by prime) contributes Acknowledgements to substrate binding and transition state stabilization [45,48]. For example, in CyaC, the conserved N1146 and R1150 residues within We thank the staff at the Shanghai synchrotron Radiation Facil- 0 helix a4 contact the ribose and phosphate moiety of the substrate ity beamline BL17U for data collection support. This work was sup- (Fig. 3D). By contrast, helix a8 in Cmr2 D1, the equivalent of CyaC ported by the Ministry of Science and Technology of China through 0 helix a4 , is distant from the nucleotide-binding site and unlikely National Basic Research Programs (973 Programs) 2010CB835402 binds the nucleotide. Moreover, Cmr2 D1 lacks the equivalent and 2012CB910900 and the Beijing Municipal Government. b60–b70 loop of AC, which covers the nucleotide-binding site (Fig. 3D). The substrate-binding pocket of Cmr2 is more open com- References pared to AC and may be able to accept large substrates.

The other predicted catalytic residues of Cmr2 are not con- [1] Terns, M.P. and Terns, R.M. (2011) CRISPR-based adaptive immune systems. served in ACs and polymerases, but interestingly are conserved Curr. Opin. Microbiol. 14, 321–327. in DGCs at three positions. First, Cmr2 is characterized by an [2] Marraffini, L.A. and Sontheimer, E.J. (2010) CRISPR interference. RNA-directed invariant tetrapeptide GGDD that constitutes a turn connecting adaptive immunity in bacteria and archaea. Nat. Rev. Genet. 11, 181–190. [3] Deveau, H., Garneau, J.E. and Moineau, S. (2010) CRISPR/Cas system and its role strands b5 and b6 in D3 (Figs. 1B and 3B), whereas DGCs contain in phage-bacteria interactions. Annu. Rev. Microbiol. 64, 475–493. a signature GG(D/E)EF motif at the equivalent position (DGC is also [4] van der Oost, J., Jore, M.M., Westra, E.R., Lundgren, M. and Brouns, S.J. (2009) known as the GGDEF domain) [49]. Second, Cmr2 residues S711 CRISPR-based adaptive and heritable immunity in prokaryotes. Trends Biochem. Sci. 34, 401–407. and D602 interact with each other by forming side chain hydrogen [5] Makarova, K.S. et al. (2011) Evolution and classification of the CRISPR-Cas bonds. A similar interacting Ser-Asp pair is present in DGCs. Third, systems. Nat. Rev. Microbiol. 9, 467–477. the equivalent residue of Cmr2 K744 is conserved as a basic residue [6] Bhaya, D., Davison, M. and Barrangou, R. (2011) CRISPR-Cas systems in bacteria and archaea: versatile small rnas for adaptive defense and regulation. in DGCs and contacts the c-phosphate of GTP [50]. The Cmr2-cat- Annu. Rev. Genet. 45, 273–297. alyzed reaction may be more related to c-di-GMP formation than [7] Pourcel, C., Salvignol, G. and Vergnaud, G. (2005) CRISPR elements in Yersinia single nucleotide cyclization. pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies. Microbiology 151, 653–663. However, Cmr2 is unlikely to synthesize c-di-GMP. The conden- [8] Mojica, F.J., Diez-Villasenor, C., Garcia-Martinez, J. and Soria, E. (2005) sation of two GTP molecules into c-di-GMP requires the formation Intervening sequences of regularly spaced prokaryotic repeats derive from of two intermolecular 30–50 phosphodiester bonds. The catalytic foreign genetic elements. J. Mol. Evol. 60, 174–182. X. Zhu, K. Ye / FEBS Letters 586 (2012) 939–945 945

[9] Bolotin, A., Quinquis, B., Sorokin, A. and Ehrlich, S.D. (2005) Clustered regularly [29] Hale, C.R., Zhao, P., Olson, S., Duff, M.O., Graveley, B.R., Wells, L., Terns, R.M. interspaced short palindrome repeats (CRISPRs) have spacers of and Terns, M.P. (2009) RNA-Guided RNA Cleavage by a CRISPR RNA-Cas extrachromosomal origin. Microbiology 151, 2551–2561. Protein Complex. Cell 139, 945–956. [10] Makarova, K.S., Grishin, N.V., Shabalina, S.A., Wolf, Y.I. and Koonin, E.V. (2006) [30] Makarova, K.S., Aravind, L., Wolf, Y.I. and Koonin, E.V. (2011) Unification of Cas A putative RNA-interference-based immune system in prokaryotes: protein families and a simple scenario for the origin and evolution of CRISPR- computational analysis of the predicted enzymatic machinery, functional Cas systems. Biol. Direct. 6, 38. analogies with eukaryotic RNAi, and hypothetical mechanisms of action. Biol. [31] Makarova, K.S., Aravind, L., Grishin, N.V., Rogozin, I.B. and Koonin, E.V. (2002) Direct. 1, 7. A DNA repair system specific for thermophilic Archaea and bacteria predicted [11] Haft, D.H., Selengut, J., Mongodin, E.F. and Nelson, K.E. (2005) A guild of 45 by genomic context analysis. Nucleic Acids Res. 30, 482–496. CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes [32] Van Duyne, G.D., Standaert, R.F., Karplus, P.A., Schreiber, S.L. and Clardy, J. exist in prokaryotic genomes. PLoS Comput. Biol. 1, e60. (1993) Atomic structures of the human immunophilin FKBP-12 complexes [12] Jansen, R., Embden, J.D., Gaastra, W. and Schouls, L.M. (2002) Identification of with FK506 and rapamycin. J. Mol. Biol. 229, 105–124. genes that are associated with DNA repeats in prokaryotes. Mol. Microbiol. 43, [33] Otwinowski, Z. and Minor, W. (1997) Processing of X-ray diffraction data 1565–1575. collected in oscillation mode. Methods Enzymol. 276, 307–326. [13] Barrangou, R., Fremaux, C., Deveau, H., Richards, M., Boyaval, P., Moineau, S., [34] Sheldrick, G.M. (2008) A short history of SHELX. Acta Crystallogr. A 64, 112– Romero, D.A. and Horvath, P. (2007) CRISPR provides acquired resistance 122. against viruses in prokaryotes. Science 315, 1709–1712. [35] Vonrhein, C., Blanc, E., Roversi, P. and Bricogne, G. (2007) Automated structure [14] Wiedenheft, B., Zhou, K., Jinek, M., Coyle, S.M., Ma, W. and Doudna, J.A. (2009) solution with autoSHARP. Methods Mol. Biol. 364, 215–230. Structural basis for DNase activity of a conserved protein implicated in [36] Emsley, P. and Cowtan, K. (2004) Coot: model-building tools for molecular CRISPR-mediated genome defense. Structure 17, 904–912. graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132. [15] Beloglazova, N. et al. (2008) A novel family of sequence-specific [37] Murshudov, G.N., Vagin, A.A., Lebedev, A., Wilson, K.S. and Dodson, E.J. (1999) endoribonucleases associated with the clustered regularly interspaced short Efficient anisotropic refinement of macromolecular structures using FFT. Acta palindromic repeats. J. Biol. Chem. 283, 20361–20371. Crystallogr. D Biol. Crystallogr. 55, 247–255. [16] Wang, R., Preamplume, G., Terns, M.P., Terns, R.M. and Li, H. (2011) Interaction [38] Adams, P.D. et al. (2010) PHENIX: a comprehensive Python-based system for of the Cas6 Riboendonuclease with CRISPR RNAs: recognition and cleavage. macromolecular structure solution. Acta Crystallogr. D Biol. Crystallogr. 66, Structure 19, 257–264. 213–221. [17] Sashital, D.G., Jinek, M. and Doudna, J.A. (2011) An RNA-induced [39] DeLano, W.L. (2002) The PyMOL user’s manual, Delano Scientific, San Carlos, conformational change required for CRISPR RNA cleavage by the CA, USA. endoribonuclease Cse3. Nat. Struct. Mol. Biol. 18, 680–687. [40] Holm, L. and Sander, C. (1993) Protein structure comparison by alignment of [18] Gesner, E.M., Schellenberg, M.J., Garside, E.L., George, M.M. and Macmillan, distance matrices. J. Mol. Biol. 233, 123–138. A.M. (2011) Recognition and maturation of effector RNAs in a CRISPR [41] Aravind, L., Mazumder, R., Vasudevan, S. and Koonin, E.V. (2002) Trends in interference pathway. Nat. Struct. Mol. Biol. 18, 688–692. protein evolution inferred from sequence and structure analysis. Curr. Opin. [19] Haurwitz, R.E., Jinek, M., Wiedenheft, B., Zhou, K. and Doudna, J.A. (2010) Struct. Biol. 12, 392–399. Sequence- and structure-specific RNA processing by a CRISPR endonuclease. [42] Sinha, S.C. and Sprang, S.R. (2006) Structures, mechanism, regulation and Science 329, 1355–1358. evolution of class III nucleotidyl cyclases. Rev. Physiol. Biochem. Pharmacol. [20] Hale, C., Kleppe, K., Terns, R.M. and Terns, M.P. (2008) Prokaryotic silencing 157, 105–140. (psi)RNAs in Pyrococcus furiosus. RNA 14, 2572–2579. [43] Zhang, G., Liu, Y., Ruoho, A.E. and Hurley, J.H. (1997) Structure of the adenylyl [21] Carte, J., Wang, R., Li, H., Terns, R.M. and Terns, M.P. (2008) Cas6 is an cyclase catalytic core. Nature 386, 247–253. endoribonuclease that generates guide RNAs for invader defense in [44] Tesmer, J.J., Sunahara, R.K., Gilman, A.G. and Sprang, S.R. (1997) Crystal prokaryotes. Genes Dev. 22, 3489–3496. structure of the catalytic domains of in a complex with [22] Brouns, S.J., Jore, M.M., Lundgren, M., Westra, E.R., Slijkhuis, R.J., Snijders, A.P., Gsalpha.GTPgammaS. Science 278, 1907–1916. Dickman, M.J., Makarova, K.S., Koonin, E.V. and van der Oost, J. (2008) Small [45] Steegborn, C., Litvin, T.N., Levin, L.R., Buck, J. and Wu, H. (2005) Bicarbonate CRISPR RNAs guide antiviral defense in prokaryotes. Science 321, 960–964. activation of adenylyl cyclase via promotion of catalytic active site closure and [23] Tang, T.H., Bachellerie, J.P., Rozhdestvensky, T., Bortolin, M.L., Huber, H., metal recruitment. Nat. Struct. Mol. Biol. 12, 32–37. Drungowski, M., Elge, T., Brosius, J. and Huttenhofer, A. (2002) Identification of [46] Sakamoto, K., Agari, Y., Agari, K., Yokoyama, S., Kuramitsu, S. and Shinkai, A. 86 candidates for small non-messenger RNAs from the archaeon (2009) X-ray crystal structure of a CRISPR-associated RAMP superfamily Archaeoglobus fulgidus. Proc. Natl. Acad. Sci. USA 99, 7536–7541. protein, Cmr5, from Thermus thermophilus HB8. Proteins 75, 528–532. [24] Garneau, J.E., Dupuis, M.E., Villion, M., Romero, D.A., Barrangou, R., Boyaval, P., [47] Steitz, T.A. (1999) DNA polymerases: structural diversity and common Fremaux, C., Horvath, P., Magadan, A.H. and Moineau, S. (2010) The CRISPR/ mechanisms. J. Biol. Chem. 274, 17395–17398. Cas bacterial immune system cleaves bacteriophage and plasmid DNA. Nature [48] Tesmer, J.J., Sunahara, R.K., Johnson, R.A., Gosselin, G., Gilman, A.G. and Sprang, 468, 67–71. S.R. (1999) Two-metal-ion catalysis in adenylyl cyclase. Science 285, 756–760. [25] Marraffini, L.A. and Sontheimer, E.J. (2008) CRISPR interference limits [49] Chan, C., Paul, R., Samoray, D., Amiot, N.C., Giese, B., Jenal, U. and Schirmer, T. horizontal gene transfer in staphylococci by targeting DNA. Science 322, (2004) Structural basis of activity and allosteric control of diguanylate cyclase. 1843–1845. Proc. Natl. Acad. Sci. USA 101, 17084–17089. [26] Wiedenheft, B., Lander, G.C., Zhou, K., Jore, M.M., Brouns, S.J., van der Oost, J., [50] Wassmann, P., Chan, C., Paul, R., Beck, A., Heerklotz, H., Jenal, U. and Schirmer, Doudna, J.A. and Nogales, E. (2011) Structures of the RNA-guided surveillance T. (2007) Structure of BeF3- -modified response regulator PleD: implications complex from a bacterial immune system. Nature 477, 486–489. for diguanylate cyclase activation, catalysis, and feedback inhibition. Structure [27] Jore, M.M. et al. (2011) Structural basis for CRISPR RNA-guided DNA 15, 915–927. recognition by Cascade. Nat. Struct. Mol. Biol. 18, 529–536. [51] Doublie, S., Tabor, S., Long, A.M., Richardson, C.C. and Ellenberger, T. (1998) [28] Sinkunas, T., Gasiunas, G., Fremaux, C., Barrangou, R., Horvath, P. and Siksnys, Crystal structure of a bacteriophage T7 DNA replication complex at 2.2 A V. (2011) Cas3 is a single-stranded DNA nuclease and ATP-dependent helicase resolution. Nature 391, 251–258. in the CRISPR/Cas immune system. EMBO J. 30, 1335–1342.