DNA binding induces a cis-to-trans switch in Cre to enable intasome assembly

Aparna Unnikrishnana, Carlos Amerob, Deepak Kumar Yadava, Kye Stachowskia, Devante Pottera, and Mark P. Fostera,1

aDepartment of Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210; and bLaboratorio de Bioquímica y Resonancia Magnética Nuclear, Centro de Investigaciones Químicas, Instituto de Investigación en Ciencias Básicas y Aplicadas, Universidad Autónoma del Estado de Morelos, 62209 Cuernavaca, Morelos, Mexico

Edited by G. Marius Clore, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, Bethesda, MD, and approved August 14, 2020 (receivedfor review June 4, 2020) Mechanistic understanding of DNA recombination in the Cre-loxP assembly, and allosteric control of DNA cleavage. The overall system has largely been guided by crystallographic structures of mechanism of Cre-mediated DNA recombination has been mapped tetrameric synaptic complexes. Those studies have suggested a role out via many biochemical and crystallographic studies (5, 15–19). for protein conformational dynamics that has not been well charac- The catalyzes recombination between a pair of homologous terized at the atomic level. We used solution nuclear magnetic res- 34 bp loxP sites (Fig. 1C) via a highly orchestrated series of events onance (NMR) spectroscopy to discover the link between intrinsic involving recognition and binding of a pair of Cre molecules to its flexibility and function in . Transverse relaxation- semipalindromic recognition site containing two RBEs, “synapsis” optimized spectroscopy (TROSY) NMR spectra show the N-terminal of an antiparallel pair of Cre dimers bound to loxP to form a stable and C-terminal catalytic domains (CreNTD and CreCat)tobestructur- tetrameric structure, and coordinated cleavage of two opposing ally independent. Amide 15N relaxation measurements of the CreCat DNA strands to produce an intermediate with two of the Cre domain reveal fast-timescale dynamics in most regions that exhibit protomers covalently attached to a 3′-phosphate via a conformational differences in active and inactive Cre protomers in linkage, followed by strand transfer and religation to form a four- crystallographic tetramers. However, the C-terminal helix αN, impli- way DNA intermediate (5). Twofold asymmetry cated in assembly of synaptic complexes and regulation of DNA in the synaptic structures (Fig. 1A) is implicated in regulation of trans – cleavage activity via protein protein interactions, is unexpect- DNA strand cleavage activity and thereby influences the order of BIOPHYSICS AND COMPUTATIONAL BIOLOGY edly rigid in free Cre. Chemical shift perturbations and intra- and strand cleavage and direction by which the Holliday junction is re- intermolecular paramagnetic relaxation enhancement (PRE) NMR solved (17, 20, 21). data reveal an alternative autoinhibitory conformation for the αN Although much is known about the overall mechanism of the region of free Cre, wherein it packs in cis over the protein DNA reactions catalyzed by Cre and related site-specific , binding surface and . Moreover, binding to loxP DNA in- the nature of the conformational changes in the protein and duces a conformational change that dislodges the C terminus, result- protein–DNA complexes that facilitate the various steps in the ing in a cis-to-trans switch that is likely to enable protein–protein pathway remain poorly understood (16, 17, 19, 22–25). Cre binds interactions required for assembly of recombinogenic Cre intasomes. its target sequences by forming a C-shaped clamp with a C-terminal These findings necessitate a reexamination of the mechanisms by catalytic domain (CreCat) possessing the namesake tyrosine residue which this widely utilized gene-editing tool selects target sites, avoids on one side of the DNA and on the other an N-terminal domain spurious DNA cleavage activity, and controls DNA recombination efficiency. Significance

Cre recombinase | enzyme dynamics | autoinhibition | paramagnetic The Cre-loxP system is a gene-editing tool that has enabled relaxation enhancement (PRE) NMR transformative advances in immunology, neuroscience, and cardiovascular research. Still, off-target activities confound re- ite-specific DNA recombinases represent an attractive option search results and present obstacles to biomedical applications. Sfor engineering—the insertion or exchange of genes Overcoming those limitations requires understanding the steps into precise locations in chromosomes (1–4). The tyrosine recom- leading to assembly of recombination complexes, intasomes. binase family of phage-derived (e.g., λ-, Cre, and We measured the magnetic properties of nuclei in the Flp recombinases), which evolved to facilitate viral infection, gene backbone of the enzyme to correlate its intrinsic dynamics with transposition, and bacterial pathogenesis, have proven useful for its function in DNA recognition and cleavage. We found that in applications that include DNA subcloning without restriction en- the absence of DNA the C terminus of Cre appears to block the zymes and conditional expression of target genes (5–7). These DNA binding surface and active site of the enzyme. Binding to proteins bind specifically to pairs of inverted short palindromic loxP DNA induces a conformational switch that would enable DNA sequences (recombinase binding elements; RBEs) and me- the intermolecular protein–protein interactions required for diate recombination by assembly of tetrameric intasomes (Fig. 1A assembly of recombinogenic Cre intasomes. and B) which perform concerted DNA strand cleavage, exchange, Author contributions: A.U., C.A., K.S., D.K.Y., and M.P.F. designed research; A.U., C.A., and ligation reactions (5, 8, 9). Compared to other genome engi- D.K.Y., K.S., and D.P. performed research; A.U., C.A., and D.P. contributed new reagents/ neering approaches (10, 11), site-specific DNA recombinases have analytic tools; A.U., C.A., D.K.Y., K.S., and M.P.F. analyzed data; and A.U., K.S., and M.P.F. the advantage that they allow a high degree of specificity, do not wrote the paper. require involvement of additional host-encoded factors, and are The authors declare no competing interest. capable of generating cleanly integrated double-stranded DNA This article is a PNAS Direct Submission. products (3, 12–14). These advantages motivate efforts to under- Published under the PNAS license. stand and manipulate the mechanisms of this family of enzymes. 1To whom correspondence may be addressed. Email: [email protected]. Cre (Causes Recombination) is the best-studied member of the This article contains supporting information online at https://www.pnas.org/lookup/suppl/ tyrosine recombinase family, yet fundamental gaps exist in our doi:10.1073/pnas.2011448117/-/DCSupplemental. understanding of its mechanism of DNA site selection, intasome

www.pnas.org/cgi/doi/10.1073/pnas.2011448117 PNAS Latest Articles | 1of10 Downloaded by guest on September 24, 2021 ABimportant in controlling its function, both in mediating tetramer assembly and for regulating protomer activity (26–32). Despite the importance of understanding the basis for conformational differ- ences in Cre, structural insights have been largely limited to crystal structures of isosteric tetrameric synaptic complexes (5, 16, 17, 19). We have used solution NMR spectroscopy to explore the link between the intrinsic dynamic behavior of Cre and its function. First, NMR spectra of full-length Cre and of CreCat support the premise that the N- and C-terminal domains of Cre are struc- turally uncoupled. Nuclear spin relaxation experiments reveal flexibility in regions of CreCat that are associated with both protein–protein and protein–DNA interactions. Unexpectedly, we found that the region of the protein comprising C-terminal helix αN is not highly dynamic, contrary to the expectation from C crystal structures that it would be extended in solution (5, 16, 17, 19). Instead, NMR chemical shift perturbations (CSPs) and paramagnetic relaxation enhancement (PRE) experiments show the C terminus of unbound Cre to be located in the DNA- binding active site, in an apparent autoinhibitory conformation. D Upon binding to loxP DNA, the data show that the C terminus is displaced, after which it is able to participate in fledgling inter- molecular protein–protein interactions with other Cre mole- cules. These findings shed light on the previous paradoxical reports of trans interactions regulating Cre activity (33) and call for a revised view of the reaction mechanism, highlighting the role of protein dynamics in regulating conservative site-specific DNA recombination. Results NMR Spin Relaxation Shows Unexpected Rigidity in the Cre C-Terminal Fig. 1. Conformational changes underlie coordinated DNA recombination Helix αN. To explore the role of protein dynamics in enabling the by Cre recombinase. (A) Tetrameric synaptic complex of four Cre protomers interconversion of Cre reaction intermediates in solution, we bound to two loxP DNA duplexes (PDB ID code 2HOI), viewed from the expressed and purified full-length Cre (38.5 kDa, residues 1 to catalytic domain CreCat. C-terminal helix αN of each protomer packs into a 343) and a C-terminal fragment encoding the interdomain linker trans surface cavity (labeled in blue) in the adjacent protomer. (B) Super- along with the C-terminal domain (23.8 kDa, residues 127 to 343). Cat “ ” “ ” position of Cre domains from inactive (red) and active (gray) Cre This domain, herein termed CreCat (Fig. 1 B and D), includes all protomers (PDB ID code 2HOI, chains A and B, respectively) highlight dif- ferences in the β2–3 loop, helix αM, αM–N loop, and C-terminal helix αN. (C) seven active-site residues (R173, E176, K201, H289, R292, W315, loxP DNA and the loxP half-site hairpin DNA sequence used in these studies; and Y324) and the Cre-DNA binding specificity determinants arrows indicate cleavage sites in full-length loxP DNA; bases in bold repre- (R259 and E262) as well as the C-terminal region implicated in sent the recombinase binding elements. (D) Domain map of Cre recombi- loxP DNA binding cooperativity and synapsis (Fig. 1A)(17, nase. Residues 1 to 126 (blue) comprise the N-terminal domain (CreNTD). The 33–37). Both full-length Cre and CreCat are predominantly mo- interdomain linker and C-terminal domain comprise the CreCat construct Cat nomeric in the absence of DNA, as indicated by their elution times (residues 127 to 343) used in these studies (pink). Sequence of Cre is shown in size-exclusion chromatography (SI Appendix,Fig.S1A). The with corresponding secondary structure elements from synaptic complex Cat crystal structure (PDB ID code 2HOI, chain B); residues G342 and D343 are not oligomeric state of free Cre was further verified using the pulsed- modeled in electron density maps. Active-site and DNA-binding specificity field gradient stimulated echo NMR experiment (SI Appendix,Fig. determinant residues are red and cyan, respectively. Backbone amide NMR S1B) (38). The translational diffusion coefficient value was −6 2 chemical shift resonance assignments were made for all underlined residues. measured to be 0.84 × 10 cm /s, while the predicted value was − 0.8301 × 10 6 cm2/s, based on hydrodynamic calculations using the coordinates from Protein Data Bank (PDB) ID code 1CRX, chain Cat NTD A (39). This indicated that free Cre was indeed monomeric in (Cre ), connected by an extended eight-amino-acid linker 1 15 (Fig. 1). A series of sequential interprotomer protein–protein in- solution. Comparison of two-dimensional (2D) H- N transverse relaxation-optimized spectroscopy–heteronuclear single quantum teractions across the tetrameric synapse distinguish the structures of coherence spectroscopy (TROSY-HSQC) correlation spectra of theactiveandinactiveprotomers.WithinCreCat, the tyrosine res- CreCat and full-length Cre showed that the well-dispersed signals idue that serves as the nucleophile to catalyze phosphoryl transfer Cat overlay well (SI Appendix,Fig.S1C). This indicates that Cre and formation of the covalent intermediate is located on the pen- NTD α α – α folds independently in solution and that Cre does not interact ultimate helix M, while the M N loop and C-terminal helix N with CreCat in solution, thereby justifying structural and dynamics “ ” from each protomer makes a trans contact (in a clockwise manner studies of the isolated domain. as viewed in Fig. 1A) to the neighboring protomer in the tetrameric For backbone resonance assignments of CreCat we expressed β – α 15 15 13 complex. The 2 3 loop of each protomer abuts its own helix M, and purified the [U- N]– and [U- N, C]–labeled protein, and and in the tetramer packs proximally to helix αM of the neigh- recorded HSQC- and TROSY-based double- and triple-resonance boring protomer, in the opposite direction compared to helix αN. NMR spectra. Homogeneity and integrity of protein constructs Structural asymmetry in tetrameric structures is localized to these were verified by denaturing gel electrophoresis and mass spec- two regions of the protein (Fig. 1 A and B). The asymmetry in the trometry. We obtained backbone resonance assignments for 195 of C-terminal and interprotomer interfaces observed in the synaptic 214 nonproline amides (>91%) (Fig. 1D and SI Appendix,Figs.S2 complexes of Cre and other tyrosine recombinase family members and S3). Unassigned amide resonances include the two N-terminal suggests that the intrinsic dynamic behavior of the enzymes is residues (A127 and G128) and a few residues in the loop regions of

2of10 | www.pnas.org/cgi/doi/10.1073/pnas.2011448117 Unnikrishnan et al. Downloaded by guest on September 24, 2021 CreCat, due to line broadening from intermediate conformational examined the effect of deleting the C-terminal residues on the exchange. NMR spectra. We constructed a C-terminal deletion lacking To probe the fast-timescale backbone dynamics in CreCat,we 13 C-terminal residues including helix αN: CreCatΔC (residues 127 15 1 15 measured N NMR (R1,R2) and { H}- N heteronuclear NOE to 330), obtained backbone resonance assignments from triple- (HetNOE) relaxation data (Fig. 2). The data show that CreCat resonance NMR data, and compared the 1H-15N TROSY-HSQC backbone amides largely exhibit uniform relaxation rates, with an spectra of wild-type (WT) CreCat and CreCatΔC(Fig.3A and SI overall rotational correlation time (τc) of 13.2 ± 0.9 ns, com- Appendix,Fig.S4). Large CSPs were observed for residues in the puted from the trimmed-mean R2/R1 ratios (40). Distinctly lower interdomain linker region (Q133 to A136), helices αH, αK, αM, and 1 15 R2/R1 ratios and decreased { H}- N HetNOE values indicating the αJ–K loop (Fig. 3A). Each of the regions in the protein core that fast (picosecond to nanosecond) internal motions were observed showlargeCSPsisinexcessof20ÅfromthecenterofhelixαNin for the linker residues through A134 preceding αF, the β2–3 the synaptic structures. These findings are consistent with a cis loop, αJ–K loop, αM–N loop, and a few residues after the docking arrangement of the C terminus, and with the restricted C-terminal helix αN. The β2–3 loop forms interprotomer con- internal motions inferred from 15N relaxation studies. tacts in Cre-DNA synaptic tetramers and adopts distinct con- Remarkably, when the CSPs for CreCatΔC are mapped to the formations between the catalytically active and inactive forms of crystal structure of the protein (Fig. 3B), the CSPs do not map to Cre (Fig. 1 A and B) (5). The data show the αJ–K loop to be the synaptic trans docking surface of Cre. The region of the flexible in free CreCat although it adopts similar conformations in interdomain linker that shows large CSPs (Q133 to A136) is the "active" and "inactive" states in crystal structures, consistent adjacent to the DNA-binding surface and was found to be largely with a role in DNA binding, not in regulating recombination (5). rigid on the picosecond to nanosecond timescale (Fig. 2). Helix Surprisingly, although residues bracketing the C-terminal helix αH contains active site residues R173 and E176, αJ–K loop and αN exhibit evidence of fast internal motions, the relaxation rates helix αK contain the active site residues H289 and R292, and for the helix αN (A334 to L339) are close to the average values, helix αM contains the active site residue Y324. indicating that it tumbles with the catalytic core of the protein. These CSPs show that deletion of the C-terminal region alters This suggests that instead of being extended in solution, as might the environment of residues surrounding the active site of Cre be expected from the trans conformations observed in tetrameric and part of its DNA-binding surface. To ensure that the CSPs crystal structures, αN might adopt a distinct conformation in the observed in these experiments are not an artifact of working with free protein. CreCat, we compared spectra between full-length Cre and the same C-terminal deletion: CreΔC (residues 1 to 330). An overlay 1 15 Truncation of C-Terminal Helix αΝ Results in Widespread CSPs in the of H- N TROSY-HSQC spectrum of full-length Cre on that of BIOPHYSICS AND COMPUTATIONAL BIOLOGY Core of CreCat. In tetrameric DNA-bound crystal structures of CreΔC showed the same CSPs for residues from CreCat,and Cre, the C-terminal helix αN of each protomer packs in trans in a minimal perturbation of signals attributed to CreNTD (SI Appen- cyclic (nonreciprocal) manner into a surface cavity on a neigh- dix,Fig.S5). Although CSPs can be due to either direct or indirect boring protomer composed of residues from αF, αG, αH, β2–3 interactions (42, 43), the CreCatΔC CSP data suggest that the loop, αI, αK, and αL regions (Fig. 1A). Since Cre is monomeric C-terminal residues in CreCat pack in cis over the active site and in the absence of DNA, the Van Duyne group had previously DNA-binding surface of CreCat, not over the trans docking surface. used cross-linking experiments to test the hypothesis that in free Cre the C-terminal region folds back to dock over its own trans PRE-NMR Data Support Cis Docking of C-Terminal Helix αΝ of Cre. To cavity in cis; those experiments failed to detect intramolecular clarify whether the CSPs in the core of CreCat domain were in- cross-links (41). Nevertheless, the 15N relaxation data (Fig. 2) duced by direct interactions or via indirect allosteric effects, we would be consistent with a cis docking model in which rotational attached a nitroxide spin probe at the extreme C terminus of diffusion of αN is coincident with overall tumbling of the protein. CreCat and performed PRE-NMR experiments (44–46). We con- To test whether the C-terminal region of CreCat might be structed a variant of CreCat (C155A/C240A/D343C) that enabled docked in cis, instead of being extended for trans docking, we attachment of the spin probe to the C-terminal cysteine residue

A B

Cat 1 15 15 Fig. 2. Cre loops β2–3, αJ–K, and αM–N, but not helix αN, exhibit fast-timescale dynamics. (A){ H}- N HetNOE, N longitudinal R1 relaxation rate constant −1 15 −1 Cat (seconds ), N transverse R2 relaxation rate constant (seconds ), and corresponding R2/R1 ratios for Cre ; 800 MHz; 298 K; trimmed mean correlation time -9 τc in ns (10 s). Dashed lines in panels represent average values calculated excluding regions of fast-timescale dynamics. β2–3 loop (I197 to V209), αJ–K loop (Y273 to A285), and αM–N loop (N327 to G333) residues showing deviations from average measured values are shown in blue. Linker residues (A134 to E138) and C-terminal helix αN residues (A334 to L339) are highlighted in pink. The secondary structure elements of CreCat synaptic complex crystal structure (PDB ID code 2HOI, chain B) are shown above the plot for comparison. ns, nanoseconds. (B){1H}-15N HetNOE values for CreCat mapped onto the crystal structure (PDB ID code 2HOI, chain B) as a linear gradient from red (=0.3) to white (≥0.7). Residues with no data are shown in gray.

Unnikrishnan et al. PNAS Latest Articles | 3of10 Downloaded by guest on September 24, 2021 AB

Cat 2 2 1/2 Cat Fig. 3. C-terminal deletion results in CSPs in the core of Cre .(A) Per-residue amide CSPs [ΔδNH = (ΔδH + ΔδN /25) ]ofCre upon truncation of last 13 amino acids from the C terminus Δ(E331 to D343) (red bars). The secondary structure elements of Cre synaptic complex crystal structure (PDB ID code 2HOI, chain B) are shown above the plot. Asterisks indicate regions of high CSP containing active site residues R173, E176, H289, R292, W315, and Y324. Residues that showed largest CSPs or peak broadening beyond detection (assigned in CreCat but unassigned in CreCatΔC spectrum) are shown in maroon. Unassigned residues are indicated by small negative black bars. (B) Amide CSPs mapped onto the crystal structure (PDB ID code 2HOI, chain B) as a linear gradient from white (=0.0 ppm) to red (≥0.15 ppm). Side chains of active site residues R173, E176, and K201 (modeled using PyMOL) and H289, R292, W315, and Y324 are shown in blue. Truncated C-terminal residues Δ(E331 to D343) and unassigned residues are indicated in gray.

and used 1H-15N TROSY-HSQC spectra to verify that the mu- PRE Restraints Yield a Cis-Docked Model of the C Terminus of Cre, tations did not significantly perturb the structure of the protein. Consistent with CSPs. To determine whether the covalent struc- The single cysteine mutant was then conjugated with an S-(1-oxyl- ture of CreCat would allow cis docking of helix αN region without 2,2,5,5-tetramethyl- 2,5-dihydro-1H-pyrrol-3-yl) methyl meth- distorting other structural elements, we performed structure re- anesulfonothioate (MTSL) paramagnetic spin probe. Nearly 100% finement using PRE-derived distances as restraints. Analysis of Cat 15 tagging efficiency was verified using high-resolution matrix-assisted the free Cre PRE Iox/Ired values and NR2 relaxation rates laser desorption ionization (MALDI) mass spectrometry. 1H-15N yielded 124 restraints (SI Appendix, Table S1). To account for a TROSY-HSQC spectra of the MTSL-tagged protein under oxi- distribution of spin probe positions, lower bounds were set to 5 Å dized (paramagnetic) and reduced (diamagnetic) conditions were below calculated distances while the upper bounds were set to obtained and a comparison of the peak intensities was used to 10 Å higher than the calculated distances. ROSETTA energy determine per-residue PRE effects in free CreCat.InsuchaPRE- minimization was performed using upper and lower distance β – α – NMR experiment, the lone-pair electron on a paramagnetic spin bounds, assuming a flexible C terminus, 2 3 loop and J K loop regions; crystallographically observed secondary structure ele- probe is expected to induce distance-dependent line broadening for Cat protons up to 25 Å away (47). ments were enforced. Members of the PRE-derived Cre en- A strong correspondence between the C terminus-linked PRE semble (Fig. 5A) adopt stereochemically robust cis-docked effects and previously observed CSPs provide evidence for cis C-terminal conformations that differ significantly from the con- packing of the C terminus over the active site and DNA-binding formation observed in synaptic complexes (Fig. 5B). These struc- surface of Cre. In addition to residues sequentially neighboring tures are consistent with CSP data, even though those were not C343, we observed strong PRE effects from the C-terminal C343- used during refinement (Fig. 5C). A contact map showing differ- α– α MTSL spin probe to a part of the interdomain linker (E129 to ence in C C distances, between the top 10 conformers and the A136), helix αH (R173 to A178), helix αJ (S257 to I264), αJ–K synaptic crystal structure (PDB ID code 2HOI, chain A) shows loop (L284 to R292), and helix αM (G314 to Y324) (Fig. 4A and that the average changes in C-terminal interresidue distances SI Appendix, Fig. S6). To interpret the PRE effects in a structural when cis-docked ranged in values as large as 25 Å (Fig. 5D). context, we built a model of the MTSL tagged-protein using DNA Binding Perturbs the C-Terminal Region in CreCat. Coincidence of PyMOL (48) with PDB ID code 2HOI, chain B as the template, the DNA binding site and cis docking surface of the CreCat manually added the two missing C-terminal residues (G342 and C-terminal region led us to examine the effects of binding to a loxP D343C), D343 substituted with cysteine, and attaching an MTSL DNA half-site substrate (Fig. 1C) on the protein C terminus. The moiety (coordinates from PDB ID code 2XIU) (49) to the C343 1 15 Cat Cat H- N TROSY-HSQC spectrum of the Cre -DNA half-site side chain, and mapped the PRE data onto this Cre structural complex showed CSPs in residues corresponding to the Cre– model (Fig. 4C, Left). Each of the regions showing strong PRE Cat DNA interaction surface, comprising structural elements involved effects in free Cre are at distances in excess of 25 Å as mea- in DNA binding and the active site regions. Moreover, large CSPs sured from the C343-MTSL-oxygen to backbone amide nitrogen ∼ ∼ ∼ ∼ ∼ (and broadening of resonances) were also observed in the atoms: further than 42, 47, 51, 42, and 37 Å, respectively C-terminal residues E331 to D343, including helix αN(Fig.6A and (SI Appendix,Fig.S6A). B). The CSPs mapped on the crystal structure (Fig. 6C) show that To rule out the possibility that the observed PRE effects were the loxP DNA half-site binds to CreCat in the manner predicted due to intermolecular interactions, we performed an intermolec- from available tetrameric structure models, but DNA binding also ular PRE experiment by recording 1H-15N TROSY-HSQC spectra 15 Cat induces a change in the environment of the C-terminal residues, of a 1:1 mixture of [U- N]-Cre (non-tagged) with MTSL-tagged resulting in the observed CSPs. unlabeled CreCat C155A/C240A/D343C, under paramagnetic and 15 diamagnetic conditions. Because the N labels and MTSL tags are PRE-NMR Data Reveal a DNA Binding-Induced Conformational Change on different molecules any PRE effects must arise from intermo- in the Cre C Terminus. Like for the free protein, we used PRE- lecular interactions (SI Appendix,Fig.S7A) (50). Only a few sparse NMR to determine the conformational space populated by the C residues located in the trans docking surface of Cre showed in- terminus of Cre upon binding to the loxP DNA half-site. PRE termolecular PRE effects, suggestive of very weak transient inter- effects were measured using MTSL-tagged CreCat C155A/C240A/ molecular interaction (SI Appendix,Fig.S7B and C). Thus, the D343C bound to the loxP half-site hairpin DNA. PRE effects were strong intramolecular PRE effects in the 1H-15N TROSY-HSQC remarkably different in the DNA complexes in comparison to free Cat Cat amide spectrum of free Cre strongly support a model in which Cre (Fig. 4B and SI Appendix, Fig. S8). Low Iox/Ired ratios were the C-terminal region makes a cis interaction with its own DNA- observed for helix αF (E138 to M149), helix αG (G165 to E176), binding surface (Fig. 4C, Left and Right). β2andβ2–3 loop (G191 to I195, G198, and V204), β3–αIregion

4of10 | www.pnas.org/cgi/doi/10.1073/pnas.2011448117 Unnikrishnan et al. Downloaded by guest on September 24, 2021 A

B

CD BIOPHYSICS AND COMPUTATIONAL BIOLOGY

Fig. 4. PRE-NMR data reveal a cis to trans docking conformational switch in the CreCat C terminus upon DNA binding. (A) Per-residue amide PRE-NMR Cat normalized intensity ratios (Iox/Ired) of free Cre C155A/C240A/D343C generated using an MTSL paramagnetic probe at C-terminal residue C343 (green arrow); strong PRE effects (Iox/Ired < 0.65) are shown in red. Secondary structure elements of Cre synaptic complex crystal structure (PDB ID code 2HOI, chain B) are shown. Uncertainties are propagated from the signal-to-noise ratio of individual resonances. (B) Per-residue amide PRE-NMR normalized intensity ratios Cat (Iox/Ired) for Cre C155A/C240A/D343C bound to loxP half-site DNA generated using MTSL paramagnetic probe at C-terminal residue C343 (green arrow); strong PRE effects (Iox/Ired < 0.65) are shown in blue. Uncertainties are propagated from the signal-to-noise ratio of individual resonances. (C) Iox/Ired PRE values from free CreCat mapped to the structure of a protomer in the crystal structure (PDB ID code 2HOI, chain B; G342, C343-MTSL modeled) as a gradient

from red (Iox/Ired= 0) to white (Iox/Ired= 0.65). Residues with no PRE data are shown in gray. Juxtaposed equivalent surface representation (same orientation) along with a portion of the loxP DNA, as seen in the crystal (PDB ID code 2HOI, chains C and D) is shown in orange to illustrate the PRE effects mapping to the Cat DNA binding surface. (D) Iox/Ired PRE values from the Cre –DNA complex mapped to the structure of a protomer in the crystal structure (PDB ID code 2HOI, chain B; G342, C343 modeled) as a gradient from blue (Iox/Ired= 0) to white (Iox/Ired= 0.65). Bound DNA (PDB ID code 2HOI, chains C and D) is shown in pale orange. Residues with no PRE data are shown in gray. Juxtaposed equivalent surface representation (same orientation) along with C-terminal helix αN region from an adjacent MTSL-tagged protomer (PDB ID code 2HOI, chain G: E331 to D341 and G342, C343-MTSL modeled) is shown in yellow to illustrate the PRE effects mapping to protein–protein trans docking surface cavity. Orientation of CreCat is rotated ∼180° between C and D.

(A212 to R223), and helices αK–αL region (A291 to G314). These be predicted to yield weaker PRE effects in comparison to the regions together form the trans docking cavity that accommodates experiments with uniformly labeled and MTSL-tagged CreCat– the C terminus of neighboring Cre protomers in synaptic com- DNA complex (Fig. 4B), although observation of such effects in plexes (Figs. 1A and 4D, Left and Right). This change in PRE the same regions of the protein would support trans interactions. pattern (Fig. 4A versus Fig. 4B) clearly identifies a DNA-induced Despite the resonances being broader and data quality being conformational change that involves displacing the C terminus from generally lower, we indeed observed strong PRE effects consistent the DNA binding surface, to enable new protein–protein interac- with docking of DNA-bound Cre C-terminal region over the trans tions; however, they do not establish whether those new interactions docking surface (SI Appendix, Fig. S9 B and C); these effects must occur in cis (intramolecular) or in trans (intermolecular). arise from intermolecular interactions because of the labeling Intermolecular PRE-NMR studies with mixed isotopic label- pattern used. Yet, a comparison to the similarly strong PRE ef- ing were then used to clarify whether the PRE effects observed in fects in uniformly tagged and labeled CreCat–DNA complex the presence of DNA arise from intra- or intermolecular inter- studies (SI Appendix,Fig.S9C versus Fig. 4B) illustrates the pos- actions. We mixed DNA-bound non-tagged [U-15N]-CreCat and sibility of a population of DNA-bound Cre C terminus docking in MTSL-tagged unlabeled CreCat C155A/C240A/D343C at a 1:1 cis over its own trans docking cavity, while another population molar ratio, to study possible intermolecular interactions in CreCat extends out to dock in trans into the same cavity on another DNA- bound to DNA. Because these mixed samples have the potential bound Cre molecule. to interact in four different relative orientations (SI Appendix,Fig. S9A), only one of which would generate measurable PRE effects Discussion (i.e., MTSL-tagged unlabeled CreCat C155A/C240A/D343C with Cre recombinase has emerged as an important tool in molecular and C terminus docking onto a [U-15N] CreCat), the experiment would cellular biology and has several features that make it an attractive

Unnikrishnan et al. PNAS Latest Articles | 5of10 Downloaded by guest on September 24, 2021 A B C

D

Fig. 5. PRE-derived ensemble models of free CreCat shows cis docking of C terminus over the CreCat active site. (A) Top 50 PRE-derived ROSETTA models of free CreCat showing cis docking of C-terminal region (E331 to D343, in magenta). (B) Overlay of lowest ROSETTA energy member of the PRE-derived ensemble model (magenta) and the crystal structure protomer (PDB ID code 2HOI, chain A) (gray). (C) CSPs in free CreCat upon C-terminal truncation Δ(E331 to D343) (magenta) mapped onto the lowest-energy member of the PRE-derived ensemble model as a linear gradient from white (0.0 ppm) to red (≥0.15 ppm). Unassigned residues are shown in gray (as in Fig. 3B). (D) Difference in the interresidue Cα–Cα contact map between the top 10 PRE-derived models and the

DNA bound tetrameric crystal structure (PDB ID code 2HOI, chain A) (absolute values of changes in interresidue distances |ΔdistanceCα-Cα| (Å) are shown by a three-color heat map from blue to yellow). The secondary structure elements of Cre synaptic complex crystal structure (PDB ID code 2HOI, chain B) are shown along the axes. Diagonal is shown as dashed line (orange).

reagent for gene editing (7, 51, 52). Reaching that potential requires the loop in tetrameric synaptic complexes indicates that this thorough characterization of its mechanism for DNA site selection flexibility will be still retained, thereby enabling its regulatory role and control of its DNA cleavage and recombination activity. High- over adjacent active site structures. The protein dynamics ob- resolution structural studies of Cre, largely limited to crystal served in the αJ-K loop in the free protein may serve to facilitate structures of DNA-bound tetrameric synaptic complexes, have DNA binding and could thereafter be predicted to undergo suggested a role for conformational plasticity in facilitating the quenching since this region does not show structural plasticity steps within the reaction mechanism. We set out to characterize between the various Cre reaction intermediates. The surprise the solution behavior of Cre in its free and DNA-bound states in within the protein dynamics analysis lies in the observation that order to understand the link between protein dynamics and reg- while residues flanking C-terminal helix αN exhibit flexibility on the ulation of Cre assembly and activity. While solution NMR studies picosecond to nanosecond timescale αN itself does not, and instead yielded some results consistent with previous understanding of this exhibits relaxation rates consistent with overall tumbling of the prototypical member of the tyrosine recombinase family, other protein. This is counter to the expectation that the Cre C terminus surprising results call for a revised view of the reaction mechanism. would flexibly adopt a range of extended conformations to facili- Solution NMR data showed CreCat to fold independently of tate capturing an adjacent protomer via trans docking contacts. CreNTD. The well-dispersed signals in the 1H-15N TROSY-HSQC An unexpected C-terminal cis docking interaction in the prox- NMR spectrum of CreCat are nearly superimposable on that of imity of the free Cre active site was revealed from highly corre- full-length Cre, except for the signals attributable to CreNTD (SI lated CSP and PRE-NMR experiments. Given the known trans Appendix,Fig.S1C). The structural independence of the domains docking site, and prior incongruities regarding cis or trans cleavage implies structural variability that might be expected to prevent by Cre and other recombinases (53), a logical presumption was formation of suitable crystal lattices for X-ray diffraction. It also that in free Cre helix αN might dock in cis inthesamesiteit facilitates detailed NMR studies with the CreCat domain, wherein occupies when assembled in trans in synaptic complexes; indeed, reside the active site residues and sequence specificity determi- such a premise was previously tested experimentally using disul- nants, to test the role of protein dynamics in regulation of Cre fide cross-linking strategies (41). Were that the case, one would function (33–36). predict that deletion of helix αN would result in CSPs for residues Our findings of the protein dynamics in specific regions within flanking that trans docking surface cavity. Instead, we observed the CreCat revealed unique spectral signatures linked to their func- largest CSPs for residues on the other side of the protein, namely, tional roles. Underlying flexibility in CreCat studied by nuclear spin the DNA-binding and active site surfaces (Fig. 3). Since CSPs can 15 1 15 relaxation measurements ( N R1, R2 and { H}- NHetNOE) arise from both proximity and induced conformational changes, were consistent with the monomeric 217-residue protein construct we employed distance-dependent effects in PRE-NMR experi- (τc = 13.2 ns). A portion of the linker, the β2–3loop,theαJ–K ments to conclusively demonstrate the proximity of the protein C loop, and the αM–N loop each exhibit higher R1,lowerR2,and terminus to the active site and DNA-binding regions (Fig. 4A). reduced HetNOE values, indicative of fast internal motions. Consistent with size-exclusion chromatography (SI Appendix, Fig. Within the Cre4–loxP2 tetrameric synaptic complexes, conforma- S1A), NMR relaxation (Fig. 2), and published sedimentation ve- tional differences in the β2–3 loop and the C-terminal regions locity experiments (17) that show unbound Cre to be monomeric in (including helix αN) that form structural bridges that link adjacent solution, the absence of strong intermolecular PREs (SI Appendix, protomers, and in helix αM bearing the catalytic tyrosine, distin- Fig. S7) indicates that docking of the C terminus of free Cre onto the guish the catalytically “active” and “inactive” Cre protomers (5, DNA binding surface indeed occurs intramolecularly, in cis. Cis- 16, 22). Flexibility of the β2–3 loop observed in free CreCat could docked models generated using distance bounds derived from the be expected to prevent stabilization of the active site geometry in PRE data satisfy the restraints, covalent structure, and are in ex- uncoordinated Cre protomers. The alternating conformations of cellent agreement with CSPs generated by deletion of the C-terminal

6of10 | www.pnas.org/cgi/doi/10.1073/pnas.2011448117 Unnikrishnan et al. Downloaded by guest on September 24, 2021 A

B C

Cat 2 2 1/2 Cat Fig. 6. C-terminal helix αN region of Cre is perturbed upon binding to loxP DNA. (A) Per-residue amide CSPs [ΔδNH = (ΔδH + ΔδN /25) ]ofCre upon binding to loxP half-site hairpin DNA with a linear color ramp from gray (=0 ppm) to red (≥twice the Δδ SD [0.047 ppm]). Largest CSPs, or those that resulted Cat Cat–

in peak-broadening beyond detectability (assigned in free Cre but unassigned in the Cre DNA complex) are shown in maroon. Unassigned residues are BIOPHYSICS AND COMPUTATIONAL BIOLOGY indicated by small negative black bars. Secondary structure elements of Cre synaptic complex crystal structure (PDB ID code 2HOI, chain B) are shown above the plot. (B) Overlay of 1H-15N TROSY-HSQC spectra of free CreCat (black) and bound to a DNA half-site (red) with select residue assignments indicated (>3SD ppm). (C) CSPs mapped onto the crystal structure (PDB ID code 2HOI, chain B) as a linear gradient from white (≤1 SD ppm) to red (≥2 SD ppm). Unassigned residues are shown in gray.

region (Fig. 5), although these were not used in generating the structures with the C-terminal helix αN region packing over its models. A comparison between the free CreCat PRE-based ensemble own trans docking cavity without requiring significant confor- and its crystal structure in DNA-bound synaptic complexes shows mational rearrangements that would be inconsistent with the differences in interresidue contacts as large as 20 to 25 Å (Fig. 5D). NMR CSP data (Fig. 5). We conclude that in the absence of Thus, these data provide strong evidence that in the absence of DNA, the C-terminal residues spanning helix αN occupy the DNA the C-terminal residues dock in cis over the DNA-binding and DNA binding surface of Cre, and that binding to DNA produces active-site surfaces, in an apparent autoinhibitory state. a conformational change that enables productive trans protein– A Cre structure in which the C terminus is docked in cis over protein interactions with adjacent Cre molecules. The apparent the active site would seem incompatible with a DNA-bound autoinhibitory conformation observed in C-terminal residues in state—so, how is this conundrum resolved? PRE-NMR data free Cre could be expected to disfavor oligomerization of the on the CreCat–DNA complex show that the C terminus of CreCat protein prior to loxP DNA binding and to play a role in inhibiting is displaced from the cis docking surface when DNA binds there, spurious DNA binding, cleavage, and recombination activities. since PRE effects are instead observed at the trans docking cavity Although it is unclear whether the cis docking conformation of (Fig. 4B). This loxP binding-induced pattern of PREs at the trans C-terminal region in free Cre serves additional roles, our data docking site clearly indicates relocation of C-terminal helix αN suggest that the conformational rearrangements following the but does not clarify whether docking occurs in cis or in trans to a displacement of cis docking C-terminal upon loxP DNA half-site nearby Cre molecule. The use of a DNA loxP half-site substrate binding comprise an important mechanistic step that triggers Cre would be expected to prevent assembly of presynaptic complexes protomer stepwise assembly on loxP through protein–protein with two Cre molecules bound to DNA; however, PRE effects interactions via the now-exposed C-terminal residues. can be observed even when interactions are transient (54). Al- The autoinhibition of Cre inferred from these studies is emerging though partially attenuated, intermolecular PRE effects were as an important mechanism for regulating activities of nucleic acid- indeed observed at the same trans docking cavity in experiments binding proteins, ranging from transcriptional control (55) to epi- where the NMR signals from DNA-bound CreCat arise from genetic modification (56) and DNA repair (57). For instance, in the protein molecules not tagged with MTSL (SI Appendix, Fig. S9). ETS family of transcriptional repressors, a C-terminal inhibitory This attenuation could result from the fact that only one-fourth domain blocks the DNA-binding site, thereby weakening its affinity of protein–protein contacts within DNA-bound complexes can for DNA; dimerization of these proteins is thought to help over- produce intermolecular PRE effects (i.e., when C terminus from come autoinhibition and serve as a means of ensuring increased MTSL-tagged unlabeled CreCat docks in trans over non-tagged specificity (55). The DNA methyltransferase DNMT3A appears to [U-15N] CreCat when bound to DNA; SI Appendix, Fig. S9A), or exist in an autoinhibited form in which its ADD domain interacts may indicate that in the presence of DNA intramolecular cis with and inhibits enzymatic activity of the catalytic domain; binding docking over the protomer’s own trans docking cavity is also to the histone H3 tail (but not H3K4me3) disrupts ADD–CD in- possible. However, molecular modeling calculations starting teraction and thus releases the autoinhibition of DNMT3A, allowing from the DNA-bound crystal structure failed to adopt stable DNA methylation where H3K4 is not methylated (56). Nonspecific

Unnikrishnan et al. PNAS Latest Articles | 7of10 Downloaded by guest on September 24, 2021 endonucleolytic activity of XPF-ERCC1, which participates in mul- The lyophilized DNA oligo was resuspended in H2O and heated at 95 °C for tiple DNA damage-repair pathways, is prevented by an autoinhibited 15 min and immediately cooled on ice. To avoid precipitation, CreCat and loxP conformation in which the catalytic XPF subunit associates with the DNA hairpin at low concentrations (< 50 μM) in high-salt buffer (10 mM > ERCC1 (HhH)2 domain; specific engagement with DNA junctions Tris, 500 mM NaCl, pH 7.0) were mixed by stepwise titration of the protein into DNA. The solution was then dialyzed into low-salt NMR buffer (10 mM induces a conformational rearrangement that releases autoinhibition Tris, 15 mM NaCl, and 0.02% NaN3, pH 7.0) at 4 °C. The protein–DNA complex while also enabling coupling with the XPF-ERCC1 nuclease/nucle- sample was then concentrated using 500 μL, 1 kDa centrifugal filters (VWR). ase-like domains (57). Although “active” and “inactive” confor- mations of tyrosine recombinase family members are observed NMR Data Collection. Purified [U-15N] or [U-15N,13C] CreCat samples were con- within the context of tetramers in the crystal structures, the auto- centrated to 0.5 to 1.0 mM after dialysis into buffer containing 10 mM Tris, Cat inhibition implied by the cis-docked conformation of Cre ,and 100 mM NaCl, and 0.02% NaN3,pH7.0.Whenrequired,thesampleswere dependence on an extended C terminus for synapsis (5), suggest an exchanged into 10 mM d11-Tris (Sigma), 100 mM NaCl, and 0.02% NaN3,pH analogous means of ensuring specificity in assembly and engage- 7.0, buffer using Sephadex PD10 columns (GE Healthcare); 5 to 10% (vol/vol) ment of recombinogenic intasomes. Moreover, the conditional D2O (99%) and 0.66 mM DSS (2,2-dimethyl-2-silapentane-5-sulfonate) were release of the autoinhibitory conformation of Cre suggests a pos- added to the NMR samples. NMR data were recorded on Bruker Avance III HD sible new means of controlling the activity in Cre in the context of spectrometers operating at 850, 800, or 600 MHz equipped with a 5-mm triple- resonance cryoprobes and z axis gradients, at 25 °C. Data were processed with mammalian gene-editing applications. NMRPipe (59) or NMRFx (60) and visualized using NMRView (61). Typical These studies expand our understanding of the regulatory role 1H-15N TROSY-HSQC 2D correlation spectra were recorded using 16 to 24 played by C-terminal residues in Cre (5, 16, 17). DNA-binding- scans and 2,048 × 128 data points. One-dimensional pulsed-field gradient associated conformational changes mediated by C-terminal re- stimulated echo NMR experiments for measuring translational diffusion were gions of Cre and related recombinases have been implicated in recorded with the z axis gradient strength varied in linear steps from 5 to 95%. catalytic activation and initiation of oligomerization (2, 5, 28). Tetramethylsilane was used as an internal reference. However, the structural basis for Cre to undergo stepwise assembly on the DNA recognition sites has remained poorly understood due NMR Chemical Shift Assignment and Perturbation Mapping. For backbone chemical to lack of structural information on presynaptic reaction interme- shift assignments, TROSY triple-resonance spectra HNCO, HNCA, HN(CO)CA, – HN(CA)CO, and non-TROSY CBCA(CO)NH and HA(CO)NH were recorded on a diates. The features of Cre and Cre loxP complexes presented here 13 15 Cat provide insight into both the unliganded state of Cre and the purified [U- C, N] Cre sample. Backbone assignment was achieved using – NMRView aided by PINE (62) and CARA (63). The assignments have been de- mechanism of initiation of protein protein interactions that lead to posited in the Biological Magnetic Resonance Data Bank, http://www.bmrb.wisc. stepwise assembly of recombinogenic intasomes. Knowledge gaps edu/ under accession no. 50270. remain, including understanding of the precise structural rear- CSPs were calculated from differences in peak positions using rangements that accompany the cis to trans switch in presynaptic √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ Cre recombinase. Nevertheless, these investigations provide a dis- (Δδ )2 Δδ = Δδ 2 + N , tinct perspective on Cre recombinase and represent a transformative NH H 25 step toward understanding the structure, function, and mechanism of Δδ 1 Δδ this important and widely used gene-editing tool. where H is the chemical shift difference in the H dimension and N is the chemical shift change in the 15N dimension. Corrected SDs were determined by Materials and Methods calculating the SD of CSPs of all assigned residues, removing CSP data with >3 Protein Expression, Purification, and Site-Directed Mutagenesis. The deletion SD, and recalculating the SD. construct containing the catalytic domain and the interdomain linker (CreCat, 15 residues 127 to 343) was prepared using QuikChange Site-Directed Muta- N Relaxation Measurements. Backbone amide relaxation measurements were 15 Cat genesis Kit (Agilent) from a pET21A vector (Novagen) encoding WT Cre performed at 800 MHz proton Larmor frequency using [U- N] Cre at a con- μ recombinase (provided by Gregory Van Duyne, University of Pennsylvania, centration of 800 M in 10 mM Tris, 100 mM NaCl, and 0.02% NaN3,pH7.0, Philadelphia, PA). Further truncation and point mutants of CreCat construct buffer (5% D2O vol/vol). TROSY versions of R1 and R2 experiment data were were prepared using the Q5 Site-Directed Mutagenesis Kit (NEB). Protein ex- acquired using recycle delay of 2 s between experiments, and the following pression from Escherichia coli BL21 (DE3) cells was carried out by growing relaxation delays for R1:0,560,1,120and1,680mswithrepeatsandR2:2,24.2, freshly transformed cells (using ampicillin antibiotic, 100 mg/L) in Luria-Bertani 48.2, and 72.2 ms. The R1 and R2 values were obtained by fitting the intensity of media (natural abundance Cre) or M9 minimal medium (isotopically labeled peaks to exponential decay function and errors were determined by relaxation [U-15N] and [U-13C,15N]-Cre) supplemented with 1 g/L [15N]-ammonium chlo- curve fitting; 80 to 90% of the amides were analyzed, after excluding those with 1 15 ride (Cambridge Isotopes) as the sole nitrogen source or 2 g/L [13C]-glucose significant resonance overlap. { H}- N HetNOE values were obtained by re- 1 1 (Cambridge Isotopes) as the sole carbon source in addition to [15N]-ammonium cording spectra with and without a H presaturation period (8 s), in which H chloride. The cells were grown (shaking at 220 rpm, 37 °C) to an optical density signals were saturated using a train of 90° pulses, applied before the start of at 600 nm of 0.6 to 0.8 and induced with 0.75 mM isopropyl β-D-1-thio- experiment. HetNOE values for nonoverlapping assigned resonances were de- galactopyranoside (Gold Biotechnology) for 12 h. After harvesting the cells termined from intensity ratios in the two spectra; uncertainties were obtained (4,000 × g, 15 min, 4 °C) and sonication, the Cre constructs in 40 mM Tris, from SD of noise in the spectra. 100 mM NaCl and pH 7.0 buffer containing one protease inhibitor mixture tablet (Roche) were purified using a 5 mL SPFF cation exchange column (GE Paramagnetic Relaxation Enhancement (PRE)-NMR. We constructed a single- Healthcare)witha100mLgradientof0.1to1MNaCl.Theproteinfractions cysteine CreCat variant C155A/C240A/D343C to avoid unintended paramag- (eluted at ∼0.4 M NaCl) were combined and diluted four times with low-salt netic tagging of the native cysteines and thereby enable probing just the C 40 mM Tris, 100 mM NaCl and pH 7.0 buffer and further purified on a 5 mL terminus. MTSL tagging of CreCat C155A/C240A/D343C was achieved by fol- heparin affinity column (GE Healthcare) with a 100 mL gradient of 0.1 to 1.5 M lowing a published protocol (64). Briefly, a 200 mM MTSL (Toronto Research NaCl. Protein fractions (eluted at ∼1 M NaCl) were then purified using a Chemicals) stock was made by adding 189 μL of acetonitrile to 10 mg of MTSL HiLoad 16/60 Superdex 75 (GE Healthcare) column in 40 mM Tris, 500 mM NaCl (stored at −20 °C, protected from light). Dithiothreitol (DTT) was added to and pH 7.0 buffer. Purified protein concentrations were determined using purified protein in high-salt conditions (10 mM Tris, 500 mM NaCl, and 5 mM − − absorbance spectroscopy (e for CreCat = 23,950 M 1 cm 1 at 280 nm). Typical DTT, pH 7.0) to reduce possible disulfide bonds. The reducing agent was then yields obtained were ∼20 to 25 mg protein per L of culture. A260/A280 ratio removed by rapid buffer exchange into 10 mM Tris, 500 mM NaCl, pH 7.0, was measured to be ∼0.55. buffer using a PD10 desalting column (GE Healthcare). The sample was col- lected into exchange buffer containing 10-fold molar excess of MTSL and CreCat–DNA Complex NMR Sample Preparation. The loxP half-site hairpin was stirred at room temperature overnight. Excess MTSL was removed after reac- formed from single-stranded DNA oligos (IDT) containing RBEs along with tion by buffer exchange with a second PD10 column, followed by dialysis additional GC base pairs at ends, symmetrized spacer base pairs, and a GAA (molecular weight cutoff 3 kDa) into NMR buffer. Nearly 100% MTSL tagging hairpin (58): 5′-GCATAACTTCGTATAGCATATGCGAAGCATATGCTATACGAA- of the protein was verified using high-resolution MALDI mass spectrometry as GTTATGC-3′. indicated by an increase in mass by ∼186 Da (weight of attached probe). Final

8of10 | www.pnas.org/cgi/doi/10.1073/pnas.2011448117 Unnikrishnan et al. Downloaded by guest on September 24, 2021 − − sample concentrations ranged between 100 and 200 μMatavolumeof each amide proton, and K is a constant (1.23 × 10 32 cm6 s 2) that encap- ∼500 μL. sulates the spin properties of the MTSL tag (44). For PRE-NMR measurements, 1H-15N TROSY-HSQC spectra were recorded on Distance restraints were used in a bounded form (typical for NOE con- the same MTSL-tagged CreCat C155A/C240A/D343C sample before (paramag- straints). Due to the relative imprecision of this method of constraint distance netic) and after (diamagnetic) reduction of the spin probe by treatment with calculation (46, 65), residues with an Iox/Ired value less than 0.8 were assigned − fivefold molar excess of sodium ascorbate (250 mM stock used for minimal a lower bound of calculated r 5 Å and an upper bound was set to r +10Å. sample dilution) for 3 h at room temperature. Peak intensities were measured Residues with an Iox/Ired value greater than 0.8 were assigned a lower bound Cat and normalized using intensities of residues (D153, E222, and V230) >45 Å of 20 Å and no upper bound. Constraints for 124 residues in Cre were obtained where data (Iox, Ired,R2, ωh, and τC) for each amide proton were away from C terminus in PDB ID code 2HOI, chain B. For MTSL probes, Iox/Ired ratios between 0 and 1 indicate that the distance of probe to proton is within available. ROSETTA energy minimization was performed using the relax application 13 to 25 Å. Uncertainties in Iox/Ired values were propagated from signal- (66–70). Chain A of PDB ID code 2HOI (residues 127 to 341, and G342 and to-noise ratio of each resonance in the two spectra using D343C modeled) was used as the input structure with the D343C mutation. A √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ move map file retained secondary structure of CreCat along with the fol- σ 2 σ 2 Iox Ired β – α – σI =I = I =I ( ) + ( ) , lowing impositions: the 2 3 and J K loops and interdomain linker residues ox red ox red I I ox red (A127 to A134) were allowed to have backbone φ, ψ, and side chain χ1 15 σ σ torsional freedom based on flexibility observed via NR2/R1 data. C- where Iox/Ired is the calculated error in Iox/Ired ratio, Iox is the SD of the noise χ σ terminal E331 to C343 were also allowed to sample backbone and 1 an- in the MTSL oxidized spectrum, and Ired is the SD in the reduced spectrum. α 15 Cat gles allowing movement of helix N through space. Other residues were For the intermolecular PRE studies, [U- N] Cre (non-MTSL-tagged) was only allowed to undergo side-chain minimization. Constraints were used to Cat mixed with MTSL-tagged natural abundance Cre C155A/C240A/D343C, at generate 100 initial structures subsequently scored using the ref2015_cst a 1:1 molar ratio in NMR buffer (10 mM Tris, 100 mM NaCl, pH 7.0) at 25 °C. weights function (71). The 10 structures with the lowest atom_pair_ – For similar studies with the protein DNA complex, the complex samples constraint energy were selected as the input structures for a second round of with/without MTSL tags were prepared separately and mixed at a molar minimization wherein 10 structures were generated for each input struc- ratio of 1:1. ture. The resulting 100 structures constitute the PRE-NMR constrained en- semble model of free CreCat. PRE-Restrained Modeling Using ROSETTA. PRE-derived distance constraints for Coarse-grained contact maps were generated using MATLAB by calcu- use in structural modeling were generated using the approach of Battiste lating backbone interresidue Cα–Cα distances from PDB files (2HOI, chain A sp and Wagner (44). Briefly, per-residue R2 values (paramagnetic relaxation residues 127 to 341 and G342 and D343C modeled, and top 10 lowest-energy rate due to the spin tag) were obtained from members of free CreCat PRE-derived ensemble). Absolute values of change in interresidue distances between the contact maps were determined by sub- −Rsp t R e 2 tracting one distance matrix from the other. I =I = 2 , BIOPHYSICS AND COMPUTATIONAL BIOLOGY ox red + sp R2 R2 Data Availability. NMR chemical shift data have been deposited in the Bio- where R2 is the transverse relaxation rate of each amide and t is the ac- logical Magnetic Resonance Data Bank (accession no. 50270). sp quisition time in the proton dimension. R2 values were then used to obtain the approximate distance, r (angstroms) between the spin tag to the amide ACKNOWLEDGMENTS. We thank Dr. Chunhua Yuan, Dr. Alex Hansen, and proton of each residue: Dr. Arpad Somogyi of The Ohio State University Campus Chemical Instrumen- tation Center for assistance with NMR experimental setup and mass spectrom- 1 6 etry analysis and Dr. Eric Danhart for guidance with NMR data processing. We K 3τ r = [ (4τ + C )] , thank Dr. Gregory Van Duyne (University of Pennsylvania) for guidance and sp C + ω2 τ2 R2 1 h c for providing the plasmids encoding Cre. This study was supported by NIH grant R01 GM122432 (to M.P.F.), NSF Molecular and Cellular Biosciences grant where τC (seconds) is the per-residue correlation time of amide protons 0092962 (to M.P.F.), and NIH/National Institute of General Medical Sciences determined from R2/R1 relaxation ratios, ωh (Hz) is the Larmor frequency of grant 1P41GM111135 (to NMRBox).

1. F. Bucholtz, Principles of site-specific recombinase (SSR) technology. J. Vis. Exp., 718 17. K. Ghosh, F. Guo, G. D. Van Duyne, Synapsis of loxP sites by Cre recombinase. J. Biol. (2008). Chem. 282, 24004–24016 (2007). 2. S. E. Nunes-Düby, H. J. Kwon, R. S. Tirumalai, T. Ellenberger, A. Landy, Similarities and 18. K. Ghosh, G. D. Van Duyne, Cre-loxP biochemistry. Methods 28, 374–383 (2002). differences among 105 members of the Int family of site-specific recombinases. Nu- 19. D. N. Gopaul, F. Guo, G. D. Van Duyne, Structure of the Holliday junction intermediate cleic Acids Res. 26, 391–406 (1998). in Cre-loxP site-specific recombination. EMBO J. 17, 4175–4187 (1998). 3. D. Wirth et al., Road to precision: Recombinase-based targeting technologies for 20. S. S. Martin, E. Pulido, V. C. Chu, T. S. Lechner, E. P. Baldwin, The order of strand genome engineering. Curr. Opin. Biotechnol. 18, 411–419 (2007). exchanges in Cre-LoxP recombination and its basis suggested by the crystal structure 4. P. C. M. Fogg, S. Colloms, S. Rosser, M. Stark, M. C. M. Smith, New applications for of a Cre-LoxP Holliday junction complex. J. Mol. Biol. 319, 107–127 (2002). – phage . J. Mol. Biol. 426, 2703 2716 (2014). 21. K. Ghosh, C. K. Lau, K. Gupta, G. D. Van Duyne, Preferential synapsis of loxP sites 5. G. D. Van Duyne, Cre recombinase. Microbiol. Spectr. 3, MDNA3-A0014-2014 (2015). drives ordered strand exchange in Cre-loxP site-specific recombination. Nat. Chem. 6. G. Meinke, A. Bohm, J. Hauber, M. T. Pisabarro, F. Buchholz, Cre recombinase and Biol. 1, 275–282 (2005). – other tyrosine recombinases. Chem. Rev. 116, 12785 12820 (2016). 22. G. D. Van Duyne, A structural view of cre-loxp site-specific recombination. Annu. Rev. 7. A. M. Lanza, T. J. Dyess, H. S. Alper, Using the cre/lox system for targeted integration Biophys. Biomol. Struct. 30,87–104 (2001). into the human genome: loxFAS-loxP pairing and delayed introduction of cre DNA 23. L. Rajeev, K. Malanowska, J. F. Gardner, Challenging a paradigm: The role of DNA improve gene swapping efficiency. Biotechnol. J. 7, 898–908 (2012). homology in tyrosine recombinase reactions. Microbiol. Mol. Biol. Rev. 73, 300–309 8. N. D. F. Grindley, K. L. Whiteson, P. A. Rice, Mechanisms of site-specific recombination. (2009). Annu. Rev. Biochem. 75, 567–605 (2006). 24. N. E. Seah et al., architectures regulating the directionality of viral 9. A. Landy, The λ integrase site-specific recombination pathway. Microbiol. Spectr. 3, integration and excision. Proc. Natl. Acad. Sci. U.S.A. 111, 12372–12377 (2014). MDNA3-0051-2014 (2015). 25. E. P. Baldwin et al., A specificity switch in selected cre recombinase variants is medi- 10. T. Gaj, C. A. Gersbach, C. F. Barbas 3rd, ZFN, TALEN, and CRISPR/Cas-based methods ated by macromolecular plasticity and water. Chem. Biol. 10, 1085–1094 (2003). for genome engineering. Trends Biotechnol. 31, 397–405 (2013). 26. G. D. Van Duyne, A structural view of cre-loxp site-specific recombination. Annu. Rev. 11. D. B. T. Cox, R. J. Platt, F. Zhang, Therapeutic : Prospects and chal- – lenges. Nat. Med. 21, 121–131 (2015). Biophys. Biomol. Struct. 30,87 104 (2001). 12. A. C. Groth, M. P. Calos, Phage integrases: Biology and applications. J. Mol. Biol. 335, 27. C. Zhang et al., Redesign of the monomer-monomer interface of Cre recombinase – 667–678 (2004). yields an obligate heterotetrameric complex. Nucleic Acids Res. 43, 9076 9085 (2015). 13. N. L. Craig, Conservative site-specific recombination. Annu. Rev. Genet. 22,77–105 28. H. Aihara, H. J. Kwon, S. E. Nunes-Düby, A. Landy, T. Ellenberger, A conformational λ – (1988). switch controls the DNA cleavage activity of integrase. Mol. Cell 12, 187 198 (2003). 14. M. C. Birling, F. Gofflot, X. Warot, Site-specific recombinases for manipulation of the 29. S. Subramaniam, A. K. Tewari, S. E. Nunes-Duby, M. P. Foster, Dynamics and DNA mouse genome. Methods Mol. Biol. 561, 245–263 (2009). substrate recognition by the catalytic domain of lambda integrase. J. Mol. Biol. 329, 15. N. Sternberg, P1 site-specific recombination. III. Strand exchange 423–439 (2003). during recombination at lox sites. J. Mol. Biol. 150, 603–608 (1981). 30. M. C. Serre et al., The carboxy-terminal αN helix of the archaeal XerA tyrosine re- 16. F. Guo, D. N. Gopaul, G. D. van Duyne, Structure of Cre recombinase complexed with combinase is a molecular switch to control site-specific recombination. PLoS One 8, DNA in a site-specific recombination synapse. Nature 389,40–46 (1997). e63010 (2013).

Unnikrishnan et al. PNAS Latest Articles | 9of10 Downloaded by guest on September 24, 2021 31. A. B. Hickman, S. Waninger, J. J. Scocca, F. Dyda, Molecular organization in site- 51. V. Brault, V. Besson, L. Magnol, A. Duchon, Y. Hérault, Cre/loxP-mediated chromo- specific recombination: The catalytic domain of bacteriophage HP1 integrase at 2.7 some engineering of the mouse genome. Handb. Exp. Pharmacol.,29–48 (2007). A resolution. Cell 89, 227–237 (1997). 52. R. Sengupta et al., Viral Cre-LoxP tools aid genome engineering in mammalian cells. 32. R. A. Kazmierczak, B. M. Swalla, A. B. Burgin, R. I. Gumport, J. F. Gardner, Regulation J. Biol. Eng. 11, 45 (2017). of site-specific recombination by the C-terminus of λ integrase. Nucleic Acids Res. 30, 53. D. Esposito, J. J. Scocca, The integrase family of tyrosine recombinases: Evolution of a 5193–5204 (2002). conserved active site domain. Nucleic Acids Res. 25, 3605–3614 (1997). 33. B. Gibb et al., Requirements for catalysis in the Cre recombinase active site. Nucleic 54. G. M. Clore, Seeing the invisible by paramagnetic and diamagnetic NMR. Biochem. – Acids Res. 38, 5817 5832 (2010). Soc. Trans. 41, 1343–1354 (2013). 34. R. Hoess, K. Abremski, S. Irwin, M. Kendall, A. Mack, DNA specificity of the Cre re- 55. S. M. Green, H. J. Coyne 3rd, L. P. McIntosh, B. J. Graves, DNA binding by the ETS combinase resides in the 25 kDa carboxyl domain of the protein. J. Mol. Biol. 216, protein TEL (ETV6) is regulated by autoinhibition and self-association. J. Biol. Chem. 873–882 (1990). 285, 18496–18504 (2010). 35. A. W. Rüfer, B. Sauer, Non-contact positions impose site selectivity on Cre re- 56. X. Guo et al., Structural insight into autoinhibition and histone H3-induced activation combinase. Nucleic Acids Res. 30, 2764–2771 (2002). of DNMT3A. Nature 517, 640–644 (2015). 36. S. T. Kim, G. W. Kim, Y. S. Lee, J. S. Park, Characterization of cre-loxP interaction in the 57. M. Jones et al., Cryo-EM structures of the XPF-ERCC1 endonuclease reveal how DNA- major groove: Hint for structural distortion of mutant cre and possible strategy for junction engagement disrupts an auto-inhibited conformation. Nat. Commun. 11, HIV-1 therapy. J. Cell. Biochem. 80, 321–327 (2001). 1120 (2020). 37. L. Lee, P. D. Sadowski, Identification of Cre residues involved in synapsis, isomeriza- 58. S. Yoshizawa, G. Kawai, K. Watanabe, K. Miura, I. Hirao, GNA trinucleotide loop se- tion, and catalysis. J. Biol. Chem. 278, 36905–36915 (2003). 38. W. S. Price, Pulsed-field gradient nuclear magnetic resonance as a tool for studying quences producing extraordinarily stable DNA minihairpins. Biochemistry 36, – translational diffusion: Part 1. Basic theory. Concepts Magn. Reson., 10.1002/(sici) 4761 4767 (1997). 1099-0534(1997)9:5<299:aid-cmr2>3.0.co;2-u (1997). 59. F. Delaglio et al., NMRPipe: A multidimensional spectral processing system based on 39. J. García de la Torre, M. L. Huertas, B. Carrasco, HYDRONMR: Prediction of NMR re- UNIX pipes. J. Biomol. NMR 6, 277–293 (1995). laxation of globular proteins from atomic-level structures and hydrodynamic calcu- 60. M. Norris, B. Fetler, J. Marchant, B. A. Johnson, NMRFx processor: A cross-platform lations. J. Magn. Reson. 147, 138–146 (2000). NMR data processing program. J. Biomol. NMR 65, 205–216 (2016). 40. L. E. Kay, D. A. Torchia, A. Bax, Backbone dynamics of proteins as studied by 15N 61. B. A. Johnson, Using NMRView to visualize and analyze the NMR spectra of macro- inverse detected heteronuclear NMR spectroscopy: Application to staphylococcal molecules. Methods Mol. Biol. 278, 313–352 (2004). nuclease. Biochemistry 28, 8972–8979 (1989). 62. A. Bahrami, A. H. Assadi, J. L. Markley, H. R. Eghbalnia, Probabilistic interaction 41. P. Yuan, “Structural and biochemical studies of site-specific recombinases,” PhD dis- network of evidence algorithm and its application to complete labeling of peak lists sertation, University of Pennsylvania, Philadelphia, PA (2008). from protein NMR spectroscopy. PLOS Comput. Biol. 5, e1000307 (2009). 42. M. P. Foster et al., Chemical shift as a probe of molecular interfaces: NMR studies of 63. R. Keller, The Computer Aided Resonance Assignment Tutorial, (Cantina Verlag, DNA binding by the three amino-terminal zinc finger domains from transcription Goldau, Switzerland, 2004). factor IIIA. J. Biomol. NMR 12,51–71 (1998). 64. M. Sjodt, R. T. Clubb, Nitroxide labeling of proteins and the determination of para- 43. M. P. Williamson, Using chemical shift perturbation to characterise ligand binding. magnetic relaxation derived distance restraints for NMR studies. Bio Protoc. 7, e2207 – Prog. Nucl. Magn. Reson. Spectrosc. 73,1 16 (2013). (2017). 44. J. L. Battiste, G. Wagner, Utilization of site-directed spin labeling and high-resolution 65. J. Iwahara, C. D. Schwieters, G. M. Clore, Ensemble approach for NMR structure re- heteronuclear nuclear magnetic resonance for global fold determination of large finement against (1)H paramagnetic relaxation enhancement data arising from a proteins with limited nuclear overhauser effect data. Biochemistry 39, 5355–5365 flexible paramagnetic group attached to a macromolecule. J. Am. Chem. Soc. 126, (2000). 5879–5896 (2004). 45. G. M. Clore, J. Iwahara, Theory, practice, and applications of paramagnetic relaxation 66. P. Conway, M. D. Tyka, F. DiMaio, D. E. Konerding, D. Baker, Relaxation of backbone enhancement for the characterization of transient low-population states of biological bond geometry improves protein energy landscape modeling. Protein Sci. 23,47–55 macromolecules and their complexes. Chem. Rev. 109, 4108–4139 (2009). (2014). 46. G. M. Clore, Practical aspects of paramagnetic relaxation enhancement in biological 67. F. Khatib et al., Algorithm discovery by protein folding game players. Proc. Natl. Acad. macromolecules. Methods Enzymol. 564, 485–497 (2015). – 47. C. Altenbach, K. J. Oh, R. J. Trabanino, K. Hideg, W. L. Hubbell, Estimation of inter- Sci. U.S.A. 108, 18949 18953 (2011). residue distances in spin labeled proteins at physiological temperatures: Experimental 68. L. G. Nivón, R. Moretti, D. Baker, A Pareto-optimal refinement method for protein strategies and practical limitations. Biochemistry 40, 15471–15482 (2001). design scaffolds. PLoS One 8, e59004 (2013). 48. W. L. DeLano, “Pymol: An open-source molecular graphics tool” in CCP4 Newsletter 69. M. D. Tyka et al., Alternate states of proteins revealed by detailed energy landscape on Protein Crystallography, (2002), Vol. 40, pp. 82–92. mapping. J. Mol. Biol. 405, 607–618 (2011). 49. T. Gruene et al., Integrated analysis of the conformation of a protein-linked spin label 70. A. Leaver-Fay et al., Rosetta3: An object-oriented software suite for the simulation by crystallography, EPR and NMR spectroscopy. J. Biomol. NMR 49, 111–119 (2011). and design of macromolecules. Methods Enzymol. 487, 545– 574 (2011). 50. M. K. Janowska, J. Baum, Intermolecular paramagnetic relaxation enhancement (PRE) 71. H. Park et al., Simultaneous optimization of biomolecular energy functions on fea- studies of transient complexes in intrinsically disordered proteins. Methods Mol. Biol. tures from small molecules and macromolecules. J. Chem. Theory Comput. 12, 1345,45–53 (2016). 6201–6212 (2016).

10 of 10 | www.pnas.org/cgi/doi/10.1073/pnas.2011448117 Unnikrishnan et al. Downloaded by guest on September 24, 2021