<<

FtsK-dependent XerCD-dif recombination replication catenanes in a stepwise manner

Koya Shimokawaa, Kai Ishiharab, Ian Graingec, David J. Sherrattd, and Mariel Vazqueze,1

aDepartment of Mathematics, Saitama University, Saitama 380-8570, Japan; bFaculty of Education, Yamaguchi University, Yamaguchi 753-8512, Japan; cSchool of Environmental and Life Sciences, University of Newcastle, Callaghan, NSW 2308, Australia; dDepartment of Biochemistry, University of Oxford, Oxford OX1 3QU, United Kingdom; and eDepartment of Mathematics, San Francisco State University, San Francisco, CA 94132

Edited by De Witt Sumners, Florida State University, Tallahassee, FL, and accepted by the Editorial Board October 17, 2013 (received for review May 10, 2013) In Escherichia coli, complete unlinking of newly replicated sister In E. coli, XerCD-dif recombination plays an essential role in chromosomes is required to ensure their proper segregation at cell chromosome dimer resolution (reviewed in ref. 7). Furthermore, division. Whereas replication links are removed primarily by top- when coupled with FtsK, XerCD recombination at dif sites can oisomerase IV, XerC/XerD-dif site-specific recombination can me- 2m-cats produced in vitro by λ-Integrase (3). These results diate sister chromosome unlinking in Topoisomerase IV-deficient suggested a potential in vivo role for XerCD–FtsK recombination, cells. This reaction is activated at the division septum by the DNA which was then hypothesized to work with TopoIV to unlink translocase FtsK, which coordinates the last stages of chromosome DNA links produced by DNA replication. To test this hypothe- segregation with cell division. It has been proposed that, after being sis, a pair of supercoiled linked plasmids, each with one dif site, activated by FtsK, XerC/XerD-dif recombination removes DNA links was produced in vivo by replication in TopoIV-deficient cells, in a stepwise manner. Here, we provide a mathematically rigorous and these were then incubated in vitro with XerCD–FtsK50C (4). characterization of this topological mechanism of DNA unlinking. The ATP-dependent reaction efficiently produced unlinked cir- We show that stepwise unlinking is the only possible pathway that cles. The ATP dependence of the reaction is likely twofold: firstly strictly reduces the complexity of the substrates at each step. Finally, the DNA translocase activity of FtsK relies upon ATP hydrolysis we propose a topological mechanism for this unlinking reaction. for movement, and in the absence of translocation there is no stimulation of recombination. Secondly, the energy from ATP DNA | tangle method | Xer recombination | band surgery | hydrolysis is also used to align the two recombining dif sites so topology simplification that subsequent recombination produces the observed stepwise reduction in complexity. In addition to right-handed (RH) torus – he Escherichia coli chromosome is a 4.6-Mbp circular double- links with parallel sites and with 2 14 crossings, unknotted Tstranded (ds) DNA duplex, in which the two DNA strands dimers and a few dimeric were also observed. The exper- are wrapped around each other ∼420,000 times. During repli- imental data suggested a stepwise reaction where crossings are cation, DNA gyrase acts to remove the majority of these strand removed one at a time, iteratively converting links into knots, crossings, but those that remain result in two circular sister into links, until two free are obtained (Fig. 1). A control – molecules that are nontrivially linked. This creates the topolog- experiment demonstrated that XerCD FtsK50C recombination ical problem of separating the two linked sister chromosomes to could convert knotted dimers (RH torus knots with two directly ensure proper segregation at the time of cell division. Unlinking repeated dif sites) to free circles. Separate experiments showed of replication links in E. coli is largely achieved by Topoisomerase that chromosome unlinking in E. coli can be accomplished in IV (TopoIV), a type II topoisomerase (1, 2). However, Ip et al. fi demonstrated that XerC/XerD-dif (XerCD-dif) site-specific re- Signi cance combination, coupled with action of the translocase FtsK, could resolve linked plasmid substrates in vitro and hypothesized that Newly replicated circular chromosomes are topologically linked. this system could work alongside, yet independently of, TopoIV XerC/XerD-dif (XerCD-dif)–FtsK recombination acts in the repli- during in vivo unlinking of replicative catenanes in the bacterial cation termination region of the Escherichia coli chromosome to chromosome (3). Grainge et al. then demonstrated that in- remove links introduced during homologous recombination and creased site-specific recombination could indeed compensate for replication, whereas Topoisomerase IV removes replication links a loss of TopoIV activity in unlinking chromosomes in vivo (4). only. Based on gel mobility patterns of the products of recombi- When the activity of TopoIV is blocked, the result is cell le- nation, a stepwise unlinking pathway has been proposed. Here, thality. We here propose a mathematically rigorous analysis to we present a rigorous mathematical validation of this model, a fi describe the pathway and mechanisms of unlinking of replication signi cant advance over prior biological approaches. We show fi links by XerCD–FtsK. This work places a fundamental biological de nitively that there is a unique shortest pathway of unlinking dif– process within a mathematical context. by XerCD- FtsK that strictly reducesthecomplexityofthe Site-specific recombination is a process of breakage and re- links at every step. We delineate the mechanism of action of union at two specific dsDNA duplexes (the recombination sites). the enzymes at each step along this pathway and provide a 3D When the DNA substrate consists of circular DNA molecules, interpretation of the results. the recombination sites may occur in a single DNA or in Author contributions: K.S., D.J.S., and M.V. designed research; K.S., K.I., and M.V. per- separate circles. Two sites are in direct repeat if they are in the formed research; K.S., K.I., I.G., D.J.S., and M.V. contributed new reagents/analytic tools; same orientation on one DNA circle (Fig. 1). The relative ori- K.S., K.I., I.G., D.J.S., and M.V. analyzed data; and K.S., I.G., D.J.S., and M.V. wrote the entation of the sites is harder to characterize when the two sites paper. are on separate DNA circles. In the case of simple torus links The authors declare no conflict of interest. with 2m crossings (also called 2m-catenanes, or 2m-cats) for an This article is a PNAS Direct Submission. D.W.S. is a guest editor invited by the Editorial Board. integer m > 1, the sites are said to be in parallel or antiparallel Freely available online through the PNAS open access option. orientation with respect to each other (Fig. 1). Site-specific re- See Commentary on page 20854. combination occurs in two steps (5, 6): first, the recombination 1To whom correspondence should be addressed. E-mail: [email protected]. sites are brought together (synapsis); second, each site is cleaved This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. and the DNA ends are exchanged, then rejoined. 1073/pnas.1308450110/-/DCSupplemental.

20906–20911 | PNAS | December 24, 2013 | vol. 110 | no. 52 www.pnas.org/cgi/doi/10.1073/pnas.1308450110 Downloaded by guest on October 2, 2021 biologically reasonable assumptions and using results from low- dimensional topology, one can show that the tangles involved are SEE COMMENTARY all rational (14, 15, 21). Therefore, all solutions to the XerCD– psi system of equations can be computed using tangle calculus. There are only three solutions consistent with the experimental data (15). It was further shown that these solutions can be seen as different projections of the same three-dimensional (3D) ob- Fig. 1. Proposed stepwise unlinking by XerCD-dif–FtsK recombination: Parallel ject, and a unique topological mechanism for XerCD at psi was RH 2m-cats [e.g., T(2,6)p] are converted to RH torus knots [e.g., T(2,5)] with proposed that incorporated all three solutions (15). directly repeated sites; such knots are converted to RH cats, and so on, In Grainge et al. (4), several systems of tangle equations were iteratively, until the reaction stops at two open circles. (In ref. 4, RH torus links with parallel sites and up to 14 crossings, also called parallel 2m-cats proposed for the pathway taking replication links to two open cir- cles (the unlink). Tangle calculus was used to solve each system. For and denoted by T(2,2m)p, were used as substrates of Xer recombination.) example, all possible systems of two equations converting a RH 6-cat with parallel sites into a knotted product with five or fewer vivo by multiple rounds of XerCD-dif or Cre-loxP site-specific crossings were considered. Using tangle calculus, only three bio- recombination. The reactions required DNA translocation by logically meaningful solutions were found, all of which produced the FtsK. Overexpression of FtsK50C in TopoIV-deficient cells was RH 5-crossing torus with directly repeated sites. The authors sufficient to drive the topology simplification. Furthermore, in proposed that the three solutions are equivalent by 3D rigid motion vivo XerCD activation by actively translocating FtsK is essential (i.e., the three solutions reflect different views of the same 3D to effectively unlink replication links (4). In the absence of FtsK, shape). This study concluded that the stepwise unlinking pathway of an active XerCD complex may produce complicated DNA knots Fig. 1 is the most likely pathway of XerCD–FtsK recombination and links, with a small yield of unlinks (8). Whereas Xer site-specific when acting on 2m-cats, and posited a stepwise mechanism of action. recombination on DNA plasmids in vitro has been well-characterized The mathematical study in ref. 4 assumed that solutions to the at a local biochemical level, the mechanism of Xer-mediated DNA tangle equations were rational, sums of rational tangles, or closely unlinking in vivo remains unclear, and is extremely technically related to these, because tangle calculus is not helpful in finding difficult to address experimentally. Mathematical analysis using solutions outside these families (9, 13). Here, we extend the tangle calculus provided evidence that the stepwise unlinking APPLIED

methods from refs. 4 and 15 with the following two objectives: 1. MATHEMATICS mechanism (Fig. 1) was most consistent with the experimental Completely characterize the shortest unlinking pathways by data (4). However, this work relied on a number of assumptions. XerCD–FtsK; 2. Determine rationality of the tangles involved Here, we present an expanded mathematical analysis with a min- and compute the exact mechanism of action at each step of the imum number of assumptions. We characterize the topology of the process. We first determine that any shortest pathway to unlink a recombination products and show that there is a unique shortest 2m-cat has at least 2m steps (Theorem 1, SI Text S1 and S2, and unlinking pathway that strictly reduces the complexity of the sub- Figs. S1 and S2). If we further assume that topological complexity strates at each step. This unique mathematical analysis significantly

advances previous biological approaches. BIOCHEMISTRY Tangle Method and Xer Recombination A The tangle method uses tangles to model changes in topology of the synaptic complex before and after site-specific recombination (9–12). For the purposes of this paper, a tangle is a ball with two strings inside. In a DNA–protein complex, the strings of a tangle can represent two dsDNA duplexes bound to the enzyme(s) (the ball) thus forming a local synapse (Fig. 2A). The tangle method relies on a few justified assumptions and on knowledge of the topology of substrate and products. One round of recombination is translated into a system of two tangle equations, each of which B corresponds to the enzyme-bound DNA substrate and product. When substrate and product are in a specific family of knots and links, and when the tangles involved are assumed to be rational, possible topological mechanisms for the enzymatic action can be computed using tangle calculus (9, 10, 13). A rational tangle is one wherein a sequence of rotations of one pair of string ends Fig. 2. (A)XerCD–FtsK-dif complex before recombination is modeled by + relative to the other will lead to a trivial tangle, i.e., a tangle the sum of tangles Ob P. There, P encloses the core regions of the re- which admits a planar projection with no crossings (Fig. 2B). The combination sites and therefore can be seen as a ball with two very short fi strings of a rational tangle can always be depicted as winding DNA strings. Site-speci crecombinationoccursinsideP. The action of cleavage ’ and strand exchange is modeled by converting P into a tangle R. In the case around the ball s surface without crossing themselves, much like of unknotted substrates, the two DNA strings outside the enzymatic com- DNA winding around the protein surface during synapsis. The plex have been observed to be supercoiled but not intertwined. However, in tangle method has been used extensively to model the topolog- the case of knotted or linked substrates one would expect the DNA outside ical mechanism of site-specific recombination enzymes, and XerCD–FtsK not to be topologically trivial. For simplicity, we push any such specifically Xer recombination (4, 12, 14–16). Tangle theory and topological complexity into the shaded area of the tangle in the figure, fi = + tangle calculus are introduced in Mathematical Methods. which we then refer to as the outside tangle O.Wede ne O Of Ob,where Colloms et al. (17) showed that, when acting on unknotted Of includes the DNA outside XerCD-dif-FtsK. The recombination event is rep- + = + = DNA circles with two psi sites in direct repeat, XerCD yielded resented as a system of equations N(O P) substrate and N(O R) product, and the tangle O is not changed during the recombination. (B) products of unique topology: a RH 4-cat with antiparallel sites. Orientations of the recombination sites induce orientations to the strings Experimental data support a view where the sites wrap around inside P and R. When the two sites are inside P = (0) in parallel alignment, accessory proteins ∼3 times before recombination (18–20). This then R = (−1)or(1). When the sites are in P = (0) in antiparallel alignment, reaction can be written as a system of tangle equations. Under then R = (0,0). Tangles (0), (0,0), (−1), and (1) are the trivial tangles.

Shimokawa et al. PNAS | December 24, 2013 | vol. 110 | no. 52 | 20907 Downloaded by guest on October 2, 2021 of the substrate, as measured by its crossing number, declines after each step, then the enzymes unlink in exactly 2m steps and all of . . . . the intermediates are torus knots/links (Theorem 2; SI Text S3 ...... and Figs. S3 and S4). We then proceed to fully characterize the RH 2m-cat RH (2m-1)- RH 2-cat trivial knot trivial mechanisms of unlinking from trefoil to 2-cat, to unknot, to unlink (Propositions 1–3; SI Text S4). For other transitions we assume that Fig. 3. Unlinking pathway of RH 2m-cat: RH 2m-cat, RH (2m − 1)-torus knot, the solutions are rational or sums of rational tangles and revert RH (2m − 2)-cat,..., RH trefoil, 2-cat, the unknot, trivial link. to using tangle calculus (Propositions 4 and 5; SI Text S4). In Discussion, we propose a mechanism that unifies all solutions (SI Text S5; Fig. S5) and consider the case where the assumption on the in ref. 4 is one such shortest pathway (Figs. 1 and 3). In Theorem stepwise decline in topological complexity is removed. Support for 2 we ask whether there are other possible pathways, assuming a this assumption is shown in SI Text S6 and Figs. S6 and S7. stepwise decrease in topological complexity. We assume that each product of XerCD–FtsK recombination Results has a smaller number of crossings than its substrate, except per- The Shortest Pathway of Unlinking of 2m-Cats by XerCD–FtsK Has haps for the last product of recombination (i.e., the unlink). The Exactly 2m Steps. Here, dsDNA is modeled as the curve drawn assumption that each recombination step reduces the topological – complexity of its substrates is supported by the experimental data by the axis of the double helix. XerCD FtsK-dif is represented by fi a tangle: a ball (the enzymes) with two strings (the dsDNA) in- described below and by its quanti cation (SI Text S6; Figs. S6 and S7). In ref. 4, supercoiled replication catenanes produced side that intersect the boundary of the ball at four points. These fi strings may be intertwined in nontrivial ways. We partition this in vivo in TopoIV-de cient cells were incubated in vitro with XerCD–FtsK50C. The reaction yielded a spectrum of knotted and tangle into the sum of two tangles Ob + P.ThetangleP (parental) encloses only the core regions of the recombination sites, whereas linked DNA plasmids. A time-course analysis showed that, over time, the substrate catenanes were efficiently converted to un- the tangle Ob encloses any geometrical information captured within the enzymatic complex before recombination (i.e., all other DNA linked circles. It was proposed that knotted dimers appeared as recombination intermediates. In a separate reaction the authors crossings held in place by the protein complex outside the core fi – fi sites). Let O be the tangle outside XerCD–FtsK-dif. O includes con rmed that the XerCD FtsK complex could also ef ciently f f unknot knotted DNA with two directly repeated dif sites. Dec- all DNA outside the complex. For simplicity, let O = Of + Ob. Cleavage and strand exchange occur inside the tangle P, converting atenation data suggest that unlinking and unknotting occur grad- it into a tangle R while keeping O fixed. The recombination event is ually, with a steady decrease in topological complexity (4). then represented as a system of two tangle equations: N(O + P) = Theorem 2. Consider unlinking pathways for the 2m-cat. Assume substrate and N(O + R) = product (Fig. 2A). that each recombination event, other than the final one, reduces In this model, P and R enclose only the core regions of the dif the number of crossings. Then every intermediate is a RH 2k-cat, recombination sites, each of which is 28 base pairs long. The aRH(2k + 1)-torus knot, or the unknot. Furthermore, the pathway relatively short length of the core sites, combined with the fact is unique: that XerCD are tyrosine recombinases (i.e., go through a Holli- day Junction Intermediate), implies that P and R are trivial RH 2m-cat; RHð2m − 1Þ-torus knot; RHð2m − 2Þ-cat; ...; tangles. Without loss of generality we assume that the sites come ; ; ; ; : together as P = (0) with parallel or antiparallel orientation, and RH 4-cat RH trefoil 2-cat unknot unlink R = (0,0)or(±1) (Fig. 2B). Theorem 1 below considers the minimum number of re- The proof of Theorem 2 is based on Theorem 3 below, which combination events needed to convert the RH 2m-cat to the relies on properties of the Euler characteristic of a Seifert sur- unlink. Note that the proof of this theorem does not preclude the face for a knot/link, and the signature. R tangle from changing from one event to the next. Theorem 3. Let O, P, and R be tangles such that: Theorem 1. Assume each action of XerCD–FtsK is modeled as a ð + Þ = ; system of equations N(O + P) = substrate and N(O + R) = product N O P K1 where P = (0) and R =(0,0), (−1),or(1).Thenatleast2mrecom- bination events are needed to convert the RH 2m-cat with parallel dif NðO + RÞ = K2: sites to the unlink. Proof: Detailed background needed for this proof is provided in If P is assumed to be (0) and R is assumed to be (w,0) for some SI Text S1 and S2 and Figs. S1 and S2. The proof uses the sig- integer w, and the orientations of K1 and K2 agree outside P and R, nature of a link, a link invariant denoted by σ. If L is a RH 2m- then the following is true: cat with parallel dif sites, then σðLÞ = − 2m + 1. A result of a. If K1 = RH 2k-cat T(2,2k) with parallel sites ðk ≥ 1Þ, and K2 is Murasugi (22) implies that the difference of the signatures of the − – a knot with at most 2k 1 crossings, then K2 is the RH torus substrate and the product of recombination by XerCD FtsK is at knot T(2,2k − 1) with directly repeated sites; most 1 (SI Text, Lemma S1). Because the signature of the unlink b. If K = RH torus knot T(2,2k + 1) with directly repeated sites is 0, we need at least (2m−1) recombination events to go from 1 ðk ≥ 0Þ, and K2 is a link with at most 2k crossings then K2 is the the 2m-cat to the unlink. As each recombination changes a link RH 2k-cat T(2,2k) with parallel sites. into a knot and vice versa, the total number of recombination events must be even. Hence at least 2m recombination events are needed to convert the 2m-cat to the unlink. Proof of Theorem 3 is given in SI Text S3 and Fig. S3. Figs. 1 and 3 illustrate shortest pathways of unlinking starting Proof of Theorem 2: Theorem 3 implies that the product of XerCD– with the 6-cat and with the 2m-cat, respectively. FtsK recombination on a RH 6-cat with parallel sites is the RH 5-crossing torus knot with sites in direct repeat, and that the The Shortest DNA Unlinking Pathway by XerCD–FtsK Is Unique. The- product of recombination on the 5-noded knot is the RH 4-cat orem 1 provides formal proof that to unlink a replication cate- with parallel sites. Repeated application of Theorem 3 confirms nane with 2m crossings, the XerCD–FtsK complex must perform that the pathway in Fig. 1 is the only shortest pathway of unlinking at least 2m single recombination events. The pathway proposed for the parallel RH 6-cat if we assume that recombination reduces

20908 | www.pnas.org/cgi/doi/10.1073/pnas.1308450110 Shimokawa et al. Downloaded by guest on October 2, 2021 the topological complexity at every step but the last. This same propose topological mechanisms by assuming that O is equiva- argument can be applied to XerCD–FtsK unlinking of any parallel lent to a rational tangle or sum of rational tangles. In this way, SEE COMMENTARY RH 2m-cat, thus completing the proof (Fig. 3). we use tangle calculus to propose the most likely mechanisms; The condition of Theorem 2 can be relaxed by allowing the however, there is no guarantee that these are the only ones. The crossing number to remain constant or to be reduced. This as- proofs for the following two propositions are analogous, and are sumption gives rise to other more complicated pathways. given in SI Text S4.

Full Characterization of the Last Three Steps of DNA Unlinking by Proposition 4. Suppose N(O + P) = RH 2k-cat, N(O + R) = RH XerCD–FtsK. By Theorem 2, each shortest unlinking pathway start- (2k − 1)-torus knot. Suppose that O is equivalent to rational or the ing with a RH 2m-cat ends with a sequence of RH trefoil, Hopf sum of rational tangles. Then we show the following: link, unknot, and unlink (Fig. 1). Here we show that, under bio- if P = (0) and R = (−1), then O = (2k); logically reasonable assumptions, each of the last three recombi- = = = − fi if P (0) and R (0,0), then O (2k, 1,0); nation steps admits a very speci c topological mechanism. As in if P = (0) and R = (1), then O = (2k,−2,0). ref. 15, assumptions for P and R are based on the length of the core region of the recombination sites, and restrict the number of meaningful solutions to a small finite number. Proposition 5. Suppose N(O + P) = RH (2k − 1)-torus knot, N(O + Recombination converting a RH trefoil with directly repeated R) = RH (2k − 2)-cat. Suppose that O is equivalent to rational or sites into a 2-cat was characterized in ref. 16. Based on this, sum of rational tangles. Then we show the following: Proposition 1 shows that if a trefoil substrate is acted upon by = = − = − – if P (0) and R ( 1), then O (2k 1); XerCD FtsK, then there are exactly three possible topological if P = (0) and R = (0,0), then O = (2k − 1,−1,0); mechanisms of action. if P = (0) and R = (1), then O = (2k − 1,−2,0). Proposition 1 (adapted from ref. 16). Suppose N(O + P) = RH trefoil, The results of Propositions 4 and 5 are consistent with those of N(O + R) = 2-cat. Then we show the following: Propositions 1–3. In the general case, the shortest unlinking − if P = (0) parallel and R = (−1), then O = (3); pathway takes a RH 2k-cat with parallel sites into a RH (2k 1)- if P = 0 antiparallel and R = 0,0 then O = 3,−1,0 knot with sites in direct repeats, and then takes that knot into

( ) ( ), ( ); APPLIED = = = − a RH (2k − 2)-cat with parallel sites (Theorems 1 and 2). Using

if P (0) parallel and R (1), then O (3, 2,0). MATHEMATICS Propositions 4 and 5, each of the recombination events above By an argument similar to that in ref. 11, Proposition 2 shows admits three possible topological mechanisms of action. that if the substrate of recombination is the 2-cat with parallel sites and the product is the unknot, and if R is rational, then the geom- Discussion etry in the synapse is uniquely determined for each choice of R. Unification of the Individual Stepwise Unlinking Pathways. We have shown that for each recombination step in the unlinking pathway Proposition 2. Suppose N(O + P) = 2-cat, N(O + R) = unknot. Then there are three corresponding topological mechanisms of action we show the following:

for the enzyme. These mechanisms can be interpreted as three BIOCHEMISTRY if P = (0) parallel and R = (−1), then O = (2); views of the same 3D object (Figs. 4 and 5). The mechanisms if P = (0) antiparallel and R = (0,0), then O = (2,−1,0); computed for each recombination step can be clustered together if P = (0) parallel and R = (1), then O = (2,−2,0). to propose three stepwise unlinking mechanisms (Fig. 4). Notice that each row in Fig. 4 illustrates the topological mechanism for Finally Proposition 3 states that, if the substrate of re- the last three steps of XerCD–FtsK DNA unlinking. In refs. 15 combination is the unknot with two sites in direct repeat and the and 17 it was proposed that the topological mechanism with P = product is the unlink, then the geometry in the synapse is (0) with sites in parallel alignment and R = (−1) was the most uniquely determined for each choice of R. The proof is based on a result included in SI Text S4 and Fig. S4.

Proposition 3. Suppose N(O + P) = unknot, and N(O + R) = unlink. Then we show the following: if P = (0) parallel and R = (−1), then O = (1); if P = (0) antiparallel and R = (0,0), then O = (0,0); if P = (0) parallel and R = (1), then O = (−1). In summary, the shortest unlinking pathway takes the trefoil to the 2-cat, the 2-cat to the unknot, and the unknot to the unlink (Theorems 1 and 2). There are exactly three topological mech- anisms of action to account for each of these unlinking steps by XerCD–FtsK (Propositions 1–3).

Unlinking of 2m-Cats and (2m−1) Torus Knots. We have proposed a shortest pathway to account for XerCD–FtsK unlinking of Fig. 4. Last three steps of DNA unlinking by XerCD–FtsK. At the top are the replication catenanes and have proved that this pathway is last three steps in the DNA unlinking pathway. Each of the three rows below unique. We have fully characterized the unlinking mechanisms shows one of the three possible topological mechanisms by which a trefoil is taking trefoil to 2-cat, 2-cat to unknot, and unknot to unlink. converted to a 2-cat, a 2-cat to an unknot, and an unknot to an unlink. fi = = − With the current mathematical machinery we can provide full Locally, the rst mechanism, where P (0) parallel and R ( 1), is consistent with the mechanism proposed for XerCD at psi (17). The other two pathways characterization only for the last three recombination events (rows 2 and 3) can be obtained by rigid rotation of the pathway in row 1, along the pathway. In the general case where a 2k-cat is con- and correspond to P = (0) antiparallel and R = (0,0), and to P = (0) parallel − − verted into a (2k 1)-torus knot (Proposition 4) and a (2k 1)- and R = (1), respectively. The rotation axes are m1 and m2, as in Fig. 5. These torus knot is converted into a (2k−2)-cat (Proposition 5), we can mechanisms are consistent with the three proposed for Xer at psi (15).

Shimokawa et al. PNAS | December 24, 2013 | vol. 110 | no. 52 | 20909 Downloaded by guest on October 2, 2021 ′ = m can obtain a solution S equivalent to S by rotating the synapse P 1 = 13° 8° (0)toP (0,0). In the case of XerCD at psi, performing this IV simple transformation on the two solutions for P parallel yields the unique solution for P antiparallel. Similar arguments have been proposed for the recombinase TnP1-IRS (11) and for XerCD– I FtsK-dif (4). Here, we extend these arguments and apply them to each recombination step in the XerCD–FtsK-dif unlinking path- m2 III way (Figs. 4 and 5). The observed geometrical equivalence, ob- tained by rigid motion, between the three solutions for one of the II recombination steps suggests that a unique 3D representation of 14° 8° the synaptic complex and topological mechanism of the enzymes can be interpreted as three different tangle solutions when viewed from different spatial directions. Computer visualization methods, Fig. 5. Local site alignment. Due to a high level of sequence homology, we combined with molecular modeling software such as PyMol and here assume that the XerCD–DNA synapse can be modeled using the data with X-ray crystallographic data, can be applied to realize these – – for Cre DNA. This diagram is based on the crystal structure of Cre DNA from spatial equivalences and to propose a 3D molecular model for the ref. 23, in which the recombinase core complex is slightly off-planar, so that the cores appear either parallel or antiparallel in different projections. To enzymatic action, as was done in ref. 15. An animation generated with Knotplot (www.knotplot.com) is presented in Movie S1 to il- improve clarity, the angles are doubled in this diagram. Let m2 be the hor- izontal axis (dashed line), and let m1 be the vertical axis (solid line). When lustrate the transition from the 6-cat to the 5-torus knot. These looking down m2, the two sites appear in antiparallel alignment, whereas studies indicate a limitation of the tangle model and suggest the when seen from the reader’s perspective, the sites are in parallel. A rigid left- need to consider equivalence classes of planar tangle diagrams hand rotation of the DNA conformation around the axis m1 will result in related by 3D rigid motion (rotations and translations). string IV crossing over string I (15). Then the sites appear in antiparallel alignment and one crossing is added to the domain. Further left-hand ro- Conclusions tation around the m2 axis results in III crossing over II and the sites returning We have presented a thorough mathematical analysis of the to a parallel alignment. process of DNA unlinking by XerCD–FtsK. Based on the ex- perimental data of ref. 4, we first prove that the shortest pathway consistent with the experimental data. The mechanism proposed of unlinking a substrate consisting of 2m-cats with sites in par- in ref. 17 corresponds to the pathway in Fig. 4, row 1 (and is allel orientation has exactly 2m steps. We then show that if we revisited in SI Text S5 and Fig. S5). The other two pathways assume that recombination reduces the complexity of the sub- (rows 2 and 3) are obtained by rigid rotation of the pathway in strate at each step, as suggested by the experimental data, then row 1, and correspond to P = (0) antiparallel and R = (0,0), and the shortest pathway is unique and every recombination inter- to P = (0) parallel and R = (1), respectively. mediate is a torus knot or link. We analyze each recombination step and find three topological mechanisms at each step. The Tangle Analysis of Iterative Recombination. Because the recom- mechanisms can be unified in two different ways: (1) At each re- bination event involves two separate pairs of strand exchanges combination step, the obtained mechanisms can be interpreted as and proceeds through a Holliday Junction (HJ) Intermediate, different projections of the same 3D object, as illustrated in Figs. 4 there must be a step resetting the synapse between recombi- and 5 and in the animation in Movie S1;or(2)Theobtained nation events to maintain the same recombination mechanism at mechanisms can be sorted by the local action of the enzyme each event (i.e., if XerD is always to exchange the first pair of (specified by P, R, and the relative orientation of the sites). This strands to form the HJ which is resolved by XerC, as shown in unification produces three separate mechanisms of stepwise ref. 8). There is little evidence of the ability of XerCD and other unlinking consistent with the unique, shortest pathway previously tyrosine recombinases to act processively (i.e., bind and recom- reported. bine a single pair of recombination sites multiple times before dissociating). However, in refs. 3, 4, and 24 the authors report an Mathematical Methods action of XerCD recombination consistent with iterative cleav- is the study of non–self-intersecting curves in 3D space, and of fi fi – age and strand exchange. If the recombination reaction is as- the spaces de ned by these curves. A knot is de ned as a non self-inter- sumed to be iterative, and is modeled as processive (9, 25), secting circular path in space. A link (or catenane) is composed of two or more such curves, which can be intertwined. If the path can be laid flat then it can be proven that the tangle O is rational, and therefore without any over- or undercrossings, then K is the unknot or the trivial knot. all solutions are computable.

3D Interpretation of the Tangle Solutions. In the tangle method, P is defined as a very small ball containing the cleavage core regions A B of the recombination sites. In this case, the orientation of the recombination sites is inherited into the tangle P. Two sites in + = =~ P = (0) are in parallel alignment if both arrows point in the same direction in the tangle diagram; otherwise they are in antiparallel (-3,0) (6) (-5,0) (-1) (-5,-1) (6,-1,0) alignment. However, the concept of parallel and antiparallel alignment represents a local geometric property of the sites and C is well-defined only in the tangle diagram (a planar projection of the 3D tangle). For the same 3D tangle, one can always obtain a = = planar projection with parallel sites and another with antiparallel sites. The only exception to this statement is when two sites are coplanar, in a strict mathematical sense. In ref. 15 the authors N(A+B) 6-cat proposed a unique 3D topological mechanism for XerCD at psi Fig. 6. (A) Two rational tangles and their Conway vectors. (B) Tangle ad- that incorporated all three solutions to the tangle equations. Given dition of rational tangles A = (−5,0)andB = (−1). A + B is equivalent to the a solution where O is rational, P = (0)andR= (1)orR = (−1), one tangles ((−5,−1)and(6,−1,0)). (C) N(A + B) is a 6-crossing RH torus link (RH 6-cat).

20910 | www.pnas.org/cgi/doi/10.1073/pnas.1308450110 Shimokawa et al. Downloaded by guest on October 2, 2021 Likewise, the unlink, or trivial link, is the disjoint union of two unknotted detailed in ref. 15. Computing the solutions is not mathematically chal-

circles. Knots and links can be studied through their diagrams, i.e., planar lenging but can be very tedious. Mathematical software is available to make SEE COMMENTARY projections that distinguish between under- and overcrossings. Two knots the method more accessible to the scientific community (29, 30). TangleSolve or links K1 and K2 are equivalent if K1 can be smoothly deformed into K2 (http://ewok.sfsu.edu/TangleSolve) is a stand-alone Java program and web- without breaking the chain. We indistinguishably denote the knot/link, or its based applet with a user-friendly interface for analyzing and visualizing fi equivalence class, by K. Knots and links are classi ed according to their site-specific recombination mechanisms using the tangle method (29). crossing numbers. The crossing number of K is the minimum number of Using arguments from low-dimensional topology, O, P, and R can some- crossings taken over all projections of all elements in the equivalence class times be shown to be equivalent to rational or sums of two rational tangles K. Although finer classifications are attempted, it is generally difficult to (9, 11, 13–16, 25). In these cases one concludes that, under the assumptions determine when two knots/links are topologically identical (26). A tangle is a pair consisting of a ball and two strings in it. Fig. 2B illustrates of the tangle method, the solutions computed are the only possible sol- the four simplest tangles, called trivial tangles. Rational tangles are obtained utions to a given system of equations. Biologically this implies that the by intertwining the two strings in a trivial tangle. All tangles in Fig. 6 are tangle method computes all possible topological mechanisms of action for rational. Rational tangles most often appear in the mathematical analysis of the enzyme. In some cases reasonable assumptions can be made about P and site-specific recombination. There is a one-to-one correspondence between R to limit the number of solutions to a small finite number (14, 15). Math- the set of rational tangles and the set of extended rational numbers (i.e., the ematical analyses are often useful to characterize topological mechanisms of union of the set of rational numbers and the infinity 1/0) (27). A rational recombination (4, 10, 11, 13–16, 25, 31–34). A preliminary tangle analysis of tangle can be expressed using a vector representation, called the Conway Xer unlinking is given in ref. 4. vector. Conway vectors are illustrated in Fig. 6. Notice that the same rational tangle can admit different vector representations. The tangle method uses ACKNOWLEDGMENTS. The authors thank Rob Scharein for help with some two operations: tangle addition A + B; and numerator N(A) (Fig. 6C). A + B is of the figures and Movie S1, which were generated using Knotplot (www. a tangle, whereas N(A) is a knot or a link. If A is rational then N(A) belongs to knotplot.com), and Barbara Ustanko for editorial assistance with this man- the well-studied family of 4-plat knots or links (28). uscript. The authors also thank the referees for their careful review of the The tangle method uses biologically reasonable assumptions to model manuscript. M.V., K.S., and K.I. thank the Institute of Mathematics and Its a site-specific recombination event as a system of two or more tangle Applications for its hospitality and support. This research was supported by the following: Japan Society for the Promotion of Science KAKENHI 22540066 equations on three unknowns O, P, and R.IfP = (0), R = (w)or(w,0) for some and 25400080 (to K.S.); National Science Foundation DMS0920887 and CAREER integer w, then the system can be solved for O rational, sums of rational Grant DMS1057284 (to M.V.); UK Engineering and Physical Sciences Research tangles (9, 13). The solution set can be very large. To obtain the most bi- Council EP/H031367 (to K.I.); Australian Research Council FT120100153 ologically relevant solutions we often resort to assumptions where P = (0) and NHMRC APP1005697 (to I.G.); and Wellcome Trust WT099204AIA

and R = (k)or(k,0). These assumptions are backed by biological data as (to D.J.S.). APPLIED MATHEMATICS

1. Espeli O, Levine C, Hassing H, Marians KJ (2003) Temporal regulation of topoisomerase 17. Colloms SD, Bath J, Sherratt DJ (1997) Topological selectivity in Xer site-specificre- IV activity in E. coli. Mol Cell 11(1):189–201. combination. Cell 88(6):855–864. 2. Schvartzman JB, Stasiak A (2004) A topological view of the replicon. EMBO Rep 5(3): 18. Alén C, Sherratt DJ, Colloms SD (1997) Direct interaction of aminopeptidase A with 256–261. recombination site DNA in Xer site-specific recombination. EMBO J 16(17):5188–5197. 3. Ip SC, Bregu M, Barre FX, Sherratt DJ (2003) Decatenation of DNA circles by FtsK- 19. Colloms SD, Alén C, Sherratt DJ (1998) The ArcA/ArcB two-component regulatory dependent Xer site-specific recombination. EMBO J 22(23):6399–6407. system of Escherichia coli is essential for Xer site-specific recombination at psi. Mol 4. Grainge I, et al. (2007) Unlinking chromosomes catenanes in vivo by site-specificre- Microbiol 28(3):521–530. – combination. EMBO J 26(19):4228 4238. 20. Colloms SD (2013) The topology of plasmid-monomerizing Xer site-specificre-

fi fl fl BIOCHEMISTRY 5. Sadowski PD (1993) Site-speci c genetic recombination: Hops, ips, and ops. FASEB J combination. Biochem Soc Trans 41(2):589–594. – 7(9):760 767. 21. Hirasawa M, Shimokawa K (2000) Dehn surgeries on strongly invertible knots which fi 6. Hallet B, Sherratt DJ (1997) Transposition and site-speci c recombination: Adapting yield lens spaces. Proc Am Math Soc 128(11):3445–3451. DNA cut-and-paste mechanisms to a variety of genetic rearrangements. FEMS Mi- 22. Murasugi K (1965) On a certain numerical invariant of link types. Trans Am Math Soc – crobiol Rev 21(2):157 178. 117:387–422. 7. Barre FX, Sherratt DJ (2005) Chromosome dimer resolution. The Bacterial Chromo- 23. Van Duyne GD (2001) A structural view of Cre-loxP site-specific recombination. Annu some, ed Higgins PN (ASM, Washington), pp 513–524. Rev Biophys Biomol Struct 30:87–104. 8. Grainge I, Lesterlin C, Sherratt DJ (2011) Activation of XerCD-dif recombination by the 24. Gourlay SC, Colloms SD (2004) Control of Cre recombination by regulatory elements FtsK DNA translocase. Nucleic Acids Res 39(12):5140–5148. from Xer recombination systems. Mol Microbiol 52(1):53–65. 9. Ernst C, Sumners DW (1990) A calculus for rational tangles: Applications to DNA re- 25. Vazquez M, Sumners DW (2004) Tangle analysis of Gin recombination. Math Proc combination. Math Proc Camb Philos Soc 108(3):489–515. Camb Philos Soc 136(3):565–582. 10. Sumners DW, Ernst C, Spengler SJ, Cozzarelli NR (1995) Analysis of the mechanism of 26. Murasugi K (2008) Knot Theory and Its Applications (Birkhäuser, Boston). DNA recombination using tangles. Q Rev Biophys 28(3):253–313. 27. Conway JH (1970) An enumeration of knots and links, and some of their algebraic 11. Zheng W, Galloy C, Hallet B, Vazquez M (2007) The tangle model for site-specific properties. Computational Problems in Abstract Algebra, ed Leech J (Pergamon, Oxford), recombination: A computer interface and the TnpI-IRS recombination system. Knot – Theory for Specific Objects, OCAMI Studies, ed Kawauchi A (Osaka Municipal Uni- pp 329 358. versities Press, Sakai, Japan), Vol 1, pp 251–271. 28. Burde G, Zieschang H (2003) Knots (de Gruyter, Berlin). fi 12. Arsuaga J, Diao Y, Vazquez M (2007) DNA topology in recombination and chromo- 29. Saka Y, Vázquez M (2002) TangleSolve: Topological analysis of site-speci c recom- – some organization. Mathematics of DNA Structure, Function and Interactions, The bination. Bioinformatics 18(7):1011 1012. IMA Volumes in Mathematics and its Applications 150, eds CJ Benham et al. (Springer 30. Darcy IK, Scharein RG (2006) TopoICE-R: 3D visualization modeling the topology of – Science + Business Media, New York), pp 7–36. DNA recombination. Bioinformatics 22(14):1790 1791. fi 13. Ernst C, Sumners DW (1999) Solving tangle equations arising in a DNA recombination 31. Buck D, Flapan E (2007) Predicting knot or catenane type of site-speci c recombi- – model. Math Proc Camb Philos Soc 126(1):23–36. nation products. J Mol Biol 374(5):1186 1199. 14. Darcy I (2001) Biological distances on DNA knots and links: applications to Xer re- 32. Darcy I, Luecke J, Vazquez M (2009) Tangle analysis of difference topology ex- combination. J. Knot Theory Ramifications 10(2):269–294. periments: Applications to a Mu protein-DNA complex. Algebr. Geom. Topol. 9: 15. Vazquez M, Colloms SD, Sumners DW (2005) Tangle analysis of Xer recombination 2247–2309. reveals only three solutions, all consistent with a single three-dimensional topological 33. Valencia K, Buck D (2011) Predicting knot and catenane type of products of site- pathway. J Mol Biol 346(2):493–504. specific recombination on substrates. J Mol Biol 411(2):350–367. 16. Darcy IK, Ishihara K, Medikonduri RK, Shimokawa K (2012) Rational tangle surgery 34. Darcy IK, Vazquez M (2013) Determining the topology of stable protein-DNA com- and Xer recombination on catenanes. Algebr. Geom. Topol. 12:1183–1210. plexes. Biochem Soc Trans 41(2):601–605.

Shimokawa et al. PNAS | December 24, 2013 | vol. 110 | no. 52 | 20911 Downloaded by guest on October 2, 2021