<<

FtsK-dependent XerCD-dif recombination replication catenanes in a stepwise manner

Koya Shimokawaa, Kai Ishiharab, Ian Graingec, David J. Sherrattd, and Mariel Vazqueze,1

aDepartment of Mathematics, Saitama University, Saitama 380-8570, Japan; bFaculty of Education, Yamaguchi University, Yamaguchi 753-8512, Japan; cSchool of Environmental and Life Sciences, University of Newcastle, Callaghan, NSW 2308, Australia; dDepartment of Biochemistry, University of Oxford, Oxford OX1 3QU, United Kingdom; and eDepartment of Mathematics, San Francisco State University, San Francisco, CA 94132

Edited by De Witt Sumners, Florida State University, Tallahassee, FL, and accepted by the Editorial Board October 17, 2013 (received for review May 10, 2013) In Escherichia coli, complete unlinking of newly replicated sister In E. coli, XerCD-dif recombination plays an essential role in chromosomes is required to ensure their proper segregation cell chromosome dimer resolution (reviewed in ref. 7). Furthermore, division. Whereas replication links are removed primarily by top- when coupled with FtsK, XerCD recombination at dif sites can oisomerase IV, XerC/XerD-dif site-specific recombination can me- 2m-cats produced in vitro by λ-Integrase (3). These results diate sister chromosome unlinking in Topoisomerase IV-deficient suggested a potential in vivo role for XerCD–FtsK recombination, cells. This reaction is activated at the division septum by the DNA which was then hypothesized to work with TopoIV to unlink translocase FtsK, which coordinates the last stages of chromosome DNA links produced by DNA replication. To this hypothe- segregation with cell division. It has been proposed that, after being sis, a pair of supercoiled linked plasmids, each with one dif site, activated by FtsK, XerC/XerD-dif recombination removes DNA links was produced in vivo by replication in TopoIV-deficient cells, in a stepwise manner. Here, we provide a mathematically rigorous and these were then incubated in vitro with XerCD–FtsK50C (4). characterization of this topological mechanism of DNA unlinking. The ATP-dependent reaction efficiently produced unlinked cir- We show that stepwise unlinking is the only possible pathway that cles. The ATP dependence of the reaction is likely twofold: firstly strictly reduces the complexity of the substrates at each step. Finally, the DNA translocase activity of FtsK relies upon ATP hydrolysis we propose a topological mechanism for this unlinking reaction. for movement, and in the absence of translocation there is no

stimulation of recombination. Secondly, the energy from ATP APPLIED

DNA topology | tangle method | Xer recombination | band surgery | hydrolysis is also used to align the two recombining dif sites so MATHEMATICS topology simplification that subsequent recombination produces the observed stepwise reduction in complexity. In addition to right-handed (RH) he Escherichia coli chromosome is a 4.6-Mbp circular double- links with parallel sites and with 2–14 crossings, unknotted Tstranded (ds) DNA duplex, in which the two DNA strands dimers and a few dimeric were also observed. The exper- are wrapped around each other ∼420,000 times. During repli- imental data suggested a stepwise reaction where crossings are cation, DNA gyrase acts to remove the majority of these strand removed one at a , iteratively converting links into knots, crossings, but those that remain result in two circular sister into links, until two free circles are obtained (Fig. 1). A control – BIOCHEMISTRY molecules that are nontrivially linked. This creates the topolog- experiment demonstrated that XerCD FtsK50C recombination ical problem of separating the two linked sister chromosomes to could convert knotted dimers (RH torus knots with two directly ensure proper segregation at the time of cell division. Unlinking repeated dif sites) to free circles. Separate experiments showed of replication links in E. coli is largely achieved by Topoisomerase that chromosome unlinking in E. coli can be accomplished in fi IV (TopoIV), a II topoisomerase (1, 2). However, Ip et al. vivo by multiple rounds of XerCD-dif or Cre-loxP site-speci c demonstrated that XerC/XerD-dif (XerCD-dif) site-specific re- combination, coupled with action of the translocase FtsK, could Significance resolve linked plasmid substrates in vitro and hypothesized that this system could work alongside, yet independently of, TopoIV Newly replicated circular chromosomes are topologically linked. during in vivo unlinking of replicative catenanes in the bacterial XerC/XerD-dif (XerCD-dif)–FtsK recombination acts in the repli- chromosome (3). Grainge et al. then demonstrated that in- cation termination region of the Escherichia coli chromosome to creased site-specific recombination could indeed compensate for remove links introduced during homologous recombination and a loss of TopoIV activity in unlinking chromosomes in vivo (4). replication, whereas Topoisomerase IV removes replication links When the activity of TopoIV is blocked, the result is cell le- only. Based on gel mobility patterns of the products of recombi- thality. We here propose a mathematically rigorous analysis to nation, a stepwise unlinking pathway has been proposed. Here, describe the pathway and mechanisms of unlinking of replication we present a rigorous mathematical validation of this model, a links by XerCD–FtsK. This work places a fundamental biological significant advance over prior biological approaches. We show process within a mathematical context. definitively that there is a unique shortest pathway of unlinking Site-specific recombination is a process of breakage and re- by XerCD-dif–FtsK that strictly reduces the complexity of the union at two specific dsDNA duplexes (the recombination sites). links at every step. We delineate the mechanism of action of When the DNA substrate consists of circular DNA molecules, the enzymes at each step along this pathway and provide a 3D the recombination sites may occur in a single DNA circle or in interpretation of the results. separate circles. Two sites are in direct repeat if they are in the Author contributions: K.S., D.J.S., and M.V. designed research; K.S., K.I., and M.V. per- same orientation on one DNA circle (Fig. 1). The relative ori- formed research; K.S., K.I., I.G., D.J.S., and M.V. contributed new reagents/analytic tools; entation of the sites is harder to characterize when the two sites K.S., K.I., I.G., D.J.S., and M.V. analyzed data; and K.S., I.G., D.J.S., and M.V. wrote the are on separate DNA circles. In the case of simple torus links paper. with 2m crossings (also called 2m-catenanes, or 2m-cats) for an The authors declare no conflict of interest. integer m > 1, the sites are said to be in parallel or antiparallel This article is a PNAS Direct Submission. D.W.S. is a guest editor invited by the Editorial Board. orientation with respect to each other (Fig. 1). Site-specific re- Freely available online through the PNAS open access option. combination occurs in two steps (5, 6): first, the recombination 1To whom correspondence should be addressed. E-mail: [email protected]. sites are brought together (synapsis); second, each site is cleaved This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. and the DNA ends are exchanged, then rejoined. 1073/pnas.1308450110/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1308450110 PNAS Early Edition | 1of6 Downloaded by guest on September 28, 2021 dimensional topology, one can show that the tangles involved are all rational (14, 15, 21). Therefore, all solutions to the XerCD– psi system of equations can be computed using tangle calculus. There are only three solutions consistent with the experimental data (15). It was further shown that these solutions can be seen as different projections of the same three-dimensional (3D) ob- ject, and a unique topological mechanism for XerCD at psi was Fig. 1. Proposed stepwise unlinking by XerCD-dif–FtsK recombination: Parallel proposed that incorporated all three solutions (15). RH 2m-cats [e.g., T(2,6)p] are converted to RH torus knots [e.g., T(2,5)] with In Grainge et al. (4), several systems of tangle equations were directly repeated sites; such knots are converted to RH cats, and so on, proposed for the pathway taking replication links to two open iteratively, until the reaction stops at two open circles. (In ref. 4, RH torus circles (the unlink). Tangle calculus was used to solve each sys- links with parallel sites and up to 14 crossings, also called parallel 2m-cats tem. For example, all possible systems of two equations con- and denoted by T(2,2m)p, were used as substrates of Xer recombination.) verting a RH 6- with parallel sites into a knotted product with five or fewer crossings were considered. Using tangle calculus, recombination. The reactions required DNA translocation by only three biologically meaningful solutions were found, all of which produced the RH 5-crossing torus with directly re- FtsK. Overexpression of FtsK50C in TopoIV-deficient cells was sufficient to drive the topology simplification. Furthermore, in peated sites. The authors proposed that the three solutions are fl vivo XerCD activation by actively translocating FtsK is essential equivalent by 3D rigid motion (i.e., the three solutions re ect to effectively unlink replication links (4). In the absence of FtsK, different views of the same 3D shape). This study concluded that the stepwise unlinking pathway of Fig. 1 is the most likely pathway an active XerCD complex may produce complicated DNA knots – and links, with a small yield of unlinks (8). Whereas Xer site-specific of XerCD FtsK recombination when acting on 2m-cats, and recombination on DNA plasmids in vitro has been well-characterized posited a stepwise mechanism of action. at a local biochemical level, the mechanism of Xer-mediated DNA The mathematical study in ref. 4 assumed that solutions to the tangle equations were rational, sums of rational tangles, or closely unlinking in vivo remains unclear, and is extremely technically fi difficult to address experimentally. Mathematical analysis using related to these, because tangle calculus is not helpful in nding tangle calculus provided evidence that the stepwise unlinking solutions outside these families (9, 13). Here, we extend the mechanism (Fig. 1) was most consistent with the experimental methods from refs. 4 and 15 with the following two objectives: 1. Completely characterize the shortest unlinking pathways by data (4). However, this work relied on a number of assumptions. XerCD–FtsK; 2. Determine rationality of the tangles involved Here, we present an expanded mathematical analysis with a min- and compute the exact mechanism of action at each step of the imum number of assumptions. We characterize the topology of the process. We first determine that any shortest pathway to unlink a recombination products and show that there is a unique shortest 2m-cat has at least 2m steps (Theorem 1, SI Text S1 and S2, and unlinking pathway that strictly reduces the complexity of the sub- Figs. S1 and S2). If we further assume that topological complexity strates at each step. This unique mathematical analysis significantly of the substrate, as measured by its crossing number, declines after advances previous biological approaches. Tangle Method and Xer Recombination The tangle method uses tangles to model changes in topology of A the synaptic complex before and after site-specific recombination (9–12). For the purposes of this paper, a tangle is a ball with two inside. In a DNA–protein complex, the strings of a tangle can represent two dsDNA duplexes bound to the enzyme(s) (the ball) thus forming a local synapse (Fig. 2A). The tangle method relies on a few justified assumptions and on knowledge of the topology of substrate and products. One round of recombination is translated into a system of two tangle equations, each of which corresponds to the enzyme-bound DNA substrate and product. B When substrate and product are in a specific family of knots and links, and when the tangles involved are assumed to be rational, possible topological mechanisms for the enzymatic action can be computed using tangle calculus (9, 10, 13). A rational tangle is one wherein a sequence of rotations of one pair of string ends relative to the other will lead to a trivial tangle, i.e., a tangle Fig. 2. (A)XerCD–FtsK-dif complex before recombination is modeled by + which admits a planar projection with no crossings (Fig. 2B). The the of tangles Ob P. There, P encloses the core regions of the re- strings of a rational tangle can always be depicted as winding combination sites and therefore can be seen as a ball with two very short fi around the ball’s surface without crossing themselves, much like DNA strings. Site-speci c recombination occurs inside P. The action of cleavage and strand exchange is modeled by converting P into a tangle . In the case DNA winding around the protein surface during synapsis. The of unknotted substrates, the two DNA strings outside the enzymatic com- tangle method has been used extensively to model the topolog- plex have been observed to be supercoiled but not intertwined. However, in ical mechanism of site-specific recombination enzymes, and the case of knotted or linked substrates one would expect the DNA outside specifically Xer recombination (4, 12, 14–16). Tangle theory and XerCD–FtsK not to be topologically trivial. For simplicity, we push any such tangle calculus are introduced in Mathematical Methods. topological complexity into the shaded area of the tangle in the figure, fi = + Colloms et al. (17) showed that, when acting on unknotted which we then refer to as the outside tangle O.Wede ne O Of Ob,where DNA circles with two psi sites in direct repeat, XerCD yielded Of includes the DNA outside XerCD-dif-FtsK. The recombination event is rep- + = + = products of unique topology: a RH 4-cat with antiparallel sites. resented as a system of equations N(O P) substrate and N(O R) product, and the tangle O is not changed during the recombination. (B) Experimental data support a view where the sites wrap around ∼ – Orientations of the recombination sites induce orientations to the strings accessory proteins 3 times before recombination (18 20). This inside P and R. When the two sites are inside P = (0) in parallel alignment, reaction can be written as a system of tangle equations. Under then R = (−1)or(1). When the sites are in P = (0) in antiparallel alignment, biologically reasonable assumptions and using results from low- then R = (0,0). Tangles (0), (0,0), (−1), and (1) are the trivial tangles.

2of6 | www.pnas.org/cgi/doi/10.1073/pnas.1308450110 Shimokawa et al. Downloaded by guest on September 28, 2021 each step, then the enzymes unlink in exactly 2m steps and all of the intermediates are torus knots/links (Theorem 2; SI Text S3 . . . . and Figs. S3 and S4). We then proceed to fully characterize the ...... mechanisms of unlinking from trefoil to 2-cat, to , to unlink RH 2m-cat RH (2m-1)- RH 2-cat trivial knot trivial (Propositions 1–3; SI Text S4). For other transitions we assume that the solutions are rational or sums of rational tangles and revert Fig. 3. Unlinking pathway of RH 2m-cat: RH 2m-cat, RH (2m − 1)-torus knot, to using tangle calculus (Propositions 4 and 5; SI Text S4). In RH (2m − 2)-cat,..., RH trefoil, 2-cat, the unknot, trivial link. Discussion, we propose a mechanism that unifies all solutions (SI Text S5; Fig. S5) and consider the case where the assumption on the stepwise decline in topological complexity is removed. Support for in ref. 4 is one such shortest pathway (Figs. 1 and 3). In Theorem this assumption is shown in SI Text S6 and Figs. S6 and S7. 2 we ask whether there are other possible pathways, assuming a stepwise decrease in topological complexity. Results We assume that each product of XerCD–FtsK recombination The Shortest Pathway of Unlinking of 2m-Cats by XerCD–FtsK Has has a smaller number of crossings than its substrate, except per- Exactly 2m Steps. Here, dsDNA is modeled as the curve drawn haps for the last product of recombination (i.e., the unlink). The by the axis of the double helix. XerCD–FtsK-dif is represented by assumption that each recombination step reduces the topological complexity of its substrates is supported by the experimental data a tangle: a ball (the enzymes) with two strings (the dsDNA) in- fi side that intersect the boundary of the ball at four points. These described below and by its quanti cation (SI Text S6; Figs. S6 strings may be intertwined in nontrivial ways. We partition this and S7). In ref. 4, supercoiled replication catenanes produced + in vivo in TopoIV-deficient cells were incubated in vitro with tangle into the sum of two tangles Ob P. The tangle P (pa- – rental) encloses only the core regions of the recombination sites, XerCD FtsK50C. The reaction yielded a spectrum of knotted and linked DNA plasmids. A time-course analysis showed that, over whereas the tangle Ob encloses any geometrical information fi captured within the enzymatic complex before recombination time, the substrate catenanes were ef ciently converted to un- (i.e., all other DNA crossings held in place by the protein com- linked circles. It was proposed that knotted dimers appeared as plex outside the core sites). Let O be the tangle outside XerCD– recombination intermediates. In a separate reaction the authors f confirmed that the XerCD–FtsK complex could also efficiently FtsK-dif. Of includes all DNA outside the complex. For sim-

unknot knotted DNA with two directly repeated dif sites. Dec- APPLIED plicity, let O = Of + Ob. Cleavage and strand exchange occur inside the tangle P, converting it into a tangle R while keeping O atenation data suggest that unlinking and unknotting occur grad- MATHEMATICS fixed. The recombination event is then represented as a system of ually, with a steady decrease in topological complexity (4). two tangle equations: N(O + P) = substrate and N(O + R) = Theorem 2. Consider unlinking pathways for the 2m-cat. Assume product (Fig. 2A). that each recombination event, other than the final one, reduces In this model, P and R enclose only the core regions of the dif the number of crossings. Then every intermediate is a RH 2k-cat, recombination sites, each of which is 28 base pairs long. The aRH(2k + 1)-torus knot, or the unknot. Furthermore, the pathway relatively short length of the core sites, combined with the fact is unique: that XerCD are tyrosine recombinases (i.e., go through a Holli- BIOCHEMISTRY day Junction Intermediate), implies that P and R are trivial RH 2m-cat; RHð2m − 1Þ-torus knot; RHð2m − 2Þ-cat; ...; tangles. Without loss of generality we assume that the sites come ; ; ; ; : together as P = (0) with parallel or antiparallel orientation, and RH 4-cat RH trefoil 2-cat unknot unlink R = (0,0)or(±1) (Fig. 2B). Theorem 1 below considers the minimum number of re- The proof of Theorem 2 is based on Theorem 3 below, which combination events needed to convert the RH 2m-cat to the relies on properties of the Euler characteristic of a Seifert sur- unlink. Note that the proof of this theorem does not preclude the face for a knot/link, and the signature. R tangle from changing from one event to the next. Theorem 3. Let O, P, and R be tangles such that: Theorem 1. Assume each action of XerCD–FtsK is modeled as a ð + Þ = ; system of equations N(O + P) = substrate and N(O + R) = product N O P K1 where P = (0) and R =(0,0), (−1),or(1). Then at least 2m recom- bination events are needed to convert the RH 2m-cat with parallel dif NðO + RÞ = K2: sites to the unlink. Proof: Detailed background needed for this proof is provided in If P is assumed to be (0) and R is assumed to be (w,0) for some SI Text S1 and S2 and Figs. S1 and S2. The proof uses the sig- integer w, and the orientations of K1 and K2 agree outside P and R, nature of a link, a link invariant denoted by σ. If L is a RH 2m- then the following is true: cat with parallel dif sites, then σðLÞ = − 2m + 1. A result of a. If K1 = RH 2k-cat T(2,2k) with parallel sites ðk ≥ 1Þ, and K2 is Murasugi (22) implies that the difference of the signatures of the − – a knot with at most 2k 1 crossings, then K2 is the RH torus substrate and the product of recombination by XerCD FtsK is at knot T(2,2k − 1) with directly repeated sites; most 1 (SI Text, Lemma S1). Because the signature of the unlink b. If K = RH torus knot T(2,2k + 1) with directly repeated sites is 0, we need at least (2m−1) recombination events to go from 1 ðk ≥ 0Þ, and K2 is a link with at most 2k crossings then K2 is the the 2m-cat to the unlink. As each recombination changes a link RH 2k-cat T(2,2k) with parallel sites. into a knot and vice versa, the total number of recombination events must be even. Hence at least 2m recombination events are needed to convert the 2m-cat to the unlink. Proof of Theorem 3 is given in SI Text S3 and Fig. S3. Figs. 1 and 3 illustrate shortest pathways of unlinking starting Proof of Theorem 2: Theorem 3 implies that the product of XerCD– with the 6-cat and with the 2m-cat, respectively. FtsK recombination on a RH 6-cat with parallel sites is the RH 5-crossing torus knot with sites in direct repeat, and that the The Shortest DNA Unlinking Pathway by XerCD–FtsK Is Unique. The- product of recombination on the 5-noded knot is the RH 4-cat orem 1 provides formal proof that to unlink a replication cate- with parallel sites. Repeated application of Theorem 3 confirms nane with 2m crossings, the XerCD–FtsK complex must perform that the pathway in Fig. 1 is the only shortest pathway of unlinking at least 2m single recombination events. The pathway proposed for the parallel RH 6-cat if we assume that recombination reduces

Shimokawa et al. PNAS Early Edition | 3of6 Downloaded by guest on September 28, 2021 the topological complexity at every step but the last. This same propose topological mechanisms by assuming that O is equiva- argument can be applied to XerCD–FtsK unlinking of any parallel lent to a rational tangle or sum of rational tangles. In this way, RH 2m-cat, thus completing the proof (Fig. 3). we use tangle calculus to propose the most likely mechanisms; The condition of Theorem 2 can be relaxed by allowing the however, there is no guarantee that these are the only ones. The crossing number to remain constant or to be reduced. This as- proofs for the following two propositions are analogous, and are sumption gives rise to other complicated pathways. given in SI Text S4.

Full Characterization of the Last Three Steps of DNA Unlinking by Proposition 4. Suppose N(O + P) = RH 2k-cat, N(O + R) = RH XerCD–FtsK. By Theorem 2, each shortest unlinking pathway start- (2k − 1)-torus knot. Suppose that O is equivalent to rational or the ing with a RH 2m-cat ends with a sequence of RH trefoil, Hopf sum of rational tangles. Then we show the following: link, unknot, and unlink (Fig. 1). Here we show that, under bio- if P = (0) and R = (−1), then O = (2k); logically reasonable assumptions, each of the last three recombi- = = = − fi if P (0) and R (0,0), then O (2k, 1,0); nation steps admits a very speci c topological mechanism. As in if P = (0) and R = (1), then O = (2k,−2,0). ref. 15, assumptions for P and R are based on the length of the core region of the recombination sites, and restrict the number of meaningful solutions to a small finite number. Proposition 5. Suppose N(O + P) = RH (2k − 1)-torus knot, N(O + Recombination converting a RH trefoil with directly repeated R) = RH (2k − 2)-cat. Suppose that O is equivalent to rational or sites into a 2-cat was characterized in ref. 16. Based on this, sum of rational tangles. Then we show the following: Proposition 1 shows that if a trefoil substrate is acted upon by = = − = − – if P (0) and R ( 1), then O (2k 1); XerCD FtsK, then there are exactly three possible topological if P = (0) and R = (0,0), then O = (2k − 1,−1,0); mechanisms of action. if P = (0) and R = (1), then O = (2k − 1,−2,0). Proposition 1 (adapted from ref. 16). Suppose N(O + P) = RH trefoil, The results of Propositions 4 and 5 are consistent with those of N(O + R) = 2-cat. Then we show the following: Propositions 1–3. In the general case, the shortest unlinking − if P = (0) parallel and R = (−1), then O = (3); pathway takes a RH 2k-cat with parallel sites into a RH (2k 1)- if P = (0) antiparallel and R = (0,0), then O = (3,−1,0); knot with sites in direct repeats, and then takes that knot into − if P = (0) parallel and R = (1), then O = (3,−2,0). a RH (2k 2)-cat with parallel sites (Theorems 1 and 2). Using Propositions 4 and 5, each of the recombination events above By an argument similar to that in ref. 11, Proposition 2 shows admits three possible topological mechanisms of action. that if the substrate of recombination is the 2-cat with parallel sites and the product is the unknot, and if R is rational, then the geom- Discussion etry in the synapse is uniquely determined for each choice of R. Unification of the Individual Stepwise Unlinking Pathways. We have shown that for each recombination step in the unlinking pathway Proposition 2. Suppose N(O + P) = 2-cat, N(O + R) = unknot. Then there are three corresponding topological mechanisms of action we show the following: for the enzyme. These mechanisms can be interpreted as three if P = (0) parallel and R = (−1), then O = (2); views of the same 3D object (Figs. 4 and 5). The mechanisms if P = (0) antiparallel and R = (0,0), then O = (2,−1,0); computed for each recombination step can be clustered together if P = (0) parallel and R = (1), then O = (2,−2,0). to propose three stepwise unlinking mechanisms (Fig. 4). Notice that each row in Fig. 4 illustrates the topological mechanism for Finally Proposition 3 states that, if the substrate of re- the last three steps of XerCD–FtsK DNA unlinking. In refs. 15 combination is the unknot with two sites in direct repeat and the and 17 it was proposed that the topological mechanism with P = product is the unlink, then the geometry in the synapse is (0) with sites in parallel alignment and R = (−1) was the most uniquely determined for each choice of R. The proof is based on a result included in SI Text S4 and Fig. S4.

Proposition 3. Suppose N(O + P) = unknot, and N(O + R) = unlink. Then we show the following: if P = (0) parallel and R = (−1), then O = (1); if P = (0) antiparallel and R = (0,0), then O = (0,0); if P = (0) parallel and R = (1), then O = (−1). In summary, the shortest unlinking pathway takes the trefoil to the 2-cat, the 2-cat to the unknot, and the unknot to the unlink (Theorems 1 and 2). There are exactly three topological mech- anisms of action to account for each of these unlinking steps by XerCD–FtsK (Propositions 1–3).

Unlinking of 2m-Cats and (2m−1) Torus Knots. We have proposed a shortest pathway to account for XerCD–FtsK unlinking of Fig. 4. Last three steps of DNA unlinking by XerCD–FtsK. At the top are the replication catenanes and have proved that this pathway is last three steps in the DNA unlinking pathway. Each of the three rows below unique. We have fully characterized the unlinking mechanisms shows one of the three possible topological mechanisms by which a trefoil is taking trefoil to 2-cat, 2-cat to unknot, and unknot to unlink. converted to a 2-cat, a 2-cat to an unknot, and an unknot to an unlink. fi = = − With the current mathematical machinery we can provide full Locally, the rst mechanism, where P (0) parallel and R ( 1), is consistent with the mechanism proposed for XerCD at psi (17). The other two pathways characterization only for the last three recombination events (rows 2 and 3) can be obtained by rigid rotation of the pathway in row 1, along the pathway. In the general case where a 2k-cat is con- and correspond to P = (0) antiparallel and R = (0,0), and to P = (0) parallel − − verted into a (2k 1)-torus knot (Proposition 4) and a (2k 1)- and R = (1), respectively. The rotation axes are m1 and m2, as in Fig. 5. These torus knot is converted into a (2k−2)-cat (Proposition 5), we can mechanisms are consistent with the three proposed for Xer at psi (15).

4of6 | www.pnas.org/cgi/doi/10.1073/pnas.1308450110 Shimokawa et al. Downloaded by guest on September 28, 2021 ′ = m can obtain a solution S equivalent to S by rotating the synapse P 1 = 13° 8° (0)toP (0,0). In the case of XerCD at psi, performing this IV simple transformation on the two solutions for P parallel yields the unique solution for P antiparallel. Similar arguments have been proposed for the recombinase TnP1-IRS (11) and for XerCD– I FtsK-dif (4). Here, we extend these arguments and apply them to each recombination step in the XerCD–FtsK-dif unlinking path- m2 III way (Figs. 4 and 5). The observed geometrical equivalence, ob- tained by rigid motion, between the three solutions for one of the II recombination steps suggests that a unique 3D representation of 14° 8° the synaptic complex and topological mechanism of the enzymes can be interpreted as three different tangle solutions when viewed from different spatial directions. Computer visualization methods, Fig. 5. Local site alignment. Due to a high level of sequence homology, we combined with molecular modeling such as PyMol and here assume that the XerCD–DNA synapse can be modeled using the data with X-ray crystallographic data, can be applied to realize these – – for Cre DNA. This diagram is based on the crystal structure of Cre DNA from spatial equivalences and to propose a 3D molecular model for the ref. 23, in which the recombinase core complex is slightly off-planar, so that the cores appear either parallel or antiparallel in different projections. To enzymatic action, as was done in ref. 15. An animation generated with Knotplot (www.knotplot.com) is presented in Movie S1 to il- improve clarity, the angles are doubled in this diagram. Let m2 be the hor- izontal axis (dashed line), and let m1 be the vertical axis (solid line). When lustrate the transition from the 6-cat to the 5-torus knot. These looking down m2, the two sites appear in antiparallel alignment, whereas studies indicate a limitation of the tangle model and suggest the when seen from the reader’s perspective, the sites are in parallel. A rigid left- need to consider equivalence classes of planar tangle diagrams hand rotation of the DNA conformation around the axis m1 will result in related by 3D rigid motion (rotations and translations). string IV crossing over string I (15). Then the sites appear in antiparallel alignment and one crossing is added to the domain. Further left-hand ro- Conclusions tation around the m2 axis results in III crossing over II and the sites returning We have presented a thorough mathematical analysis of the to a parallel alignment. process of DNA unlinking by XerCD–FtsK. Based on the - perimental data of ref. 4, we first prove that the shortest pathway APPLIED consistent with the experimental data. The mechanism proposed of unlinking a substrate consisting of 2m-cats with sites in par- MATHEMATICS in ref. 17 corresponds to the pathway in Fig. 4, row 1 (and is allel orientation has exactly 2m steps. We then show that if we revisited in SI Text S5 and Fig. S5). The other two pathways assume that recombination reduces the complexity of the sub- (rows 2 and 3) are obtained by rigid rotation of the pathway in strate at each step, as suggested by the experimental data, then row 1, and correspond to P = (0) antiparallel and R = (0,0), and the shortest pathway is unique and every recombination inter- to P = (0) parallel and R = (1), respectively. mediate is a torus knot or link. We analyze each recombination step and find three topological mechanisms at each step. The mechanisms can be unified in two different ways: (1) At each re-

Tangle Analysis of Iterative Recombination. Because the recom- BIOCHEMISTRY bination event involves two separate pairs of strand exchanges combination step, the obtained mechanisms can be interpreted as and proceeds through a Holliday Junction (HJ) Intermediate, different projections of the same 3D object, as illustrated in Figs. 4 there must be a step resetting the synapse between recombi- and 5 and in the animation in Movie S1;or(2)Theobtained nation events to maintain the same recombination mechanism at mechanisms can be sorted by the local action of the enzyme each event (i.e., if XerD is always to exchange the first pair of (specified by P, R, and the relative orientation of the sites). This strands to form the HJ which is resolved by XerC, as shown in unification produces three separate mechanisms of stepwise ref. 8). There is little evidence of the ability of XerCD and other unlinking consistent with the unique, shortest pathway previously tyrosine recombinases to act processively (i.e., bind and recom- reported. bine a single pair of recombination sites multiple times before dissociating). However, in refs. 3, 4, and 24 the authors report an Mathematical Methods action of XerCD recombination consistent with iterative cleav- is the study of non–self-intersecting curves in 3D space, and of fi fi – age and strand exchange. If the recombination reaction is as- the spaces de ned by these curves. A knot is de ned as a non self-inter- sumed to be iterative, and is modeled as processive (9, 25), secting circular path in space. A link (or catenane) is composed of two or more such curves, which can be intertwined. If the path can be laid flat then it can be proven that the tangle O is rational, and therefore without any over- or undercrossings, then K is the unknot or the trivial knot. all solutions are computable.

3D Interpretation of the Tangle Solutions. In the tangle method, P is defined as a very small ball containing the cleavage core regions A B of the recombination sites. In this case, the orientation of the recombination sites is inherited into the tangle P. Two sites in + = =~ P = (0) are in parallel alignment if both arrows point in the same direction in the tangle diagram; otherwise they are in antiparallel (-3,0) (6) (-5,0) (-1) (-5,-1) (6,-1,0) alignment. However, the concept of parallel and antiparallel alignment represents a local geometric property of the sites and C is well-defined only in the tangle diagram (a planar projection of the 3D tangle). For the same 3D tangle, one can always obtain a = = planar projection with parallel sites and another with antiparallel sites. The only exception to this statement is when two sites are coplanar, in a strict mathematical sense. In ref. 15 the authors N(A+B) 6-cat proposed a unique 3D topological mechanism for XerCD at psi Fig. 6. (A) Two rational tangles and their Conway vectors. (B) Tangle ad- that incorporated all three solutions to the tangle equations. Given dition of rational tangles A = (−5,0)andB = (−1). A + B is equivalent to the a solution where O is rational, P = (0)andR= (1)orR = (−1), one tangles ((−5,−1)and(6,−1,0)). (C) N(A + B) is a 6-crossing RH torus link (RH 6-cat).

Shimokawa et al. PNAS Early Edition | 5of6 Downloaded by guest on September 28, 2021 Likewise, the unlink, or trivial link, is the disjoint union of two unknotted detailed in ref. 15. Computing the solutions is not mathematically chal- circles. Knots and links can be studied through their diagrams, i.e., planar lenging but can be very tedious. Mathematical software is available to projections that distinguish between under- and overcrossings. Two knots the method more accessible to the scientific community (29, 30). TangleSolve or links K1 and K2 are equivalent if K1 can be smoothly deformed into K2 (http://ewok.sfsu.edu/TangleSolve) is a stand-alone Java program and web- without breaking the chain. We indistinguishably denote the knot/link, or its based applet with a user-friendly interface for analyzing and visualizing fi equivalence class, by K. Knots and links are classi according to their site-specific recombination mechanisms using the tangle method (29). crossing numbers. The crossing number of K is the minimum number of Using arguments from low-dimensional topology, O, P, and R can some- crossings taken over all projections of all elements in the equivalence class times be shown to be equivalent to rational or sums of two rational tangles K. Although finer classifications are attempted, it is generally difficult to (9, 11, 13–16, 25). In these cases one concludes that, under the assumptions determine when two knots/links are topologically identical (26). A tangle is a pair consisting of a ball and two strings in it. Fig. 2B illustrates of the tangle method, the solutions computed are the only possible sol- the four simplest tangles, called trivial tangles. Rational tangles are obtained utions to a given system of equations. Biologically this implies that the by intertwining the two strings in a trivial tangle. All tangles in Fig. 6 are tangle method computes all possible topological mechanisms of action for rational. Rational tangles most often appear in the mathematical analysis of the enzyme. In some cases reasonable assumptions can be made about P and site-specific recombination. There is a one-to-one correspondence between R to limit the number of solutions to a small finite number (14, 15). Math- the set of rational tangles and the set of extended rational numbers (i.e., the ematical analyses are often useful to characterize topological mechanisms of union of the set of rational numbers and the infinity 1/0) (27). A rational recombination (4, 10, 11, 13–16, 25, 31–34). A preliminary tangle analysis of tangle can be expressed using a vector representation, called the Conway Xer unlinking is given in ref. 4. vector. Conway vectors are illustrated in Fig. 6. Notice that the same rational tangle can admit different vector representations. The tangle method uses ACKNOWLEDGMENTS. The authors thank Rob Scharein for help with some two operations: tangle addition A + B; and numerator N(A) (Fig. 6C). A + B is of the figures and Movie S1, which were generated using Knotplot (www. a tangle, whereas N(A) is a knot or a link. If A is rational then N(A) belongs to knotplot.com), and Barbara Ustanko for editorial assistance with this man- the well-studied family of 4-plat knots or links (28). uscript. The authors also thank the referees for their careful review of the The tangle method uses biologically reasonable assumptions to model manuscript. M.V., K.S., and K.I. thank the Institute of Mathematics and Its a site-specific recombination event as a system of two or more tangle Applications for its hospitality and support. This research was supported by the following: Japan Society for the Promotion of Science KAKENHI 22540066 equations on three unknowns O, P, and R.IfP = (0), R = (w)or(w,0) for some and 25400080 (to K.S.); National Science Foundation DMS0920887 and CAREER integer w, then the system can be solved for O rational, sums of rational Grant DMS1057284 (to M.V.); UK Engineering and Physical Sciences Research tangles (9, 13). The solution set can be very large. To obtain the most bi- Council EP/H031367 (to K.I.); Australian Research Council FT120100153 ologically relevant solutions we often resort to assumptions where P = (0) and NHMRC APP1005697 (to I.G.); and Wellcome Trust WT099204AIA and R = (k)or(k,0). These assumptions are backed by biological data as (to D.J.S.).

1. Espeli O, Levine C, Hassing H, Marians KJ (2003) Temporal regulation of topoisomerase 17. Colloms SD, Bath J, Sherratt DJ (1997) Topological selectivity in Xer site-specificre- IV activity in E. coli. Mol Cell 11(1):189–201. combination. Cell 88(6):855–864. 2. Schvartzman JB, Stasiak A (2004) A topological view of the replicon. EMBO Rep 5(3): 18. Alén C, Sherratt DJ, Colloms SD (1997) Direct interaction of aminopeptidase A with 256–261. recombination site DNA in Xer site-specific recombination. EMBO J 16(17):5188–5197. 3. Ip SC, Bregu M, Barre FX, Sherratt DJ (2003) Decatenation of DNA circles by FtsK- 19. Colloms SD, Alén C, Sherratt DJ (1998) The ArcA/ArcB two-component regulatory dependent Xer site-specific recombination. EMBO J 22(23):6399–6407. system of Escherichia coli is essential for Xer site-specific recombination at psi. Mol 4. Grainge I, et al. (2007) Unlinking chromosomes catenanes in vivo by site-specificre- Microbiol 28(3):521–530. – combination. EMBO J 26(19):4228 4238. 20. Colloms SD (2013) The topology of plasmid-monomerizing Xer site-specificre- fi fl fl 5. Sadowski PD (1993) Site-speci c genetic recombination: Hops, ips, and ops. FASEB J combination. Biochem Soc Trans 41(2):589–594. – 7(9):760 767. 21. Hirasawa M, Shimokawa K (2000) Dehn surgeries on strongly invertible knots which fi 6. Hallet B, Sherratt DJ (1997) Transposition and site-speci c recombination: Adapting yield lens spaces. Proc Am Math Soc 128(11):3445–3451. DNA -and- mechanisms to a variety of genetic rearrangements. FEMS Mi- 22. Murasugi K (1965) On a certain numerical invariant of link types. Trans Am Math Soc – crobiol Rev 21(2):157 178. 117:387–422. 7. Barre FX, Sherratt DJ (2005) Chromosome dimer resolution. The Bacterial Chromo- 23. Van Duyne GD (2001) A structural view of Cre-loxP site-specific recombination. Annu some, ed Higgins PN (ASM, Washington), pp 513–524. Rev Biophys Biomol Struct 30:87–104. 8. Grainge I, Lesterlin C, Sherratt DJ (2011) Activation of XerCD-dif recombination by the 24. Gourlay SC, Colloms SD (2004) Control of Cre recombination by regulatory elements FtsK DNA translocase. Nucleic Acids Res 39(12):5140–5148. from Xer recombination systems. Mol Microbiol 52(1):53–65. 9. Ernst C, Sumners DW (1990) A calculus for rational tangles: Applications to DNA re- 25. Vazquez M, Sumners DW (2004) Tangle analysis of Gin recombination. Math Proc combination. Math Proc Camb Philos Soc 108(3):489–515. Camb Philos Soc 136(3):565–582. 10. Sumners DW, Ernst C, Spengler SJ, Cozzarelli NR (1995) Analysis of the mechanism of 26. Murasugi K (2008) Knot Theory and Its Applications (Birkhäuser, Boston). DNA recombination using tangles. Q Rev Biophys 28(3):253–313. 27. Conway JH (1970) An enumeration of knots and links, and some of their algebraic 11. Zheng W, Galloy C, Hallet B, Vazquez M (2007) The tangle model for site-specific properties. Computational Problems in Abstract Algebra, ed Leech J (Pergamon, Oxford), recombination: A computer interface and the TnpI-IRS recombination system. Knot – Theory for Specific Objects, OCAMI Studies, ed Kawauchi A (Osaka Municipal Uni- pp 329 358. versities Press, Sakai, Japan), Vol 1, pp 251–271. 28. Burde G, Zieschang H (2003) Knots (de Gruyter, Berlin). fi 12. Arsuaga J, Diao Y, Vazquez M (2007) DNA topology in recombination and chromo- 29. Saka Y, Vázquez M (2002) TangleSolve: Topological analysis of site-speci c recom- – some organization. Mathematics of DNA Structure, Function and Interactions, The bination. Bioinformatics 18(7):1011 1012. IMA Volumes in Mathematics and its Applications 150, eds CJ Benham et al. (Springer 30. Darcy IK, Scharein RG (2006) TopoICE-R: 3D visualization modeling the topology of – Science + Business Media, New York), pp 7–36. DNA recombination. Bioinformatics 22(14):1790 1791. fi 13. Ernst C, Sumners DW (1999) Solving tangle equations arising in a DNA recombination 31. Buck D, Flapan E (2007) Predicting knot or catenane type of site-speci c recombi- – model. Math Proc Camb Philos Soc 126(1):23–36. nation products. J Mol Biol 374(5):1186 1199. 14. Darcy I (2001) Biological distances on DNA knots and links: applications to Xer re- 32. Darcy I, Luecke J, Vazquez M (2009) Tangle analysis of difference topology ex- combination. J. Knot Theory Ramifications 10(2):269–294. periments: Applications to a Mu protein-DNA complex. Algebr. Geom. Topol. 9: 15. Vazquez M, Colloms SD, Sumners DW (2005) Tangle analysis of Xer recombination 2247–2309. reveals only three solutions, all consistent with a single three-dimensional topological 33. Valencia K, Buck D (2011) Predicting knot and catenane type of products of site- pathway. J Mol Biol 346(2):493–504. specific recombination on substrates. J Mol Biol 411(2):350–367. 16. Darcy IK, Ishihara K, Medikonduri RK, Shimokawa K (2012) Rational tangle surgery 34. Darcy IK, Vazquez M (2013) Determining the topology of stable protein-DNA com- and Xer recombination on catenanes. Algebr. Geom. Topol. 12:1183–1210. plexes. Biochem Soc Trans 41(2):601–605.

6of6 | www.pnas.org/cgi/doi/10.1073/pnas.1308450110 Shimokawa et al. Downloaded by guest on September 28, 2021