Chromosomal directionality of DNA mismatch repair in

A. M. Mahedi Hasan and David R. F. Leach1

Institute of Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF United Kingdom

Edited by Peggy Hsieh, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, and accepted bythe Editorial Board June 16, 2015 (received for review March 17, 2015) Defects in DNA mismatch repair (MMR) result in elevated muta- single-strand DNA-binding protein (SSB) (17). The DNA poly- genesis and in predisposition. This disease burden arises merase III holoenzyme then correctly resynthesizes the nascent because MMR is required to correct errors made in the copying of strand and DNA ligase seals the remaining (16, 18). DNA. MMR is bidirectional at the level of DNA strand polarity as it Most of the previous studies of MMR have been based on operates equally well in the 5′ to 3′ and the 3′ to 5′ directions. in vitro experiments using linear or closed-circular heteroduplex However, the directionality of MMR with respect to the chromo- DNA substrates in defined systems (19–21). These in vitro studies some, which comprises parental DNA strands of opposite polarity, have shown that an incision of the nascent strand can occur either has been unknown. Here, we show that MMR in Escherichia coli is on the 3′ or the 5′ side of a mismatch, depending on the position unidirectional with respect to the chromosome. Our data demon- of the hemimethylated GATC motif recognized by MutH (22, 23). strate that, following the recognition of a 3-bp insertion-deletion Using electron microscopy and end labeling, Grilley et al. have loop mismatch, the MMR machinery searches for the first hemime- shown that the single-stranded DNA region created after an ex- thylated GATC site located on its origin-distal side, toward the rep- cision reaction lies between the mismatch and the closest GATC lication fork, and that resection then proceeds back toward the motif (15). A hemimethylated GATC motif recognized on the 3′ mismatch and away from the replication fork. This study provides side of the mismatch requires a 3′ to 5′ (e.g., ExoI or support for a tight coupling between MMR and DNA replication. ExoX), and ExoVII or RecJ cleaves the single-stranded DNA when a hemimethylated GATC motif is recognized on the 5′ side mismatch repair | recombination | E. coli of the mismatch (5′ to 3′ cleavage).Therefore, in vitro, the cleav- age reaction of the MMR system is bidirectional (15, 16). How- NA can be mutated following damage caused by exposure ever, in vivo, an MMR system that is bidirectional with respect to Dto chemical or physical or by errors in DNA DNA polarities could nevertheless be unidirectional with respect metabolism, during replication, recombination, or repair (1). to the chromosome. In fact, a unique directionality of MMR with Although replicative DNA polymerases have proofreading respect to DNA replication would explain the evolution of bidir- activities, they cannot avoid a low frequency of incorporation of ectionality at the level of strand polarities as mismatches on both noncomplementary deoxynucleoside triphosphates (dNTPs). It the leading and lagging strands need to be repaired. Fig. 1A dis- − − has been estimated that there are 10 5–10 6 misincorporation tinguishes directionality at the level of strand polarity from di- events per replicated base pair (2). Uncorrected errors of this rectionality at the level of the chromosome and illustrates how kind (known as mismatches) will be converted into , directionality relative to the replication fork can lead to bidi- which have the potential to perturb biological processes in the rectionality of resection polarities. Blackwood and collaborators next round of DNA replication. The DNA mismatch repair found that MMR at the site of an unstable trinucleotide repeat (MMR) system is a key DNA guardian that ensures the removal (TNR) array caused a stimulation of recombination at a nearby of the misincorporated and thereby maintains ge- 275-bp tandem repeat. This stimulation occurred only when the nomic integrity. The MMR system has a defined substrate range, tandem repeat was placed on the origin-proximal side the TNR with different efficiencies of correction of single base mismatches that had generated a high frequency substrate for MMR (24). This and a 3–4 base size limit for the correction of insertion/deletion – loops (IDLs) (3 6). This system is conserved in almost all or- Significance ganisms, with the exception of most Actinobacteria and Molli- cutes, and parts of the archaea (7). DNA mismatch repair (MMR) is critical to avoid mutations that When the replication machinery inserts a noncomplementary can lead to genetic disease, cancer, and death. The MMR sys- dNTP in the nascent strand of the E. coli chromosome, the MutS tem is evolutionarily conserved from bacteria to humans and is protein binds to the mismatch and recruits the MutL protein (8). an example of a remarkable molecular machine. Here we show These two proteins activate the endonuclease MutH, which nicks that this machine operates unidirectionally with respect to the the unmethylated strand of a hemimethylated GATC site to ini- chromosome of Escherichia coli. The most likely explanation for tiate removal of the nascent strand containing the mismatch (9, this directionality is that the MMR machinery is associated with 10). Following the passage of the replisome, GATC motifs re- the complex responsible for DNA replication. We suggest that main transiently hemimethylated before of the na- this association facilitates the mechanism of strand discrimi- scent strand by the Dam enzyme and can nation in MMR. therefore be used to distinguish between parental and nascent strands (11, 12). In vitro, MutH can distinguish between these Author contributions: A.M.M.H. and D.R.F.L. designed research; A.M.M.H. performed re- strands by using hemimethylated GATC motifs located within a search; A.M.M.H. and D.R.F.L. analyzed data; and A.M.M.H. and D.R.F.L. wrote the paper. 2-kb distance of the mismatch, either on the 3′ or the 5′ side (13, The authors declare no conflict of interest. 14). UvrD uses the incision made by MutH at the This article is a PNAS Direct Submission. P.H. is a guest editor invited by the Editorial hemimethylated GATC motif as an entry point to unwind the Board. nascent strand. One or more of the four (ExoI, Freely available online through the PNAS open access option. ExoVII, RecJ, and ExoX), depending on the required polarity of 1To whom correspondence should be addressed. Email: [email protected]. degradation, resects the unwound nascent strand (15, 16). The This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. single-stranded parental DNA is immediately bound by the 1073/pnas.1505370112/-/DCSupplemental.

9388–9393 | PNAS | July 28, 2015 | vol. 112 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1505370112 Downloaded by guest on October 1, 2021 Results MMR Is Unidirectional in the E. Coli Chromosome. Length instability at the level of a single trinucleotide unit in a CTG·CAG repeat array that is inserted in the chromosome of E. coli is elevated in MMR deficient cells (24). This elevation of occurs to an equal extent in mutS, mutL, and mutH mutants indicating that correction of the 3-base IDL, precursor to a single trinucleotide repeat unit change, occurs by a normal MMR reaction involving recognition by MutS and strand cleavage by MutH (24). This is as expected for a 3-base IDL, which has been shown to be corrected with an efficiency close to that of a G·Tmismatch(thebestrec- ognized single base mismatch) (6) and contrasts with changes of two or more CTG·CAG trinucleotide units, which are not ele- vated in MMR deficient cells (24), as predicted by the inability of the MMR system to recognize an IDL of 5 bases (6). Accordingly, aCTG·CAG repeat array can be used as a source of frequent 3-base IDL mismatches at a defined chromosomal locus for the study of MMR in vivo. In this work, we have used a repeated array of 98 copies of CTG·CAG (294 bp) where the CTG repeats are located on the leading-strand template in the lacZ of the E. coli chromosome (Fig. 1B). The variation of length of the repeat array at the level of a single trinucleotide unit has been defined as “single-unit instability,” which is used as a quantitative phenotype reflecting the efficiency of the MMR system. We have focused on the directionality of MMR as defined by the use of GATC sites on both sides of the TNR. This was carried out by generating site- specific mutations that modified the GATC sites but retained their GENETICS coding sequences via synonymous substitutions. No significant change of single-unit instability was observed following the modification of all of the GATC motifs within a 2.3-kb region on the origin-proximal side of the TNR. In this situation, the next available GATC site was located at a distance of 2,339 bp from the TNR (P2339) and the frequency of single- unit instability remained equivalent to that observed for wild type cells [D472(wt)] (Fig. 2). We conclude that GATC motifs within the first 2.3 kb on the origin-proximal side of the TNR are not required for the excision reaction during DNA mismatch repair while origin-distal GATC motifs are present. On the other hand, the frequency of single-unit instability of the TNR started to increase when GATC motifs beyond 1 kb Fig. 1. Directionality of MMR. (A) Visualization of MMR in the context of DNA were modified on the origin-distal side of the TNR (D1356 on- replication. The two possible directionalities of MMR with respect to the chro- ward in Fig. 2), while keeping the origin-proximal GATC motifs mosome are depicted in the context of the replication fork where mismatches arise. The MMR complex has the potential to scan the chromosome for a hem- intact. The level of single-unit instability increased 2.9-fold over imethylated GATC site in the direction of movement of the replication fork (i)or that observed for [D472(wt)] when all of the available GATC in the opposite direction, away from the replication fork (ii). Mismatches are motifs within a 2-kb region on the original-distal side of the TNR shown on both the leading and lagging strands. However, in reality they are were modified (D2369). However, the level of single-unit in- most likely to be present in either one or other of these two locations at a given stability observed for D2369 did not reach the level of that in a time. It can be seen that if the scanning for a hemimethylated GATC is unidi- mutS background. The difference of instability frequencies be- rectional with respect to the chromosome because of the direction of movement tween the mutS background and D2369 is statistically significant ofthereplicationfork,thiscanleadtobidirectional resection to repair mis- at a 95% confidence level (p value = 0.0317), indicating that an matches on both the leading and the lagging strands. (B) Schematic represen- tation of the locus of interest and the experimental design to investigate the in vivo separation of 2 kb between the mismatch and the closest influence of GATC sites on MMR and tandem repeat recombination associated remaining GATC site did not totally abolish MMR. This con- with MMR. The native GATC motifs within the 5-kb region surrounding the trasts with the situation observed in vitro where a separation of CTG·CAG TNR array are shown. There are thirteen GATC motifs within the 2.5-kb 2 kb abrogates MMR (15, 25, 26). Nevertheless, our data clearly region on the origin-proximal side of the TNR, of which, two (P128 and P224) are show that the DNA mismatch repair system has a preference for situated between the TNR and the site of insertion of the zeocin cassette. On the recognition of GATC sites on the origin-distal side of the TNR. origin-distal side of the TNR, there are nine GATC motifs in the first 2.5-kb re- Our next step was to extend the distance between the nearest gion. The E. coli chromosome is shown as a horizontal blue cylinder, the TNR is available GATC motif and a mismatch further beyond the 2 kb shown as a green band and the zeocin cassette is shown as two tandem red cylinders. GATC motifs are shown by amber bands with their respective names in vitro limit. As there were many closely spaced GATC motifs based on their positions. “P” and “D” correspond to the origin-proximal side and just beyond the 2-kb boundary on the origin-distal side of the the origin-distal side of the TNR, respectively, whereas the number indicates the TNR in the E. coli genome, we extended the GATC-free region distance between the GATC motif and the TNR. by inserting a non E. coli DNA sequence in the strain D2369. A 2.3-kb-long GATC motif-free DNA sequence was isolated from the fission yeast Schizosaccharomyces pombe and inserted at a result suggested that MMR might be directional in a chromosomal site 200 bp from the TNR on the origin-distal side. In that strain context. We have now tested this hypothesis and shown that the [D4600(Sp)], the total length of GATC motif-free sequence on MMR system of E. coli has a unique chromosomal directionality. the origin-distal side of the TNR reached 4.6 kb. The frequency

Hasan and Leach PNAS | July 28, 2015 | vol. 112 | no. 30 | 9389 Downloaded by guest on October 1, 2021 of the GATC site implicated in the reaction or the length of the excision tract, we investigated a second reaction implicating MMR. It has previously been shown that recombination at a split zeocin resistance recombination reporter cassette (zeocin cassette) on the origin-proximal side of a TNR (Fig. S3) is stimulated by MMR (24). The rate of recombination was found to be a function of the distance between the TNR and the zeocin cassette and could be detected as far as 6.3 kb away on the origin-proximal side of the TNR. Here, we have studied the recombination level at the zeocin cassette as a function of the location of the first available GATC site to initiate the MMR reaction. In this experiment, we placed the zeocin cassette at 500 bp on the origin-proximal side of the TNR, as indicated in Fig. 1B. In the absence of the TNR, there – was a basal rate of recombination [D472(wt) TNR ; Fig. 3]. This level increased sevenfold in the presence of the TNR [D472(wt) + TNR ]. This increase was dependent on MMR as only the basal level of recombination was detected in a mutS mutant containing + the TNR (D472 mutS TNR ). This reaction has previously been shown to be dependent on MutS, MutL, and MutH, with the ar- Fig. 2. MMR as a function of GATC site availability. Frequencies of single-unit gument that (like MMR) it is dependent on an excision tract instability of the TNR in strains with different distances to the nearest origin- proximal or origin-distal GATC sites. The distance to the first available GATC site initiated by cleavage of a GATC site by MutH endonuclease (24). from the TNR is indicated on the x axis and the height of the corresponding In MMR proficient cells, mutation of both GATC motifs, P128 and P224, located between the TNR and the zeocin cassette did columns shows the frequency of single-unit instability (y axis). Error bars rep- + resent SEs of means. Strains are numbered according to the distance to the first not affect the level of recombination (P590 TNR ). Therefore, GATC motif present, with “P” and “D” representing the origin-proximal side this MMR excision reaction is not influenced by the presence of and the origin-distal side of the TNR respectively. The strain containing the wild- the GATC motifs between the TNR and the zeocin cassette on the type GATC motif is named D472(wt) and D4600(Sp) represents the strain in- origin-proximal side of the mismatch. A gradual decrease in the cluding a 2.3-kb GATC-free section of S. pombe DNA with a 4.6-kb GATC-free rate of recombination at the zeocin cassette was observed when region on the origin-distal side of the TNR. The mutS strain refers to the strain + sequential modifications of GATC motifs extended beyond a 1-kb D472 mutS TNR in the Table S1, which contains all of the native GATC sites + from the chromosome. All of the strains used in this assay contain the TNR. distance on the origin-distal side of the TNR (D1356 TNR and beyond). A 15% and a 19% decrease of the rate of recombination were observed, compared with the strain with intact GATC motifs + [D472(wt) TNR ], in the strains with the first available GATC of single-unit instability in this strain rose close to that observed + + for a DNA mismatch repair deficient background (mutS; p value = motif at 1,356 bp (D1356 TNR )and1,825bp(D1825TNR) 0.9451 at 95% confidence level), indicating little or no MMR away on the origin-distal side of the TNR. However, the rate of recombination did not decrease to the level observed in cells de- (Fig. 2). This implies that the GATC sites on the origin proximal – void of the TNR (TNR ) or mutated in the MMR system (D472 side of the TNR (which remained present in this situation) were + mutS TNR ), even after modification of all GATC motifs within notusedtodirectrepair. + + Our data have allowed us to calculate the relative efficiency of 2 kb (D2369 TNR ) or 4.6 kb [D4600(Sp) TNR ] on the origin- MMR as a function of the distance between a mismatch and its distal side of the TNR (Fig. 3). Nevertheless, these experiments nearest GATC motif on the origin-distal side of the TNR (Fig. S1). The efficiency of the MMR system in a wild type cell has been defined as 1 and that in a mutS mutant has been set to 0. Sequential modifications of GATC motifs on the origin-distal side result in the loss of efficiency of MMR as depicted in Fig. S1. Fitting an exponential trend line to the data reveals that the MMR system is 50% less efficient when the distance between the TNR and the next available GATC motif is 2.8 kb. To determine whether a single origin-distal GATC site located close to the TNR was able to restore MMR, we inserted an ec- topic GATC site (via synonymous substitution) at 567 bp on the origin-distal side of the TNR in the strain D2369 (D567-1G; Fig. S2). In this strain, the frequency of single-unit instability of the TNR returned to a similar level to that observed in wild type cells [D472(wt)]. This experiment suggested that a single origin-distal GATC site at 567 bp from the TNR was efficiently recognized. To confirm that this site was indeed recognized at close to 100% efficiency, we modified the same region (567 bp origin distal to the TNR) to include three tandemly repeated GATC sites (D567- 3G). Because that did not further increase the efficiency of MMR we concluded that a single origin-distal GATC site at this distance Fig. 3. Tandem repeat recombination as a function of GATC site availabil- from the TNR was already recognized with maximal efficiency. ity. Rates of recombination at a zeocin cassette in strains with different distances to the nearest origin-proximal or origin-distal GATC sites. The rates of recombination at the zeocin cassette are shown along the y axis for strains The Efficiency of MMR Depends on the Length of the Excision Tract. with different distances to the first available GATC site from the TNR (in- To investigate whether the observed decrease in efficiency of dicated along the x axis by the number following “P” for origin-proximal MMR with respect to the separation between the mismatch and side and “D” for origin-distal side of the TNR). Error bars represent upper the closest origin-distal GATC site was determined by recognition and lower limits of 95% confidence intervals.

9390 | www.pnas.org/cgi/doi/10.1073/pnas.1505370112 Hasan and Leach Downloaded by guest on October 1, 2021 revealed that, as for MMR, the rate of recombination was sensi- fork that is normally recognized during MMR. Unidirectionality tive to the separation of the TNR and the zeocin cassette from the of the MMR system with respect to chromosomal DNA replica- first available origin-distal GATC site. tion explains bidirectionality at the level of the strand excision, as We have quantified the effect of the distance between the MMR is required to occur on both the leading and lagging strand mismatch and the first available GATC site on the origin-distal of the replication fork (15). side of the TNR by plotting the relative recombination efficiency Our assay relies on the observation that the frequency of single- as a function of this distance (see Supporting Information for the unit TNR changes (+1and−1TNR)inaCTG·CAG repeat array is calculation of relative recombination efficiencies). Here, we have dependent on MMR catalyzed by MutS, MutL, and MutH (24). + defined the strain with wild type GATC sites [D472(wt) TNR ]to This is consistent with a conventional MMR reaction to correct a have a relative recombination efficiency of 1 and the mutS mutant 3- IDL, generated in the TNR during DNA replication + (D472 mutS TNR ) to have a relative recombination efficiency of and known to be a good substrate for MMR (6). For this reason, we 0. As can be seen in Fig. S4, the recombination efficiencies de- have made the simplifying assumption that the TNR is providing an creased as the distance between the first available GATC site and elevated frequency of 3-nucleotide IDLs for conventional MMR the TNR (or the zeocin cassette) increased. However, the slope of detected in our single-unit instability assay. However, we cannot the exponential fitted curve is approximately half that for MMR formally exclude the possibility that some other features of the re- (compare Figs. S4 and S1). The data predict that a 50% re- peat array are contributing to the frequency of single-unit changes. combination efficiency will be reached when 6 kb separates the It is well known that trinucleotide repeats can form pseudohairpins first available GATC site and the TNR. in vitro (27–32) that have been implicated in large-scale TNR The observations that the efficiencies of MMR and of zeocin changes in many experimental systems including the E. coli chro- cassette recombination are affected by the separation between the mosome (33, 34). We suspect that these large pseudohairpin TNR and first available origin-distal GATC site argue that origin- structures are not implicated in our single-unit TNR changes, but if distal GATC sites are implicated in both reactions. However, the they are, the repair reaction must involve the three components of difference in the efficiencies of the two reactions was surprising conventional MMR (MutS, MutL, and MutH) and must demon- given that both excision to mediate MMR and excision to stimu- strate the chromosomal directionality that we have detected. late zeocin cassette recombination were expected to pass through Further experiments are required to determine the molecular the TNR and to enable repair. We considered that sensitivity to basis of this directionality. However, codirectionality of hemi-

the separation between the TNR (or zeocin cassette) and the first methylated GATC recognition and DNA replication provides a GENETICS available origin-distal GATC site might either reflect a diminished mechanism for origin-distal GATC motif scanning via a known probability of recognizing a GATC site with distance from the interaction between the MutS protein and the β-clamp (35, 36). mismatch or a diminished probability of an excision tract covering Following recognition of the mismatch by an ADP-bound form the distance between the GATC site recognized and the TNR (or of MutS, this protein is converted into an ATP-bound sliding zeocin cassette). Given that there is no reason to expect two dif- clamp (2). The interaction between MutS-ATP and the β-clamp ferent modes of recognition of the same origin-distal GATC site, of the replisome would provide the possibility of a rapid and we argue that two lengths of excision tracts must exist. We propose directed scanning for a hemimethylated GATC site on the ori- that MMR is primarily mediated via shorter tracts that do not gin-distal side of the mismatch, as shown in Fig. 4. Upon iden- need to extend far beyond the mismatch, whereas longer tracts are tification of a hemimethylated GATC site, the incision of the responsible for zeocin cassette recombination at a greater distance nascent strand would be carried out by MutH in conjunction with beyond the initiating mismatch. Because both reactions are the matchmaker protein MutL. expected to involve excision tracts that remove the mismatch in the TNR, the shorter tracts must be more frequent than the longer The Maximal Distance, Measured Between a GATC Site and a Mismatch, tracts to explain the observation of two different efficiencies as a for Productive MMR Is Longer in vivo than in vitro. Our data suggest function of separation of the mismatch and the first available that the efficiency of MMR is reduced to 50% in the presence of origin-distal GATC site. The fact that two reactions, which are 2.8 kb of DNA between the mismatch and the GATC site used to initiated at the same GATC sites, are differentially sensitive to the initiate resection. This distance is greater than the limit of 2 kb distance of the first origin-distal GATC site from the TNR pro- observed in vitro (15, 25, 26). This difference could be explained if vides strong evidence that the lengths of the two different classes scanning for a hemimethylated GATC site is unidirectional in vivo of excision tracts are the primary in vivo determinants of the because of the interaction between MutS and the β-clamp (as ability of a distant GATC to stimulate MMR or recombination. described in Fig. 4). If so, scanning for a hemimethylated GATC site might in fact not be the factor determining the maximal in vivo RecQ Helicase Is Not Implicated in the Longer Excision Tracts. We have distance separating a mismatch and a GATC site for productive tested whether the RecQ helicase might be implicated in the longer MMR, while it might be critical in vitro. We have determined that excision tracts, by extending excision tracts initiated by UvrD. This only seven regions of the E. coli chromosome have GATC motifs was an attractive hypothesis as RecQ is a non-MMR helicase with separated by distances greater than 2.8 kb. These regions are ei- a3′ to 5′ directionality that can bind ssDNA—dsDNA junctions ther derived from prophages or present in rhs loci, which have an and stimulate recombination. However, the rates of recombination unusual base composition and are subject to complex inheritance at the zeocin cassette were similar independently of the presence of patterns (37–40). RecQ in strains where 4.6 kb of DNA separated the TNR and the + first origin-distal GATC site [D4600(Sp) TNR ](Fig. S5). Different Lengths of MMR Excision Tracts Are Formed. MMR me- diates the repair of mismatches and stimulates recombination at Discussion a zeocin cassette located up to 6.3 kb away from a TNR (24). MMR Is Unidirectional with Respect to the E. coli Chromosome. We Both reactions involve the recognition of a GATC site on the have shown that to correct IDL mismatches the MMR system of origin-distal side of the TNR. However, the zeocin cassette re- E. coli uses the GATC sequences present on the origin-distal side combination reaction implies the existence of longer resection of a TNR located in the chromosomal lacZ gene. Given that IDLs tracts extending beyond that observed in vitro (15). Here, we are created during DNA replication and that hemimethylated show that MMR and zeocin cassette recombination are differ- GATC sites, which direct the strand specificity of MMR, are entially affected by the distance between the first origin-distal transiently generated during DNA replication, we propose that it GATC site and the TNR. This differential effect argues that two is the first GATC site between the mismatch and the replication different lengths of excision tracts are generated during MMR.

Hasan and Leach PNAS | July 28, 2015 | vol. 112 | no. 30 | 9391 Downloaded by guest on October 1, 2021 The Success of MMR as a Function of the Distance between the Mismatch and the First Origin-Distal GATC Site Is Determined by the Length of the Excision Tract. We have considered two possibilities to explain the effect of the separation of the mismatch from the first origin-distal GATC site on the efficiency of MMR and of zeocin cassette recombination. First, the recognition of a hemi- methylated GATC site is sensitive to the distance from the mis- match. Second, the processivity of resection determines the length of the excision tract. Given that there is no reason to propose two different modes of GATC recognition for excision tracts of dif- ferent lengths, the fact that these events are differentially affected by the distance between the first origin-distal GATC and the TNR argues that it is the length of the excision tract that determines the success of MMR. This fits with the model presented in Fig. 4, where MutS is carried forward with the replisome facilitating the unidirectional scanning of DNA for a hemimethylated GATC site. We can estimate the length of the excision tracts that mediate MMR assuming that GATC sites are recognized equally well, are recognized independently of their distance from the TNR in the range of distances that we have investigated and that resection from the first origin-distal site (at 472 bp from the TNR) always reaches the mismatch. Given these assumptions, the observation that the efficiency of MMR is reduced to 50% with a separation of 2.8 kb between the mismatch and the first available GATC site argues that 50% of these excision tracts measure this distance. By a similar argument, our data argue that 50% of the excision tracts that mediate zeocin cassette recombination measure 6 kb plus the distance required to uncover at least one copy of the repeated section of the zeocin cassette. Because the zeocin cassette is lo- cated 500 bp on the origin-proximal side of the TNR, the zeocin cassette includes a repeated sequence of 275 bp and the TNR measures 294 bp a successful excision tract must reach ∼1.1 kb farther than the distance between the recognized GATC site and the TNR. The 50% length of these longer excision tracts is there- fore estimated to be ∼7kb.

Implications for Eukaryotic MMR Systems. In eukaryotic cells, MMR is primarily initiated by the MutSα (MSH2/MSH6)orMutSβ (MSH2/ MSH3) complex, which is then followed by the MutLα (MLH1/ Fig. 4. Model for the chromosomal directionality of MMR. (A) Recognition of PMS) protein. There is no separate MutH endonuclease homolog. a mismatch by MutS. MutS is closely associated to the replisome through its α interaction with the β-clamp. Upon recognition of a mismatch the MutS dimer However, MutL has been shown to have endonuclease activity adopts its ATP-bound form and becomes a sliding clamp. (B) Recognition of an (42). Strand discrimination is carried out via the recognition of nicks origin-distal hemimethylated GATC site by the MutSLH complex. Through in the nascent strand and it has recently been shown that these nicks continued association with the β-clamp, MutS slides with the replisome and can be generated via the removal of ribonucleotides from this DNA becomes associated with MutL and MutH. This triple complex recognizes the strand (43, 44). The roles of and exonucleases in eukary- first hemimethylated GATC site encountered in the direction of replisome otic MMR are only partially understood. No helicase has been movement. In these figures, mismatches are shown on both the leading and shown to be required for the resection reaction but this does not lagging strands. However, in reality replisome complexes are likely to encounter exclude the possible involvement of helicases (45). The exonuclease a mismatch on either the leading or the lagging strand at a particular time. activity of Exo1 can provide 5′ to 3′ resection in the absence of a helicase (46). Furthermore the endonuclease activity of MutLα can The shorter tracts must be more frequent as they are primarily nick a strand 5′ of a mismatch that was originally nicked on the 3′ responsible for MMR and only need to reach the mismatch for side, allowing access to a 5′ to 3′ such as Exo1 (42, 47). repair to occur. We have no evidence for where they terminate. However, the 3′ to 5′ exonuclease activities of polymerases δ and However, they may indeed terminate within 100 nucleotides e have also been implicated in MMR (48). As in E. coli, the beyond the mismatch, as demonstrated in vitro (15). The longer eukaryotic MutS complexes interact with the sliding clamp (49, 50). excision tracts that lead to zeocin cassette recombination must be Therefore, whatever the transduction pathway to resection and the less frequent as they do not determine the efficiency of MMR final polarity of that resection, we expect that MMR in eukaryotic but extend significantly beyond the mismatch. They are less cells will also show chromosomal directionality. sensitive to the distance between the mismatch and the first Materials and Methods GATC site, suggesting that they are intrinsically longer (more processive resection). We have eliminated the possibility that the Bacterial Strains. The construction and genotypes of the strains used are provided in the Supporting Information. longer excision tracts are caused by the action of the RecQ helicase. Further work is required to determine the molecular MMR Analysis. Fragment length analysis of the TNR is described in the basis of the differences between these lengths of excision tracts. Supporting Information. For example, it is known that several alternative can substitute for each other in excision tract processing (16, 41) and Recombination Analysis. Fluctuation analysis of recombination frequencies this nuclease choice might determine excision tract length. at the zeocin cassette is described in the Supporting Information.

9392 | www.pnas.org/cgi/doi/10.1073/pnas.1505370112 Hasan and Leach Downloaded by guest on October 1, 2021 ACKNOWLEDGMENTS. We thank Elise Darmon and Benura Azeroglu for A.M.M.H. was funded by a studentship from the Darwin Trust of Edinburgh. critical reading of the manuscript and Ewa Okely for bacterial strains. D.R.F.L. was funded by the Medical Research Council (UK).

1. Li G-M (2008) Mechanisms and functions of DNA mismatch repair. Cell Res 18(1): 28. Mitas M, Yu A, Dill J, Haworth IS (1995) The trinucleotide repeat sequence d(CGG)15 85–98. forms a heat-stable hairpin containing Gsyn. Ganti base pairs. Biochemistry 34(39): 2. Jiricny J (2013) Postreplicative mismatch repair. Cold Spring Harb Perspect Biol 5(4): 12803–12811. a012633. 29. Smith GK, Jie J, Fox GE, Gao X (1995) DNA CTG triplet repeats involved in dynamic 3. Gradia S, Acharya S, Fishel R (2000) The role of mismatched nucleotides in activating mutations of neurologically related gene sequences form stable duplexes. Nucleic the hMSH2-hMSH6 molecular switch. J Biol Chem 275(6):3922–3930. Acids Res 23(21):4303–4311. 4. Schofield MJ, et al. (2001) The Phe-X-Glu DNA binding motif of MutS. The role of 30. Yu A, Dill J, Mitas M (1995) The purine-rich trinucleotide repeat sequences d(CAG)15 hydrogen bonding in mismatch recognition. J Biol Chem 276(49):45505–45508. and d(GAC)15 form hairpins. Nucleic Acids Res 23(20):4055–4057. 5. Jiricny J, Su SS, Wood SG, Modrich P (1988) Mismatch-containing oligonucleotide 31. Yu A, et al. (1995) The trinucleotide repeat sequence d(GTC)15 adopts a hairpin duplexes bound by the E. coli mutS-encoded protein. Nucleic Acids Res 16(16): conformation. Nucleic Acids Res 23(14):2706–2714. 7843–7853. 32. Petruska J, Hartenstine MJ, Goodman MF (1998) Analysis of strand slippage in DNA 6. Parker BO, Marinus MG (1992) Repair of DNA heteroduplexes containing small het- polymerase expansions of CAG/CTG triplet repeats associated with neurodegenera- – erologous sequences in Escherichia coli. Proc Natl Acad Sci USA 89(5):1730 1734. tive disease. J Biol Chem 273(9):5204–5210. 7. Sachadyn P (2010) Conservation and diversity of MutS proteins. Mutat Res 694(1-2): 33. Zahra R, Blackwood JK, Sales J, Leach DRF (2007) Proofreading and secondary struc- – 20 30. ture processing determine the orientation dependence of CAG x CTG trinucleotide 8. Su SS, Modrich P (1986) Escherichia coli mutS-encoded protein binds to mismatched repeat instability in Escherichia coli. Genetics 176(1):27–41. – DNA base pairs. Proc Natl Acad Sci USA 83(14):5057 5061. 34. Andreoni F, Darmon E, Poon WCK, Leach DRF (2010) Overexpression of the single- 9. Wang H, Hays JB (2003) Mismatch repair in human nuclear extracts: Effects of internal stranded DNA-binding protein (SSB) stabilises CAG*CTG triplet repeats in an orien- DNA-hairpin structures between mismatches and excision-initiation nicks on mis- tation dependent manner. FEBS Lett 584(1):153–158. – match correction and mismatch-provoked excision. J Biol Chem 278(31):28686 28693. 35. López de Saro FJ, Marinus MG, Modrich P, O’Donnell M (2006) The beta sliding clamp 10. Wang H, Hays JB (2004) Signaling from DNA mispairs to mismatch-repair excision sites binds to multiple sites within MutL and MutS. J Biol Chem 281(20):14340–14349. – despite intervening blockades. EMBO J 23(10):2126 2133. 36. Simmons LA, Davies BW, Grossman AD, Walker GC (2008) Beta clamp directs locali- 11. Urig S, et al. (2002) The Escherichia coli dam DNA methyltransferase modifies DNA in zation of mismatch repair in Bacillus subtilis. Mol Cell 29(3):291–301. a highly processive reaction. J Mol Biol 319(5):1085–1096. 37. Feulner G, et al. (1990) Structure of the rhsA locus from Escherichia coli K-12 and 12. Barras F, Marinus MG (1989) The great GATC: DNA methylation in E. coli. Trends comparison of rhsA with other members of the rhs multigene family. J Bacteriol Genet 5(5):139–143. 172(1):446–456. 13. Lahue RS, Su SS, Modrich P (1987) Requirement for d(GATC) sequences in Escherichia 38. Edwards RA, Keller LH, Schifferli DM (1998) Improved allelic exchange vectors and coli mutHLS mismatch correction. Proc Natl Acad Sci USA 84(6):1482–1486. their use to analyze 987P fimbria gene expression. Gene 207(2):149–157. 14. Bruni R, Martin D, Jiricny J (1988) d(GATC) sequences influence Escherichia coli mis- 39. Minet AD, Rubin BP, Tucker RP, Baumgartner S, Chiquet-Ehrismann R (1999) Ten- match repair in a distance-dependent manner from positions both upstream and GENETICS eurin-1, a vertebrate homologue of the Drosophila pair-rule gene ten-m, is a neu- downstream of the mismatch. Nucleic Acids Res 16(11):4875–4890. ronal protein with a novel type of heparin-binding domain. J Cell Sci 112 (Pt 1): 15. Grilley M, Griffith J, Modrich P (1993) Bidirectional excision in methyl-directed mis- 2019–2032. match repair. J Biol Chem 268(16):11830–11837. 40. Aggarwal K, Lee KH (2011) Overexpression of cloned RhsA sequences perturbs the 16. Burdett V, Baitinger C, Viswanathan M, Lovett ST, Modrich P (2001) In vivo re- cellular translational machinery in Escherichia coli. J Bacteriol 193(18):4869–4880. quirement for RecJ, ExoVII, ExoI, and ExoX in methyl-directed mismatch repair. Proc 41. Kunkel TA, Erie DA (2005) DNA mismatch repair. Annu Rev Biochem 74:681–710. Natl Acad Sci USA 98(12):6765–6770. 42. Kadyrov FA, Dzantiev L, Constantin N, Modrich P (2006) Endonucleolytic function of 17. Ramilo C, et al. (2002) Partial reconstitution of human DNA mismatch repair in vitro: MutLalpha in human mismatch repair. Cell 126(2):297–308. Characterization of the role of human . Mol Cell Biol 22(7): 43. Ghodgaonkar MM, et al. (2013) Ribonucleotides misincorporated into DNA act as 2037–2046. – 18. Lahue RS, Au KG, Modrich P (1989) DNA mismatch correction in a defined system. strand-discrimination signals in eukaryotic mismatch repair. Mol Cell 50(3):323 332. Science 245(4914):160–164. 44. Lujan SA, Williams JS, Clausen AR, Clark AB, Kunkel TA (2013) Ribonucleotides are 19. Thomas DC, Roberts JD, Kunkel TA (1991) Heteroduplex repair in extracts of human signals for mismatch repair of leading-strand replication errors. Mol Cell 50(3): – HeLa cells. J Biol Chem 266(6):3744–3751. 437 443. 20. Blackwell LJ, Bjornson KP, Modrich P (1998) DNA-dependent activation of the 45. Song L, Yuan F, Zhang Y (2010) Does a helicase activity help mismatch repair in eu- – hMutSalpha ATPase. J Biol Chem 273(48):32049–32054. karyotes? IUBMB Life 62(7):548 553. ′ 21. Dzantiev L, et al. (2004) A defined human system that supports bidirectional mis- 46. Genschel J, Bazemore LR, Modrich P (2002) Human exonuclease I is required for 5 and ′ – match-provoked excision. Mol Cell 15(1):31–41. 3 mismatch repair. J Biol Chem 277(15):13302 13311. 22. Modrich P (1989) Methyl-directed DNA mismatch correction. J Biol Chem 264(12): 47. Kadyrov FA, et al. (2007) Saccharomyces cerevisiae MutLalpha is a mismatch repair – 6597–6600. endonuclease. J Biol Chem 282(51):37181 37190. ′–> ′ 23. Au KG, Welsh K, Modrich P (1992) Initiation of methyl-directed mismatch repair. J Biol 48. Tran HT, Gordenin DA, Resnick MA (1999) The 3 5 exonucleases of DNA poly- Chem 267(17):12142–12148. merases delta and epsilon and the 5′–>3′ exonuclease Exo1 have major roles in 24. Blackwood JK, Okely EA, Zahra R, Eykelenboom JK, Leach DRF (2010) DNA tandem postreplication mutation avoidance in Saccharomyces cerevisiae. Mol Cell Biol 19(3): repeat instability in the Escherichia coli chromosome is stimulated by mismatch repair 2000–2007. at an adjacent CAG·CTG trinucleotide repeat. Proc Natl Acad Sci USA 107(52): 49. Flores-Rozas H, Clark D, Kolodner RD (2000) Proliferating cell nuclear antigen and 22582–22586. Msh2p-Msh6p interact to form an active mispair recognition complex. Nat Genet 25. Pukkila PJ, Peterson J, Herman G, Modrich P, Meselson M (1983) Effects of high levels 26(3):375–378. of DNA adenine methylation on methyl-directed mismatch repair in Escherichia coli. 50. Kleczkowska HE, Marra G, Lettieri T, Jiricny J (2001) hMSH3 and hMSH6 interact with Genetics 104(4):571–582. PCNA and colocalize with it to replication foci. Dev 15(6):724–736. 26. Modrich P (1991) Mechanisms and biological effects of mismatch repair. Annu Rev 51. Merlin C, McAteer S, Masters M (2002) Tools for characterization of Escherichia coli Genet 25:229–253. genes of unknown function. J Bacteriol 184(16):4573–4581. 27. Gacy AM, Goellner G, Juranic N, Macura S, McMurray CT (1995) Trinucleotide repeats 52. Spell RM, Jinks-Robertson S (2004) Determination of mitotic recombination rates by that expand in human disease form hairpin structures in vitro. Cell 81(4):533–540. fluctuation analysis in Saccharomyces cerevisiae. Methods Mol Biol 262:3–12.

Hasan and Leach PNAS | July 28, 2015 | vol. 112 | no. 30 | 9393 Downloaded by guest on October 1, 2021