Topological impact of noncanonical DNA structures on Klenow fragment of DNA

Shuntaro Takahashia, John A. Brazierb, and Naoki Sugimotoa,c,1

aFrontier Institute for Biomolecular Engineering Research (FIBER), Konan University, 7-1-20 Minatojima-minamimachi, Kobe 650-0047, Japan; bDepartment of Pharmacy, University of Reading, Whiteknights, Reading, Berkshire RG6 6AD, United Kingdom; and cGraduate School of Frontiers of Innovative Research in Science and Technology (FIRST), Konan University, 7-1-20 Minatojima-minamimachi, Kobe 650-0047, Japan

Edited by Sarah A. Woodson, Johns Hopkins University, Baltimore, MD, and accepted by Editorial Board Member Richard Eisenberg July 31, 2017 (receivedfor review March 13, 2017) Noncanonical DNA structures that stall DNA replication can cause which is highly expressed in cancer cells, forms a very stable errors in genomic DNA. Here, we investigated how the noncanonical i-motif structure in vitro (12, 17). The thermodynamic stability structures formed by sequences in genes associated with a number of the i-motif structure with the Hif1a sequence under slightly of diseases impacted DNA polymerization by the Klenow fragment acidic conditions is comparable to that of G4 formed by the of DNA polymerase. Replication of a DNA sequence forming an opposite strand (18). Because G4 formation in the template i-motif from a , hypoxia-induced transcription factor, and DNA causes breakage of the genomic DNA due to stalling of DNA an insulin-linked polymorphic region was effectively inhibited. On polymerase during replication (19, 20), the presence of an i-motif the other hand, replication of a mixed-type G-quadruplex (G4) from a telomere was less inhibited than that of the antiparallel type or on the template DNA may be responsible for the mechanism of parallel type. Interestingly, the i-motif was a better inhibitor of intractable diseases, including cancer. However, the effects of replication than were mixed-type G4s or hairpin structures, even these topological differences, particularly the i-motif, remain though all had similar thermodynamic stabilities. These results unknown. indicate that both the stability and topology of structures formed In this study, we investigated the effect of structures formed in in DNA templates impact the of a DNA polymerase. This the template strand on the replication reaction. Under slightly suggests that i-motif formation may trigger genomic instability by acidic conditions, i-motif formation decreased the rate of repli- stalling the replication of DNA, causing intractable diseases. cation by Klenow fragment DNA polymerase (KF). Gel elec- trophoretic analyses revealed that the DNA polymerase stalled replication | i-motif | G-quadruplex | thermodynamics | molecular immediately before the i-motif–forming region, indicating that crowding the i-motif is an obstruction to the polymerase. We compared the inhibition resulting from i-motif formation with that resulting oncanonical intramolecular structures of nucleic acids, such from G4 and hairpin formation, and found that the level of in- Nas a triplex and a quadruplex, are stabilized under condi- hibition was determined not only by the stability but also by the tions that mimic the crowded cellular conditions (1), and have topology of noncanonical DNA structures. The i-motif, anti- been detected in cells (2, 3). In vitro and in vivo, guanine- parallel G4s, and parallel G4s repressed replication more ef- quadruplex (G-quadruplex or G4) formation inhibits transcrip- fectively than mixed G4 or hairpins with similar thermodynamic tion and translation of template nucleic acids (4–8). It is possible stabilities. These findings suggest that not only G4 but also the that the noncanonical structures act as “functional codes” trig- gered by different molecular environments, which regulate gene i-motif can induce genomic instability more than other non- expression epigenetically (6, 7). As sequences capable of forming canonical structures, which causes intractable diseases. the noncanonical structures are found in and promoter regions of known oncogenes, alterations of the noncanonical Significance structures could play important roles in the progression of cancer and in other diseases (9). Alterations in nonduplex structures could play roles during One of the remarkable features of noncanonical structures of DNA replication in the progression of cancer and other in- nucleic acids is the diversity of topologies. In the case of G4s, tractable diseases. These noncanonical structures differ topo- antiparallel, mixed, and parallel type structures have been char- logically from each other. However, the role of these differences acterized (Fig. 1 A–C). The sequences complementary to regions in diseases remains unknown. In this study, we found that the capable of G4 formation are composed of tandem repeats of cy- presence of i-motif structures in the template caused the DNA tosine, and these C-rich regions can form a different type of tet- polymerase to stall immediately before these structures. The raplex topology, which is the i-motif (10, 11). An intramolecular i-motif structures are more efficient than other structures, i-motif is formed upon the interaction of four C-rich regions. The such as G-quadruplexes and hairpins, although their thermo- structure has three loops and two parallel-stranded hairpin-like dynamic stabilities are similar. This inhibition effect of the + units stabilized by cytosine and protonated cytosine (C:C )base DNA polymerase was regulated by molecular crowding, which pairs that are vertically intercalated antiparallel to one another mimics conditions in the cell. Therefore, it is possible that the (Fig. 1 D and E). The i-motif structures are categorized into class I i-motif could impede DNA replication or repair and cause BIOPHYSICS AND and class II based on the lengths of the loops. C-rich sequences genomic instability. are found in telomeres and in the promoter regions of about COMPUTATIONAL BIOLOGY Author contributions: S.T., J.A.B., and N.S. designed research; S.T. performed research; S.T. 40% of human genes (12, 13). As the i-motif is stabilized by and N.S. analyzed data; and S.T., J.A.B., and N.S. wrote the paper. + hydrogen bonding in C:C base pairs (14, 15), acidic conditions The authors declare no conflict of interest. stabilize the structure. As the i-motif can mediate transcrip- This article is a PNAS Direct Submission. S.A.W. is a guest editor invited by the tional regulation of B-cell lymphoma 2 (Bcl2) oncogene in cells Editorial Board. (16), the environment inside these cells might be favorable to 1To whom correspondence should be addressed. Email: [email protected].

i-motif formation. Interestingly, the promoter region of the This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. BIOCHEMISTRY gene that encodes hypoxia-induced transcription factor (Hif1a), 1073/pnas.1704258114/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1704258114 PNAS | September 5, 2017 | vol. 114 | no. 36 | 9605–9610 Downloaded by guest on September 24, 2021 on a template containing the i-motif–forming sequence found in the promoter region of the Hif1a gene, which forms a particularly stable i-motif structure in vitro (12, 17), the results were more striking. At pH 7.0, the replication of the Hif1a template was ef- ficient (Fig. 2D), but virtually no full-length product was detected at pH 6.0 (Fig. 2E). We did observe a small amount of a shorter product, indicating that the replication reaction was inhibited after elongation of less than 10 nucleotides (SI Appendix,Fig.S4). It was Fig. 1. Schematic illustrations of topologies of antiparallel G4 (A), mixed G4 shown previously that KF stalls immediately before a G4 structure (B), parallel G4 (C), class I i-motif (D), and class II i-motif (E) having longer on a template (22). To confirm that the shorter products resulted loops than class I are shown. from stalling of the polymerase during progressive replication, we analyzed the replication from FAM-labeled primer S, which was Results 10 bases shorter than the original primer. Reactions using this primer generated the same length of stalled product as those using DNA Replication Reaction. In the template DNA, the structure- the longer primer, indicating that the noncanonical structure did forming sequence was adjacent to the region complementary to not disrupt template loading (SI Appendix,Fig.S5). Thus, during the primer (SI Appendix, Fig. S1). The 5′ terminus of primer replication of the Hif1a template DNA, we confirmed that KF DNA was labeled with fluorescein (FAM) to enable quantifica- tion of product formation. The gap between the region of a was stalled by the i-motif structure formed within the template. We also confirmed replication stall by T7 DNA polymerase, noncanonical structure and the primer-binding region was four SI Appendix bases, which is a length selected to ensure that the structure which replicates the genome of T7 phage ( ,Fig.S6). adopted does not interfere with the formation of the elongation Therefore, stalling of replication at an i-motif can be considered to complex of KF and initial polymerization (21). This design en- generally occur. abled us to discriminate stalled product from unreacted primer Dependence of Replication Efficiency on i-Motif Stability. To un- and full-length product. The i-motif–forming sequences used are derived from the human telomeric sequence (C TA ) , the derstand the relationship between the replication efficiency and i- 3 2 4 motif stability, a UV melting analysis was performed on strands promoter region of the Hif1a gene, a region complementary to a – portion of the insulin-linked polymorphic region (cILPR), which containing only the i-motif forming region of the template se- quences (Table 1 and SI Appendix, Figs. S7 and S8 and Table S2). is a regulatory sequence upstream of the gene encoding insulin, T and the complementary sequence of an oncogene of Bcl2 At pH 6.0, (C3TA2)4 had a melting temperature ( m) of 31.4 °C −ΔG − · −1 (cBcl2). Also tested were G4-forming sequences (T AG ) , Q6, and a free energy change at 37 °C ( °37)of 0.70 kcal mol , 2 3 4 T −ΔG and ILPR (complementary G-rich sequence of the cILPR) and whereas the Hif1a sequence had a m of 57.3 °C and a °37 of · −1 hairpin-forming sequences (H1, H2, and H3). Q6 possesses four 3.1 kcal mol (Table 1). These data suggest that the stalling effect of KF is related to the magnitude of −ΔG° .Wepreviously repeats of the G4 unit linked with the loop of the thrombin- 37 binding aptamer (6). The control was a linear sequence not showed that transcription by RNA polymerase was halted at a expected to form any stable structures. All of the sequences are G4 structure in the template DNA (6) and that the magnitude of shown in SI Appendix, Table S1. Formation of the i-motif and the inhibition depended on the stability of G4. There has been no G4 structures, which have different topologies, was confirmed by quantitative analysis of the effect of stabilities of noncanonical circular dichroism (CD) analyses at 37 °C in the buffer used for structures present in the template DNA on replication, but a replication assays (SI Appendix, Figs. S2 and S3). previous study showed that replication was inhibited at a G4 structure and that the inhibition was more efficient in the presence Replication of DNA Containing an I-Motif Structure. Although in- of a G4-binding ligand expected to increase the stability of the hibition of DNA replication by a G4 formed within a template DNA strand has been studied before (22), there have been no reports on the effects of i-motif structures on DNA replication. Therefore, we first examined replication of a template that con- tained the human telomeric sequence (C3TA2)4 forming the i-motif structure. The replication reaction was carried out with 1 μMKF and 250 μM dNTPs for 0.5 min at 37 °C, and the progress of the reaction was analyzed by quantification of fluorescently la- beled products on denaturing polyacrylamide gel electrophoresis (PAGE). At pH 7.0, a condition in which (C3TA2)4 does not form the i-motif structure, the fluorescently labeled primer was com- pletely converted to a longer product within 0.5 min (Fig. 2A). By staining of DNA using SYBR Gold dye, we confirmed that the fluorescently labeled replication product had the same mobility as the template DNA (SI Appendix,Fig.S4). Thus, the replication of the (C3TA2)4-containing template yielded a full-length product. Under a slightly acidic condition at pH 6.0, replication of the i-motif–forming template was repressed (Fig. 2B). A band that Fig. 2. Representative PAGE analyses of replication reactions with (C3TA2)4 migrated just above the primer band was observed in the reaction template DNA at pH 7.0 (A), (C3TA2)4 template DNA at pH 6.0 (B), linear tem- plate DNA at pH 6.0 (C), Hif1a template DNA at pH 7.0 (D), and Hif1a template with the (C3TA2)4-containing template at pH 6.0 (Fig. 2B). The presence of this band indicates that the replication was stalled after DNA at pH 6.0 (E) are shown. The images were captured using a fluorescent primer extension of less than 10 nucleotides (SI Appendix,Fig.S4). imager; thus, only DNA containing the fluorescent label is visualized. Images of the same gels stained to visualize all DNA are shown in SI Appendix,Fig.S2. Such a short band was not observed in the case of the replication of μ μ μ C Reactions were carried out with 1 MKF,1 M DNAs, and 250 MdNTPsin the linear template (Fig. 2 ).Thesedatasuggestthatthei-motif 40 mM MES buffer (pH 7.0 or pH 6.0) for 2 min at 37 °C. The asterisk indicates structure formed at low pH transiently stalled replication imme- the position of major stalled product. The values at the bottom of the gel diately before the structure. When the experiment was performed image are the percentages of full-length product.

9606 | www.pnas.org/cgi/doi/10.1073/pnas.1704258114 Takahashi et al. Downloaded by guest on September 24, 2021 Table 1. Thermodynamic and kinetic parameters of the stalled product was observed at 0.16 min (SI Appendix, Fig. structured sequence region on each template DNA at pH 6.0 S10A). The amount of stalled product was decreased at 0.33 min −1 −1 and had almost disappeared by 1 min. On the other hand, full- DNA topology Sequence Tm,°C −ΔG°37, kcal·mol ks,min length product was detected at 0.33 min, and the conversion to ± − ± ± i-motif (C3TA2)4 31.4 0.1 0.70 0.01 4.5 0.1 full-length product was complete at 2 min. In the case of the Hif1a 57.3 ± 0.1 3.1 ± 0.2 0.39 ± 0.11 replication of the Hif1a sequence, stalled product was detected Hairpin H2 75.8 ± 2.2 4.0 ± 0.2 3.7 ± 0.1 at 0.5 min and the amount gradually decreased with time. A ± ± ± Mixed G4* (T2AG3)4 62.8 0.5 3.0 0.2 2.6 0.3 small amount of full-length product was detected at 1 min, and Parallel G4 ILPR 58.1 ± 0.2 2.1 ± 0.3 0.54 ± 0.4 the amount increased at longer time points (SI Appendix, Fig. Antiparallel G4 Q6 71.1 ± 0.3 4.5 ± 0.1 0.08 ± 0.03 S10B). For both the (C3TA2)4 and Hif1a templates, the increase All experiments were performed in 40 mM MES (pH 6.0) and 8 mM MgCl in full-length product corresponded to a decrease in amount of 2 A without both KCl and PEG, except for the case of (T2AG3)4. Oligonucleotides the stalled product (Fig. 3 ). The generation curves of full- corresponding to the structured region of indicated templates were evalu- length product were analyzed by global fitting (SI Appendix, μ −1 ated at 10 M strand concentration. Fig. S11) and rate constants, ks (min ) for the reaction step of *Experiments for (T2AG3)4 were performed in 40 mM MES (pH 6.0), 8 mM the stall of replication, including the prior replication to the − MgCl2, and 30 mM KCl. 1 stalling position and kf (min ) for the reaction step of gener- ating full-length product after resolving the replication stall were k −1 −1 G4 structure (5). The data reported here suggest that the i-motif calculated. The s values were 4.5 min and 0.38 min at 37 °C can also inhibit replication in a stability-dependent manner. and pH 6.0 for the templates containing the (C3TA2)4 sequence and the Hif1a sequence, respectively (Table 1). These results Effect of Molecular Crowding on the Replication of i-Motif–Forming indicate that KF took about a 12-fold longer time to overcome DNA. To further investigate the replication of i-motif–forming the more stable i-motif formed by the Hif1a sequence than to overcome that formed by the (C3TA2)4 sequence. The kf values DNA in cell-mimicking conditions, we examined replication in −1 −1 more physiologically relevant conditions in the presence of salt were 4.5 min and 6.3 min at 37 °C and pH 6.0 for the tem- and crowding reagents using polyethylene glycol (PEG). In the plates containing the (C3TA2)4 sequence and the Hif1a se- presence of 100 mM KCl and 20 wt% PEG200 (average molecular quence, respectively (SI Appendix, Table S2), suggesting that the weight is 200) at pH 6.0, the amount of full-length product rep- replication stall did not occur for the (C3TA2)4 sequence but that licated from the template containing the (C3TA2)4 sequence was the stalling step was the rate-determining step in the replication quite similar to that in the absence of PEG200 (SI Appendix, Fig. of the Hif1a sequence. S9A). Because the higher molecular weight of PEG shows more Fig. 3B shows a plot of the logarithm of replication rate con- k −ΔG stabilization of i-motif due to the excluded volume effect (23, 24), stant (ln s)vs. °37 for the template sequences investigated. the replication might not be effectively stalled by the addition of 20 wt% PEG200. In fact, the addition of 20 wt% PEG200 did not increase thermal stability of the (C3TA2)4 i-motif, whose Tm was −1 28.7 °C and −ΔG°37 was −1.4 kcal·mol (SI Appendix,TableS2). In the presence of 20 wt% PEG1000 (average molecular weight is 1,000) at pH 6.0, a short product was detected on PAGE (SI Appendix,Fig.S9B), indicating that KF was stalled immediately before the i-motif structure. The stabilizing effect of PEG1000 was observed in the melting temperature of (C3TA2)4 (SI Appendix, Fig. S8 and Table S2). The Tm in the presence of 20 wt% −1 PEG1000 was 37.3 °C and the −ΔG°37 was 0.1 kcal·mol .Inthe case of the Hif1a sequence, the thermodynamic stability was also not affected in the presence of PEG200 (Tm = 51.2 °C and −1 −ΔG°37 = 3.1 kcal·mol ) but increased in the presence of −1 PEG1000 (Tm = 59.7 °C and −ΔG°37 = 4.2 kcal·mol )(SI Ap- pendix,Fig.S8andTableS2). The Hif1a sequence stalled KF im- mediately before the i-motif structure in the presence of PEG1000, although PEG200 did not have an effect (SI Appendix,Fig.S9C and D). Interestingly, stalled replication could be observed even at pH 7.0 in the presence of 100 mM KCl and 20 wt% PEG1000 (SI Appendix,Fig.S9E). Although the detailed mechanism of stabili- zation of the i-motif structure by PEG1000 is unknown, the higher molecular-weight PEG shifts the pKa ofN3inthecytosinebase(23, 24). As the i-motif structure is very compact compared with a Fig. 3. Kinetics of replication of DNA templates containing noncanonical random coil, a more pronounced excluded volume effect of higher structures. (A) Ratios of stalled (dotted lines) and full-length (solid lines) to

molecular-weight PEG1000 compared with that of PEG200 may total product of (C3TA2)4 (blue), Hif1a (red), and linear (green) templates as a

−Δ BIOPHYSICS AND contribute to the observed effect of PEG1000 on the replication of function of time. (B) Plot of G°37 values vs. lnks for reactions to dissolve – the stall from reaction start along i-motif–forming templates (blue), G4-

an i-motif forming DNA. In summary, our results suggest that the COMPUTATIONAL BIOLOGY cellular condition crowded with biomacromolecules stabilizes the forming templates (red), and linear template (green). Data on the G4 ILPR i-motif structure, which facilitates the inhibition of replication. template were excluded from fitting of the G4 data. (C) Ratios of stalled product and full-length product to total products of (T2AG3)4 replication in the presence of 1 mM KCl (blue) and 30 mM KCl (red) and of linear (green) Replication Rate Is Inversely Correlated with Stability of i-Motif. To replication in the absence of KCl as a function of time. (D) Ratios of full- further analyze the correlation between the thermodynamic length product to total product of Hif1a in the presence of 100 mM KCl and stability and replication rate, we evaluated the amount of prod- 20 wt% PEG200 (blue) and in the presence of 100 mM KCl and 20 wt%

uct as a function of time (SI Appendix, Fig. S10). In the reaction PEG1000 (red). All of the reactions were carried in 40 mM MES (pH 6.0), BIOCHEMISTRY μ μ μ with the (C3TA2)4 template in the absence of PEG at pH 6.0, 8 mM MgCl2,1 M KF, 1 M DNAs, and 250 M dNTPs at 37 °C.

Takahashi et al. PNAS | September 5, 2017 | vol. 114 | no. 36 | 9607 Downloaded by guest on September 24, 2021 The −ΔG°37 value for the linear control template was assumed to ks values differed, indicating that the replication stall depended on be zero. This plot includes data on an additional i-motif formed by not only the stability but also the topology. the cILPR. The cILPR is a portion of the complementary se- The plots of lnks vs. −ΔG°37 for hairpin DNAs and (T2AG3)4 quence of the insulin-linked polymorphic region, which forms a at 1, 10, 30, and 50 mM KCl were linear, as shown in the case of B k −ΔG previously characterized i-motif (18). The values of lnks decreased i-motif templates (Fig. 3 ). The slope of the ln s vs. °37 − − with increasing −ΔG°37, and there is a good linear correlation. As was 0.048 for hairpin structures and 0.28 for (T2AG3)4, which ΔG°isequalto−RT lnK (R is a gas constant and K is the equi- is different from that of −0.83 for the i-motif, indicating that the ΔG ‡ librium constant), ΔG° also can be described as −RT ln(k1/k−1), activation free energy ( °37 ) required to unwind the i-motif where k1 is the rate constant of structure formation and k−1 is the structures was about 17-fold or threefold higher than that re- deformation rate constant. Because the stabilities of DNA struc- quired to unwind hairpin or (T2AG3)4 G4, which forms a mixed tures depend on hydrogen bonding and stacking interactions be- structure (SI Appendix, Fig. S2). The data for the template tween bases, k1 values for DNAs of similar chain lengths are containing the G4 ILPR, which forms a parallel structure (29), similar; thus, k−1 is a dominant factor in stability (25). Therefore, and Q6 sequences at 1 mM KCl, which form an antiparallel structure, also showed a different stability tendency and k value the linear relationship between lnks and −ΔG°37 indicates that the s replication rate through an i-motif is mainly proportional to the (Table 1). Those plots did not fall on the line with data for (T AG ) -containing templates. This suggests that the topology rate of the unfolding of the i-motif structure (k−1). As the tem- 2 3 4 of G4 affects the efficiency of unwinding. Interestingly, the data perature was constant in our analysis, we infer that lnks is pro- ‡ ‡ portional to the activation free energy, ΔG .TheΔG of the for the ILPR and Q6 template showed a correlation with data on ‡ i-motifs, suggesting that the i-motif is a strong block for repli- unfolding of a DNA duplex (ΔG off) has a linear correlation with the stability of the DNA duplex (−ΔG°) (26), and the activation cation as well as parallel/antiparallel G4s. To investigate the effect of i-motif topology on replication, we free energy of dissociation of a DNA duplex depends on the se- E quence (GC content) and length (27, 28). Therefore, to unwind tested the class II i-motif cBcl2 sequence (Fig. 1 ) that is found in ‡ structured DNAs, the polymerase must decrease the ΔG or rely the promoter region of the Bcl2 oncogene. In this case, we ob- off served two major bands shorter than the full-length product (SI on spontaneous dissociation of the structured DNA. It is possible Appendix A that reduced processivity due to low pH and PEG may be ,Fig.S15). The Bcl2 sequence can adopt i-motif the factor in the replication stall. Low pH and PEG do affect structures of different conformations, and the sequence can also form a hairpin structure, due to the presence of six cytosine tracts. enzyme processivity because the ks value of linear template at −1 If multiple conformers are present, more than one stalled product pH 7.0 in the absence of salt and PEG was 18 min ,whichwasabout might be observed. Moreover, there were unexpected smeared fivefold larger than that at pH 6.0 (SI Appendix, Figs. S10 and bands above the full-length product band. Because a nascent DNA S11 and Table S2). However, the slope of the lnks vs. −ΔG°37 plot ‡ product from the replication of a triplet repeat, such as (CGG) or of i-motifs, which indicates the magnitude of ΔG for the un- n (CAG) , which forms a hairpin-like structure, potentially causes winding process of the i-motif by KF, showed similar values at n − longer product than the template (30), the transient hairpin pH 6.0 in the absence of salt and PEG ( 0.83) and in the presence structure of the nascent DNA product from the cBcl2 template of 100 mM KCl without PEG (−0.73), and at pH 7.0 in the presence − SI Appendix might increase the size of product. Therefore, we calculated the of 100 mM KCl and 20 wt% PEG1000 ( 0.55) ( ,Figs. rate of full-length product formation, including the smeared S12–S14) (PEG200 is discussed in the next section). The similar bands. As a result, the lnks value was −0.18, and −ΔG°37 value was − value obtained for the slope indicated that the unwinding of 3.0 kcal·mol 1 (SI Appendix,Fig.S15B and C). On the lnk vs. i-motif was dominated by the same mechanism, not by enzyme s −ΔG°37 plot, the cBcl2 i-motif conformed to the linear correlation properties, in various conditions. Thus, the increased stalling at of class I i-motifs (Fig. 3B), which indicates that the topology of the i-motif was due to folding of the i-motif and not to reduced the i-motif does not influence the replication stall. Based on these enzyme processivity at low pH or in PEG. experiments, we conclude that the class II i-motifs with longer loops and class I motifs are unwound by a similar mechanism. Effect of Topology of Structured DNAs on the Replication Reaction. The topology and stability of noncanonical structures are also We next analyzed the replication of template DNAs having se- affected by the crowding condition. Experiments were also per- quences that can form different structural topologies. First, we formed in the presence of 20 wt% PEG200 (Fig. 3D and SI Ap- investigated the replication of a template DNA containing a pendix,Fig.S12).Inthecaseofthei-motif,theslopeoflnk vs. − hairpin structure. The sequences contained stems of 4 (H1), 9 s ΔG°37 was −0.12 (SI Appendix,Fig.S16), whose magnitude was (H2), and 12 (H3) base pairs, and all had a four-nucleotide loop about sixfold smaller than that in the absence of PEG200 (−0.74) −ΔG characterized previously (6). The °37 values obtained from (SI Appendix,Fig.S14B). Thus, the stalling effect of the i-motif UV melting experiments in 40 mM MES (pH 6.0) and 8 mM decreased in 20 wt% PEG200 compared with conditions without · −1 MgCl2 were 2.2, 4.0, and 8.3 kcal mol for H1, H2, and H3, re- PEG200. Our recent study suggested that ethylene glycol binds to SI Appendix spectively (Table 1 and , Fig. S7 and Table S2). Thus, the duplex and disrupts the hydrogen network around Watson– each of these hairpins is more stable than the i-motif structure Crick base pairing (31). All stabilities of the i-motif in the presence −ΔG − · −1 adopted by (C3TA2)4, which has a °37 of 0.70 kcal mol . of 20 wt% PEG200 decreased compared with stabilities in corre- Interestingly, KF did not stall on any of the hairpin-containing sponding conditions without PEG200. Therefore, PEG200 might templates (SI Appendix,Fig.S10). uniquely interact with the base pairs of the i-motif and decrease We also analyzed replication of templates that contained regions stability, thereby increasing replication efficiency. On the other able to form G4 in different KCl concentrations. For the template hand, (T2AG3)4 was transformed from a mixed to a parallel to- containing (T2AG3)4 in the presence of 30 mM KCl at 37 °C and pology in the presence of 20 wt% PEG200 and 30 mM KCl (32) pH 6.0, stalled product was observed at 0.16 min (Fig. 3C and SI (SI Appendix, Fig. S3). Replication was effectively repressed and Appendix ,Fig.S10) and the full-length product increased corre- showed a similar slope of the lnks vs. −ΔG°37 plot (−0.87) to that spondingly. UV melting showed that the −ΔG°37 value of of the i-motif observed in the absence of PEG200 (−0.83) (SI −1 (T2AG3)4 was 3.0 kcal·mol in the presence of 30 mM KCl (Table Appendix, Figs. S14A and S16). Thus, this transformation may 1 and SI Appendix, Fig. S7). We also confirmed that the stalled inhibit replication more efficiently, as observed in the case of product was due to a replication block at the G4 structure (SI ILPR. In the presence of PEG1000, the replication of i-motifs Appendix D ,Fig.S5). Interestingly, although Hif1a, H2, and was effectively repressed (Fig. 3D). The slope of lnks vs. −ΔG°37 (T2AG3)4 in 30 mM KCl showed similar stabilities (Table 1), the was −0.56, which was close to that observed in the absence of

9608 | www.pnas.org/cgi/doi/10.1073/pnas.1704258114 Takahashi et al. Downloaded by guest on September 24, 2021 PEG200 (−0.83) (SI Appendix,Figs.S14C and S16). Therefore, G4 structures, and the loop structure of the i-motif likely presents + a specific molecular environment with various conditions of a steric obstruction to unwinding of C:C base pairs (Fig. 4B). + molecular crowding regulates the processivity of KF along a Furthermore, in the i-motif structure, two parallel-strand C:C template DNA based on the activation free energy for un- base pairs mutually intercalate with each other and form the tet- winding by changing the stability and the topology of the DNA raplex. In this conformation, the base pair next to the terminal structure formed. base pair belongs to the other parallel strand. Therefore, the consecutive unzipping from the terminal base pairs by breathing Discussion should be repressed due to the stacking by the intercalated base In this study, we found a linear correlation between the lnks and the pairs from the other parallel strand. Replication through an stability (−ΔG°37) of particular structural topologies of non- i-motif–forming region may require the complete unfolding of the canonical structures in the template DNA, indicating that the rate- structure, leading to our observations. This might be a reason why limiting step of structure unwinding by KF is determined by both G4 showed different enzyme processivities. In the case of mixed the stability and topology of the structure formed within the DNA G4, although the stability of a terminal quartet is presumably template. We reported previously that the processivity of T7 RNA higher than that of a terminal base pair of a duplex, once the polymerase in the transcription reaction was influenced by both terminal quartet is unzipped by breathing, the stability of the hairpin and G4 structures formed within the template DNAs (6). overall structure is decreased and KF can proceed (Fig. 4C). As The rates of transcription reaction were correlated with the stabil- for other G4 topologies, the breathing of the terminal quartet ities of the structures, and the lack of relationship with topology may might be repressed, because metastable intermediates formed due indicate that the unwinding mechanisms of RNA polymerase and to strand slippage via triplex formation occur during unfolding KF are different (33). Some G4-specific show topological (39) (Fig. 4D). Data on the ILPR and Q6 G4s fall on the line of specificity. For example, the telomere TPP1 more efficiently lnks vs. −ΔG°37 data on i-motif–forming sequences. This suggests unwinds antiparallel G4 structures like Q6 than it does parallel G4 that it is also possible that metastable intermediates in the (34). In contrast, RNA RHAU preferentially binds to unfolding reaction might prolong dissociation of the i-motif. parallel G4 structures like that of ILPR (35). These results imply In cells, helicases assist DNA polymerase in replication of that the processivity of DNA polymerase requires the assistance of structured DNAs. A variety of helicases that unwind G4 have been different types of helicases able to unwind various topologies. identified, and defects in helicases that unwind these structures are KF has an ability to unwind hairpin structures, presumably associated with genetic diseases (9). This phenomenon implies that through an unzipping mechanism, which is essentially the reverse the topological properties of structures formed in DNA templates of the folding process. Based on the nearest neighbor model, are important. A recent study showed that the antiparallel G4 on stability of a duplex region of a hairpin is determined by the se- the leading strand with Tm values even less than physiological quence and number of stacked base pairs (36, 37). Terminal base temperature could cause genomic instability (40). This phenome- pairs “breathe” and are easily dissociated (38). Therefore, KF may non implies that the replication stall due to the topology of non- induce unzipping of a hairpin in a mechanism that does not de- canonical structures is also influenced by helicase polarity. Helicases pend on the overall free energy of the hairpin structure (Fig. 4A). that unwind i-motif structures have not yet been identified. How- In the case of the i-motif, the structural topology encountered by ever, i-motif–binding , such as hnRNPA1 and hnRNPLL, the polymerase is quite different from those in hairpin and mixed induce unwinding of i-motif structures (41–43). It has also been BIOPHYSICS AND COMPUTATIONAL BIOLOGY

Fig. 4. Proposed mechanism of unwinding of structured DNAs in templates containing various structures. (A) Terminal base pair of a hairpin is relatively unstable. Thus, polymerase can induce unzipping one base pair at a time to proceed through a hairpin. (B) In the i-motif structure, the loop region can be a block for polymerase, and the base pair has a different topology from the first, which likely stalls the polymerase. (C) Replication of mixed G4 structure presents an obstacle as the terminal quartet is more stable than the terminal base pair of a hairpin. Once the terminal G-quartet is unzipped, the stability of

G4 is significantly reduced and replication proceeds rapidly. (D) Parallel/antiparallel G4s have a different unfolding pathway, which repressed the terminal BIOCHEMISTRY breathing and enzyme processivity as well as the i-motif.

Takahashi et al. PNAS | September 5, 2017 | vol. 114 | no. 36 | 9609 Downloaded by guest on September 24, 2021 suggested that negative supercoiling induced by RNA inducing formation of specific topologies that stall replication. promote i-motif formation in cells (44). Our data suggest that i-motif Development of topology-specific binders may enable site-specific formation can block replication, and this, in turn, may result in ge- control of genomic instability and expression of genes triggered by nomic instability. Genomic instability is associated with cancer de- the noncanonical structures targeted. velopment and also with chronic diseases, such as diabetes (45, 46). It is possible that mutation in or dysregulation of expression of Materials and Methods i-motif–binding proteins or the acidic pH of the cancer cell The characterization of structure and melting analysis of DNAs were in- environment may promote or lead to stabilization of i-motif vestigated by CD and UV spectrometry. In the replication assay, FAM-labeled formation. Moreover, there exists a lower number of the free DNA was used as a primer strand for the quantification of replicated product water in tumor cells than normal cells (47), which might also by PAGE analyses. Detailed information about the materials and methods used stabilize i-motif formation. in this study is provided in SI Appendix, Materials and Methods. In conclusion, template DNA that forms an i-motif especially decreased the processivity of DNA polymerase through a mech- ACKNOWLEDGMENTS. We thank M. Izumi, J. Inoue, and A. Matsuyama for their help. This work was supported by grants-in-aid for scientific research anism likely due to the unique topology of base pairs that stabilize from the Ministry of Education, Culture, Sports, Science and Technology the i-motif. Unwinding of i-motif structure by processive activity of ‡ (MEXT) and Japan Society for the Promotion of Science (JSPS), especially a KF had an activation free energy barrier (ΔG°37 ) about threefold Grant-in-Aid for Scientific Research on Innovative Areas “Chemistry for Mul- higher than that for unwinding of mixed G4s. The stalling was timolecular Crowding Biosystems” (JSPS KAKENHI Grant JP17H06351); MEXT- correlated with stability of the i-motif structure and was regulated Supported Program for the Strategic Research Foundation at Private Universities (2014–2019), Japan; The Hirao Taro Foundation of Konan Gakuen for Aca- by crowding molecule size. Therefore, it is possible that the i-motif demic Research; The Okazaki Kazuo Foundation of Konan Gakuen for Ad- may cause genomic instability in cancer cells. Our physicochemical vanced Scientific Research; The Chubei Itoh Foundation; and the Hyogo approach can be applied to the screening of the ligands capable of Science and Technology Association.

1. Nakano S, Miyoshi D, Sugimoto N (2014) Effects of molecular crowding on the 25. Howorka S, Movileanu L, Braha O, Bayley H (2001) Kinetics of duplex formation for structures, interactions, and functions of nucleic acids. Chem Rev 114:2733–2758. individual DNA strands within a single protein nanopore. Proc Natl Acad Sci USA 98: 2. Biffi G, Tannahill D, McCafferty J, Balasubramanian S (2013) Quantitative visualization 12996–13001. of DNA G-quadruplex structures in human cells. Nat Chem 5:182–186. 26. Rauzan B, et al. (2013) Kinetics and thermodynamics of DNA, RNA, and hybrid duplex 3. Fadloun A, et al. (2013) Chromatin signatures and retrotransposon profiling in mouse formation. Biochemistry 52:765–772. embryos reveal regulation of LINE-1 by RNA. Nat Struct Mol Biol 20:332–338. 27. Crothers DM, Bloomfield VA, Tinoco I (2000) Nucleic Acids: Structures, Properties, and 4. Bochman ML, Paeschke K, Zakian VA (2012) DNA secondary structures: Stability and Functions (University Science Books, Sausalito, CA). function of G-quadruplex structures. Nat Rev Genet 13:770–780. 28. Gu XB, Nakano S, Sugimoto N (2007) Consecutive GC base pairs determine the energy 5. Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH (2002) Direct evidence for a barrier of DNA duplex formation under molecularly crowded conditions. Chem G-quadruplex in a promoter region and its targeting with a small molecule to repress Commun (Camb) (26):2750–2752. c-MYC transcription. Proc Natl Acad Sci USA 99:11593–11598. 29. Yu Z, et al. (2009) ILPR G-quadruplexes formed in seconds demonstrate high me- 6. Tateishi-Karimata H, Isono N, Sugimoto N (2014) New insights into transcription fi- chanical stabilities. J Am Chem Soc 131:1876–1882. delity: Thermal stability of non-canonical structures in template DNA regulates 30. Mirkin EV, Mirkin SM (2014) To switch or not to switch: At the origin of repeat ex- – transcriptional arrest, pause, and slippage. PLoS One 9:e90580. pansion disease. Mol Cell 53:1 3. 7. Endoh T, Kawasaki Y, Sugimoto N (2013) Suppression of gene expression by 31. Nakano M, et al. (2015) Thermodynamic properties of water molecules in the pres- G-quadruplexes in open reading frames depends on G-quadruplex stability. Angew ence of cosolute depend on DNA structure: A study using grid inhomogeneous sol- – Chem Int Ed Engl 52:5522–5526. vation theory. Nucleic Acids Res 43:10114 10125. 8. Bugaut A, Balasubramanian S (2012) 5′-UTR RNA G-quadruplexes: Translation regu- 32. Yu H, Gu X, Nakano S, Miyoshi D, Sugimoto N (2012) Beads-on-a-string structure of long – lation and targeting. Nucleic Acids Res 40:4727–4741. telomeric DNAs under molecular crowding conditions. JAmChemSoc134:20060 20069. 9. Wu Y, Brosh RM, Jr (2010) G-quadruplex nucleic acids and human disease. FEBS J 277: 33. Yin YW, Steitz TA (2002) Structural basis for the transition from initiation to elon- gation transcription in T7 RNA polymerase. Science 298:1387–1395. 3470–3488. 34. Ray S, Bandaria JN, Qureshi MH, Yildiz A, Balci H (2014) G-quadruplex formation in 10. Benabou S, Aviñó A, Eritja R, González C, Gargallo R (2014) Fundamental aspects of telomeres enhances POT1/TPP1 protection against RPA binding. Proc Natl Acad Sci the nucleic acid i-motif structures. RSC Advances 4:26956–26980. USA 111:2990–2995. 11. Phan AT, Mergny JL (2002) Human telomeric DNA: G-quadruplex, i-motif and Watson- 35. Heddi B, Cheong VV, Martadinata H, Phan AT (2015) Insights into G-quadruplex Crick double helix. Nucleic Acids Res 30:4618–4625. specific recognition by the DEAH-box helicase RHAU: Solution structure of a 12. Brazier JA, Shah A, Brown GD (2012) I-motif formation in gene promoters: Unusually peptide-quadruplex complex. Proc Natl Acad Sci USA 112:9608–9613. stable formation in sequences complementary to known G-quadruplexes. Chem 36. Turner DH, Sugimoto N, Kierzek R, Dreiker SD (1987) Free energy increments for Commun (Camb) 48:10739–10741. hydrogen bonds in nucleic acid base pairs. J Am Chem Soc 109:3783–3785. 13. Kendrick S, Hurley LH (2010) The role of G-quadruplex/i-motif secondary structures as 37. Turner DH, Sugimoto N, Freier SM (1988) RNA structure prediction. Annu Rev Biophys cis-acting regulatory elements. Pure Appl Chem 82:1609–1621. Biophys Chem 17:167–192. 14. Gehring K, Leroy J-L, Guéron M (1993) A tetrameric DNA structure with protonated 38. Phelps C, Lee W, Jose D, von Hippel PH, Marcus AH (2013) Single-molecule FRET and cytosine.cytosine base pairs. Nature 363:561–565. linear dichroism studies of DNA breathing and helicase binding at replication fork 15. Phan AT, Guéron M, Leroy J-L (2000) The solution structure and internal motions of a junctions. Proc Natl Acad Sci USA 110:17320–17325. fragment of the cytidine-rich strand of the human telomere. J Mol Biol 299:123–144. 39. Stadlbauer P, Krepl M, Cheatham TE, 3rd, Koca J, Sponer J (2013) Structural dynamics 16. Kendrick S, et al. (2014) The dynamic character of the BCL2 promoter i-motif provides of possible late-stage intermediates in folding of quadruplex DNA studied by mo- a mechanism for modulation of gene expression by compounds that bind selectively lecular simulations. Nucleic Acids Res 41:7128–7143. – to the alternative DNA hairpin structure. J Am Chem Soc 136:4161 4171. 40. Schiavone D, et al. (2014) Determinants of G quadruplex-induced epigenetic in- 17. Gurung SP, Schwarz C, Hall JP, Cardin CJ, Brazier JA (2015) The importance of loop stability in REV1-deficient cells. EMBO J 33:2507–2520. – length on the stability of i-motif structures. Chem Commun (Camb) 51:5630 5632. 41. Kang HJ, Kendrick S, Hecht SM, Hurley LH (2014) The transcriptional complex be- 18. Dhakal S, et al. (2012) G-quadruplex and i-motif are mutually exclusive in ILPR double- tween the BCL2 i-motif and hnRNP LL is a molecular switch for control of gene ex- – stranded DNA. Biophys J 102:2575 2584. pression that can be modulated by small molecules. J Am Chem Soc 136:4172–4185. 19. Brosh RM, Jr (2013) DNA helicases involved in DNA repair and their roles in cancer. 42. Roy B, et al. (2016) Interaction of individual structural domains of hnRNP LL with the – Nat Rev Cancer 13:542 558. BCL2 promoter i-motif DNA. J Am Chem Soc 138:10950–10962. 20. Paeschke K, et al. (2013) Pif1 family helicases suppress genome instability at G-quadruplex 43. Miglietta G, Cogoi S, Pedersen EB, Xodo LE (2015) GC-elements controlling HRAS motifs. Nature 497:458–462. transcription form i-motif structures unfolded by heterogeneous ribonucleoprotein 21. Beese LS, Derbyshire V, Steitz TA (1993) Structure of DNA polymerase I Klenow particle A1. Sci Rep 5:18097. fragment bound to duplex DNA. Science 260:352–355. 44. Brooks TA, Hurley LH (2009) The role of supercoiling in transcriptional control of MYC 22. Chen H, et al. (2014) Exploring the formation and recognition of an important and its importance in molecular therapeutics. Nat Rev Cancer 9:849–861. G-quadruplex in a HIF1α promoter and its transcriptional inhibition by a benzo[c]phe- 45. Palazzo RP, Bagatini PB, Schefer PB, de Andrade FM, Maluf SW (2012) Genomic in- nanthridine derivative. JAmChemSoc136:2583–2591. stability in patients with type 2 diabetes mellitus on hemodialysis. Rev Bras Hematol 23. Rajendran A, Nakano S, Sugimoto N (2010) Molecular crowding of the cosolutes in- Hemoter 34:31–35. duces an intramolecular i-motif structure of triplet repeat DNA oligomers at neutral 46. Gaillard H, García-Muse T, Aguilera A (2015) Replication stress and cancer. Nat Rev pH. Chem Commun (Camb) 46:1299–1301. Cancer 15:276–289. 24. Cui J, Waltman P, Le VH, Lewis EA (2013) The effect of molecular crowding on the stability 47. Damadian R, Zaner K, Hor D, DiMaio T (1974) Human tumors detected by nuclear of human c-MYC promoter sequence I-motif at neutral pH. Molecules 18:12751–12767. magnetic resonance. Proc Natl Acad Sci USA 71:1471–1473.

9610 | www.pnas.org/cgi/doi/10.1073/pnas.1704258114 Takahashi et al. Downloaded by guest on September 24, 2021