A short adaptive path from DNA to RNA polymerases

Christopher Cozensa, Vitor B. Pinheiroa, Alexandra Vaismanb, Roger Woodgateb, and Philipp Holligera,1

aMedical Research Council Laboratory of Molecular Biology, Cambridge CB2 0QH, United Kingdom; and bSection on DNA Replication, Repair, and Mutagenesis, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD 20892

Edited by John Kuriyan, University of California, Berkeley, CA, and approved April 10, 2012 (received for review December 21, 2011) DNA polymerase substrate specificity is fundamental to genome long (14) and more commonly stall at +6–7 nt (14–16, 18) even integrity and to polymerase applications in . In the after prolonged incubation. At the same time, there is compel- current paradigm, active site geometry is the main site of specificity ling structural and phylogenetic evidence for an adaptive path control. Here, we describe the discovery of a distinct specificity linking DNA to RNA polymerase activity in the evolution of the checkpoint located over 25 Å from the active site in the polymerase single-subunit RNA polymerases (ssRNAPs) of mitochondria thumb subdomain. In Tgo, the replicative DNA polymerase from and T-odd bacteriophages (e.g., T7 RNA polymerase), which are Thermococcus gorgonarius, we identify a single mutation (E664K) thought to derive from an ancestral polA-family DNA poly- within this region that enables translesion synthesis across a tem- merase (19–22). Thus, we (17) and others (6, 10, 16) have argued plate abasic site or a cyclobutane thymidine dimer. In conjunction that there must be a determinant of polymerase substrate spec- with a classic “steric-gate” mutation (Y409G) in the active site, ificity that has remained unidentified and that precludes syn- E664K transforms Tgo DNA polymerase into an RNA polymerase thesis of longer in the steric-gate mutants. capable of synthesizing RNAs up to 1.7 kb long as well as fully Here, we describe the discovery and characterization of a pseudouridine-, 5-methyl-C–,2′-fluoro–,or2′-azido–modified RNAs plausible candidate for such a specificity checkpoint. Using Tgo primed from a wide range of primer chemistries comprising DNA, DNA polymerase, the replicative DNA polymerase from Ther- RNA, locked (LNA), or 2′O-methyl–DNA. We find that mococcus gorgonarius, as our model system, we identify a region in E664K enables RNA synthesis by selectively increasing polymerase the thumb subdomain, and a single key residue (E664) within it. affinity for the noncognate RNA/DNA duplex as well as lowering Mutation of E664 to lysine (K) relieves the synthetic block for the Km for ribonucleotide triphosphate incorporation. This gate- RNA polymerization and, in the context of a steric-gate mutation keeper mutation therefore identifies a key missing step in the adap- (Y409G) and four previously described auxiliary mutations, ena- tive path from DNA to RNA polymerases and defines a previously bles the primer-dependent synthesis of long RNAs. Character- unknown postsynthetic determinant of polymerase substrate spec- ization of the phenotype suggests a molecular mechanism based ificity with implications for the synthesis and replication of noncog- on an enhanced primer/template duplex interaction interface. nate nucleic acid polymers. Results processivity | protein engineering | second gate Polymerase Region Enabling RNA Synthesis. Recent work in our group has focused on the engineering of polymerases for the eplicative polymerases require extraordinary specificity in synthesis and replication of unnatural nucleic acid polymers (23). Rsubstrate selection, incorporation, and replication both to This line of investigation led to the serendipitous identification of ensure fidelity and to exclude noncognate and/or damaged a polymerase (D4) with enhanced RNA polymerase activity, from the genome. A particular threat to DNA ge- which is the starting point of the work described herein (Fig. S1A). nome integrity are ribonucleotide triphosphates (NTPs), which D4 derives from a variant of the replicative DNA polymerase of are present in the cell at concentrations up to 100-fold in excess the hyperthermophilic archaeon T. gorgonarius (Tgo) bearing of the cognate triphosphates (dNTPs) (1–3) additional mutations to disable uracil-stalling [V93Q (24)] and 3′- yet differ from them only by the presence of a 2′-hydroxyl(-OH) 5′ exonuclease (D141A and E143A) functions as well as the group. Indeed, although DNA polymerases have evolved to ex- “Therminator” (25) mutation (A485L), known to enhance in- clude NTPs from their active sites, incorporation does occur to corporation of unnatural substrates. This mutant polymerase a detectable degree, with significant implications for genome (henceforth termed TgoT) does not display RNA polymerase stability and repair (2, 4). This issue may be even more acute for activity above background levels (Fig. 1 and Fig. S1B): RNA thermophilic organisms, because high temperatures further in- synthesis by TgoT stalls after six to seven incorporations from crease genome instability by accelerating the spontaneous deg- a DNA primer. TgoT is also unable to extend an RNA primer radation of RNA (5). Control of NTP incorporation by DNA using NTPs. In contrast, D4 extends both DNA and RNA primers polymerases is therefore a paradigmatic case of the link between synthesizing RNAs of 20 nt under identical conditions. This gain polymerase substrate specificity and genome stability. of function in D4 is attributable to nine additional mutations, DNA polymerases from all three domains of life are known to comprising a cluster of eight mutations (P657T, E658Q, K659H, use a common strategy to prevent NTP incorporation into the Y663H, E664Q, D669A, K671N, and T676I) in the thumb sub- nascent strand, by exerting stringent geometric control of the domain and a single mutation (L403P) in the A-motif (Fig. S1D). chemical nature of the 2′ position of the incoming Having identified mutations that enhance RNA polymerase through a single active site residue, the “steric gate” (6). This activity, we sought to dissect their contribution to the phenotype strategy is so efficient that mutation of the steric gate alone (e.g., in the context of a more permissive active site for RNA synthesis.

to an amino acid with a smaller side chain) can reduce dis- BIOCHEMISTRY crimination against NTP incorporation by several orders of mag- nitude (6–13). However, although steric-gate mutations generally Author contributions: P.H. designed research; C.C., V.B.P., and A.V. performed research; render DNA polymerases permissive for NTP incorporation, C.C., V.B.P., A.V., R.W., and P.H. analyzed data; and C.C. and P.H. wrote the paper. they do not by themselves enable synthesis of RNAs beyond The authors declare no conflict of interest. short termination products. Indeed, engineering efforts using This article is a PNAS Direct Submission. rational design (9), in vitro and in vivo screening (14, 15), or Freely available online through the PNAS open access option. directed evolution by phage display (16) and compartmentalized 1To whom correspondence should be addressed. E-mail: [email protected]. self-replication (17) have so far only yielded polymerases that This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. can synthesize RNAs up to 58 nucleotide incorporations (nt) 1073/pnas.1120964109/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1120964109 PNAS | May 22, 2012 | vol. 109 | no. 21 | 8067–8072 Downloaded by guest on September 29, 2021 91 95 402 410 483 487 655 678 A TgoT QDQPA ... YLDFRSLYP ... QRLIK ... VPPEKLVIYEQITRDLKDYKATGP D4 QDQPA ... YPDFRSLYP ... QRLIK ... VPTQHLVIHQQITRALNDYKAIGP TGE QDQPA ... YLDFRSLGP ... QRLIK ... VPPEKLVIYEQITRDLKDYKATGP TYK QDQPA ... YLDFRSLYP ... QRLIK ... VPPEKLVIYKQITRDLKDYKATGP TGK QDQPA ... YLDFRSLGP ... QRLIK ... VPPEKLVIYKQITRDLKDYKATGP

BDSteric gate A485L V93Q C mutation TGK (Y409G) Primer only TgoT TGE TGK 0.5 m 10 m 1 m 5 m TYK

+74nt +74nt

Primer Template

exonuclease mutations (D141A, E143A) RNA Second gate RNA primer mutation (E664K) primer

Fig. 1. Mutations enabling RNA synthesis. (A) Sequence locations of mutations present in D4, the engineered variant TGK (TgoT: Y409G E664K), and in- termediate TGE and TYK polymerases are shown together with the parent TgoT. (B) TGK mutations are mapped onto the structure of a secondary complex of the closely related Pfu polymerase [Protein Data Bank (PDB) ID code 4AIL]. The steric-gate mutation and second-gate mutation are shown in red, secondary mutations present in the starting polymerase TgoT (uracil stalling function and Therminator mutation) are shown in orange, and exonuclease mutations are shown in yellow. (C) RNA polymerase activity of polymerases TgoT, TGE, TYK, and TGK (from an RNA primer). (D) Time course of synthesis of E. coli tRNATyr from RNA primers, showing the appearance of the full-length product (+74 nt) within 30 s for TGK.

Previous work on the polB-family polymerases had identified mutation of E664 is both necessary and sufficient for efficient a conserved tyrosine (Tgo: Y409) as the steric-gate residue. RNA synthesis in conjunction with a steric-gate mutation. Mutation of Y409 to a smaller side-chain amino acid was found Having established a key role for E664 in enabling RNA syn- to reduce NTP/dNTP discrimination by more than 103-fold, yet it thesis, we sought to identify optimal mutations of this residue for is not sufficient to enable RNA synthesis beyond short termi- RNA synthesis. We randomized position 664 in TNE and screened nation products (6–7 nt) (7–9). We initially mutated Y409 to for enhanced RNA polymerase activity using our high-throughput medium-sized side chains (Y409S, L, and N), of which the polymerase activity assay [PAA (23)]. The PAA is based on cap- Y409N mutation (D4: Y409N, henceforth called D4N) yielded ture of primer extension products and their quantification via hy- an improved RNA polymerase (Fig. S1B) capable of synthesizing bridization to a specific antisense probe (Fig. S3). The PAA screen an Escherichia coli supF tRNATyr gene of 74 nt in under 20 min identified E664K as the most effective mutation. (Fig. S2), superior to the previous best example of engineered Following the success of this strategy, we randomized the steric RNA polymerase activity (14). The same mutation introduced gate (position 409) in the polymerase TNK (TgoT: Y409N and into the “WT” polymerase TgoT: Y409N (TNE) only marginally E664K) and performed an analogous PAA screen, identifying improved RNA polymerase activity (Fig. S1B). Y409G as the most effective steric-gate mutation. Finally, we combined mutations from both screens to yield the TgoT double- Single Mutation Is Critical for RNA Synthesis. Nine mutations in D4, mutant TGK (TgoT: Y409G E664K), which could synthesize our together with the Y409N steric-gate mutation, enabled efficient benchmark 74-nt supF tRNA in under 30 s (Fig. 1), nearly 100 RNA synthesis (Figs. S1 and S2). To understand the con- times faster than the parent polymerase D4N (Fig. S2C). tributions of individual mutations in D4N better, we reverted each mutation to WT and analyzed its effect on RNA synthesis, Synthesis of Protein Coding and Functionalized mRNAs. Encouraged revealing a striking pattern: although reversion of seven of the by this gain in activity, we challenged TGK to generate much eight thumb mutations (D4N: T657P, Q658E, H659K, H663Y, longer RNAs. We first tested synthesis of the 748-nt mRNA- A669D, N671K, and I676T) did not affect RNA polymerase encoding GFP. Indeed, TGK was able to synthesize a full-length activity, reversion of one specific residue (D4N: Q664E) all but GFP RNA (primed from an RNA primer) in under 10 min as abolished RNA synthesis (Fig. S1C). Indeed, the reversion mu- determined by denaturing agarose gel electrophoresis, RT-PCR, tant D4N: Q664E displayed essentially the same level of RNA and sequencing (Fig. 2A) as well as the synthesis of a correctly sized polymerase activity as the parent polymerase TgoT. This also 26.8-kDa protein product in an in vitro translation extract (Fig. indicated that the single mutation in D4 in the A-motif, L403P, 2B). We also examined synthesis of a much longer 1,691-nt mRNA did not contribute to RNA polymerase activity (Fig. S1D). encoding firefly(Photinus pyralis) luciferase. Again, TGK gener- To confirm that RNA polymerase activity in D4N was mainly ated a full-length luciferase RNA (in 1–2h)(Fig.2C). We de- conferred by E664Q, we introduced the E664Q mutation de termined the fidelity of RNA synthesis by TGK through sequencing − novo into the TgoT framework, together with the steric-gate the 1.7-kb luciferase RNA, yielding a fidelity of 1.03 × 10 3,which mutation Y409N. The resulting double-mutant TNQ (TgoT: is approximately fivefold lower than the fidelity of T7 RNA poly- − Y409N and E664Q) displayed superior RNA polymerase activ- merase on the same RNA (2.1 × 10 4) reported previously (26) ity, enabling the synthesis of the 74-nt supF tRNA gene in under but superior to the fidelity of DNA synthesis by both the parent − − 10 min, twice as fast as D4N (Fig. S2C). We concluded that polymerase TgoT (8.3 × 10 3) and TGK itself (3.3 × 10 3).

8068 | www.pnas.org/cgi/doi/10.1073/pnas.1120964109 Cozens et al. Downloaded by guest on September 29, 2021 A GFP reverted both of these mutations in the parent polymerase TgoT 10 m 20 m 30 m 60 m mRNA in TGK (TgoT: Y409G and E664K) and in the intermediate 10 m variants TGE (TgoT: Y409G) and TYK (TgoT: E664K), yielding NTP -+ + - -+ + - reversion mutants Tgo exo-, T GK, T GE, and T YK, re-

RT - R R R PCR - +- M spectively. We found that although reversion reduced overall 795 nt 0.8kb RNA polymerase or TLS activity, it increased the differential between polymerases with and without the E664K mutation. Where we had previously observed weak RNA synthesis for TGE B C and TYK (Fig. 1C), the reversion mutants showed none (Fig. S6). NTP -+ Luciferase We conclude that although V93Q and A485L contribute to GFP RNA RNA synthesis and TLS in a generic way (presumably by gen- mRNA + - M 1 h erally increasing polymerase substrate promiscuity), they do not MW 2.0kb enable it. The key positions controlling RNA polymerization are RT - M

PCR - +- 30kD Y409 and E664 (Fig. 1). Mutation of both is necessary (and fi 2kb suf cient) for synthesis of extended RNA molecules, whereas bypass of DNA damage (TLS) is enhanced by the E664K mu- 25kD 1.6kb 1.5kb tation alone (Fig. S6), despite its being located over 25 Å away from the template lesion. E664K therefore defines a second key Fig. 2. Long-range RNA synthesis by TGK. (A) Synthesis of mRNA encoding checkpoint for RNA synthesis and TLS distinct from the steric GFP (0.8 kb) by TGK as resolved by denaturing agarose electrophoresis (Left) gate, and we will refer to it henceforth as the “second gate”. and RT-PCR (Right) after 10 min (M, 100-bp ladder). (B) In vitro translation of the RNA synthesized in A yielding GFP protein (26 kDa; arrow). (C) Synthesis Polymerase Functional Parameters. To gain a better understanding of RNA encoding firefly luciferase (1.7 kb) by TGK as resolved by native of the mechanistic aspects of the second-gate mutation E664K agarose electrophoresis (Left) and RT-PCR (Right) after 1 h. on RNA synthesis by TGK, we measured steady-state NTP in- corporation kinetics, single-hit extension, and termination prob- abilities as a measure of polymerase processivity (27) and primer/ Unlike T7 RNA polymerase or other RNA polymerases, TGK template affinities for the parent polymerase TgoT, for TGK, efficiently initiates primer-dependent RNA synthesis from a range and for the intermediate variants TGE and TYK. of chemistries, including DNA, RNA, 2′-O,4′-C-methylene-β-D- The catalytic efficiency [V /K (f)] of dATP incorporation ribonucleic acids [locked nucleic acids (LNA)], and 2′O-methyl max m ′ – ′ fi ( fdATP) is essentially identical for all polymerases, but there are (2 OMe) DNA, thus allowing a free choice of 5 modi cation (e. differences of four orders of magnitude in the catalytic efficiency g., fluorophores, biotin) or 5′-end chemistries (DNA, RNA, ′ ′ of ATP incorporation ( fATP). Indeed, the fATP of polymerase LNA, and 2 OMe) and RNA 5 base (G/A/U/C). From these variants with the Y409G steric-gate mutation (TGE and TGK) is primers, TGK is also able to synthesize modified nucleic acid only slightly worse than their fdATP, whereas in polymerases with polymers, for example, fully 2′-azido (2′-N3)or2′-fluoro (2′-F) 4 a WT Y409 steric gate (TgoT and TYK), fATP is 2 × 10 -fold and substituted 57-nt RNAs in under 10 min or fully 5-methyl-C 2 2 × 10 -fold, respectively, lower than their fdATP (Table 1 and (5meC) and pseudouridine (Ψ) substituted 1.7-kb RNAs (Fig. Table S1), mostly attributable to a higher Km. Surprisingly, pol- S4). Furthermore, we found that the E664K mutation enhances ymerases with the E664K mutation display a significantly lower translesion synthesis (TLS) across abasic sites or cyclobutane Km(ATP), suggesting that the second-gate mutation exerts an pyrimidine dimers (CPDs) (Fig. S5). effect on substrate utilization. E664K modulates the apparent Km for NTPs despite being located more than 25 Å from the Auxiliary Mutations. Our starting polymerase (TgoT) contained active site, presumably by promoting tertiary complex stability four mutations from the Tgo WT sequence (V93Q, D141A, (below). Indeed, fATP of TGK polymerase from a DNA–RNA6 E143A, and A485L). These mutations by themselves do not en- chimeric primer was reduced more than 20-fold and over 200- able extended RNA synthesis, because both TgoT and TGE fold for an all-RNA primer (Table S1), indicating a striking in- (TgoT: Y409G), the variant with the most effective steric-gate terdependence of the nature of the primer/template duplex and mutation, are unable to synthesize RNAs longer than a few nu- both apparent affinity for substrate (Km) and catalytic efficiency cleotides. Nevertheless, we sought to disentangle the contribu- ( f ) in RNA synthesis. tions (if any) of the different mutations to RNA synthesis. Be- Although TGK efficiently extends both DNA and RNA pri- cause an active exonuclease was unlikely to contribute to RNA mers with NTPs, TgoT cannot extend RNA primers (Fig. 3A). synthesis (and is indeed lacking in extant RNA polymerases), we When using DNA–RNA chimeric primers comprising one (or did not consider reversions of the D141A and E143A mutations. more) 3′-terminal ribonucleotides, we again observed a strong In contrast, both the V93Q and A485L mutations have been termination once a stretch of 6 nt is reached [extension of fl fi described to in uence polymerase substrate speci city either a primer with 1 terminal ribonucleotide (DNA–RNA1) stalls through enabling template uracil bypass (24) or by means of after incorporation of 5–6 nt, whereas a primer with 6 terminal a generic increase in substrate promiscuity (25). We therefore ribonucleotides (DNA–RNA6) is not extended at all by TgoT]

Table 1. Steady-state kinetic parameters of engineered polymerases BIOCHEMISTRY dATP ATP Vmax dATP/ Km ATP/ −1 −2 −1 −1 −1 −1 −1 Polymerase Vmax,%min Km × 10 , μM f,%mM ·min Vmax,%min Km, μM f,%mM ·min Vmax ATP Km dATP fdATP/fATP

− − TgoT 1.04 ± 0.07 3.1 ± 0.03 33 ± 2 0.12 ± 0.03 99 ± 91.2× 10 3 ± 0.2 × 10 3 15 3 × 103 4.7 × 104 TGE 0.82 ± 0.02 2.4 ± 0.4 35 ± 4 0.61 ± 0.06 0.10 ± 0.01 6.4 ± 0.2 1.4 3.9 5.5 TYK 0.83 ± 0.03 1.9 ± 0.1 44 ± 6 0.07 ± 0.007 0.15 ± 0.01 0.43 ± 0.06 13 8.1 102 TGK 0.56 ± 0.01 1.4 ± 0.2 39 ± 4 0.46 ± 0.09 0.035 ± 0.004 13 ± 1.8 1.2 2.5 3

Kinetic parameters were measured as described in Materials and Methods. Data are averages (±SD) from n =3–6 experiments.

Cozens et al. PNAS | May 22, 2012 | vol. 109 | no. 21 | 8069 Downloaded by guest on September 29, 2021 A TgoT TGE TYK TGK second-gate mutation (E664K), is weaker but does not display the n + 6 termination. In contrast, TGK efficiently extends both 6 6 1 1 1 1 6 6 DNA and RNA primers beyond n + 6 and to the end of the

-RNA template with termination probabilities less than 50%, indicating Primer: an increase in processivity of RNA synthesis. DNA -RNA DNA DNA -RNA RNA DNA -RNA DNA -RNA RNA DNA -RNA DNA -RNA DNA DNA DNA DNA RNA DNA- RNA RNA We also performed RNA synthesis under single-hit conditions, which provides another measure of polymerase processivity, be- cause each extension product is the result of a single polymerase binding event (28). Under conditions in which DNA synthesis is comparable between all four variants (TgoT, TGE, TYK, and TGK), RNA synthesis terminates by the seventh incorporation for both TgoT and TGE (Fig. 3B and Fig. S7), further confirming standard extension experiments. In contrast, TYK and TGK (both with the E664K mutation) can extend beyond that synthetic block to over 30 incorporations from a DNA primer. TGK is the only enzyme that can synthesize RNA from an RNA primer with similar processivity to extension from a DNA primer (Fig. 3C). Exami- nation of termination probabilities in RNA synthesis as a measure B TgoT TGE TYK TGK of polymerase processivity (27) revealed a clear decrease (Fig. S7) [pol] in processivity by polymerases without the E664K mutation, par- +32 ticularly with respect to the strong termination at n + 6/7. Having established that the E664K mutation imparts a sub- stantial increase in the processivity of RNA synthesis as judged from product length and termination probabilities, we sought to identify the mechanistic basis of this effect. We determined fi +7 equilibrium binding af nities (Kd) of DNA/DNA and RNA/ DNA primer/template duplexes to polymerase variants by fluo- rescence polarization (FP). These measurements revealed more than a 25-fold affinity increase in polymerase variants comprising the E664K mutation (TYK and TGK) for a noncognate RNA/ DNA Primer DNA primer template duplex compared with TgoT (Table 2). Although E664K also increased affinity for a canonical DNA/ C TgoT TGE TYK TGK DNA duplex (ca. 5-fold), its effect on binding the noncanonical [pol] RNA/DNA heteroduplex is much greater. Indeed, TGK binds an RNA/DNA duplex with a slightly higher affinity than the parent +32 polymerase TgoT binds a DNA/DNA duplex. Together, the kinetic, processivity and affinity data reveal the molecular determinants for the strikingly increased efficiency and processivity of RNA synthesis by TGK. The steric-gate mutation allows efficient incorporation of NTPs through lower- +7 ing Km (substrate utilization), and the second-gate mutation enables tight binding of the nascent RNA/DNA heteroduplex through lowering Kd, and promotes a further decrease in Km.

RNA Discussion Primer The existence of a second checkpoint for RNA synthesis located Fig. 3. Second-gate impact on processivity of RNA synthesis. (A) RNA synthesis in the thumb subdomain had been predicted by us and others (6,

from DNA (black), RNA (blue), and DNA–RNAx chimeric primers by polymerases 10, 16, 17) on the basis of the phenotype of steric-gate mutants and TgoT, TGE (TYE: Y409G), TYK (TYE: E664K), and TGK (TYE: Y409G E664K). Red their premature termination of RNA synthesis, typically at n + 6/7. boxes highlight the unextended primers. Processivity of RNA synthesis as de- Indeed, eight of the nine mutations in the starting D4 polymerase termined by single hit extension from a DNA primer (B) and an RNA primer (C). (TgoT: L403P, P657T, E658Q, K659H, Y663H, E664Q, D669A, K671N, and T676I) with weak RNA polymerase activity are lo- cated in a sequence segment (P657–T676) in the center of the (Fig. 3A). Even with the steric-gate mutation (Y409G), RNA thumb subdomain and in close contact with the nascent strand synthesis by TGE terminates at n + 6/7 (100% termination around n + 6 (Fig. 4 A and B). By dissecting the contributions of probability) (Fig. 3A and Fig. S7), suggesting a fundamental different residues to the phenotype, we identified the critical block to extension beyond 6–7 ribonucleotides. RNA synthesis functional determinant to be a single residue (E664), mutation of by TYK, which lacks the steric-gate mutation but harbors the whichprovedbothnecessaryandsufficient for efficient RNA

Table 2. Kd for primer/template duplexes of engineered polymerases

Kd, nM (primer/template) TgoT TGE TYK TGK

DNA/DNA 179.0 ± 8.7 280.8 ± 12.6 30.5 ± 4.9 36.7 ± 4.7 RNA/DNA 2,504.0 ± 379.3 1,764.0 ± 195.1 65.32 ± 17.9 104.3 ± 10.8

Affinities were measured by FP, as described in Materials and Methods. Data are averages (±SD) of n =3 experiments for DNA/DNA and n = 6 experiments for RNA/DNA primer/template duplexes.

8070 | www.pnas.org/cgi/doi/10.1073/pnas.1120964109 Cozens et al. Downloaded by guest on September 29, 2021 A Nascent B Nascent charged DNA binding surface and primer/template duplex have DNA Template RNA Template been observed in the structure of human Pol η, where such an interface is thought to act as a “molecular splint” reshaping the distorted CPD DNA structure into a canonical B-form to allow processive TLS (29). By analogy, we propose that the continuous positively charged surface generated by the E664K mutation may not only increase the affinity (Table 2) but may reshape the nascent RNA/DNA duplex from its preferred A-form (31) into a cognate B-form conformation to allow efficient RNA synthesis and TLS (Fig. 3 and Fig. S5). By implication, the function of the negatively charged second-gate residue (E664) in the WT Tgo polymerase may be to decrease affinity for primer/template, thus E664 E664 enhancing discrimination against noncognate duplexes. Although highly conserved among members of the Thermo- coccales, the identity of the second-gate residue 664 (or equivalent) C Nascent D Nascent varies among more distantly related members of the polB family DNA Template DNA Template (23) (Table S2). The second gate may therefore be a specificad- aptation to prevent the deleterious consequences of NTP mis- incorporation or replication of genomic lesions at high temperature (32). Conversely, despite significant structural variability in the elaboration of the thumb subdomain, extension checkpoints have been observed in different polymerase families, including polA, where similar n + 6 termination of RNA synthesis is observed in steric-gate mutants (16, 17). In analogy to the diverse nature of steric-gate residues, the second gate may therefore be elaborated in different ways in the context of different thumb domain structures. The unique TGK polymerase described herein has clear utility in biotechnology because it differs in useful ways from the standard ssRNAPs [e.g., T7 RNA polymerase (T7RP)], including thermo- E664 E664K stability and, most importantly, a capacity for primer-dependent Fig. 4. Mechanistic model of second-gate function. (A) DNA/DNA duplex RNA synthesis, which is not commonly observed in nature. Primer- (nascent DNA strand shown in orange, DNA template shown in purple) as dependent RNA synthesis obviates the need to initiate transcription observed. (B) RNA/DNA hybrid duplex [PDB ID code 1EFS (36)] (nascent RNA with 5′-(pp)pG (as with T7RP) and allows free choice of the 5′-end strand green, DNA template purple) modeled into the secondary complex of or 5′-UTR chemistry: TGK is capable of extending a wide variety of Pfu (PDB ID code 4AIL). Modeling illustrates the likely steric clash with the primers, including those bearing 5′ groups, such as fluorescent dyes – thumb subdomain (Pfu: Y654 I678; pink) in proximity to the second-gate res- and biotin, facilitating synthesis and characterization of RNAs with idue (E664; red). Electrostatic potential of the primer/template binding surface ′ ′ of Pfu (C) and its change on the E664K equivalent mutation (D). The second- unusual 5 groups (33) as well as all DNA, RNA, LNA, and 2 OMe- gate mutation creates a continuous positively charged binding surface. DNA primers or chimeras thereof (Fig. S4). Furthermore, TGK enables the synthesis of fully 2′-F– or 2′-N3–substituted RNAs with applications in the aptamer field as well as RNAs in which C and U synthesis in the context of a steric-gate mutation. Hence, we called are completely replaced by 5meC and Ψ (Fig. S4), with utility in this residue the second gate. Although auxiliary mutations (V93Q, and stem cell reprogramming (34). D141A, E143A, and A485L) were found to contribute to the en- In conclusion, we show that the evolutionary path from DNA to hanced processivity of RNA synthesis, they did not by themselves RNA polymerases is surprisingly short, comprising just two critical enable RNA synthesis (Fig. 1). Indeed, the phenotype of Q93V mutations: the classic steric-gate mutation in the active site (Y409G) and L485A reversion variants highlighted the gain of function (6, 10) and the herein described second-gate mutation (E664K) in fi imparted by the steric-gate and second-gate mutations Y409G and the thumb subdomain. Together, these enable ef cient synthesis fi E664K (Fig. S6). of both long RNAs and highly modi ed nucleic acid polymers fi The second-gate residue, E664, is most effectively mutated to through modi cation of key aspects of the molecular interaction lysine (E664K), but mutation to the uncharged glutamine (E664Q) of the polymerase with nascent duplex and substrate NTP. The fi second gate thus pinpoints a previously undescribed postsynthetic already provided signi cant RNA polymerase activity (Fig. S1), fi suggesting that the positioning of a repulsive negative charge in determinant of polymerase substrate speci city with implications proximity to the nascent strand phosphate backbone may be an for the development of strategies for the enzymatic synthesis and important mechanistic aspect of second-gate function. Indeed, evolution of unnatural nucleic acid polymers (23). mapping of the polymerase surface electrostatic potential of the Materials and Methods fi primer/template duplex binding interface reveals a signi cant Primer Extension. Primer extension was typically performed in 3- to 5-μL change on E664K mutation, providing a continuous highly posi- reactions containing 1–10 pmol of primer with twofold template excess. tively charged surface to interact with the nascent duplex (Fig. 4 C Usually, primer FD (either DNA or RNA) was used with DNA template TempN BIOCHEMISTRY and D). This change in electrostatic potential is reflected by the in 1× Thermopol buffer (New England Biolabs) with 0.25–0.75 mM each NTP increase in affinity for the noncognate RNA/DNA heteroduplex and supplemented with MgSO4. A typical extension protocol was two cycles (Table 2) and the increased processivity of RNA synthesis, allowing of 10 s at 94 °C, 1 min at 50 °C, and 5 min at 65 °C. Details of the tRNA time bypass of the synthetic block at n + 6/7 (Fig. 3 and Fig. S7). courses, GFP and luciferase RNA syntheses, and in vitro translation (IVT) are There are natural precedents for enhanced processivity me- described in SI Materials and Methods 1.2 and 1.3. fi diated by either increased af nity for the primer/template duplex Kinetics, Lesion Bypass, Single-Hit Extension, and Affinity Assays. Steady-state or, indeed, the presence of a continuous positively charged sur- kinetic parameters Km and Vmax for dATP and ATP incorporation were face in the polymerase thumb subdomain (29, 30). For example, measured in standing start reactions as described previously (35) and in SI extensive interactions between the continuous highly positively Materials and Methods 1.5. Lesion bypass and single-hit extension assays are

Cozens et al. PNAS | May 22, 2012 | vol. 109 | no. 21 | 8071 Downloaded by guest on September 29, 2021 described in SI Materials and Methods 1.6. Termination probabilities were where Max is the FP saturation, Min is the free ligand polarization, a is the fi · determined as previously described (27). Polymerase primer/template affin- xed ligand concentration, b is ligand + X + Kd, c is ligand X, and X is the ities were assayed using FP as described in SI Materials and Methods 1.7.FP polymerase concentration. measurements were analyzed using GraphPad Prism v5.0d for Mac OS X. Raw data (Y) were fit to the equation: ACKNOWLEDGMENTS. This work was supported by Medical Research Council Grant U105178804 (to C.C., V.B.P., and P.H.), European Union Grant pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi FP6-STREP029092 NEST (to V.B.P. and P.H.), and the National Institute of Min þ Max − Min × b − ðb × b − ð4 × cÞÞÞ Y ¼ Child Health and Human Development/National Institutes of Health Intra- ð2 × aÞ mural Research Program (A.V. and R.W.).

1. Traut TW (1994) Physiological concentrations of purines and pyrimidines. Mol Cell 19. Delarue M, Poch O, Tordo N, Moras D, Argos P (1990) An attempt to unify the Biochem 140(1):1–22. structure of polymerases. Protein Eng 3:461–467. 2. Nick McElhinny SA, et al. (2010) Abundant ribonucleotide incorporation into DNA by 20. Moras D (1993) Polymerases. Two sisters and their cousin. Nature 364:572–573. yeast replicative polymerases. Proc Natl Acad Sci USA 107:4949–4954. 21. Sousa R (1996) Structural and mechanistic relationships between nucleic acid poly- 3. Shewach DS, Reynolds KK, Hertel L (1992) Nucleotide specificity of human deoxy- merases. Trends Biochem Sci 21(5):186–190. cytidine kinase. Mol Pharmacol 42:518–524. 22. Cermakian N, et al. (1997) On the evolution of the single-subunit RNA polymerases. J 4. Nick McElhinny SA, et al. (2010) Genome instability due to ribonucleotide in- Mol Evol 45:671–681. corporation into DNA. Nat Chem Biol 6:774–781. 23. Pinheiro VB, et al. (2012) Synthetic genetic polymers capable of heredity and evolu- 5. Li Y, Breaker RR (1999) Kinetics of RNA degradation by specific base catalysis of tion. Science 336:341–344. transesterification involving the 2′-hydroxyl group. J Am Chem Soc 121:5364–5372. 24. Fogg MJ, Pearl LH, Connolly BA (2002) Structural basis for uracil recognition by ar- 6. Brown JA, Suo Z (2011) Unlocking the sugar “steric gate” of DNA polymerases. Bio- chaeal family B DNA polymerases. Nat Struct Biol 9:922–927. chemistry 50:1135–1142. 25. Gardner AF, Jack WE (2002) Acyclic and dideoxy terminator preferences denote di- 7. Bonnin A, Lázaro JM, Blanco L, Salas M (1999) A single tyrosine prevents insertion of vergent sugar recognition by archaeon and Taq DNA polymerases. Nucleic Acids Res – ribonucleotides in the eukaryotic-type phi29 DNA polymerase. J Mol Biol 290: 30:605 613. 241–251. 26. Huang J, Brieba LG, Sousa R (2000) Misincorporation by wild-type and mutant T7 RNA fi 8. Gardner AF, Jack WE (1999) Determinants of nucleotide sugar recognition in an ar- polymerases: Identi cation of interactions that reduce misincorporation rates by chaeon DNA polymerase. Nucleic Acids Res 27:2545–2553. stabilizing the catalytically incompetent open conformation. Biochemistry 39: – 9. Yang G, Franklin M, Li J, Lin TC, Konigsberg W (2002) A conserved Tyr residue is re- 11571 11580. 27. Kokoska RJ, McCulloch SD, Kunkel TA (2003) The efficiency and specificity of apurinic/ quired for sugar selectivity in a Pol alpha DNA polymerase. Biochemistry 41: apyrimidinic site bypass by human DNA polymerase eta and Sulfolobus solfataricus 10256–10261. Dpo4. J Biol Chem 278:50537–50545. 10. Astatke M, Ng K, Grindley ND, Joyce CM (1998) A single side chain prevents Escher- 28. Bambara RA, Fay PJ, Mallaber LM (1995) Methods of analyzing processivity. Methods ichia coli DNA polymerase I (Klenow fragment) from incorporating ribonucleotides. Enzymol 262:270–280. Proc Natl Acad Sci USA 95:3402–3407. 29. Biertümpfel C, et al. (2010) Structure and mechanism of human DNA polymerase eta. 11. Brown JA, et al. (2010) A novel mechanism of sugar selection utilized by a human X- Nature 465:1044–1048. family DNA polymerase. J Mol Biol 395:282–290. 30. Huber HE, Tabor S, Richardson CC (1987) Escherichia coli thioredoxin stabilizes com- 12. DeLucia AM, Grindley ND, Joyce CM (2003) An error-prone family Y DNA polymerase plexes of bacteriophage T7 DNA polymerase and primed templates. J Biol Chem 262: (DinB homolog from Sulfolobus solfataricus) uses a ‘steric gate’ residue for discrimi- 16224–16232. – nation against ribonucleotides. Nucleic Acids Res 31:4129 4137. 31. Noy A, Pérez A, Márquez M, Luque FJ, Orozco M (2005) Structure, recognition 13. Gao G, Orlova M, Georgiadis MM, Hendrickson WA, Goff SP (1997) Conferring RNA properties, and flexibility of the DNA.RNA hybrid. J Am Chem Soc 127:4910–4920. polymerase activity to a DNA polymerase: A single residue in reverse transcriptase 32. Drake JW (2009) Avoiding dangerous missense: Thermophiles display especially low – controls substrate selection. Proc Natl Acad Sci USA 94:407 411. mutation rates. PLoS Genet 5:e1000520. 14. Staiger N, Marx A (2010) A DNA polymerase with increased reactivity for ribonu- 33. Kowtoniuk WE, Shen Y, Heemstra JM, Agarwal I, Liu DR (2009) A chemical screen for fi – cleotides and C5-modi ed . ChemBioChem 11:1963 1966. biological small molecule-RNA conjugates reveals CoA-linked RNA. Proc Natl Acad Sci 15. Patel PH, Loeb LA (2000) Multiple amino acid substitutions allow DNA polymerases to USA 106:7768–7773. synthesize RNA. J Biol Chem 275:40266–40272. 34. Warren L, et al. (2010) Highly efficient reprogramming to pluripotency and directed 16. Xia G, et al. (2002) Directed evolution of novel polymerase activities: Mutation of differentiation of human cells with synthetic modified mRNA. Cell Stem Cell 7: a DNA polymerase into an efficient RNA polymerase. Proc Natl Acad Sci USA 99: 618–630. 6597–6602. 35. Creighton S, Bloom LB, Goodman MF (1995) Gel fidelity assay measuring nucleotide 17. Ong JL, Loakes D, Jaroslawski S, Too K, Holliger P (2006) Directed evolution of DNA misinsertion, exonucleolytic proofreading, and lesion bypass efficiencies. Methods polymerase, RNA polymerase and reverse transcriptase activity in a single poly- Enzymol 262:232–256. peptide. J Mol Biol 361:537–550. 36. Hantz E, et al. (2001) Solution conformation of an RNA–DNA hybrid duplex con- 18. McCullum EO, Chaput JC (2009) Transcription of an RNA aptamer by a DNA poly- taining a pyrimidine RNA strand and a purine DNA strand. Int J Biol Macromol 28: merase. Chem Commun (Camb) 45(20):2938–2940. 273–284.

8072 | www.pnas.org/cgi/doi/10.1073/pnas.1120964109 Cozens et al. Downloaded by guest on September 29, 2021