Quick viewing(Text Mode)

How a B Family DNA Polymerase Has Been Evolved to Copy RNA

How a B Family DNA Polymerase Has Been Evolved to Copy RNA

How a B family DNA has been evolved to copy RNA

Woo Suk Choia, Peng Hea,1, Arti Pothukuchyb, Jimmy Golliharb, Andrew D. Ellingtonb, and Wei Yanga,2

aLaboratory of , National Institute of and Digestive and Kidney Diseases, NIH, Bethesda, MD 20892; and bCenter for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712

Contributed by Wei Yang, June 18, 2020 (sent for review May 12, 2020; reviewed by Hong Ling and Tahir H. Tahirov) − − We report here crystal structures of a RTX, DNA (49.1 s 1) is lower than that of KOD (160.4 s 1) (16); the − which was evolved in vitro from the B family polymerase KOD, in RNA-dependent DNA synthesis rate of RTX (47.8 s 1) is also − complex with either a DNA duplex or an RNA–DNA hybrid. Com- slightly lower than HIV-1 RT (61 s 1) (20). Unlike natural RT, pared with the apo, binary, and ternary complex structures of the however, RTX is heat resistant, retains the proofreading 3′-5′ original KOD polymerase, the 16 substitutions that result in the activity of KOD, and is able to self-correct and function of copying RNA to DNA do not change the overall remove wrongly incorporated copied from an RNA structure. Only six substitutions occur at the substrate-binding sur- template (16). face, and the others change domain–domain interfaces in the poly- The B family DNA carry an intrinsic proofread- merase to enable RNA–DNA hybrid binding and reverse . ing exonuclease (Exo) domain N terminal to the polymerase Most notably, F587L at the Palm and Thumb interface stabilizes the open and apo conformation of the Thumb. The intrinsically flexible module in the same polypeptide chain (5). All DNA polymerase Thumb domain seems to play a major role in accommodating the modules adopt a right-hand-like structure, composed of the RNA–DNA hybrid product distal to the . This is reminiscent Finger, Thumb, and Palm domains. As with all high-fidelity of naturally occurring RNA-dependent DNA polymerases, including DNA polymerases, KOD undergoes an open to closed transi- , which have a dramatically augmented Thumb domain, tion of the Finger domain upon binding the correct incoming and of reverse transcriptase, which extends its Thumb with the RNase prior to the chemical reaction (5). In the tertiary H domain. structure of KOD, Exo forms a large extension opposite to the BIOCHEMISTRY Palm domain, which contains the catalytic residues of the poly- reverse transcription | proofreading | 3′ to 5′ exonuclease | Thumb domain merase (21). The Exo and Palm domains flank and support the flexible Finger domain, which in the B family is composed of a enetic information flows from DNA to RNA to a protein, pair of long antiparallel α-helices (Fig. 1) (22–27). At the very N Gbut reverse transcription, which copies RNA to DNA, oc- terminus of the B family polymerases is the N domain, which curs in , , and maintenance contacts the Exo, Finger, and Palm domains and appears to (1, 2). Naturally occurring reverse transcriptase (RT) is heat ′ ′ coordinate the catalytic activities of Pol and Exo (22, 24). With labile and lacks the proofreading 3 to 5 exonuclease activity the C-terminal Thumb domain, B family DNA polymerases often that removes misincorporated dNMP and improves the replica- resemble a closed ring that encircles the DNA and dNTP tive polymerase fidelity by two orders of magnitude (3, 4). In the absence of proofreading, the error rate of DNA synthesis by RT − is 10 4 (1 misincorporation in 10,000 nucleotides incorporated) Significance (5). Reverse transcription is widely used to recover cDNA from RNA in laboratory and in diagnostic tests, including COVID-19 RTX is a reverse transcriptase evolved in vitro from the B family detection (6–8). Therefore, improved efficiency and fidelity of DNA polymerase KOD. Structural analyses of RTX in complex RT is highly desirable. with either a DNA duplex or an RNA–DNA hybrid and com- Thirty years ago A family DNA polymerases were found to parison with the apo, binary, and ternary complex structures of have weak intrinsic reverse transcriptase activity (9). Improve- the original KOD polymerase shed light on how to engineer ments in the RT activity of A family polymerases have been and alter substrate specificity of . Among the 16 sub- made by isolating new polymerases (10) and introducing random stitutions that result in the reverse transcriptase activity, only as well as targeted to existing thermostable enzymes six occur at the substrate-binding surface, and the others – (11–13). Thermophilic polymerases are desirable for melting change domain domain interfaces in the polymerase to enable – RNA secondary structures and also essential for RT-PCR reac- RNA DNA hybrid binding and reverse transcription. The in- tions carried out by a single polymerase (11). trinsically flexible Thumb domain seems to play a major role in – The most accurate and thermostable DNA polymerases for accommodating the RNA DNA hybrid product distal to the PCR, however, belong to the B family (14, 15). RTX was directly active site. evolved in vitro from a thermostable B family DNA polymerase Thermococcus kodakaraensis Author contributions: W.S.C., P.H., A.D.E., and W.Y. designed research; W.S.C., P.H., A.P., isolated from , which is known as and J.G. performed research; A.P. and J.G. contributed new reagents/analytic tools; KOD and widely used for high-efficiency and high-accuracy PCR W.S.C., A.D.E., and W.Y. analyzed data; and W.S.C. and W.Y. wrote the paper. (16–19). Because commercially available RT-PCR kits use a Reviewers: H.L., University of Western Ontario; and T.H.T., University of Nebraska thermostable DNA polymerase and heat-labile RT without the Medical Center. proofreading function, the goal was to create a comparably The authors declare no competing interest. thermostable RNA-dependent DNA polymerase (or RT). With Published under the PNAS license. a total of 16 substitutions in the 774-residue KOD 1Present address: Department of Biochemistry and Molecular Biophysics, Washington (17) (SI Appendix, Table S1), RTX has both RNA- and DNA- University in St. Louis, St. Louis, MO 63110. dependent DNA polymerase activities like a native RT (16). 2To whom correspondence may be addressed. Email: [email protected]. RTX binds an RNA–DNA hybrid less well than a DNA duplex This article contains supporting information online at https://www.pnas.org/lookup/suppl/ substrate, and its RNA-templated synthesis is less efficient than doi:10.1073/pnas.2009415117/-/DCSupplemental. that templated by DNA (16). The catalytic rate (kcat) of RTX on

www.pnas.org/cgi/doi/10.1073/pnas.2009415117 PNAS Latest Articles | 1of7 Downloaded by guest on September 23, 2021 ABN the substrate-binding surface and directly altering the preference Finger for DNA versus RNA, the majority of mutations appear to weaken RTX-DNA/DNA interdomain interactions (N, Exo, Finger, Palm, and Thumb) or Palm stabilize individual domain structures and thus indirectly affect TDNA PDNA substrate preference. The analysis reported here deepens our un- derstanding of how B family DNA polymerases and RT function Exo and provides a platform for future improvement of engineered RTs. RTX-RNA/DNA Results Thumb-2 TRNA Thumb-1 PDNA Structure Determination. RTX was crystallized with a 23/20 nt (nucleotide) DNA duplex or a 16/13 nt RNA–DNA hybrid, each ′ B KOD ternary vs. binary Complex with a 3 nt 5 overhang on the template strand (Fig. 1 ), and CD diffracted X-rays to 2.4 and 2.5 Å resolution, respectively (SI N Finger N Finger Appendix, Table S2). These structures were solved by molecular K466R K466R F38L F38L replacement using the KOD–DNA binary complex (30) as a K118I M137L Palm M137L K118I Palm search model and refined (Fig. 1 C and D and SI Appendix, Table Y493L Y493L R97M R97M S2 and Methods). As in all KOD structures, the last 18 residues at T514I Y384H T514I Exo Y384H I521L Exo I521L the C terminus (757 to 774) are disordered and were not mod- R381H V389I R381H V389I eled. In the 2.4 Å crystal structure of the RTX–DNA complex, E664K although an incoming nucleotide (dAMPNPP, a nonhydrolyzable Thumb-2 F587L F587L 2+ G711V analog of dATP) and two Mg ions were bound in the active site, N735K Thumb-2 G711V N735K the Finger domain was not fully closed, which would occur for a DNA primer Thumb-1 Thumb-1 reaction-competent ternary complex (31). This RTX–DNA com- DNA template RNA template DNA primer plex is superimposable with the KOD–DNA binary complex structure with a rmsd of 0.74 Å over 725 pairs of Cα atoms. The RTX-DNA-dAMPNPP Complex (RTX-D) RTX-RNA-DNA Complex (RTX-R) largest difference is in the Finger domain of RTX, with its tip A Fig. 1. Structures of RTX complexed with DNA and a RNA–DNA hybrid. (A) shifting 4.6 Å toward the closed conformation (Fig. 2 ). Addi- Structural changes in KOD from binding only DNA (PDB ID code 4K8Z) to tional structural changes are also observed in the N, Exo, and binding both DNA and an incoming dNTP (PDB ID code 5OMF). The binary Thumb domains, which may be correlated with the mutations of complex is shown in gray, and the ternary complex is in pink. Domain K118I, M137L, and F587L among the 16 changes that result in the movements are indicated by red arrowheads. (B) Diagram of the DNA and reverse transcription function of RTX (Fig. 2 B–D and SI Ap- RNA–DNA hybrid cocrystallized with RTX. (C) Cartoon diagram of the RTX– pendix, Table S1). The implications of these three substitutions – DNA dNTP structure. Each domain is color-coded, and the template and will be discussed later. primer strands are shown in orange and yellow. The dNTP is shown as sticks. In the 2.5 Å RTX–RNA–DNA complex structure, the 3′ end (D) Cartoon diagram of the RTX–RNA–DNA structure. Domains are colored as in C. The RNA template is shown in dark brown. The 15 substitutions in of the primer strand and the templating RNA are shifted away KOD that convert it to RTX are shown as sticks and labeled. from the reactive position, and the incoming nucleotide is empty (Fig. 1D). Not surprisingly, the Finger is in the open conformation. Distal from the catalytic center the Thumb do- substrate (Fig. 1A). Within this ring, DNA substrate can toggle main, which normally wraps around the upstream product duplex between the polymerase and exonuclease active sites (28). (Fig. 3A), is significantly different from KOD and RTX in Recently, the crystal structure of a KOD ternary complex with complex with DNA (Fig. 3 A–C). Thumb-1 (aa 615 to 660 and DNA and an incoming dATP analog (2′,3′-dideoxy ATP) was 728 to 756, interfacing with the Palm domain), however, is sur- D reported, making KOD one of the few B family DNA poly- prisingly more similar to apo than to DNA-bound KOD (Fig. 3 ). merases whose apo, binary (DNA pol with DNA), and ternary Thumb-2 (aa 662 to 720), adjacent to the Exo domain, is only half A B complex structures are known (29–31). Comparison of the KOD closed compared to the DNA-bound structure (Fig. 3 and ) binary and ternary complex structures reveals that upon binding and has weak electron density, indicating that it is relatively mo- bile. Concurrently, the region of the Exo domain that normally of the correct incoming dNTP, conformational changes occur in E the N, Exo, and Thumb domains in addition to the well-known contacts the Thumb is partially disordered (Fig. 3 ). When the Exo closing of the Finger domain. In the ternary complex (polymer- and Palm domains of the two RTX structures are superimposed, every domain appears to undergo a rigid-body movement (Fig. 3B), ization mode) the ring-shaped protein becomes slightly open and the rmsd over 623 pairs of Cα atoms (excluding Thumb-2) is between the Exo and Thumb domains (31) (Movie S1) (Fig. 1A). 1.34 Å. The ring-shaped RTX is more open when complexed with Unique to the B family DNA polymerases, the most flexible part the RNA–DNA hybrid than with the DNA duplex (Movie S2). of the KOD polymerase is the Thumb domain, which is com- The structural differences between RTX in the DNA and posed of two disjointed parts, Thumb-1 and Thumb-2 (29). Be- RNA–DNA complexes could be due to different crystal lattices tween the apo and substrate-bound KOD structures, Thumb-1 is (SI Appendix, Table S2). However, the largest difference occurs reconfigured and rotated by 32°, while the partially disordered in the Thumb domain, correlating with binding of an RNA–DNA Thumb-2 becomes ordered and moves 17 Å to contact the DNA hybrid versus a DNA duplex (Fig. 3C). Because the RNA–DNA SI Appendix product ( , Fig. S1). The Thumb-2 movement is sig- hybrid adopts a mixed A and B hybrid form instead of a B form, nificantly larger than that of the Finger (6Å) or Thumb-1 regions the minor groove becomes wider and shallower in the hybrid (5Å) (30). form. The mobility of the Thumb and part of the Exo domain To understand how 16 amino acid substitutions in KOD result appears to play a role in accommodating the RNA template. in the function of RNA-dependent DNA synthesis by RTX and One of the 16 substitutions, F587L, occurs at the Palm and eventually to make RTX a better RT with proofreading activity, Thumb interface, and the branched side chain of Leu appears to we have determined crystal structures of RTX complexed with stabilize the Thumb-1 in the apo-like conformation (Figs. 2D and either a DNA-duplex or RNA–DNA-hybrid substrate and com- 3D). The flexible Thumb-2 and reduced interactions with the pared these structures with the KOD apo and binary- and ternary- DNA duplex and RNA–DNA hybrid may underlie the reduced complex structures (29–31). In addition to a few mutations lining efficiency of DNA synthesis by RTX.

2of7 | www.pnas.org/cgi/doi/10.1073/pnas.2009415117 Choi et al. Downloaded by guest on September 23, 2021 N ABdAMPNPP B 1 Finger 23 K118I Mg2+ C Exo

Palm M137L W355 K118I F587L D343 D Thumb RTX-D vs. KOD-DNA Complex CD N 3 2 1 Palm

F587L

L327 F748 Thumb

Exo M137L Finger

Fig. 2. Structural details of the RTX–DNA complex. (A) Superposition of RTX–DNA–dNTP with the KOD–DNA binary complex (PDB ID code 4K8Z). RTX is colored as in 2+ Fig. 1C; KOD is shown in light gray. The finger movement is circled in blue. A zoom-in view of the active site is shown. Mg ions are shown as purple spheres. (B–D)Zoom- BIOCHEMISTRY in views of K118I (N domain, B), M137L (Exo domain, C), and F587L (Palm domain, D), showing their effects on the structural change in RTX compared with KOD.

In the following sections, 15 substitutions in RTX relative to forms a salt bridge with the template phosphate group at the −8 KOD (exclusive of W768R due to its disorder in all KOD and position in the RTX–DNA complex (Fig. 4A). RTX structures) are analyzed in four groups according to their Changes in the RTX substrate interface relative to KOD can be locations relative to the substrate (SI Appendix, Table S1) and summarized as decreasing interactions with the template strand potential influence on RNA-templated DNA synthesis. near the replicating in the active site by introducing smaller and less positively charged side chains but enhancing Substrate-Binding Interface. R381H, Y384H, and V389I, which are binding to both primer and template strands farther upstream by locatedalongaloopextendedfromtheExotothePalmdomain, increasing protein size and positive charge in the Thumb domain. interact with the template backbone at the −1to−4 positions (up- stream of the replicating base pair) in the RTX–DNA and KOD The Active Site. Two substitutions in the Palm domain, T514I and ternary complex structures (Fig. 4A). Unfortunately, in the RTX– I521L, occur in the vicinity of the active site. T514 is immediately – RNA DNA complex structure, the RNA template is disordered at adjacent to the steric gate (Y409) that prevents incorporation of − − the 1 position and is shifted away from the polymerase at the 2 ribonucleotides (rNTPs) (22, 32). Interestingly, the Ile substitu- − SI Appendix to 4 positions ( ,Fig.S2). R381H and Y384H are also tion maintains the interaction with Y409, as T514 and Cδ1 of Ile poorly defined in the electron density map. The RNA strand has an and Cγ2 of Thr are both within van der Waals contact distance additional hydroxyl group on C2′ and adopts A form and C3′-endo – (3.3 and 3.6 Å, respectively) of the hydroxyl group of Y409 in the conformations. Despite the absence of detailed protein RNA inter- DNA complex structures (Fig. 5A). The increased bulk of the Ile actions, these three substitutions likely facilitate accommodation of substitution at 514 instead induces a side chain rotamer change the RNA template. R381H and Y384H substitutions reduce the in the nearby R518. In RTX, the orientation change in the R518 protein volume at the interface with RNA, while the V389I substi- side chain shifts it toward the template backbone in both DNA tution is correlated with an increased distance to the RNA template and RNA–DNA complexes at the −3 position. One α-helical turn compared to the DNA duplex (Fig. 4B). We suspect that these three away, with the Leu of I521L in RTX, the side chain branch point is substitutions have an overall effect of reducing binding to either RNA altered to avoid the close contacts with the carbonyl oxygen of or DNA templates. B Residues E664K, G711V, and N735K occur in the Thumb R518 (Fig. 5 ). The I521L substitution retains its support for the β domain, near the primer strand at the −5 position (E664K) and in -turn structure of the catalytic motif DxD but reduces the num- contact with the template strand at the −8and−9 positions across ber of clashes with T541, which is sandwiched by the two most B the minor groove (Fig. 4A). The increased positive charge and size conserved Asp residues (33) (Fig. 5 ). As a result, even without of the side chains likely enhance Thumb domain interactions with occupancy of a templating base or incoming dNTP, the active site the substrate distal to the active site. In the DNA binary complex of the RTX–RNA–DNA complex is indistinguishable from the structures of KOD and RTX, E664 and E664K are within van der fully occupied active site observed in the KOD–DNA–dNTP ter- Waals contact with the DNA primer, but in the KOD ternary nary complex. In contrast, the active site in the apo KOD structure complex E664 is more distant from the DNA (Fig. 4A). Substitution adopts a different conformation ( [PDB] ID of E664 with Lys may aid RTX attachment to the primer strand and code 1WN7; SI Appendix,Fig.S3). It will be interesting to check thus in compensation for reducing the interactions with whether the T514I and I521L substitutions stabilize the catalytic the template proximal to the active site. Similarly, the Ne of N735K center and make KOD a more efficient DNA polymerase.

Choi et al. PNAS Latest Articles | 3of7 Downloaded by guest on September 23, 2021 ABRTX-R vs. KOD binary complex RTX-R vs. RTX-D Palm D404 Finger Exo D542 Exo D540

Palm

Thumb-2 Thumb-1 Thumb-1 Thumb-2 CDE

A F587L D A F748 Exo D B B

E C

Thumb-2

Thumb-1 Thumb-1 RTX-R, RTX-D & KOD-DNA (4K8Z) RTX-R vs. apo KOD

Fig. 3. Structural details of the RTX–RNA–DNA complex. (A) Thumb movement of RTX when complexed with the RNA–DNA hybrid relative to the KOD–DNA binary complex (PDB ID code 4K8Z). The RTX-R structure is shown in light olive except for the red active site residues, and the KOD–DNA complex is shown in light green gray. The Palm domain and active site of the two are superimposed, but Thumb-1 and Thumb-2 are dramatically different. (B) Superposition of RTX-D and RTX-R confirms the large movement of Thumb-1 and Thumb-2. RTX-D is shown in multiple colors as in Fig. 1C.(C) A zoom-in view of 3B focusing on restructured Thumb-1 in RTX. (D) Comparison of Thumb-1 between RTX-R and apo KOD (PDB ID code 1WN7) after superimposing the Exo and Palm domains of RTX and KOD. (E) A zoom-in view of 3B focusing on the Exo and Thumb-2 interface, where it is disordered in the RTX-R structure.

The Finger Stability. K466R and Y493L are in the Finger domain single-stranded templates before they enter the active site (34, 35). and appear to stabilize the two-helix structure and avoid the Although 16 Å away from the +1 template (1 nt downstream from close contacts observed in the KOD structure (Fig. 5C). K466R the templating base) in the crystal structures (Fig. 6B), R97 in is located on the first α-helix near the tip of the Finger and en- KOD would contact the template strand, which with natural tirely exposed to solvent. In the apo and binary-complex struc- substrates is much longer than the 14 or 23 nt templates in the tures of KOD, K466 is in the vicinity of E475 on the second crystal structures. F38L and R97M together may thus increase the α C -helix of the Finger domain (Fig. 5 ); in the ternary-complex flexibility of the N domain and reduce template interactions and, – structure, K466 moves toward E475 to form a charge charge in- consequently, result in accommodation of an RNA template. teraction. In RTX, the Arg substitution of K466 strengthens the As previously mentioned when describing the overall structure interaction with E475 by forming double salt bridges in the of the RTX–DNA complex (Fig. 2 B–D), K118I, M137L, and – – RTX RNA DNA complex and thus stabilizes the pair of helices F587L substitutions cause subtle but discernible changes be- even when the Finger domain is open. The second substitution, tween otherwise superimposable KOD and RTX binary com- Y493L, is in the hydrophobic core formed by the Finger, Exo, and plexes with DNA. Because these three substitutions consist of Palm domains when packed against the replicating base pair D branched hydrophobic side chains with increased stiffness (K118I (Fig. 5 ). In all KOD structures Y493 is extremely close to F356, and M138L), hydrophobic core packing and domain–domain in- with its hydroxyl group within 3.3 Å of the edge of the benzene teractions surrounding the substrate-binding interface (Fig. 2 B– ring of F356. Substitution by Leu relieves the close contact with D) are affected. K118I belongs to the N domain, and the F356 and increases flexibility in the hydrophobic core shared by increases the hydrophobicity and flexibility at the convergence of the three domains (Fig. 5D). These two substitutions in the Finger domain may stabilize the RTX while making it more flexible. the N, Exo, and Finger domains that contacts the template strand proximal to the active site (Fig. 2A). The K118I substitution leads B Global Flexibility and Domain Interfaces. The remaining five sub- to a rotamer change in W355 (Fig. 2 ) and is possibly responsible stitutions, F38L, R97M, and K118I in the N domain, M137L in for the 1 Å shift of the N domain (Fig. 2A). In addition, the ter- the Exo domain, and F587L in the Palm domain, are distal from minal amide of K118 forms a salt bridge with the Exo domain the active sites of Pol and Exo (Figs. 2 and 6). F38L and R97M (D343), which likely rigidifies the domain–domain interaction in are adjacent to one another in the tertiary structure. The two KOD. K118I compensates the loss of a salt bridge by filling the smaller side chains at these positions result in an enlarged in- hydrophobic core. On the other hand, M137L is correlated with ternal cavity in the RTX N domain, which likely increases its the opening of an adjacent loop, aa 129 to 135, which may ease the flexibility (Fig. 6A). Moreover, R97 is on a positively charged sliding of the Finger domain relative to Exo during the transition surface that mediates template strand binding, which previously between binary and ternary complexes. These substitutions in was identified as recognizing (deaminated ) in RTX appear to “lubricate” the N–Exo and Exo–Finger interfaces.

4of7 | www.pnas.org/cgi/doi/10.1073/pnas.2009415117 Choi et al. Downloaded by guest on September 23, 2021 A from substrate and dNTP binding. It is well known that the active dAMPNPP 3’ dAMPNPP 3’ sites of RNA and DNA polymerases can be highly similar to the Y384H Y384H point of being nearly interchangeable (37, 38), and the nascent 5’ -1 5’ -1 base pair bound to DNA polymerase is often in A form (3′-endo) -2 V389I -2 V389I and predisposed to be RNA-like (22, 39, 40). DNA polymerases -3 -3 R381H R381H in the A family often possess reverse transcriptase activities (9, -4 -4 10, 41), and the RT activity can be enhanced by increasing pH -5 -5 and Mn2+ concentrations in the reaction buffer (42), which re- E664K E664K -6 -6 duces the polymerase fidelity. RTX, however, demonstrates that conversion from a stringent DNA- to permissible RNA-dependent -8-9 -7 -8-9 -7 2+ N735K N735K polymerase in standard Mg buffer is aided by nonintuitive G711V G711V changes well outside the polymerase active site. The distal substitutions, for example, F38L, R97M, K118I, RTX-D vs. KOD binary complex M137L, and F587L, alter domain–domain interactions and result B in the opening of the substrate-binding surface. We tend to think dATP dATP the major differences between RNA and DNA are the extra 3’ 3’ ′ Y384H Y384H hydroxyl (2 -OH) groups in RNA and sugar pucker change from 2′-endo to 3′-endo. However, for a DNA polymerase, which V389I V389I usually binds 10 bp (base pair) of a double helix, to become a 5’ 5’ reverse transcriptase, it must accommodate a wider RNA–DNA -2 R381H -2 R381H -3 -3 hybrid (22 Å in diameter vs. 20 Å for a DNA duplex) before -4 -4 -5 -5 sensing any details of RNA. When we initially compared RTX with the DNA-bound KOD structures (Figs. 2 and 3), the F587L substitution appeared to deform the Thumb domain. However, -6 E664K -6 E664K we find that when compared to the KOD apo structure, F587L N735K -8 N735K -8 does not create a new conformational state but merely changes -7 -7 the equilibrium of the existing states (Movie S3). By stabilizing -9 -9

the apo state of KOD, the F587L substitution allows RTX to BIOCHEMISTRY RTX-R vs. KOD ternary complex keep the Thumb open to accommodate the wider RNA–DNA Fig. 4. Substitutions at the substrate-binding interface. (A) Stereo diagram product during reverse transcription. of six substitutions in the RTX–DNA–dNTP complex in the linker between Exo and Palm (R381H, Y384H, and V389I) and in the Thumb domain (E664K, G711V, and N735K). The KOD binary complex (PDB ID code 4K8Z) is superimposed in AB2+ light gray for comparison. RTX is color-coded as in Fig. 1C.(B) A stereoview from Mg I521L the back (relative to A) of the six substitutions in the RTX–RNA–DNA complex. 2+ 3.5Å The KOD ternary complex (PDB ID code 5OMF) is superimposed in light pink for Mg 3.6Å dAMPNPP comparison. The substitutions are shown as balls and sticks. 3.5Å 3.4Å D542 Y409 D540 The most dramatic structural changes caused by a single amino R518 -1 acid substitution in RTX is F587L. F587L occurs in the Palm do- T514I T541 main at its interface with Thumb-1. When both are complexed with -2 DNA, the structural differences between RTX and KOD are local. R518 dAMPNPP F587L disrupts van der Waals contact between F587 (Palm) and -3 F748 (Thumb) and causes repacking of the Palm–Thumb interface RTX-D vs. KOD binary complex RTX-D vs. KOD binary complex (Fig. 2D). However, the rest of the Thumb domain is superim- CD 4.3Å F448 posable between KOD and RTX, and RTX maintains a mode of E475 K466 3.6Å W504 I449 DNA binding similar to KOD (Fig. 2A). However, in the RTX– L452 3.3Å L357 – K466R RNA DNA complex, Thumb-1 rotates 24° and is completely 3.1Å repacked (Fig. 3 A–C). Interestingly, the Palm–Thumb interface in F356 L453 the RTX–RNA–DNA complex is similar to that in the KOD apo E475 V353 3.3Å Y493L structure (Fig. 3D), suggesting that the F587L substitution stabilizes the “relaxed” Thumb conformation (Movie S3). Concurrently, K466 L489 dAMPNPP Thumb-2 (the second half of Thumb and distal to the Palm do- main) moves significantly to accommodate the RNA–DNA hybrid in the RTX–RNA–DNA complex (Figs. 1D and 3A). The large RTX, KOD binary and KOD ternary complex RTX-D vs. KOD binary complex changes observed in the Thumb domain are probably the primary – Fig. 5. Substitutions near the active site or in the Finger domain. (A) Sub- means for accommodating an RNA DNA hybrid at the cost of stitution of T514I in RTX is correlated with the rotamer change in R518 reduced contact and stability of polymerase–substrate interactions. nearby. The active site is marked by the incoming dNTP (dAMPNPP) and two Mg2+ ions (shown as purple spheres). (B) I521L in RTX eliminates the clash Discussion with R518 and reduces the number of close contacts with T541 in the DxD RTX Acquires Changes Both Proximal and Distal to Substrate Binding. catalytic motif (D540 and D542) in KOD structures (marked by red double Amino acid substitutions designed to engineer reverse tran- arrowheads and distance). The substitution also causes a main chain shift of scription activity in a DNA polymerase often focus on residues in the DxD motif in RTX (marked by the curved red arrow). (C)K466RinFinger forms a salt bridge with E475 and may stabilize the solvent-exposed Finger in the vicinity of the replicating base pair and DNA-binding sur- RTX (colored blue) compared with the KOD binary (light gray) and ternary face. Interestingly, many of the 16 substitutions in RTX, which (light pink) complexes with K466. (D) Y493L in RTX relieves the clashing con- were selected by direct evolution of compartmentalized self- tact of Y493 with F356 in KOD. Side chains of interest are shown as sticks and replication to copy increasing lengths of RNA (16, 36), are far balls and colored according to the domain colors of Figs. 1 C and D and 2A.

Choi et al. PNAS Latest Articles | 5of7 Downloaded by guest on September 23, 2021 ABK118 F38 14Å R97M F87 R97

K118I KOD F38L K118I F38L

R97M

RTX-D vs. KOD binary complex KOD (PDB: 5OMF) RTX-D

Fig. 6. Domain interface and overall flexibility of RTX. (A) A zoom-in view of F38L, R97M, and K118I (N domain) in the RTX-D structure. The KOD–DNA complex structure (PDB ID code 4K8A) is superimposed and shown in semitransparent light gray. (B) The molecular surface of KOD is shown with its charge potential, where blue represents positive and red represents negative charge potential. R97 is part of the positively charged surface that may bind the template strand in KOD (see the zoom-in view with a dark background). R97M and K118I reduce the positive charge of the surface in RTX.

Creating space for RNA binding is also a feature of substitutions and human Pol α contain a three-residue insertion relative to along the substrate binding interface. Two substitutions proximal to DNA δ or KOD in the Thumb domain at the interface with the the active site, R381H and Y384H, result in reduced protein size Palm domain (SI Appendix, Fig. S4C). The insertion is adjacent while maintaining hydrogen bond potential to accommodate the to the F587L mutation in RTX, and the Thumb domain of Pol α RNA–DNA hybrid (Fig. 4). These substitutions would reduce contacts only the primer strand (SI Appendix, Fig. S4D), unlike polymerase–DNA interactions and may thus reduce the DNA po- KOD and DNA δ, which bind both template and primer. lymerase efficiency. Perhaps to compensate for this reduction, three Being at the C terminus, the Thumb domains in reverse tran- substitutions in the Thumb domain distal to the active site, E664K, scriptase and telomerase exhibit significant flexibility and are able to G711V, and N735K, gain positive charges and size and appear to accommodate insertions. For example, the Thumb domain in stabilize –substrate interactions. To maintain the stability of telomerase, known as C-terminal extension, is much larger than any the overall structure while the domain interactions become more Thumb found in DNA polymerases and forms a large interface with flexible in RTX, certain substitutions appear to form an additional the RNA–DNA hybrid (49, 50) (SI Appendix,Fig.S5A). Although salt bridge (K466R), increase hydrophobicity and size (K118I and the Thumb domain in reverse transcriptase is not very large, the T514I), or alleviate close contacts (I521L and Y493L). interface with the RNA–DNA hybrid is extended by the addition of an RNase H domain, which specifically recognizes RNA–DNA Features Shared by RTX, Reverse Transcriptase, and Telomerase. hybrids (1, 51) (SI Appendix,Fig.S5B). Telomerase and naturally Compared with the five established families of DNA polymer- occurring reverse transcriptase demonstrate the same trend as we ases (A, B, C, X, and Y), naturally occurring reverse transcrip- observe with RTX, that contacts of the upstream RNA–DNA hy- tase (including the catalytic subunit of telomerase) is most brid could greatly enhance the RT activity. As the active site does closely related to the B family polymerases. In both RT and B not strongly discriminate against an RNA template, the polymerase family polymerases the Thumb domain occurs at the very C terminus activity and processivity largely depend on how well the enzyme β andisdirectlylinkedtothePalmdomainviaa3-stranded -sheet (5, binds the upstream RNA–DNA hybrid. If an RNA–DNA hybrid 43). In contrast, in A family polymerases, such as Taq Pol, the Thumb binding domain (52) were appended to the C terminus of RTX, we domain precedes both the Finger and Palm domains. As a result, the suspect that the RT activity would be further enhanced. Thumb domain in the A family is less flexible than in the B family (SI Appendix,Fig.S1) (44, 45) but interacts with the upstream template Materials and Methods α SI Appendix A more fluidly with a single -helical dipole ( ,Fig.S4 ) The RTX protein was prepared as described previously (16). Crystallographic (46). In contrast, Thumb-2 of the B family interacts with the template and structural analyses were carried out according to established protocols. strand more substantially, with two α-helical dipoles and positively Details are given in SI Appendix. charged side chains affixed by a β-sheet (SI Appendix,Fig.S4B). As a result, the A family polymerases can naturally accommodate an Data Availability. The structure coordinates and structure factors (SI Ap- RNA–DNA duplex and exhibit detectable reverse transcriptase ac- pendix, Table S2) have been deposited in the Protein Data Bank (PDB ID tivity (9, 11). Because of the specific structure of Thumb-1 and codes 6WYA and 6WYB). Thumb-2, B family DNA polymerases are more specific for DNA recognition than the A family enzymes. ACKNOWLEDGMENTS. We thank Drs. R. Craigie, M. Gellert, and D. J. Leahy α for critical reading of the manuscript. This research was supported by Among B family polymerases, eukaryotic Pol is a National Institute of Diabetes and Digestive and Kidney Disease Intramural and extends RNA primer opposite a DNA template, while DNA Grants DK036144 and DK036146 (to W.Y.), the Welch Foundation (Grant δ is a strict DNA polymerase (25, 47, 48). We find that both yeast F-1654), and NIH Grant 1R01EB027202-01A0 (to A.D.E.).

1. A. Telesnitsky, S. P. Goff, “Reverse transcriptase and the generation of retroviral DNA” in 7. Y. Sasagawa, T. Hayashi, I. Nikaido, Strategies for converting RNA to amplifiable Retroviruses, J. M. Coffin, S. H. Hughes, H. E. Varmus, Eds. (Cold Spring Harbor Labo- cDNA for single- RNA sequencing methods. Adv. Exp. Med. Biol. 1129,1–17 (2019). ratory Press, Cold Spring Harbor, NY, 1997). 8. H. Zhu, Z. Fohlerová, J. Pekárek, E. Basova, P. Neuzil, Recent advances in lab-on-a-chip 2. J. Nandakumar, T. R. Cech, Finding the end: Recruitment of telomerase to . technologies for viral diagnosis. Biosens. Bioelectron. 153, 112041 (2020). Nat. Rev. Mol. Cell Biol. 14,69–82 (2013). 9. M. D. Jones, N. S. Foulkes, Reverse transcription of mRNA by DNA 3. L. Menéndez-Arias, A. Sebastián-Martín, M. Álvarez, Viral reverse transcriptases. polymerase. Nucleic Acids Res. 17, 8387–8388 (1989). Res. 234, 153–176 (2017). 10. M. J. Moser et al., Thermostable DNA polymerase from a viral metagenome is a po- 4. L. J. Reha-Krantz, DNA polymerase proofreading: Multiple roles maintain genome tent RT-PCR enzyme. PLoS One 7, e38371 (2012). stability. Biochim. Biophys. Acta 1804, 1049–1063 (2010). 11. K. B. Sauter, A. Marx, Evolving thermostable reverse transcriptase activity in a DNA 5. W. Yang, Y. Gao, Translesion and repair DNA polymerases: Diverse structure and polymerase scaffold. Angew. Chem. Int. Ed. Engl. 45, 7633–7635 (2006). mechanism. Annu. Rev. Biochem. 87, 239–261 (2018). 12. R. Kranaster et al., One-step RNA pathogen detection with reverse transcriptase ac- 6. C. Peano, M. Severgnini, I. Cifola, G. De Bellis, C. Battaglia, Transcriptome amplifica- tivity of a mutated thermostable Thermus aquaticus DNA polymerase. Biotechnol. J. tion methods in gene expression profiling. Expert Rev. Mol. Diagn. 6, 465–480 (2006). 5, 224–231 (2010).

6of7 | www.pnas.org/cgi/doi/10.1073/pnas.2009415117 Choi et al. Downloaded by guest on September 23, 2021 13. G. Raghunathan, A. Marx, Identification of Thermus aquaticus DNA polymerase 33. W. C. Copeland, T. S. Wang, Mutational analysis of the human DNA polymerase al- variants with increased mismatch discrimination and reverse transcriptase activity pha. The most conserved region in alpha-like DNA polymerases is involved in metal- from a smart enzyme mutant library. Sci. Rep. 9, 590 (2019). specific catalysis. J. Biol. Chem. 268, 11028–11040 (1993). 14. S. Ishino, Y. Ishino, DNA polymerases as useful reagents for –The his- 34. M. A. Greagg et al., A read-ahead function in archaeal DNA polymerases detects tory of developmental research in the field. Front. Microbiol. 5, 465 (2014). promutagenic template-strand uracil. Proc. Natl. Acad. Sci. U.S.A. 96, 9045–9050 15. J. Aschenbrenner, A. Marx, DNA polymerases and biotechnological applications. Curr. (1999). Opin. Biotechnol. 48, 187–195 (2017). 35. M. J. Fogg, L. H. Pearl, B. A. Connolly, Structural basis for uracil recognition by ar- 16. J. W. Ellefson et al., Synthetic evolutionary origin of a proofreading reverse tran- chaeal family B DNA polymerases. Nat. Struct. Biol. 9, 922–927 (2002). scriptase. Science 352, 1590–1593 (2016). 36. F. J. Ghadessy, J. L. Ong, P. Holliger, Directed evolution of polymerase function by 17. M. Takagi et al., Characterization of DNA polymerase from Pyrococcus sp. strain compartmentalized self-replication. Proc. Natl. Acad. Sci. U.S.A. 98, 4552–4557 (2001). KOD1 and its application to PCR. Appl. Environ. Microbiol. 63, 4504–4510 (1997). 37. S. Doublié, S. Tabor, A. M. Long, C. C. Richardson, T. Ellenberger, Crystal structure of a 18. H. Atomi, T. Fukui, T. Kanai, M. Morikawa, T. Imanaka, Description of Thermococcus T7 DNA replication complex at 2.2 A resolution. Nature 391, 251–258 kodakaraensis sp. nov., a well studied hyperthermophilic archaeon previously re- (1998). – ported as Pyrococcus sp. KOD1. Archaea 1, 263 267 (2004). 38. D. Jeruzalmi, T. A. Steitz, Structure of T7 RNA polymerase complexed to the tran- 19. K. Terpe, Overview of thermostable DNA polymerases for classical PCR applications: scriptional inhibitor T7 lysozyme. EMBO J. 17, 4101–4113 (1998). From molecular and biochemical fundamentals to commercial systems. Appl. Micro- 39. S. Doublié, M. R. Sawaya, T. Ellenberger, An open and closed case for all polymerases. – biol. Biotechnol. 97, 10243 10254 (2013). Structure 7, R31–R35 (1999). 20. S. G. Kerr, K. S. Anderson, Pre-steady-state kinetic characterization of wild type and 40. H. Ling, F. Boudsocq, R. Woodgate, W. Yang, Crystal structure of a Y-family DNA 3′-azido-3′-deoxythymidine (AZT) resistant human immunodeficiency virus type 1 polymerase in action: A mechanism for error-prone and lesion-bypass replication. Cell reverse transcriptase: Implication of RNA directed DNA polymerization in the mech- 107,91–102 (2001). anism of AZT resistance. Biochemistry 36, 14064–14070 (1997). 41. T. W. Myers, D. H. Gelfand, Reverse transcription and DNA amplification by a Thermus 21. J. Wang et al., Crystal structure of a pol alpha family replication DNA polymerase thermophilus DNA polymerase. Biochemistry 30, 7661–7666 (1991). from bacteriophage RB69. Cell 89, 1087–1099 (1997). 42. V. I. Grabko, L. G. Chistyakova, V. N. Lyapustin, V. G. Korobko, A. I. Miroshnikov, 22. M. C. Franklin, J. Wang, T. A. Steitz, Structure of the replicating complex of a pol Reverse transcription, amplification and sequencing of poliovirus RNA by Taq DNA alpha family DNA polymerase. Cell 105, 657–667 (2001). polymerase. FEBS Lett. 387, 189–192 (1996). 23. A. J. Berman et al., Structures of phi29 DNA polymerase complexed with substrate: 43. W. Yang, Y. S. Lee, A DNA-hairpin model for repeat-addition processivity in telomere The mechanism of translocation in B-family polymerases. EMBO J. 26, 3494–3505 synthesis. Nat. Struct. Mol. Biol. 22, 844–847 (2015). (2007). 44. Y. Kim et al., Crystal structure of Thermus aquaticus DNA polymerase. Nature 376, 24. F. Wang, W. Yang, Structural insight into translesion synthesis by DNA Pol II. Cell 139, 612–616 (1995). 1279–1289 (2009). 45. Y. Li, V. Mitaxov, G. Waksman, Structure-based design of Taq DNA polymerases with 25. R. L. Perera et al., Mechanism for priming DNA synthesis by yeast DNA polymerase α. improved properties of dideoxynucleotide incorporation. Proc. Natl. Acad. Sci. U.S.A. eLife 2, e00482 (2013). – 26. M. Hogg et al., Structural basis for processive DNA synthesis by yeast DNA polymerase 96, 9491 9496 (1999). e. Nat. Struct. Mol. Biol. 21,49–55 (2014). 46. W. Wang, H. W. Hellinga, L. S. Beese, Structural evidence for the rare tautomer hy- – 27. R. Jain et al., Cryo-EM structure and dynamics of eukaryotic DNA polymerase δ ho- pothesis of spontaneous mutagenesis. Proc. Natl. Acad. Sci. U.S.A. 108, 17644 17648

loenzyme. Nat. Struct. Mol. Biol. 26, 955–962 (2019). (2011). BIOCHEMISTRY 28. M. Hogg, P. Aller, W. Konigsberg, S. S. Wallace, S. Doublié, Structural and biochemical 47. M. K. Swan, R. E. Johnson, L. Prakash, S. Prakash, A. K. Aggarwal, Structural basis of investigation of the role in proofreading of a beta hairpin loop found in the exo- high-fidelity DNA synthesis by yeast DNA polymerase delta. Nat. Struct. Mol. Biol. 16, domain of a replicative DNA polymerase of the B family. J. Biol. Chem. 282, 979–986 (2009). 1432–1444 (2007). 48. A. G. Baranovskiy et al., Mechanism of concerted RNA-DNA primer synthesis by the 29. H. Hashimoto et al., Crystal structure of DNA polymerase from hyperthermophilic human primosome. J. Biol. Chem. 291, 10006–10020 (2016). archaeon Pyrococcus kodakaraensis KOD1. J. Mol. Biol. 306, 469–477 (2001). 49. J. Lingner et al., Reverse transcriptase motifs in the catalytic subunit of telomerase. 30. K. Bergen, K. Betz, W. Welte, K. Diederichs, A. Marx, Structures of KOD and 9°N DNA Science 276, 561–567 (1997). polymerases complexed with primer template duplex. ChemBioChem 14, 1058–1062 50. M. Mitchell, A. Gillis, M. Futahashi, H. Fujiwara, E. Skordalakes, Structural basis for (2013). telomerase catalytic subunit TERT binding to RNA template and telomeric DNA. Nat. 31. H. M. Kropp, K. Betz, J. Wirth, K. Diederichs, A. Marx, Crystal structures of ternary Struct. Mol. Biol. 17, 513–518 (2010). complexes of archaeal B-family DNA polymerases. PLoS One 12, e0188005 (2017). 51. L. Tian, M. S. Kim, H. Li, J. Wang, W. Yang, Structure of HIV-1 reverse transcriptase 32. M. Astatke, K. Ng, N. D. Grindley, C. M. Joyce, A single side chain prevents Escherichia cleaving RNA in an RNA/DNA hybrid. Proc. Natl. Acad. Sci. U.S.A. 115, 507–512 (2018). coli DNA polymerase I () from incorporating ribonucleotides. Proc. 52. M. Nowotny et al., Specific recognition of RNA/DNA hybrid and enhancement of Natl. Acad. Sci. U.S.A. 95, 3402–3407 (1998). human RNase H1 activity by HBD. EMBO J. 27, 1172–1181 (2008).

Choi et al. PNAS Latest Articles | 7of7 Downloaded by guest on September 23, 2021