X-ray crystal structures elucidate the nucleotidyl transfer reaction of transcript initiation using two

Michael L. Gleghorna,1, Elena K. Davydovab,2, Ritwika Basua, Lucia B. Rothman-Denesb,3, and Katsuhiko S. Murakamia,3

aDepartment of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA 16802; and bDepartment of Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL 60637

Edited* by E. Peter Geiduschek, University of California at San Diego, La Jolla, CA, and approved December 30, 2010 (received for review November 6, 2010)

We have determined the X-ray crystal structures of the pre- and in the structure and misalignment of the reactive groups of sub- postcatalytic forms of the initiation complex of bacteriophage N4 strate. In the present study, we have used X-ray crystallography RNA polymerase that provide the complete set of atomic images and a natural substrate plus a proper substrate analog to capture depicting the process of transcript initiation by a single-subunit a set of atomic resolution snapshots, from binding RNA polymerase. As observed during T7 RNA polymerase transcript to nucleotidyl transfer reaction, (Fig. 1A and Table S1) to eluci- elongation, substrate loading for the initiation process also drives a date a complete picture of the process of transcript initiation conformational change of the O helix, but only the correct base by the central domain of N4 phage virion-encapsulated RNAP pairing between the þ2 substrate and DNA base is able to com- (mini-vRNAP). plete the O-helix conformational transition. Substrate binding also facilitates catalytic metal binding that leads to alignment of the re- Results active groups of substrates for the nucleotidyl transfer reaction. Design of the X-Ray Crystallographic Experiment to Monitor the For- Although all polymerases use two divalent metals for mations of Transcript Initiation Complexes. Previously, we reported catalysis, they differ in the requirements and the timing of binding the X-ray crystal structure of the binary complex (BC) of promo-

of each metal. In the case of bacteriophage RNA polymerase, ter DNA and N4 mini-vRNAP (9), which is a member of the BIOCHEMISTRY we propose that catalytic metal binding is the last step before the T7-like single-subunit RNAP family (10) that recognizes a speci- nucleotidyl transfer reaction. fic DNA hairpin sequence with a 5-bp stem, 3-nt loop as its promoter (Fig. 1B) (11–13). In the BC structure, from −1 to þ2 NA-dependent RNA polymerases (RNAPs) transcribe DNA template DNA bases point toward the nucleotide entry pore, Dgenetic information into RNA and play a central role in gene whereas the þ3 template DNA base is flipped in the opposite expression. RNAP catalyzes a nucleotidyl transfer reaction, which direction providing an opportunity to analyze the structural tran- is initiated by the nucleophilic attack of an O3′ oxyanion at the sitions of DNA template bases at the þ1 and þ2 positions and RNA 3′ terminus to the α-phosphate (αP) of the incoming of the enzyme upon nucleotide loading. nucleotide, resulting in phosphodiester bond formation and re- The structures reported in this study represent the precatalytic lease of pyrophosphate (PPi). Both single-subunit T7 phage-like [substrate complex I (SCI); substrate complex II (SCII); mis- RNAPs and the multisubunit cellular RNAPs possess two nucleo- match complex (MC)] and postcatalytic [product complex (PC)] tide-binding sites for loading the RNA 3′ end (P site) and the stages of transcript initiation (Fig. 1A). Each complex comprises incoming NTP (N site) (1, 2). A two metal-ion catalytic mechan- the 120 kDa N4 mini-vRNAP and a 36-nt DNA, which includes ism has been proposed, as the enzyme possesses two divalent the P2 promoter 7 bp stem, stable and well-ordered 3-nt loop catalytic and nucleotide-binding metal cations chelated by two or hairpin followed by five bases of single-stranded DNA including three conserved Asp residues (3). The catalytic metal is a Lewis the start site (þ1) (Fig. 1B). Promoter and template DNA regions ′ acid, coordinating the RNA 3 -OH lowering its pKa and facilitat- to þ3 ∼ 4 were well resolved in the crystal structures, but were ing the formation of the attacking oxyanion. The nucleotide-bind- completely disordered downstream. The P2_7a DNA sequence ing metal is coordinated by the triphosphate of the incoming of the transcription start site is CC at positions þ1 and þ2,to nucleotide and stabilizes a pentacovalent phosphate intermediate form Watson–Crick base pairs with two molecules of GTP upon during the reaction. Both metal ions are proposed to have octa- nucleotide loading, followed by a nucleotidyl transfer reaction 2þ hedral coordination at physiological Mg concentrations (4). to produce a 2-mer RNA—5′-pppGpG-3′—and a leaving PPi. During transcript elongation, RNAP carries out the loading of There are two molecules in the asymmetric unit and, in the cases a single nucleotide substrate at the N site followed by a nucleo- of the SCI, SCII, and MC, both molecules are quasi-identical. In tidyl transfer reaction with the RNA 3′ end at the P site; this cycle is repeated as elongation proceeds. X-ray crystal structures of the single-subunit T7 phage RNAP (2, 5) have depicted the process Author contributions: L.B.R.-D. and K.S.M. designed research; M.L.G., E.K.D., R.B., and K.S.M. performed research; M.L.G., E.K.D., R.B., L.B.R.-D., and K.S.M. analyzed data; of transcript elongation in detail and reveal a conformational and M.L.G., L.B.R.-D., and K.S.M. wrote the paper. change of the Fingers subdomain during substrate loading to the The authors declare no conflict of interest. active site as also observed in the A family of DNA polymerases (DNAPs) (6, 7). *This Direct Submission article had a prearranged editor. Initiation is the only step in the entire transcription process Data deposition: The atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.pdb.org (PDB ID codes 3Q22 for substrate complex I, 3Q23 for where two nucleotide substrates are loaded at the active site substrate complex II, 3Q0A for the mismatch complex, and 3Q24 for the product complex). followed by a nucleotidyl transfer reaction. Compared with elon- 1Present address: Department of Biochemistry and Biophysics, School of Medicine and gation, the process of initiation has not been well characterized by Dentistry, University of Rochester, Rochester, NY 14642. X-ray crystallography. An X-ray crystal structure of T7 RNAP 2Present address: Department of Chemistry, University of Chicago, Chicago, IL 60637. initiation complex was reported (8), but it was captured by using 3To whom correspondence may be addressed. E-mail: [email protected] or lbrd@ a substrate analog 3′-deoxyGTP (Fig. S1B). This analog lacks the uchicago.edu. ′ essential O3 required for nucleotidyl transfer and catalytic metal This article contains supporting information online at www.pnas.org/lookup/suppl/ coordination resulting in the absence of the catalytic metal ion doi:10.1073/pnas.1016691108/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1016691108 PNAS Early Edition ∣ 1of6 Downloaded by guest on September 29, 2021 A against the structure factors from the initiation complex crystals, F − F we observed clear unbiased o c electron densities around the active site, which corresponded to nucleotides and metals in the precatalytic complexes and a product 2-mer RNA plus PPi in the postcatalytic complex (Fig. 1 D–F). Compared to the binary complex, the backbone structures of the initiation B C complexes are almost identical (0.40 ∼ 0.65 Å rmsd) except for distinct deviations in the part of Fingers (residues 657–770, 1.5 ∼ 3.5 Å rmsd) including the O helix (residues 666–678) and DNA bases from −1 to þ2 (Figs. 2 and 3, and Movie S1).

Structure of Substrate Complex I: Presence of Two Metals at the Active Site Is Essential for Catalysis. Precatalytic SCI (Figs. 1 A and D and 2B) was prepared by soaking 5 mM GTP and 10 mM MgCl2 into the BC crystals. SCI contained two molecules of GTP that base pair with DNA bases þ1 and þ2, and one Mg2þ ion as the nucleotide-binding metal. Mg2þ octahedrally coordinated with DEF ligands that include three atoms of the nonbridging triphosphate oxygens of GTPðþ2Þ, two carboxylates of the conserved Asp residues (D559 and D951), and the main-chain carboxyl group of G560 in the metal-binding motifs A and C that are common to the T7-like single-subunit RNAP family. The binding of the two GTP molecules and the Mg2þ to the BC triggers several conformational changes of DNA, the O helix of the Fingers and side-chain residues of motifs A and C in the Palm (Fig. 3A and Movie S1). Y678 at the O helix C terminus moves 4.3 Å to open the GTPðþ2Þ binding pocket and hydrogen bonds with the 2′-OH of GTPðþ2Þ. This movement is linked to a con- Fig. 1. The structure of the initiation complex. (A) A schematic representa- formational change of the O helix, which swings approximately tion of the sequential processes during initiation of transcription. The BC comprising RNAP and promoter DNA is depicted as “E,” and catalytic and nucleotide-binding metal ions are shown as MeA and MeB, respectively. (B) Sequences and secondary structures of the two DNA constructs used for crystallization. Regions highlighted by the gray boxes were disordered in the crystal structures. Nucleotide-binding sites (þ1 and þ2) for transcript initiation are colored in red. (C) Overall structure of the SCII. N4 mini-vRNAP is depicted as a molecular surface model. The N-terminal domain, subdomains, and motifs are labeled. The β-intercalating hairpin, Plug, Thumb, and N-term- inal two-thirds of Fingers have been removed from this view for clarity, and only their outlines are shown. The promoter DNA and O helix of the Fingers are depicted by a pink tube and blue ribbon, respectively. (D–F) Electron den- sity maps showing nucleotides, 2-mer RNA, pyrophosphate, and metal ions F − F found in the three initiation complexes. o c electron density maps (black net) superimposed on the final models (sticks and spheres) of the SCI (D), SCII (E), and PC (F). These maps were calculated using the native amplitudes and the phase derived from the BC. Template DNA is depicted and labeled. The metal-chelating D559 and D951 are shown as stick models. Divalent metals and waters are depicted by yellow and cyan spheres, respectively.

the case of PC, we observed a clear electron density map corre- sponding to the 2-mer RNA from only one complex (molecule A); the other complex (molecule B) contained weak and discon- tinued map for the 2-mer RNA, suggesting that the 2-mer RNA was partially dissociated from molecule B. We used molecule A for the representative structure of transcript initiation. All complexes were prepared by soaking GTP or its nonhydro- lysable analog, guanosine-5′-[(α,β)-methyleno] triphosphate (GMPCPP) (Fig. S1 A and C), and divalent cations, Mg2þ or Mn2þ, to the preformed BC crystals. The overall structure of the BC (9) and the initiation complexes determined in this study resemble a canonical “cupped right hand”; the enzyme active site is located at the bottom of this cup (Fig. 1C) and does not interact with any neighboring molecules in the crystals. Indeed, we ob- Fig. 2. Structures of active site, DNA, and nucleotides during transcript in- served N4 mini-vRNAP-catalyzed RNA synthesis in crystallo that itiation. The main chains (ribbon models) of motifs A and C (red) and of the produced the 2-mer RNA product (described below), indicating O helix (blue), and the main and side chains (stick models) involved in nucleo- that the enzyme was active and able to perform any required tide and metal binding in the BC (A), SCI (B), SCII (C), and PC (D). NTP binding P and N sites are indicated as green and magenta circles in A. DNA template conformational changes in crystallo. (from −1 to þ2, pink) and nucleotides at þ1 (green) and þ2 (magenta) posi- Each initiation complex structure was determined by rigid tions are shown as stick models. Divalent metals (Mg2þ or Mn2þ) are depicted body and restrained refinements by using the N4 mini-vRNAP by yellow spheres. Hydrogen bonds and salt bridges are depicted by black BC (9) as an initial model. After refinements with the BC models dashed lines. Amino acid residues discussed in the text are labeled.

2of6 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1016691108 Gleghorn et al. Downloaded by guest on September 29, 2021 A +2 BC+2 +2 O-helix conformation may function as an intermediate kinetic +1 +1 +1 checkpoint for substrate discrimination and also relate to a 7.6° -1 O helix -1 O helix -1 O helix unique species, between the open and closed O-helix conforma- Y678 Y678 Y678 tions, found in the mismatch complex of DNAP I by single- molecule FRET analysis (14). F950 +2 F950 +2 F950 +2 In the SCI, there was no electron density corresponding to +1 Y612 Y612 Y612 the catalytic metal (Fig. 1D), possibly due to the presence of

A D951 5.9 Å R666 6.2 Å citric acid (0.11 M) in the crystallization solution, which forms B γ B (57°) (69°) PPi R666 R666 a stable metal–ligand complex resulting in a decreased concen- D559 D559 D559 2þ tration of free Mg . The stabilization constant (log10 K, DE F+2 ¼½ ∕½ ½ 2þ +1 K ML M L , metal, M; ligand, L) between Mg and citric +1 +2 +1 +2 2þ -1 acid is 2.8, which is larger than that between Mg and aspartic O helix 4.1 Å αP acid (2.43) (15). The significantly larger stability constant of αP O3’ 2þ O3’ 3.1 Å 49° Mg and (4.0) allowed the coordinated 84° Y678 D951 nucleotide-binding metal in the SCI to be retained. Other exam- A A 2þ F950 ples of Mg chelation by citric acid in crystal structures, which D559 prevented Mg2þ binding to the catalytic metal site, have been reported [e.g., DNAP λ (16) and CCA adding polymerase (17)]. Fig. 3. Structural transitions of the active site, DNA, and nucleotides asso- SCI possesses all of the components required for catalysis except ciated with transcript initiation. Superposition of the BC and SCI structures for the catalytic metal; its absence prevents the nucleotidyl (A), SCI and SCII structures (B), and SCII and PC structures (C) showing the transfer reaction even in the presence of reactive GTPs at the conformational changes induced by nucleotide and metal binding and the active site. The result indicates that substrate loading drives the nucleotidyl transfer reaction. BC, SCI, SCII, and PC are colored in black, yellow, conformational change of the O helix, although it is not sufficient green, and orange, respectively. The O helix is depicted as a ribbon model. for nucleotidyl transfer, and that the presence of both metal ions DNA template (from −1 to þ2, pink), nucleotides, and amino acid side chains involved in nucleotide and metal bindings are shown as stick models and at the active site is essential for catalysis (3). labeled. Divalent metals (Mg2þ or Mn2þ) are depicted by spheres, and cata- lytic and nucleotide metals are indicated as “A” and “B,” respectively. Hydro- The GTPðþ1Þ and GTPðþ2Þ Binding Sites. In the case of transcript

gen bonds and salt bridges are depicted by yellow (in SCI), green (in SCII), initiation, a single nucleotide has to be positioned at the P site BIOCHEMISTRY and orange (in PC) dashed lines. Close-up views of reactive groups prior to the first nucleotidyl transfer reaction, and a single base —O30ðþ1Þ and αPðþ2Þ—of SCI and SCII structures (D) and SCII and PC struc- 0 pair with the template DNA is most likely not sufficient for tures (E). In D, the distance between O3 ðþ1Þ and αPðþ2Þ is reduced upon the GTPðþ1Þ binding. The SCI structure revealed extensive interac- 30ðþ1Þ α ðþ2Þ catalytic metal binding. In E,the[O -catalytic metal- P ] angle is tions between the GTPðþ1Þ triphosphate and two basic residues changed from 84° to 49° by phosphodiester bond formation. (F) Superposi- — — B tion of the BC, SCI, and MC structures showing the partial conformational K437 and R440 in the Palm core (Fig. 2 ). These interac- change of O helix found in the MC. BC, SCI, and MC are colored in black, tions, which are unique to initiation, because only at this stage yellow, and pink, respectively. This view is the same as in A. is a nucleoside triphosphate loaded at this position, may compen- sate for the weaker binding of GTPðþ1Þ. Accordingly, K437A- 8° away from the active site. DNA template bases from −1 to þ2 and R440A-substituted enzymes had lower affinities for the initi- K ¼ 200 μ change their positions (−1∶1.4 Å, þ1∶3.0 Å, þ2∶2.1 Å) to bind ating nucleotide ( m and 100 M for K437A and R440A, GTPs at the P and N sites. The metal-coordinating carboxylates respectively, vs. 50 μM for the wild-type enzyme) and significantly D559 (motif A) and D951 (motif C) rotate their side chains to reduced in vitro transcription activities compared with the wild- chelate the nucleotide-binding metal. The triphosphate of type enzyme in the presence of NTP at low concentration GTPðþ2Þ forms extensive interactions with Y612, R666, and (4 μM); higher NTP concentration (500 μM) partially restored K670 in the Fingers and their side chains move their positions the activity of the R440A enzyme but not of the K437A enzyme upon GTPðþ2Þ binding. In our structural analysis, we identified (Fig. 4A). These results suggest that K437 and R440 play a role a conformational change of RNAP that is important for eliciting in nucleotide binding for transcript initiation and K437 plays a a proper environment for phosphodiester bond formation more important role than R440. To ascertain the site of transcript (Figs. 2 and 3A, and Movie S1). Although the conformational initiation by the mutant enzymes, we cross-linked the hydro- change of the O helix has been well characterized during sub- xybenzaldehyde ester of GTP to the enzyme; addition of 32 strate loading and catalysis of transcript elongation, our observa- [α − P]ATP led to phosphodiester bond formation in a tem- tion clearly proves that the single-subunit RNAP indeed changes plate-directed manner and enzyme autolabeling (10). Catalytic its conformation during the initiation process. autolabeling of the mutant enzymes at high-NTP concentration To investigate whether correct vs. incorrect base pairing confirmed that initiation occurred at position þ1 (Fig. 4B). between nucleotide and DNA template is critical for the O-helix The BC structure revealed that residue R318 in the N-terminal conformational change, we determined the structure of an MC domain forms a cation-π interaction with DNA base −2 and prepared by using BC crystals with P2_7c DNA (Fig. 1B) which salt bridges with the phosphate backbone that induce a DNA kink were soaked with GTP plus MgCl2. In this setup, only a Watson– between bases −2 and −1. During substrate loading, the −1 DNA Crick base pair forms at the þ1 position but the þ2 position has a base changes its position to partially stack with GTPðþ1Þ in the G (substrate)-T (DNA template) mismatch. In the MC structure, initiation complexes (Fig. 2 A and B). The stacking of purine there is a clear density map corresponding to þ1 GTP that forms bases between −1 DNA base and GTPðþ1Þ may facilitate a Watson-Crick base pair with the þ1C DNA base; however, the GTPðþ1Þ loading at the active site. This combination, a purine þ2 substrate binding site shows only a subtle density (Fig. S2) at position −1 on the template strand and at position þ1 on the likely reflecting the formation of an unstable mismatch between nontemplate strand, is also found in the majority of Escherichia GTP and þ2 base of DNA template. The mismatch at þ2 position coli σ70-dependent promoters (18), which is consistent with the is still able to trigger the conformational change of the O helix hypothesis that the −1 template base plays a similar role in initial and Y678, but their positions deviate from the ones observed NTP binding by the bacterial RNAPs. in the SCI structure (Fig. 3F), indicating that only the correct GTPðþ2Þ is located at the N site (Fig. 2B) and has a unique base pairing between þ2 substrate and DNA base is able to com- base-specific hydrogen bond between the keto group of guano- plete the O-helix conformational transition. A partially changed sine and the N671 side chain, which is positioned in the middle

Gleghorn et al. PNAS Early Edition ∣ 3of6 Downloaded by guest on September 29, 2021 between αP and βP of GTPðþ2Þ (Fig. 3D). However, the distance between O30ðþ1Þ and αPðþ2Þ is 4.1 Å, which is distinctly longer than distances (3.3 ∼ 3.7 Å) reported from other precatalytic forms of polymerase structures, including T7 RNAP in the elon- gation complex (2) and X-family DNAPs (16, 20). This configura- tion distance indicates that the geometry of the reactive groups —O30ðþ1Þ and αPðþ2Þ—in the SCI may not be competent for catalysis and suggests that catalytic metal binding at the site will realign these groups for phosphodiester bond formation (4, 21).

Structure of Substrate Complex II: Loading the Catalytic Metal to the Active Site Induces Conformation Changes of the Enzyme Active Site and Nucleotide þ1. To load the catalytic metal but prevent phos- phodiester bond formation, we soaked 20 mM MnCl2 and 5 mM of GMPCPP into the preformed BC crystals. The stability con- 2þ stant of the Mn -aspartate complex (log10 K ¼ 3.74) is higher than its citric acid counterpart (log10 K ¼ 2.8) (15) allowing Mn2þ binding at both sites. Mn2þ has octahedral coordination with almost identical metal-donor distances as those observed 2þ 2þ 2þ Fig. 4. Role of K437, R440, E557, and N671 residues in initiation of transcrip- with Mg (22). In addition, both Mg and Mn can activate tion by the mini-vRNAP. (A) Effect of K437, R440, and N671 substitutions on catalysis in vitro by N4 vRNAP (Fig. S4) and other members of mini-vRNAP runoff transcription at increasing NTP concentrations. (B) Effect this type of polymerase including T7 RNAP (23) and E. coli of Alanine substitutions at K437, R440, and N671 on selection of the site of DNAP I (24). The structure was determined at 1.8-Å resolution F − F transcript initiation. Catalytic autolabeling was performed on templates with clear unbiased o c electron densities around the active with increasing numbers (n) of As between the promoter hairpin and CTA site, corresponding to two molecules of GMPCPP and two with increasing concentrations of the hydroxybenzaldehyde derivative of Mn2þ ions (Fig. 1E). We termed this precatalytic complex SCII GTP (bGTP). A wild-type vRNAP promoter contains 4As and initiates transcrip- (Figs. 1A and 2C). tion 11 nt from the center of the hairpin at C.(C) Effect of Mg2þ concentra- tion on runoff transcription by E557A-mutant mini-vRNAP. (D) Effect of Loading the catalytic metal into the active site aligned the reactive groups of substrates and the catalytically essential car- E557A substitution on selection of the transcript initiation site. Catalytic 0 autolabeling was performed as described in B at 1-mM bGTP. boxylates for the nucleotidyl transfer reaction: (i) The O3 ðþ1Þ moved in the direction of αPðþ2Þ and the distance between of O helix. This interaction stabilizes the binding of GTPðþ2Þ the two groups decreased from 4.1 to 3.1 Å (Fig. 3D); and (ii) the because the N671A enzyme had decreased affinity for the second D559 side chain also moved 1.8 Å to chelate both catalytic and K ¼ 120 μ μ B nucleotide ( m M for the N671A enzyme vs. 50 M nucleotide-binding metals (Fig. 3 ), with the metals separated by for the wild-type enzyme) and reduced activity in the presence 3.6 Å. The catalytic metal binding induced an unexpected confor- of low NTP concentration (4 μM), which was restored to approxi- mational transition of the triphosphate moiety of nucleotide þ1. mately 80% at high-NTP concentration (500 μM) with initiation Compared to the SCI structure, the γ-phosphate (γP) group of at þ1 (Fig. 4A). GMPCPPðþ1Þ in SCII moved 5.9 Å toward the catalytic metal, The interaction between an amino acid residue at the middle thus becoming one of six ligands that coordinate the catalytic of the O helix and nucleotide þ2 might be universal for transcript metal (Fig. S5A). This drastic motion disrupted the interaction initiation in the T7-like single-subunit RNAP family. These en- between R440 and the triphosphate, and established a new inter- zymes contain amino acids with longer side chain with a hydro- action between the γP groupðþ1Þ and E557. To allow this inter- philic moiety (Arg, Lys, Asn, or Gln) at this position, which action, a nonbridging oxygen associated with γPðþ1Þ is most likely are capable of making base-specific interactions (Fig. S3, Upper). protonated (pKa value for secondary phosphate ionization in T7 RNAP has a strong sequence preference for GTP at positions unbound nucleotide triphosphate is approximately 7.6) (25). The þ1 and þ2 and has Arg at this position, which is capable of mak- relevance of this interaction was supported by the behavior of the ing base-specific contact with the 6-keto and/or the 7-imino E557A-mutant enzyme, with decreased runoff transcription activ- 2þ 2þ groups of GTP at position þ2. Mitochondrial RNAP, which has ity at 2 mM Mg and some recovery at 10 mM Mg concentra- a strong sequence preference for ATP or GTP at þ2, possesses tion, without a change in the site of initiation (Fig. 4 C and D). Gln at this position, which is able to be a hydrogen donor and To assess the role of the γP group ðþ1Þ in transcript initiation, acceptor at this position. The bifunctional character of Gln we determined the kinetic parameters for GTP and GDP incor- may allow this side chain to establish a hydrogen bond with poration at the RNA 5′ end (Table 1). N4 mini-vRNAP had a μ k the 7-amine group of ATP and 6-keto group of GTP. This extra fourfold higher affinity (50 M) and a threefold higher cat −1 −1 interaction between the þ2 NTP base and amino acid side chain (300 min ) for GTP than for GDP (200 μM and 100 min )at 2þ of the O helix enhances formation of the first phosphodiester physiological (1 mM) Mg concentration. As a control, we used bond. However, it may decrease the fidelity of nucleotide selec- T7 RNAP, which does not interact with the 5′ phosphate of the tion during transcript elongation by increasing the affinity of the initiating nucleotide (8, 26). Accordingly, the kinetic parameters incorrect NTP at active site. Accordingly, the A-family DNAPs, were identical when T7 RNAP initiated with GTP or GDP which require a preexisting primer for catalysis, contain a rela- (200 μM and 20 min−1). Two charged residues—K437 and tively short side-chain residue at this position (Fig. S3, Lower). E557—are involved in positioning of the triphosphate group Furthermore, a substitution of T664 with Arg in Thermus aqua- ðþ1Þ in contact with the catalytic metal. The functional roles ticus DNAP I reduced its specific activity about threefold and of K437 and E557 are supported by an analysis of the kinetic increased the mutation frequency about 25-fold (19), suggesting parameters of GTP and GDP incorporation by the K437A and that DNAPs have most likely eliminated the interaction between E557A enzymes. Both mutant enzymes show similar affinities the amino acid residue at the middle of the O helix and dNTP (200 μM) for GTP and GDP as the initiating nucleotide. Notably, at the N site in order to enhance DNA replication fidelity. the affinity of the K437A and E557A enzymes for the initiating The ring of GTPðþ1Þ is in the C3′-endo conformation nucleotide is similar to that of T7 RNAP (Table 1), whose active and its O30ðþ1Þ is in line with αP and the leaving bridging oxygen site is superimposable with that of N4 vRNAP (27); however, T7

4of6 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1016691108 Gleghorn et al. Downloaded by guest on September 29, 2021 Table 1. Summary of N4 mini-vRNAP and T7 RNAP kinetic parameters for GTP and GDP Initiation nucleotide GTP GDP Enzyme catalytic K μ k −1 k ∕K −1 μ −1 K μ k −1 k ∕K −1 μ −1 efficiency GTP/GDP m, M cat,min cat m,min M m, M cat,min cat m,min M Wild-type* 50 300 6.00 200 100 0.50 12.00 K437A* 200 70 0.35 300 70 0.23 1.52 E557A* 200 20 0.10 200 20 0.10 1.00 T7 RNAP 200 20 0.10 200 20 0.10 1.00 Conditions as described in SI Experimental Procedures. *N4 mini-vRNAP. RNAP lacks residues equivalent to N4 vRNAP K437 and E557. reaction to proceed (compare SCI and SCII, Fig. 3 B and D). In k The 15-fold decrease in cat upon replacement of E557 by Ala the PC, the catalytic metal is released after phosphodiester bond highlights the relevance of E557 in catalysis of the N4 vRNAP. formation, indicating that the catalytic metal coordinating O30ðþ1Þ and nonbridging oxygen of αPðþ2Þ in the 2-mer RNA Structure of the Product Complex: Phosphodiester Bond Formation cannot maintain octahedral coordination geometry (compare Releases Both Metals from Their Binding Sites. The final step of tran- SCII and PC, Fig. 3 C and E). In other words, binding of the cat- script initiation is the nucleotidyl transfer reaction yielding a alytic metal is sensitive to positions of these ligands that can 2-mer RNA and PPi (Fig. 1A). To understand the structural basis be easily influenced by correct vs. incorrect base pairing between of this chemical reaction, we attempted to carry out the nucleo- the nucleotide and DNA template base. Therefore, a small dif- tidyl transfer reaction in crystallo and determine its structure. We ference in binding energy from correct vs. incorrect Watson– found a unique electron density map at the active site in BC crys- Crick base pairing is able to be converted into a large difference tals with P2_7a DNA soaked with 0.5 mM GTP and 10 mM in catalytic efficiency; reactive groups are in an inactive config- MgCl2 (Fig. 1F). There was a GTP-like structure at position þ1 uration and the 3′ oxyanion cannot be produced in the absence and a nucleotide at þ2 with a single phosphate group. The dis- of the catalytic metal, whereas they are properly aligned to gen- tance between O30ðþ1Þ and αPðþ2Þ was 1.6 Å, indicating that this erate the 3′ oxyanion for catalysis in the presence of the catalytic

density map corresponds to the 2-mer RNA (5′-pppGpG-3′) metal. Based on these facts, we propose that binding of the cat- BIOCHEMISTRY product. The PPi product, coordinated by residues K666, R660, alytic metal at the active site is the last step in the formation of and Y612, was also observed. the catalytically competent transcription complex and that cata- The 5′ and 3′ ends of RNA in the PC were found at the P and lytic-metal-dependent substrate alignment is the most critical N sites, respectively (Fig. 2D), indicating that the PC was in a checkpoint for fidelity of nucleotide incorporation by single- pretranslocation state (2). The template DNA did not change subunit RNAPs, and possibly by the A-family of DNAPs. its position and the enzyme maintained the O-helix closed The published structure of the T7 RNAP transcript initiation conformation with the Y678 side chain in the same position as complex (8) poses several problems: (i) Distances between the observed in the SCI and SCII (Fig. 3C and Movie S1). In the nucleotide-binding metal and its ligands are significantly greater PC, weak electron densities were present at its metal-binding sites in this structure (average 4.3 Å) than in the elongation complex (Fig. 1F); however, the coordination distances of these densities (average 2.7 Å) (2); (ii) although Y639 has been shown to discri- were longer (density at catalytic metal site, 3.1 Å; density at minate NTP against dNTP at the N site for both transcript initia- nucleotide-binding metal site, 3.5 Å) than the expected distance tion and elongation (28), Y639 does not contact the 2′-OH of for Mg2þ (2.1 ∼ 2.3 Å), and their coordination spheres lacked the GTPðþ2Þ in the initiation complex structure; (iii) although the octahedral geometry (22). We therefore assigned these densities interaction between H784 and the 2-amino group of GTPðþ1Þ as isoelectronic water, indicating that both catalytic and nucleo- was shown to play a role in transcription start site selection tide-binding Mg2þ ions had dissociated after the nucleotidyl (29), the H784 side chain contacts GTPðþ2Þ in the structure; and transfer reaction. The release of metal ions did not shift the (iv) no motion of RNAP or of the template DNA strand was ob- 2-mer RNA to the posttranslocated position or release PPi, served during substrate loading at the active site (8). Therefore, but triggered conformational changes of the catalytic carboxy- we suspect that the proposed mechanism of transcript initiation lates (D557 and D951); their positions were approximately the based on this T7 RNAP structure requires reevaluation. Further- same as found in the BC, indicating that these residues form more, the X-ray crystal structure of the T7 RNAP initiation the catalytically relevant conformation only in the presence of complex (8) identified a unique nucleotide-binding site that metals (Fig. 3C). In addition, metal release moved the tripho- the authors termed the D site (de novo site), which is distinct from sphate of nucleotide þ1 to a position nearly identical to that the P site used for transcript elongation. In order to determine observed in SCI. The product PPi remained associated with the whether the N4 vRNAP SCI possesses a D site for GTPðþ1Þ bind- O helix through interactions with Y612, R666, and K670, but ing, we superposed the N4 SCI with the T7 RNAP initiation (8) these residues changed their positions to those found in the BC. and elongation complexes (2) by overlaying their Palm cores in- cluding the T/DxxGR motif and motifs A and C (Fig. S6 A and B). Discussion Both GTPs at positions þ1 and þ2 in the N4 SCI overlaid well with Transcript Initiation by Single-Subunit T7 Phage-Like RNAPs. We have the P and N sites of the T7 elongation complex, but not with the determined the high-resolution X-ray crystal structures of three GTP binding sites found in the T7 initiation complex, indicating distinct forms of transcript initiation complexes during the forma- that N4 vRNAP does not use a D site for GTPðþ1Þ binding. tion of 2-mer RNA, which revealed the formation of two inter- mediates—SCI and SCII—prior to the nucleotidyl transfer Transcript Initiation by Cellular RNAPs. All organisms have multisu- reaction. In SCI, we observed the conformational change of the bunit RNAPs that carry out primer-independent transcript initia- O helix upon binding of nucleotides þ1 and þ2 and nucleotide- tion. Crystallographic studies of cellular RNAPs have revealed binding metal (compare BC and SCI, Fig. 3A); nonetheless, the insights into the mechanism of transcript elongation (1, 30). How- reactive groups—O30ðþ1Þ and αPðþ2Þ—do not possess the cat- ever, due to their larger size and complexity of preparation, X-ray alytically competent configuration. Binding of the catalytic metal crystal structures capturing transcript initiation with cellular results in alignment of the substrates’ reactive groups to allow the RNAP have been elusive. In order to obtain structural insights

Gleghorn et al. PNAS Early Edition ∣ 5of6 Downloaded by guest on September 29, 2021 into the process of transcript initiation of cellular RNAPs, we (Fig. S5). This structural difference may explain the fact that the compared the coordination geometry of the catalytic metal ion single-subunit enzyme carries out only RNA synthesis, whereas in the N4 mini-vRNAP SCII structure with the γP group ðþ1Þ the multisubunit enzyme is capable of both RNA synthesis and involved in catalytic-metal coordination (Fig. S5A) and the Ther- RNA cleavage reactions, which play an important role in tran- mus thermophilus RNAP elongation complex structure (1), as a scriptional proofreading and releasing arrested enzyme (35). representative cellular RNAP because it is the highest resolution structure determined to date (Fig. S5B). The cellular RNAP uses Experimental Procedures three carboxylates in the absolutely conserved 739-DFDGD-743 Detailed protocols of (i) N4 mini-vRNAP and DNA purifications, motif in the largest subunit (region D) (31). The first Asp residue (ii) crystallization of binary complexes, (iii) preparing transcript in the DFDGD motif is involved in both catalytic- and nucleotide- initiation complexes, (iv) X-ray data collections and structure de- metal coordination, whereas the second and third Asp residues terminations, (v) site-directed mutagenesis of N4 mini-vRNAP, coordinate only the catalytic metal. The second Asp residue in (vi) runoff transcription and catalytic autolabeling, and (vii) tran- cellular RNAP localizes to the same position as the one occupied γ ðþ1Þ script initiation assay and kinetics of first phosphodiester bond by the P group of the N4 RNAP SCII; however, a direct SI Experimental Procedures comparison of the two structures does not take into account the formation are described in . absence of the triphosphate moiety at the P site nucleotide in the T. thermophilus RNAP elongation complex structure. The sixfold Note. Recently, using fluorescence-based assays and stopped flow kinetics, V Bermek et al. proposed a reaction pathway for E. coli DNAP I where the decrease in max observed for 2-mer RNA synthesis when ATP is substituted by ADP as the initiating nucleotide in E. coli RNAP O-helix conformational change and the catalytic metal binding occur at early and late stages of the reaction, respectively (36). These stages coincide with transcription from the λ Pr promoter (32) might reflect the role of those we have defined based on the in crystallo reaction. the γP group ðþ1Þ in transcript initiation by cellular RNAPs. Whether the second Asp residue plays a role in interacting with γ ðþ1Þ ACKNOWLEDGMENTS. We thank the staff at X25 of the National Synchrotron the P group in bacterial RNAPs awaits the determination Light Source, F1 of the Macromolecular Diffraction Facility at Cornell High of the structure of their initiation complexes. Energy Synchrotron Source (MacCHESS), and H. Yennawar for supporting In the case of single-subunit RNAP, binding of the catalytic crystallographic data collection. We thank P.C. Bevilacqua, C.E. Cameron, metal requires the presence of template DNA and substrates P.R. Carey, Y. Chen, and R. Yajima for discussion, and S.J. Benkovic and at the active site, which is in contrast to the multisubunit RNAPs T. Ellenberger for comments. We thank W. Ross and R.L. Gourse for critical from Bacteria (31), Archaea (33), and Eukaryote (34), which co- reading of the manuscript. Figures were prepared using PyMOL (http:// ordinate the catalytic metal at the active site even in the absence pymol.sourceforge.net/). This work was supported by National Institutes of Health (NIH) Grants AI12575 and GM071897. The Cornell High Energy of DNA template or substrates. The difference between single- Synchrotron Source is supported by the National Science Foundation (NSF) subunit and cellular RNAPs may reflect the fact that two and and NIH/National Institute of General Medical Sciences via NSF award three carboxylates are involved in coordinating the catalytic metal DMR-0225180, and the MacCHESS resource is supported by NIH/National at the active site in single- and multisubunit enzymes, respectively Center for Research Resources award RR-01646.

1. Vassylyev DG, et al. (2007) Structural basis for substrate loading in bacterial RNA 19. Suzuki M, Avicola AK, Hood L, Loeb LA (1997) Low fidelity mutants in the O-helix of polymerase. Nature 448:163–168. Thermus aquaticus DNA polymerase I. J Biol Chem 272:11228–11235. 2. Yin YW, Steitz TA (2004) The structural mechanism of translocation and 20. Batra VK, et al. (2006) Magnesium-induced assembly of a complete DNA polymerase activity in T7 RNA polymerase. Cell 116:393–404. catalytic complex. Structure 14:757–766. 3. Steitz TA, Steitz JA (1993) A general two-metal-ion mechanism for catalytic RNA. 21. Yang W, Woodgate R (2007) What a difference a decade makes: Insights into transle- Proc Natl Acad Sci USA 90:6498–6502. sion DNA synthesis. Proc Natl Acad Sci USA 104:15591–15598. 4. Yang W, Lee JY, Nowotny M (2006) Making and breaking nucleic acids: Two-Mg2þ-ion 22. Harding M (2001) Geometry of metal-ligand interactions in proteins. Acta Crystallogr, catalysis and substrate specificity. Mol Cell 22:5–13. Sect D: Biol Crystallogr 57:401–411. 5. Temiakov D, et al. (2004) Structural basis for substrate selection by T7 RNA polymerase. 23. Woody AY, Eaton SS, Osumi-Davis PA, Woody RW (1996) Asp537 and Asp812 in bac- Cell 116:381–391. teriophage T7 RNA polymerase as metal ion-binding sites studied by EPR, flow-dialysis, 6. Doublie S, Tabor S, Long AM, Richardson CC, Ellenberger T (1998) Crystal structure of a and transcription. Biochemistry 35:144–152. bacteriophage T7 DNA replication complex at 2.2 Å resolution. Nature 391:251–258. 24. Burgers PM, Eckstein F (1979) A study of the mechanism of DNA polymerase I from 7. Johnson SJ, Taylor JS, Beese LS (2003) Processive DNA synthesis observed in a polymer- Escherichia coli with diastereomeric phosphorothioate analogs of deoxyadenosine ase crystal suggests a mechanism for the prevention of frameshift mutations. Proc Natl triphosphate. J Biol Chem 254:6889–6893. Acad Sci USA 100:3895–3900. 25. Saenger W (1983) Principles of (Springer, New York), 8. Kennedy WP, Momand JR, Yin YW (2007) Mechanism for de novo RNA synthesis and pp 108–110. initiating nucleotide specificity by T7 RNA polymerase. J Mol Biol 370:256–268. 26. Martin CT, Coleman JE (1989) T7 RNA polymerase does not interact with the 5′-phos- 9. Gleghorn ML, Davydova EK, Rothman-Denes LB, Murakami KS (2008) Structural phate of the initiating nucleotide. Biochemistry 28:2760–2762. basis for DNA-hairpin promoter recognition by the bacteriophage N4 virion RNA 27. Murakami KS, Davydova EK, Rothman-Denes LB (2008) X-ray crystal structure of the polymerase. Mol Cell 32:707–717. polymerase domain of the bacteriophage N4 virion RNA polymerase. Proc Natl Acad 10. Kazmierczak KM, Davydova EK, Mustaev AA, Rothman-Denes LB (2002) The phage N4 Sci USA 105:5046–5051. virion RNA polymerase catalytic domain is related to single-subunit RNA polymerases. 28. Huang Y, Beaudry A, McSwiggen J, Sousa R (1997) Determinants of ribose specificity EMBO J 21:5815–5823. in RNA polymerization: Effects of Mn2þ and deoxynucleoside monophosphate incor- 11. Davydova EK, Santangelo TJ, Rothman-Denes LB (2007) Bacteriophage N4 virion poration into transcripts. Biochemistry 36:13718–13728. RNA polymerase interaction with its promoter DNA hairpin. Proc Natl Acad Sci 29. Brieba LG, Padilla R, Sousa R (2002) Role of T7 RNA polymerase His784 in start site USA 104:7033–7038. selection and initial transcription. Biochemistry 41:5144–5149. 12. Glucksmann MA, Markiewicz P, Malone C, Rothman-Denes LB (1992) Specific 30. Wang D, Bushnell DA, Westover KD, Kaplan CD, Kornberg RD (2006) Structural basis sequences and a hairpin structure in the template strand are required for N4 virion of transcription: Role of the trigger loop in substrate specificity and catalysis. Cell RNA polymerase promoter recognition. Cell 70:491–500. 127:941–954. 13. Haynes LL, Rothman-Denes LB (1985) N4 virion RNA polymerase sites of transcription 31. Zhang G, et al. (1999) Crystal structure of Thermus aquaticus core RNA polymerase at initiation. Cell 41:597–605. 3.3 Å resolution. Cell 98:811–824. 14. Santoso Y, et al. (2009) Conformational transitions in DNA polymerase I revealed by 32. McClure WR, Cech CL, Johnston DE (1978) A steady state assay for the RNA polymerase single-molecule FRET. Proc Natl Acad Sci USA 107:715–720. initiation reaction. J Biol Chem 253:8941–8948. 15. Furia TE (1972) Sequestrants in foods. CRC Handbook of Food Additives, ed TE Furia 33. Hirata A, Klein BJ, Murakami KS (2008) The X-ray crystal structure of RNA polymerase (CRC, Boca Raton, FL), 2nd Ed, Vol 1,, pp 271–294. from Archaea. Nature 451:851–854. 16. Garcia-Diaz M, Bebenek K, Krahn JM, Pedersen LC, Kunkel TA (2007) Role of the 34. Cramer P, et al. (2000) Architecture of RNA polymerase II and implications for the catalytic metal during polymerization by DNA polymerase lambda. DNA Repair transcription mechanism. Science 288:640–649. 6:1333–1340. 35. Conaway RC, Kong SE, Conaway JW (2003) TFIIS and GreB: Two like-minded transcrip- 17. Tomita K, Ishitani R, Fukai S, Nureki O (2006) Complete crystallographic analysis of the tion elongation factors with sticky fingers. Cell 114:272–274. dynamics of CCA sequence addition. Nature 443:956–960. 36. Bermek O, Grindley ND, Joyce CM (2011) Distinct roles of the active-site Mg2 ligands, 18. Shultzaberger RK, Chen Z, Lewis KA, Schneider TD (2007) Anatomy of Escherichia coli Asp882 and Asp705, of DNA polymerase I (Klenow fragment) during the prechemistry sigma70 promoters. Nucleic Acids Res 35:771–788. conformational transitions. J Biol Chem 286:3755–3766.

6of6 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1016691108 Gleghorn et al. Downloaded by guest on September 29, 2021