Structure and function of the polymerase core of TRAMP, a RNA surveillance complex

Stephanie Hamilla, Sandra L. Wolina,b,1, and Karin M. Reinischa,1

aDepartments of Cell Biology, and bMolecular Biophysics and Biochemistry, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06520

Edited* by Stephen C. Harrison, Harvard Medical School, Boston, MA, and approved July 13, 2010 (received for review March 16, 2010)

The Trf4p/Air2p/Mtr4p (TRAMP) complex recog- truncated 5S rRNA and signal recognition particle RNAs, aber- nizes aberrant RNAs in Saccharomyces cerevisiae and targets them rant rRNA processing intermediates, and a large class of bidirec- for degradation. A TRAMP subcomplex consisting of a noncanoni- tional transcripts that initiate near RNA polymerase II promoters. cal poly(A) RNA polymerase in the Pol ß superfamily of nucleotidyl The exosome is conserved in higher , and higher transferases, Trf4p, and a zinc knuckle protein, Air2p, mediates eukaryotes have homologs for each component of TRAMP initial substrate recognition. Trf4p and related eukaryotic poly(A) (3, 4), suggesting that the TRAMP-exosome pathway may be and poly(U) polymerases differ from other characterized enzymes widely conserved. TRAMP consists of the poly(A) polymerase in the Pol ß superfamily both in sequence and in the lack of recog- Trf4p or its close homolog Trf5p (65% sequence identity), the zinc nizable nucleic acid binding motifs. Here we report, at 2.7-Å knuckle protein Air2p or its close homolog Air1p (45% sequence resolution, the structure of Trf4p in complex with a fragment of identity), and Mtr4p, a member of the DExH/D box RNA Air2p comprising two zinc knuckle motifs. Trf4p consists of a superfamily 2. The Trf4p/Air2p subcomplex polyadenylates the 3' catalytic and central domain similar in fold to those of other non- ends of aberrant noncoding RNAs, providing a single-stranded canonical Pol β RNA polymerases, and the two zinc knuckle motifs “landing pad” that allows the exosome to initiate RNA decay of Air2p interact with the Trf4p central domain. The interaction (3, 4). surface on Trf4p is highly conserved across eukaryotes, providing A key question in noncoding RNA quality control is how evidence that the Trf4p/Air2p complex is conserved in higher RNAs are recognized as aberrant. In the TRAMP-exosome

eukaryotes as well as in yeast and that the TRAMP complex pathway, substrate recognition is mediated by the Trf4p/Air2p BIOCHEMISTRY may also function in RNA surveillance in higher eukaryotes. We subcomplex (5–8), and in vitro, this subcomplex preferentially show that Air2p, and in particular sequences encompassing a zinc Met polyadenylates an unmodified form of tRNA i over the fully knuckle motif near its N terminus, modulate Trf4p activity, and modified version (6, 9). An understanding of TRAMP substrate we present data supporting a role for this zinc knuckle in RNA recognition will derive from a better understanding of Trf4p/ binding. Finally, we show that the RNA 3′ end plays a role in Air2p and its interactions with RNAs. substrate recognition. Trf4p/5p belongs to a family of ribonucleotide transferases in the Pol ß superfamily of nucleotidyl transferases (10, 11). It con- protein-RNA interactions ∣ RNA quality control sists of N- and C-terminal sequences that have little predicted secondary structure, a catalytic domain similar in sequence to he vast majority of transcripts from eukaryotic genomes do several Pol ß members with known structures, and an adjacent Tnot encode proteins. These transcripts include precursors “central domain,” which shares a short nucleotide recognition to noncoding RNAs, such as transfer and ribosomal RNAs, small motif hx(I/L/V)(E/Q)(E/D/N)PhxxxxNxx (h, hydrophobic; x, nuclear and nucleolar RNAs, microRNAs, siRNAs, and piRNAs. any residue) with so-called noncanonical Pol β RNA polymerases. These noncoding RNAs often fold into intricate three-dimen- Trf4p/5p differs from many structurally characterized ribonucleo- sional structures that are critical for subsequent processing and tidyl transferases in lacking a recognizable RNA binding domain, for assembling with proteins to form ribonucleoprotein com- and it has been proposed that the Air proteins function in RNA plexes. In addition, genomic sequencing experiments in both binding (3, 4). In both Air1p and Air2p, five adjacent CCHC zinc yeast and mammals have revealed the existence of a large number of short-lived noncoding transcripts that initiate near RNA knuckles are inserted between N- and C-terminal sequences pre- polymerase II promoters. Finally, cells contain a variety of long dicted to lack secondary structure. Cx1-2Cx3-6Hx7-10C-type zinc noncoding RNAs, some of which function to regulate gene knuckles (C, cysteine; H, histidine; x, any amino acid) and their expression and structure (1, 2). interactions with RNAs have been studied in the context of the Because defective RNAs can arise by synthesis from mutant retroviral nucleocapsid proteins (12). In these proteins, residues genes, transcriptional errors, or aberrant processing events, and x1-2 and x8-10 are typically involved in interactions with single- because some products of pervasive genome transcription may stranded, looped regions of RNA, and linker regions between be harmful, cells have evolved surveillance mechanisms to recog- knuckles may bind RNA duplex regions. It has not been estab- nize aberrant and unneeded noncoding RNAs and target them lished, however, whether the Air proteins or their zinc knuckles for degradation by exoribonucleases. Because all exoribonu- are involved in RNA binding, or whether these interactions cleases require a single-stranded end to initiate decay, the initial resemble those of the nucleocapsid zinc knuckles. recognition of target RNAs is often carried out by polymerases

that, by adding extra nucleotides to the 3' ends, recruit the decay Author contributions: S.H., S.L.W., and K.M.R. designed research; S.H. performed research; machinery. Because many exonucleases are unable to degrade S.H., S.L.W., and K.M.R. analyzed data; and S.H., S.L.W., and K.M.R. wrote the paper. structured RNAs, this machinery frequently includes RNA The authors declare no conflict of interest. (3). *This Direct Submission article had a prearranged editor. One of the best established noncoding RNA degradation path- Data deposition: The atomic coordinates and structure factors have been deposited in the ways in Saccharomyces cerevisiae involves the Trf4p/Air2p/Mtr4p Protein Data Bank, www.pdb.org (PDB ID code 3NYB).

polyadenylation (TRAMP) complex and the nuclear exosome, a 1 0 0 To whom correspondence may be addressed. E-mail: [email protected] or Karin. major 3 → 5 exonuclease (3, 4). In vivo, TRAMP and the exo- [email protected]. some are involved in the degradation of a large variety of RNA This article contains supporting information online at www.pnas.org/lookup/suppl/ substrates, including hypomodified and unspliced pre-tRNAs, doi:10.1073/pnas.1003505107/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1003505107 PNAS Early Edition ∣ 1of6 Downloaded by guest on September 29, 2021 Here we investigate the architecture and function of the Trf4p/ A ZK1-5 ZK4-5 Air2p heterodimer. We show that Air2p and its most N-terminal WT tRNA mutant tRNA WT tRNA mutant tRNA time (min) 0 1 2.5 5 10 20 0 1 2.5 5 10 20 0 1 2.5 5 10 20 0 1 2.5 5 10 20 zinc knuckle are important for the polyadenylation activity of 900, 1114 Trf4p, and show data suggesting that this zinc knuckle interacts 692 489,501 with tRNA substrates. We have also determined, at a resolution 320 of 2.7 Å, the structure of a Trf4p/Air2p subcomplex consisting of 242 the Trf4p catalytic and central domains and a segment of Air2p 190 that includes the fourth and fifth zinc knuckles. The fold of Trf4p is similar to that of other noncanonical Pol β family enzymes de- 147

spite insignificant sequence similarity in the central domain. The 124 structure shows that the fourth and fifth zinc knuckles of Air2p 110 interact with the central domain of Trf4p. The Trf4p surface that MW they bind is highly conserved in a wide range of eukaryotes, im- (bases) portant experimental data supporting conservation of the Trf4p/ Air2p complex in higher eukaryotes. lane 1 3 5 7 9 11 13 15 17 19 21 23 Results and Discussion B 1.0 Characterization of Trf4p/Air2p Core Complexes. Due to significant WT tRNA + ZK1-5 sample degradation during purification (Fig. S1), we were not able 0.8 to isolate a complex consisting of full-length forms of Trf4p and mutant tRNA + ZK1-5 Air2p. We therefore worked with Trf4p/Air2p subcomplexes, 0.6 WT tRNA + ZK4-5 where the proteolytically sensitive N and C termini of both mutant tRNA + ZK4-5 4 ∕ 2 0.4 proteins were removed. One subcomplex, Trf p Air pZK1-5, con- sisted of the catalytic and central domains of Trf4p (residues 161– 0.2 481) and the five zinc knuckles of Air2p (residues 58–198); a 4 ∕ 2 second version, Trf p Air pZK4-5, included only the fourth and polyadeylated fraction 0.0 fifth zinc knuckles of Air2p (residues 119–198). We coexpressed 0 5 10 15 20 Trf4p and Air2p constructs in Escherichia coli, and purified time (min) subcomplexes by affinity chromatography and gel filtration. D The truncated complexes were tested for their polyadenylation C activity and also for the ability to distinguish between aberrant and 0.8 correct forms of tRNAs. In these experiments, we used wild-type tRNAAla and a mutant that was identified as a preferred substrate 0.6 for the intact TRAMP complex purified from yeast (6). Both trun- 0.4 cated forms of Trf4p/Air2p are active in polyadenylation assays (Fig. 1), though less active than the full-length complexes used 0.2 in other studies (6, 7, 9). The nature of the enhancement by

the N and C termini is not known and will be the subject of future polyadeylated fraction 0.0 experiments. t In examining substrate preference, we found that both trun- ZK1-5 ZK4-5 Ala ZK1muZK2mutZK3mut cated complexes polyadenylate the aberrant form of tRNA no proteinZK1-5 ZK4-5 more extensively than the wild-type tRNA, as judged by both ZK1mutZK2mutZK3mut an increased poly(A) tail length and the fraction of the input substrate that underwent polyadenylation (Fig. 1 A and B). E 4 ∕ 2 The differences are smaller for Trf p Air pZK4-5, but they are 1.0 4 ∕ 2 reproducible and statistically significant. The Trf p Air pZK1-5 complex was additionally tested with wild-type and aberrant ver- 0.8 Phe sions of tRNA , where the structure was severely disrupted by 0.6 multiple mutations in the D-, T- or anticodon stems of the tRNA (Fig. S2). Structure prediction with MFOLD (13) suggests that, 0.4 except in the acceptor stem, even the secondary structures of these mutant RNAs are different from the native. We found that 0.2

again the mutants were more extensively polyadenylated than the polyadeylated fraction 0.0 4 ∕ 2 wild-type RNA (Fig. S2). The finding that Trf p Air pZK1-5 and 5 t t 5 Trf4p∕Air2p 4 5 differentiate between wild-type and mutant ZK - ZK1- ZK4- tRNAs suggests that the truncated complexes retain at least some ZK1muZK2muZK3mut of the sequences necessary for RNA recognition. Fig. 1. The Air2p zinc knuckles modulate the activity of Trf4p. (A) Represen- N-Terminal Zinc Knuckle of Air2p Modulates Trf4p Activity on Some tative gels show a time course for polyadenylation, where 2 pmol Trf4p/Air2p 4 ∕ 2 – – Substrates. Notably, Trf p Air pZK1-5 polyadenylates both aber- complex is reacted with 40 fmol of wild-type (lanes 1 6, 13 18) or mutant Ala – – Ala 4 ∕ 2 rant and wild-type tRNA more extensively than Trf4p∕ (lanes 7 12, 19 24) tRNA . Time courses are for Trf p Air pZK1-5 (ZK1-5, lanes 1–12) and Trf4p∕Air2p 4 5 (ZK4-5, lanes 13–24). (B) Plot showing the Air2p 4 5 (Fig. 1 A–D). Trf4p∕Air2p 1 5 not only polyadeny- ZK - ZK - ZK - Ala lates a larger fraction of the input tRNAs than Trf4p∕ fraction of tRNA adenylated at given time points. The data are from three 2 separate experiments. SEM is indicated. (C) Mutant tRNAAla was reacted with Air pZK4-5, but also adds longer poly(A) tails (Fig. 1 A and C). 4 ∕ 2 4 ∕ 2 4 ∕ 2 Trf p Air pZK1-5, Trf p Air pZK4-5,orTrfp Air pZK1-5 but with the three These results demonstrate that Air2p, and specifically sequences most N-terminal knuckles individually replaced by hexaserine linkers (ZK1mut, that include its three most N-terminal zinc knuckles, enhances ZK2mut, ZK3mut). (D) Quantitation of three experiments as in C. SEM Trf4p activity on larger RNA substrates. They are consistent indicated. (E)A5 oligonucleotide is polyadenylated comparably by all Trf4p/ with the notion that these zinc knuckles are involved in RNA Air2p complexes used in C and D.

2of6 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1003505107 Hamill et al. Downloaded by guest on September 29, 2021 binding but do not exclude an alternative or additional role in A catalysis. To further assess the possibility that the zinc knuckles 1 CATALYTIC CENTRAL 584 Trf4p are involved in RNA binding, we used identical assay conditions but with a short RNA oligonucleotide (A5) as the substrate. The 161 192 297 481 finding that the oligonucleotide is polyadenylated comparably by 1 301 Air2p both complexes (Fig. 1E)—and thus that the N-terminal zinc knuckle sequences in Air2p affect activity only for some sub- 119 ZK4: Cys123-Cys136, ZK5: Cys164-Cys177 198 strates—argues that the N-terminal zinc knuckles of Air2p play a role in RNA binding. Todelineate the relative roles of the three N-terminal-most zinc B 146 knuckles in enhancing Trf4p activity, we deleted each of the first ZK4 H1 4 ∕ 2 three zinc knuckles in Trf p Air pZK1-5 individually and replaced H8 N each with a hexaserine linker. The linker acts as a spacer, allowing 198 C H2 us to delete individual zinc knuckles while not shortening the S1 122 H5 H4 distance between the remaining ones. The mutated complexes H9 S5 S2 S4 4 ∕ 2 H6 behaved similarly to Trf p Air pZK1-5 during purification. All three mutant complexes polyadenylated the short A5 oligonucleo- ZK5 S7 H3 tide to similar extents as Trf4p∕Air2p 1 5 (Fig. 1E), and the S3 ZK - S6 complexes in which the second or third zinc knuckle was altered S8 also polyadenylated mutant tRNAAla comparably (Fig. 1 C and D). The second and third zinc knuckles thus appear to be dispensable CENTRAL CATALYTIC for polyadenylation of aberrant tRNAAla. In contrast, tRNAAla polyadenylation was reduced when the first zinc knuckle of Air2p was mutated (Fig. 1 C and D). That the first zinc knuckle enhances Ala the polyadenylation of aberrant tRNA but not the A5 oligonu- C cleotide supports a role for this zinc knuckle in binding large RNA substrates. The second and third zinc knuckles could be

involved in binding other RNA substrates, or they might partici- BIOCHEMISTRY pate in interactions with Trf4p or the Mtr4p helicase. Similar mutagenesis experiments to assess whether the fourth and fifth zinc knuckles of Air2p also modulate Trf4p activity were not feasible because we were unable to isolate soluble Trf4p∕ 2 Air pZK1-5 when these zinc knuckles were mutated. As shown below, the fourth and fifth zinc knuckles of Air2p are involved in interactions with Trf4p, and the mutations likely disrupt com- plex formation between Trf4p and Air2p.

Structure of the Trf4p/Air2p Core. Constructs of Trf4p and Air2p Fig. 2. Structure of a Trf4p/Air2p subcomplex at 2.7-Å resolution. (A) Sche- were coexpressed in E. coli and purified as described above. matic of Trf4p and Air2p, indicating regions included in the crystallization For Trf4p,we used an active-site mutant, where the third aspartate construct. Air2p zinc knuckles are boxed. (B) Ribbons diagram with the catalytic and central domains of Trf4p in cyan and blue, respectively. Or- in the polymerase catalytic triad was altered to alanine (D293A), ange dots indicate positions of the aspartate residues in the catalytic triad. as this improved protein yields. [Trf4p activity may be harmful The fourth and fifth zinc knuckles of Air2p and adjacent linker regions for E. coli, where polyadenylation also targets RNAs for degrada- (ZK4, ZK5) are yellow and green, respectively, and zincs are red. (C) Super- tion (14), resulting in larger yields for the inactive enzyme.] We position of the Trf4p/Air2p subcomplex (blue/yellow and green) with 4 ∕ 2 another noncanonical RNA polymerase, CCA-adding enzyme from A. fulgi- were unable to obtain crystals with Trf p Air pZK1-5, possibly because the region containing the first three zinc knuckles is dus (PDB ID 2DRA, cyan). The catalytic and central domains were superim- conformationally heterogeneous and interferes with crystalliza- posed separately because their relative orientation differs in the two 4 ∕ 2 structures. tion. Crystals of Trf p Air pZK4-5 complex belong to space group P321 and diffract to 2.7-Å resolution. The structure was solved by single anomalous wavelength dispersion phasing using the anom- The fold of the central domain is shared by other noncanonical alous signal from zinc atoms bound by Air2p. The final model β – – Pol RNA polymerases such as CCA-adding enzyme from includes residues 161 481 of Trf4p and residues 122 146 and Archaeoglobus fulgidus (15) (Fig. 2C) and the terminal uridyltrans- 160–198 of Air2p, two zinc atoms, and 58 water molecules. Resi- ferases RET2 and TUT4 from Trypanosoma brucei (16, 17). Trf4p dues 147–159 of Air2p, a portion of the peptide linker between differs from these RNA polymerases most notably near the Air2p the fourth and fifth zinc knuckles, are disordered and were not binding surface, in a longer helix H9 and in the length and confor- modeled. Data collection and refinement statistics are in TableS1. mation of the connector between helices H6 and H8 (residues Trf4p adopts folds similar to structurally characterized Pol ß – superfamily polymerases in both the catalytic and central domains, 357 372 in Trf4p). Whereas the connector is only two residues despite undetectable sequence similarity in the latter. The catalytic in the archaeal CCA-adding enzyme, it consists of a short helix domain fold, common to all known Pol ß enzymes, is a five- H7 and an adjacent loop in Trf4p. The connector in the terminal stranded antiparallel beta sheet (strands S1–S5) flanked by two polyuridine transferases is long but differs significantly in both long alpha helices (H2, H3; Fig. 2) and comprises residues 190– conformation and orientation from the Trf4p connector. 315 in Trf4p. The central domain of Trf4p includes residues Air2p is bound to a surface on the central domain of Trf4p 161–189 and 316–481. It has an alpha helical core (helices H1, formed by the connector between helices H6 and H8, the entire H4–H6, H8–H9; Fig. 2) with a three-stranded antiparallel beta length of helix H8, and residues in S7 of the ß-hairpin. In Air2p, sheet (S6–S8) inserted between helices H8 and H9. The nucleotide residues in the fifth zinc knuckle (residues 164–177) and in the recognition motif corresponds to residues 421–433 (Fig. 2 and linker sequences C-terminal to the two zinc knuckles form the Fig. S3), including residues in strand S8 and C-terminal to it. interface with Trf4p (Fig. 3 A and B and Fig. S4). The fourth zinc

Hamill et al. PNAS Early Edition ∣ 3of6 Downloaded by guest on September 29, 2021 A B C

I144 I139 D177 A142 I175 V146 R135 S176 F459 N371 D367N371 N456 Y143 V374I377 L453 W140 I366 T455 V374 I377 L198 R360 R360 R141 D365 E378 E378 F451 N197 H358 H358 E381 R332 S191 E381 K385 F193 N197 T363 P359 F354 P359 F354 L394 V185 I308 N386 F487 M357L382N386 L382 H160 K444 Y165 R184 S183 V402 G388 Y403 R441 Y389 Y163 V392 D390 R181 F404 D391 P418

D E

L461 H169 S473 E182 Y189 Q122 Q122 I474 E196 L453 I478 P192 E196 R360 P233 R200 L198 P316 R332 S235 R200 K385 Y226 L334 D236/D238/D293 I292 T228 G341 K282 V222 H160 T339 V270 F223 H338 G399 F438 V272 V240 A276 D425 I302 P426 P418 R435 F296 N431

Fig. 3. The Air2p-interaction surface of Trf4p is conserved in eukaryotes. (A) Space-filling model of Trf4p with surface residues within 4.5 Å of Air2p zinc knuckles colored lime or green. Air2p zinc knuckles are shown as worms (fourth, yellow; fifth, green), and Air2p side chains that interact with Trf4p are labeled. (B)AsinA, but residues in Trf4p that are within 4.5 Å of Air2p are labeled and Air2p is not illustrated. (C) Residues that are identical or similar in Trf4p and six of seven other sequence-related polymerases are labeled and colored blue and cyan, respectively. We compared the sequence of Trf4p with sequences from S. cerevisiae, S. pombe, humans, Gallus gallus, Xenopus laevis, Drosophila melanogaster, and Caenorhabditis elegans. An alignment is in Fig. S3. Trf4p is oriented as in A and B.(D and E) Surface conservation as in C, but Trf4p is differently oriented. The conserved aspartate residues in the polymerase catalytic triad are red. Panels are enlarged in Fig. S4.

knuckle (residues 123–136) is tethered to Trf4p via the adjacent RNA 3′ End Recognition. Because Trf4p must interact with the RNA linker region but has little direct contact with the surface of Trf4p. 3′ end in order to polyadenylate it, we investigated whether this The fourth zinc knuckle may function in RNA binding, and re- interaction plays a role in substrate specificity. In these experi- 4 ∕ 2 sidues critical for RNA binding in the nucleocapsid zinc knuckles ments, we used the truncated complexes, Trf p Air pZK1-5 4 ∕ 2 (corresponding to residues 124–125 and 133–135 in Air2p) (12) and Trf p Air pZK4-5, as well as preparations of full-length com- are accessible to RNA (Fig. S5). In the fifth zinc knuckle, these plex. (As shown in Fig. S1, the full-length complex is degraded during purification, but we would nevertheless expect similar same residues (165–166 and 174–176) are buried against the sur- trends as for intact sample.) To determine whether certain face of Trf4p (Fig. S5). Thus, the fifth zinc knuckle serves as a – sequences are preferred, we performed polyadenylation assays protein protein interaction module, and if it also interacts with with 10-mers consisting of A, C, G, or Us (Figs. 4 A and B RNA, its interactions are different from those of the nucleocapsid and Fig. S6). We found that, for all three Trf4p/Air2p complexes, knuckles. RNAs ending in cytosines are modified least. This tendency may To identify functionally important surfaces, we mapped prevent polyadenylation of deacylated nuclear tRNAs that residues that are highly conserved in Trf4p and Trf5p, their Schi- undergo trimming of the CCA end to CC-, favoring repair by zosaccharomyces pombe homolog Cid14, and in sequences from the CCA-adding enzyme (18). That all three complexes behaved higher eukaryotes onto the Trf4p/Air2p structure (Fig. 3 C–E and similarly suggests that the Trf4p/Air2p core complex is sufficient Figs. S3 and S4). There are two highly conserved surfaces. One of for mediating end recognition. these is on the central domain and corresponds to the binding site Using RNA duplexes with 3′-poly(A) overhangs of different for the Air2p fragment (Fig. 3 C–E and Figs. S3 and S4). This lengths (1, 3, 5, 7, 10 nt) as substrates, we found that an overhang is required for polyadenylation by full-length Trf4p/Air2p, Trf4p∕ finding is particularly significant in that it provides the only 2 4 ∕ 2 experimental evidence to date that the Trf4p/Air2p interaction Air pZK1-5, and Trf p Air pZK4-5, but that the overhang can be as short as three nucleotides (Fig. 4C and Fig. S6). An over- is widely conserved in eukaryotes, a prerequisite for the conser- hang is probably required because the polymerase active-site cleft vation of the TRAMP-exosome surveillance pathway. A second can accommodate single-stranded RNA but not a bulkier conserved surface, as expected, is in the active-site cleft between RNA duplex. the catalytic and central domain. The conservation also extends These experiments suggest that 3′ end recognition contributes to surfaces adjacent to the active-site cleft, which may play roles to substrate specificity in Trf4p/Air2p. RNA ends have also been in binding RNA or protein partners of Trf4p. These could include found to be recognition elements for other proteins involved Mtr4p or portions of Air2p that are not in the structure. in noncoding RNA processing and/or surveillance, including

4of6 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1003505107 Hamill et al. Downloaded by guest on September 29, 2021 3’ overhang length A B C 0 +1 +3 +5 +7 +10 A10 C10 G10 U10 900, 1114 Trf4/Air2(ZK4-5) 692 489,501 Trf4/Air2(ZK1-5) 320 1.0 242 190 d 147 0.8 124 110 0.6

67 0.4

0.2 fraction polyadenylate 0.0 37 34 0 0 0 0 26 1 1 1 A C G U1

19

lane 1 3 5 7 9 11 lane MW 1 3 5 7 9 11 13 15 17 ′ ∼2 4 ∕ 2 Fig. 4. RNA substrates of Trf4p/Air2p. (A) Polyadenylation of 5 end-labeled oligomers A10,C10,G10,orU10 (30 fmol) by pmol of Trf p Air pZK4-5 (lanes 2, 5, 4 ∕ 2 8, and 11), Trf p Air pZK1-5 (lanes 3, 6, 9, and 12), or no protein (lanes 1, 4, 7, and 10). (B) Quantitation from three experiments in A, showing fraction of adenylated RNA. SEM is indicated. Trf4p/Air2p polyadenylates C10 less extensively than A10,G10,orU10.(C) Polyadenylation assay, where Trf4p/Air2p (2 pmol) was mixed with RNA duplexes (30 fmol) containing no overhang (lanes 1–3) or 3′- A-overhangs of length indicated (lanes 4–18). Longer extensions were added ′ 4 ∕ 2 4 ∕ 2 only when the 3 overhang was three nucleotides or longer. Reactions included Trf p Air pZK4-5 (lanes 2, 5, 8, 11, 14, and 17), Trf p Air pZK1-5 (lanes 3, 6, 9, 12, 15, and 18), or no protein (lanes 1, 4, 7, 10, 13, and 16).

the eukaryotic proteins La and Ro (19, 20), which both bind polyadenylation activity only on selected RNAs supports a role in nascent transcripts, as well as the bacterial RNase T and RNase RNA binding. We therefore propose a model where portions of R (21, 22). Sequences at the 3′ end may be one mechanism for the Trf4p/Air2p N and C termini and one (the N-terminal-most) recognizing such transcripts. or more of the Air2p zinc knuckles are involved in substrate binding. Different combinations of these elements could mediate Summary and Model for Interactions with RNAs. We have found that interactions with a large group of structurally different RNA BIOCHEMISTRY the surface on the central domain of Trf4p that interacts with substrates. Air2p is conserved in a number of other noncanonical RNA poly- merases, including Trf5p in yeast as well as related polymerases in Methods higher eukaryotes. This conservation argues that the Trf4p/Air2p Preparation, purification, and quantitation of the proteins and RNAs used in this study are described in SI Methods. subcomplex is conserved structurally and, most likely, function- ally. Although our experiments have focused on the Trf4p/Air2p Polyadenylation Assays. Polyadenylation assays were carried out in 10–15 μL heterodimer, it is likely that complexes containing Trf5p and/or volumes containing 50, 100, or 200 nM Trf4p/Air2p, 2.5–3.0 nM 5′ end-labeled Air1p as well as homologous complexes in higher eukaryotes RNA, 0.5 mM ATP, 5 mM MgCl2, 20 mM Tris · HCl, pH 7.6, and 50 mM NaCl. function in a similar way. Reactions were incubated for 30 min at 22 °C. For the tRNAAla time courses The conserved surface on Trf4p extends beyond the binding (Fig. 1A), a 15 μL reaction was set up, and 1.5 μL aliquots were taken at the site for the fourth and the fifth zinc knuckle modules, and it is times indicated. Reactions were stopped by addition of 50 mM EDTA, and for possible that additional conserved surface regions could interact reactions involving tRNAs, the RNA was isolated by phenol-chloroform extraction. Samples were fractionated in 10% (tRNA), 12% (duplexes), or with portions of Air2p not in the structure. These include the 15% (single-stranded oligomers) (wt∕vol) polyacrylamide 8M urea gels. To three N-terminal-most zinc knuckle modules as well as sequences quantitate the fraction of polyadenylated RNA, we used a PhosphorImager at the N and C termini of Air2p. The additional conserved surface (Molecular Dynamics) to compare the counts in the polyadenylated portion regions in Trf4p may also be important for interactions with the of each lane with the total counts in the lane. Mtr4p helicase or RNA substrates. How might the Trf4p/Air2p complex interact with RNAs? In Crystallization and Structure Determination. The protein used in crystallization all cases, the polymerase interacts with the 3′ end of its RNA was concentrated to ∼8 mg∕mL. Crystals were grown at 4 °C by the hanging μ substrates, polyadenylating them while they are inserted into drop vapor diffusion method. Drops consisted of 1.5 and 1.0 L protein and the polymerase active site between the central and catalytic do- reservoir solution (100 mM sodium citrate, pH 5.8, 200 mM sodium acetate, “ ” and 11% PEG 4000), respectively. Crystals appeared in 5 d and grew to a size mains, and it is likely that misfolded segments of RNA bound of 200 × 150 × 50 μm over the course of 2 weeks. Crystals were transferred are within a certain distance of this 3′ end. We have noted that briefly to a solution containing the mother liquor supplemented with 4 ∕ 2 4 ∕ 2 ∕ the core complexes, Trf p Air pZK1-5 and Trf p Air pZK4-5, 25% glycerol (vol vol) and flash-frozen in liquid nitrogen. are less active than the full-length complex studied by others Data were collected at beamline ID24-C at the Advanced Photon Source in (6), suggesting that the N and C termini are important for activity. Chicago, Illinois, at a wavelength of 1.28215 Å to make use of the anomalous Although the truncated ends could be important for catalysis, we signal from zinc in phasing. The data were scaled and integrated using favor a role in substrate binding because the crystal structure shows HKL2000 (23), and SHELXE (24) was used to identify zinc positions. There is one Trf4p∕Air2p 4 5 complex per asymmetric unit, and two zinc positions the catalytic regions of Trf4p appear to be structurally intact ZK - β were located. Phases were calculated and refined with SHARP (25), and elec- relative to other noncanonical RNA polymerases in the Pol tron density maps were obtained after solvent flipping as implemented family. Our biochemical data show that both core complexes therein. The initial maps had well-defined density for the catalytic domain differentiate between correctly folded and aberrant tRNA sub- of Trf4p as well as for helices in the central domain, and these portions of strates, indicating that sequences in the Trf4p/Air2p core are Trf4p were modeled first. Model building was performed using the program involved in RNA recognition. Possibly, then, the fourth and fifth O (26). Phases calculated from this partial model were combined with experi- zinc knuckles, included in both core complexes, could be involved mental phases, resulting in maps with improved density for Air2p and still unmodeled regions of Trf4p (Fig. S7). Although data from three crystals were in RNA binding. Additionally, we have shown that N-terminal used in calculating the initial experimental maps, regions in Air2p and the sequences in Air2p, and in particular the first zinc knuckle, central domain of Trf4p were clearer in the phase-combined map when only enhance the activity of Trf4p/Air2p on a longer tRNA substrate data from a single crystal were used, perhaps because of slight nonisomorph- but not a short A5 oligonucleotide. That this zinc knuckle affects ism between crystals in these regions. Once model building was completed,

Hamill et al. PNAS Early Edition ∣ 5of6 Downloaded by guest on September 29, 2021 the model was refined using the CNS software suite (27), alternating torsion ACKNOWLEDGMENTS. Data were collected at beamline 24-ID-C at the angle dynamics, least-squares minimization, and individual B-factor refine- Advanced Photon Source, and we thank the Northeastern Collaborative ment with manual rebuilding. In a final round of refinement with PHENIX Access Team staff for their support. K.M.R. thanks D. W. Rodgers for comments regarding this manuscript. Work presented here was funded by (28), we also refined translation/libration/screw parameters for four domains grants from the National Institute of General Medical Sciences (GM070521 (catalytic and central domains of Trf4p and two zinc knuckle domains) and and GM048410 to K.M.R. and S.L.W., respectively). S.H. is a Genentech Fellow added 58 water molecules. of The Jane Coffin Childs Memorial Fund for Medical Research.

1. Amaral PP, Dinger ME, Mercer TR, Mattick JS (2008) The eukaryotic genome as an RNA 15. Tomita K, Ishitani R, Fukai S, Nureki O (2006) Complete crystallographic analysis of CCA machine. Science 319:1787–1789. sequence addition. Nature 443:956–960. 2. Brosnan CA, Voinnet O (2009) The long and short of noncoding RNAs. Curr Opin Cell 16. Deng J, Ernst NL, Turley S, Stuart KD, Hol WG (2005) Structural basis for UTP specificity Biol 21:416–425. of RNA editing TUTases from Trypanosoma brucei. EMBO J 24:4007–4017. 3. Houseley J, Tollervey D (2009) The many pathways of RNA degradation. Cell 17. Stagno J, Aphasizheva I, Rosengarth A, Luecke H, Aphasizhev R (2007) UTP-bound and 136:763–776. Apo bound structures of a minimal RNA uridyltransferase. J Mol Biol 366:882–899. 4. Anderson JT, Wang X (2009) Nuclear RNA surveillance: No sign of substrates tailing off. 18. Wolfe CL, Hopper AK, Martin NC (1996) Mechanism leading to and the consequences Crit Rev Biochem Mol Biol 44:16–24. of altering the normal distribution of ATP(CTP):tRNA nucleotidyltransferase in yeast. – 5. Kadaba S, et al. (2004) Nuclear surveillance and degradation of hypomodified initiator J Biol Chem 271:4679 4686. – tRNAMet in S. cerevisiae. Genes Dev 18:1227–1240. 19. Wolin SL, Cedervall T (2002) The La protein. Annu Rev Biochem 71:375 403. 6. Vanacova S, et al. (2005) A new yeast poly(A) polymerase complex involved in RNA 20. Fuchs G, Stein AJ, Fu C, Reinisch KM, Wolin SL (2006) Structural and biochemical basis for misfolded RNA recognition by the Ro autoantigen. Nat Struct Mol Biol quality control. PLoS Biol 3:e189. 13:1002–1009. 7. LaCava J, et al. (2005) RNA degradation by the exosome is promoted by a nuclear 21. Zuo Y, Deutscher MP (2002) The physiological role of RNase T can be explained by its polyadenylation complex. Cell 121:713–724. unusual substrate specificity. J Biol Chem 277:29654–29661. 8. Wyers F, et al. (2005) Cryptic pol II transcripts are degraded by a nuclear quality control 22. Vincent HA, Deutscher MP (2006) Substrate recognition and catalysis by the exoribo- pathway involving a new poly(A) polymerase. Cell 121:725–737. nuclease RNase R. J Biol Chem 281:29769–29775. 9. Schneider C, Anderson JT, Tollervey D (2007) The exosome subunit Rrp44 plays a direct 23. Otwinowski Z, Minor W (1997) Methods in Enzymology, eds CW Carter Jr and – role in RNA substrate recognition. Mol Cell 27:324 331. RM Sweet Jr (Academic, New York), 276, pp 307–326. 10. Martin G, Keller W (2007) RNA-specific ribonucleotidyl transferases. RNA 24. Sheldrick G, Schneider T (1997) Methods in Enzymology, eds CW Carter Jr and – 13:1834 1849. RM Sweet Jr (Academic, New York), 277, pp 319–343. 11. Martin G, Doublie S, Keller W (2008) Determinants of substrate specificity in RNA- 25. de la Fortelle E, Bricogne G (1997) Methods in Enzymology, eds CW Carter Jr and – dependent nucleotidyl transferases. Biochim Biophys Acta 1779:206 216. RM Sweet Jr (Academic, New York), 276, pp 472–494. 12. D’Souza V, Summers MF (2005) How retroviruses select their genomes. Nat Rev 26. Kleywegt GJ, Jones TA (1997) Methods in Enzymology, eds CW Carter Jr and RM Sweet Microbiol 3:643–655. Jr (Academic, New York), 277, pp 208–230. 13. Zuker M (2003) MFOLD web server for nucleic acid folding and hybridization 27. Brunger AT (2007) Version 1.2 of the crystallography and NMR system. Nat Protoc prediction. Nucleic Acids Res 31:3406–3415. 2:2728–2733. 14. Deutscher MP (2006) Degradation of RNA in bacteria: Comparison of mRNA and stable 28. Adams PD, et al. (2010) PHENIX: A comprehensive Python-based system for macro- RNA. Nucleic Acids Res 34:659–666. molecular structure solution. Acta Crystallogr D 66:213–221.

6of6 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1003505107 Hamill et al. Downloaded by guest on September 29, 2021