Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

HYPOTHESIS

Rationalization and prediction of selective decoding of -modified nonsense and sense codons

MARC PARISIEN,1,3 CHENGQI YI,1,3 and TAO PAN1,2,4 1Department of Biochemistry and Molecular Biology, 2Institute for Biophysical Dynamics, University of Chicago, Chicago, Illinois 60637, USA

ABSTRACT A stop or nonsense codon is an in-frame triplet within a messenger RNA that signals the termination of translation. One common feature shared among all three nonsense codons (UAA, UAG, and UGA) is a present at the first codon position. It has been recently shown that the conversion of this uridine into pseudouridine (C) suppresses translation termination, both in vitro and in vivo. Furthermore, decoding of the pseudouridylated nonsense codons is accompanied by the incorporation of two specific amino acids in a nonsense codon-dependent fashion. C differs from uridine by a single N1H group at the C5 position; how C suppresses termination and, more importantly, enables selective decoding is poorly understood. Here, we provide molecular rationales for how pseudouridylated stop codons are selectively decoded. Our analysis applies crystal structures of ribosomes in varying states of translation to consider weakened interaction of C with release factor; thermodynamic and geometric considerations of the codon-anticodon base pairs to rank and to eliminate mRNA-tRNA pairs; the mechanism of fidelity check of the codon-anticodon pairing by the ribosome to evaluate noncanonical codon-anticodon base pairs and the role of water. We also consider certain tRNA modifications that interfere with the C-coordinated water in the major groove of the codon-anticodon mini-helix. Our analysis of nonsense codons enables prediction of potential decoding properties for C-modified sense codons, such as decoding CUU potentially as Cys and Tyr. Our results provide molecular rationale for the remarkable dynamics of ribosome decoding and insights on possible reprogramming of the genetic code using mRNA modifications. Keywords: pseudouridine; nonsense codon; molecular modeling; ribosome; tRNA

INTRODUCTION own and the preceding phosphate groups (Arnez and Steitz 1994). Pseudouridine (C) is a C5-glycoside rotation isomer of In a recent study, Yu and colleagues demonstrated that uridine (Fig. 1A). It is found in transfer and ribosomal the substitution of uridine in all three nonsense codons RNA (tRNA, rRNA) throughout the three kingdoms of with C suppresses translational termination, both in vitro life and in spliceosomal small nuclear RNA (snRNA) in and in (Karijolich and Yu 2011). . Although C is the most abundant RNA mod- Furthermore, they showed that C-modified nonsense co- ification (Hamma and Ferre-D’Amare 2006), it remains to dons are selectively decoded as specific amino acids (Table 1). be determined whether C is also present in messenger RNA In particular, CAA and CAG are read as serine and thre- (mRNA). The primary chemical change made by the U-to-C onine, whereas CGA is read as phenylalanine and tyrosine. conversion is the addition of a hydrogen bond donor Decoding for unmodified mRNA codons is accom- through the N1H group. In an RNA helix, this hydrogen plished in multiple steps and is under strict surveillance bond donor is in the major groove and can anchor a water by the ribosome. mRNA codons are usually recognized by molecule to bridge the interactions of this N1H group to its tRNAs featuring complementary Watson-Crick sequences and sometimes wobble base pairs at the third codon posi- tion. Watson-Crick and certain wobble base pairs enable 3These authors contributed equally to this work. cognate tRNAs to thermodynamically outcompete other 4 Corresponding author. tRNA species whose sequences would introduce mismatched E-mail [email protected]. Article published online ahead of print. Article and publication date are base pairs with the mRNA codon. Further, the mRNA-tRNA at http://www.rnajournal.org/cgi/doi/10.1261/rna.031351.111. base pairs in the A-site of a ribosome are probed in a process

RNA (2012), 18:00–00. Published by Cold Spring Harbor Laboratory Press. Copyright Ó 2012 RNA Society. 1 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Parisien et al.

ciently suppressed as translation stops, they are also selectively decoded as just two amino acids (Karijolich and Yu 2011). Although the decoding pattern of the modified nonsense codons includes Watson-Crick type C/A base pairs, e.g., those in tRNASer, the presence of C in the first codon posi- tion also permits decoding by tRNAs with U36 at their third anticodon position, e.g., those in tRNAThr. This opens up the possibility for a mRNA-tRNA mismatched C/U base pair. Furthermore, the presence of C leads to additional changes in base-pairing rules in the second or even in the third codon- anticodon pairs, e.g., A/G mismatches in the case of CAA/ CAG decoding at the second position and in the case of CGA decoding at the third position (Table 1). No rational expla- nation was provided for these experimental observations. Here, we provide a molecular explanation of how the three C-modified nonsense codons are suppressed as ter- mination codons and, more importantly, are selectively de- coded as specific amino acids. Taking advantage of the large number of available crystal structures of ribosomes in vary- ing states of translation, we provide molecular models on selective decoding, A-site probing, thermodynamic and geo- metric considerations, and influence of certain tRNA mod- ifications on selective decoding of C-modified nonsense codons. Because no structures of eukaryotic ribosomes con- taining both mRNA and tRNA are available, our analysis had to rely on the structures of bacterial ribosomes. Due to the involvement of many interaction partners, it was not pos- FIGURE 1. Structural comparison of U and C and proposed role of sible to predict the precise fraction of the two amino acids decreased recognition by release factors. (A) Chemical structure of U that selectively decode C-containing nonsense codons. and C. Partial charge is labeled for each atom and calculated dipole Despite these caveats, our results provide insights for the moments of the two bases (debye, or D) are shown with arrays. (B) remarkable dynamics of ribosome decoding and lead to pre- Recognition of U in a nonsense codon by RF1 (PDB code: 3D5A) and RF2 (2WH3). (C) Comparison of attractive dipole moment between dictions of potential decoding rules for C-modified sense the release factor protein and the nucleobase, when either a U or C codons which should be useful in rational reprogramming (assuming C occupies the same position as U) is present in the active of the genetic codes using RNA modifications. site of RF1. The extra water molecule present only with C-modified nonsense codons is also shown. RESULTS AND DISCUSSION termed ‘‘the A-site test’’ to ensure the fidelity of the decoded C has to escape recognition by release factors codon (Ogle et al. 2001; Schuwirth et al. 2005; Selmer et al. 2006). Specifically, the A-site test is performed by a small Nonsense codons in the A-site of ribosome are normally ribosomal subunit via A1493 and A1492 of helix 44 in the recognized by release factors, thereby triggering dissociation 16S rRNA (Thermus thermophilus numbering is used here- after), which form an extensive network of hydrogen bonds in the minor groove with the mRNA-tRNA base pairs at the TABLE 1. Amino acids incorporated in C-containing nonsense codons first and second codon position, and by G530 of loop 530 at the third codon position (Ogle et al. 2001). Such minor- Stop Incorporation Putative tRNA groove interactions are universally conserved and serve as codon AA (%) anticodon a key step in the fidelity control of the decoding process 59-CAA ser ;50 59-IGA, 59-CGA, 59-UGA, 59-GCU (Lescoute and Westhof 2006). Many tRNAs are extensively thr ;50 59-IGU, 59-CGU, 59-UGU modified at nucleotide 37, the immediate 39 nucleotide to the 59-CAG ser ;90 59-IGA, 59-CGA, 59-UGA, 59-GCU third anticodon nucleotide which reads the first position of thr ;10 59-IGU, 59-CGU, 59-UGU 59-CGA tyr ;80 59-GUA the mRNA codon. Modification at nucleotide 37 could also phe ;20 59-GAA influence the accuracy to decode the first codon nucleotide. When the first position of the nonsense codons is modi- Corresponding tRNA anticodons for all tRNA isoacceptors are also shown. (I) inosine. fied from U to C, the modified codons are not only effi-

2 RNA, Vol. 18, No. 3 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Recoding of pseudouridine-modified codons of ribosomal subunits and releasing the newly synthesized The weaker affinity derived from altered dipoles and chain. In order to allow base-pairing to anticodons water-mediated base geometry could enable some tRNA and subsequent incorporation of amino acids, C-modified species to compete for the binding of the modified non- nonsense codons must first escape recognition by the release sense codon. factors. In , all three nonsense codons are recognized by eRF1 which forms a functional complex with eRF3, while in tRNA abundance cannot explain the specificity prokaryotes, two proteins (RF1 and RF2) recognize the non- of amino acid incorporation sense codons. Due to the lack of structural information of eRF1 in the context of nonsense codon recognition in the One trivial explanation for the specific incorporation of ribosome, our analysis was performed with prokaryotic re- certain amino acids for the C-modified nonsense codon lease factors (RF1 and RF2) where rich structural informa- (Table 1) is that decoding might be governed by an in vivo tion is available (Korostelev et al. 2008; Laurberg et al. 2008; abundance of tRNA molecules. We have previously de- Weixlbaumer et al. 2008; Korostelev et al. 2010), despite the veloped a tRNA microarray method to measure the relative fact that the nonsense codon suppression experiments were abundance of tRNA molecules within any biological sample done in yeast (Karijolich and Yu 2011). (Dittmar et al. 2006; Zaborske et al. 2009). Our previous Nonsense-codon recognition involves numerous inter- results for yeast suggest that the charging level is generally actions of various types (polar, hydrophobic, ionic, stack- high for all tRNAs under optimal growth conditions ing, etc.) for E. coli RF1 and RF2 (Sund et al. 2010). Since (Zaborske et al. 2009). Therefore, only the abundance of interactions to the Watson-Crick face of uridine (or C) cellular tRNA should be taken into account for decoding seem inadequate to explain the decreased binding affinity C-modified nonsense codons. However, a correlation be- of C to release factors, we hypothesize that certain differ- tween the tRNA abundance and selective decoding of ences between uridine and C must have weakened the re- C-modified nonsense codon is not observed in yeast (Fig. 2; cognition of the C-modified nonsense codons by the release data from Tuller et al. 2010). For example, the abundance of factor. all tRNAThr isoacceptors is below average, but threonine is The first difference is the dipole moment: while the readily incorporated at CAA and CAG codons. Further, magnitude difference of dipole moments of U and C base none of the five most abundant tRNAs matches the actual is small, the angular difference is obvious (z27.9°) (Fig. 1A). amino acids incorporated at C-modified nonsense codons. The N-terminus of helix a5 from 2 of RF1 is in close These results indicate that the observed specificity of amino proximity to the uridine base (Fig. 1A), and the projection acid incorporation is not determined by the abundance of for the dipole of U on the dipole of a5isz30% greater than tRNA species in cells. that for C (Fig. 1C). Since the most favorable interaction is for colinear dipole moments, i.e., the angle u = 0, a greater Selecting tRNA candidates for decoding C-modified angle to helix a5’s dipole moment for C compared to U nonsense codons suggests that its interaction with helix a5isweakened. Furthermore, the greater angle of the C dipole should The competition of various tRNA species for binding to experience z1.4 times greater torque than U. This torque a sense codon is determined in part by the thermodynamic tends to align C’s dipole with that of the termination factor’s preference of the Watson-Crick and certain wobble base a-helix, therefore coaxing C in a nonideal position for pairs in the codon-anticodon mini-helix at the ribosomal hydrogen bonding to release factors (Fig. 1C). An all-atoms A-site. Unlike sense codons, no complementary Watson- energy computation provides for a À3.2 kcal/mol estimate in Crick tRNA partners exist for the nonsense codons. There- favor of U with respect to C at binding RF1; hence, the fore, decoding C-modified nonsense codons has to involve binding of C-containing stop codons to release factors is mismatched base pairs. In addition, the mismatched mRNA- expected to be significantly decreased. Although the struc- tRNA mini-helix has to survive the ribosomal A-site test to tures of prokaryotic release factors vary significantly from enable elongation of the nascent peptide chain. those of eukaryotes (Song et al. 2000; Cheng et al. 2009), and The closest tRNA candidates involves two Watson-Crick atomic details of base recognition by eRF1 are still unclear, base pairs in the first and second position but a mis- the rotational isomerization from U to C is expected to match in the third position. Using such tRNA candi- always change the dipole-dipole interaction between a bound dates would predict decoding CAA/CAG as tyrosine base and protein residues in eRF1. and CGA as cysteine or tryptophan, but none of these Second, a water molecule is present only in the C-mod- predictions matches the experimental results (Karijolich ified codons through hydrogen bonds to the N1H and the and Yu 2011). 59 phosphate oxygen of C (Fig. 1C; Arnez and Steitz 1994; Many of the competing tRNAs can be readily ruled out Yarian et al. 1999). This water molecule could interfere by applying simple rules to the geometry of base pairs with release factor interaction through steric hindrance formed between the mRNA codon and tRNA anticodon. or reducing the rotational freedom of the C base. The first and second codon-anticodon base pairs are proof-

www.rnajournal.org 3 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Parisien et al.

Hetero-purine base pairs can exist in many conformations, two of which are most abundant: the imino (cis W/W, Saenger type VIII) and sheared (trans S/H, Saenger type XI) forms. The sequence context around the hetero- purine pair determines which one is favored over the other (Villescas-Diaz and Zacharias 2003; Yildirim and Turner 2005), The presence of either a sheared A/G (i.e., trans H/S) or G/A (i.e., trans S/H) pair would hinder the fidelity check by the A1492 residue of the rRNA which senses the sugar edge of the second codon-anticodon pair. In the sheared FIGURE 2. Relative tRNA abundance in yeast measured by microarray. tRNAs that could base-pair state, the O29 atom of the decode the C-containing nonsense codons are sorted on the right. The sum of the percentage second mRNA nucleotide is not posi- of tRNASer and tRNAThr species is shown. The anticodon sequence for each tRNA is included in the parentheses. tioned properly to make a ribose zipper with A1492. Hence, a hetero-purine base pair at the second position cannot be in read extensively by the ribosome in the A-site. The base-pair the sheared state. With regard to the imino state, the cis geometry of the first codon-anticodon pair is particularly W/W conformation offers a larger C19-C19 distance (z12.5 important because of its most extensive interactions with the A˚ compared to z10.3 A˚ for canonical Watson-Crick), but its A1493 residue of the 16S rRNA (Ogle et al. 2001; Lescoute distortion on the RNA helix is minimal. For example, sugar and Westhof 2006). Since the canonical Watson-Crick base puckers of tandem G/A mismatches in protein database pairs are of the cis W/W type (Leontis-Westhof nomencla- (PDB) file 1MIS (Wu and Turner 1996) are in C39-endo anti ture) (Leontis and Westhof 2001), viable mismatches for the conformation which is the same as in a canonical double- first codon-anticodon pairs should also be of the cis W/W helix. A1492 may coax the imino state of the G/A or A/G geometry. For the first codon position, tRNA molecules base pair via an induced fit mechanism (Williamson 2000), having either an A or a U at the corresponding position which would then allow for a proper minor-groove base-pair effectively decode the C-modified nonsense codons (Table recognition by A1492 of the rRNA. Inosine at position 34 in 1). Thus, C/A and C/U pairs (codon nucleotide precedes) several tRNAs decodes the third codon position; since it also must be allowed. Indeed, both U/A and U/U pairs can be decodes , an adenosine-inosine mismatch should present in the cis W/W conformation (Leontis and Westhof also be tolerated at the third codon position (Murphy and 2001). This readily rules out tRNAs having a G or C at the Ramakrishnan 2004; Murphy et al. 2004). In these studies, third position of the anticodon, leaving only tRNAs with A it has been found that the A/I mismatch is in cis W/W or U at the third anticodon position for further consider- with both sugars in the anti conformation. ation. For the second and third positions, the experimentally Taken altogether, these simple rules readily eliminate decoded nonsense codons indicate that hetero-purine mis- most of the tRNA species from considerations for decoding matches (A/G or G/A) are allowed, which can also be found C-modified nonsense codons (Table 2). These results, how- in the cis W/W state. Homo-purine mismatches can be ever, do not yet explain how the mRNA-tRNA mini-helix discarded because a G/G base pair cannot occur in the cis containing various mismatched base pairs is able to satisfy W/W geometry, and A/A cis W/W conformation features the ribosome A-site test. We will provide detailed expla- only one hydrogen bond between the two bases and, hence, is nations of this in the following sections. unstable (Leontis et al. 2002). Although a purine-pyrimidine wobble should be allowed RNA molecular modeling using MC-Sym for the second and third codon-anticodon pair, A/C is not considered here because the adenosine has to be protonated Although molecular dynamics (MD) simulations are rou- to promote the formation of the base pair, even though an tinely performed to gain insights in chemical and biological A/C cis W/W pair is isosteric to a G/U wobble (Stombaugh processes, in the case of RNA, however, current force fields et al. 2009). The protonation state of an A/C pair is known to have been shown to produce unstable trajectories or macro- depend on the number of its surrounding base pairs scopic expected values not in line with experimental results (Siegfried et al. 2010), and the short mRNA-tRNA mini- (Yildirim et al. 2009; Gong and Xiao 2010). We, therefore, helix in the A-site is unlikely to support the protonation of choose to apply the RNA molecular modeling package, MC- the adenosine. Sym (Major et al. 1991; Major 2003) to generate models at

4 RNA, Vol. 18, No. 3 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Recoding of pseudouridine-modified codons

TABLE 2. Detailed explanations of the simple rules used to narrow down tRNAs for decoding C-modified nonsense codons

Rule-out notes Description

Decoded An amino acid of that tRNA species is incorporated into the protein. Outcompeted The energy profile for that tRNA species is on the right (worst) of energy profiles of other tRNA species; hence. it is outcompeted by others. Weak A/C The mRNA-tRNA mini-helix features an A/C base pair. An A/C pair can be isosteric to wobble G/U only if A is protonated at N1, which is unlikely, as discussed in the text. Other A/C pairs feature weak hydrogen bond(s) between the bases, including those mediated by water molecules. Homopurine G/G The mRNA-tRNA mini-helix features a G/G or I/G base pair. This base pair cannot occur in the cis or I/G W/W configuration. Homopurine A/A The mRNA-tRNA mini-helix features an A/A base pair. However, in the cis W/W configuration, that base pair features only one hydrogen bond. No heteropurine When a C/U base pair between the mRNA and tRNA occurs at the first codon position, it requires a heteropurine A/G mismatch at the second codon position. This helps widen the minor groove of the C/U base pair for it to survive the ribosomal A-site test. Pyrimidine U/U At the third codon position, a pyrimidine mismatch has a minor groove too narrow for a proper codon-anticodon or U/C helix. Alternatively, a minor groove water-mediated pair would introduce a steric clash with G530 in the 18S rRNA.

Suppressor tRNAs with anticodons matching the nonsense codons are not known in yeast.

the atomic level, as was done for other systems (Major et al. water. Although our computation does not take entropy 1993; Lemieux et al. 1998; Wang et al. 2011). MC-Sym is not into account, it still gives valuable information on the based on energetic considerations of base-pairing interac- ability of different tRNAs for decoding particular codons. tions, is capable of addressing specific base-pair types during Entropic factors may be estimated, but force field issues modeling, and allows for systematic and exhaustive explo- have not yet been adequately resolved for molecular dyna- ration of hypotheses on base-pair types. mics of nucleic acids, so calculations of DS is currently out We applied MC-Sym to test and provide various mRNA- of the scope of this work. tRNA mini-helices in the context of the ribosome’s helix 44 Because no structures of eukaryotic 80S featuring a tRNA and loop 530 (Fig. 3). In our case, MC-Sym is used to sam- bound to an mRNA codon in an A-site test configuration ple the conformational space of the nucleotides comprising have been reported, 70S structure (1IBM) from Thermus the mRNA-tRNA mini-helix and the rRNA residues directly thermophilus is used as the template for modeling, although involved in probing this mini-helix, which correspond to the C-modified codon experiments were performed in yeast. A1493 and A1492 from helix 44 and G530 of loop 530. As such, it might be argued that a prokaryotic template is Subsequently, energy minimization is applied to the MC- inappropriate to model translational decoding results ob- Sym’s decoy sets, and the energy profile of each decoy set is tained from yeast. Indeed, the eukaryotic 80S ribosome is compared to reveal which tRNA species are most likely to more complex and bound by extra proteins compared to the decode a specific mRNA codon. The energy profile derived prokaryotic 70S ribosome; however, the A-site decoding from MC-Sym is the energy curve of the models within the center features universally conserved and essential nucleo- given decoy set. A more negative energy profile based on our tides: A1492, A1493, and G530 (Ben-Shem et al. 2010; Rabl structural hypothesis of the codon-anticodon mini-helix for et al. 2011). There are structural differences between the a given tRNA species increases the probability for using that prokaryotic and eukaryotic ribosome, e.g., in the internal tRNA for decoding. At this time, our energy calculations can loop that anchors nucleotides A1492 and A1493. Because we only rank each tRNA species under consideration for their model the ‘‘on,’’ or activated state of this internal loop, the likelihood to decode C-modified codons. structural template of a prokaryotic ribosome should be Stability measurements are typically reported in terms of comparable to a eukaryotic ribosome, as both prokaryotic DDG. Although the computation of DDG values in silico and eukaryotic activated A-site states should essentially be can be done via the use of thermodynamics cycles, this is the same (Kondo and Westhof 2008). computationally very expensive (Yildirim et al. 2009). Here, Although ribosome P-site proofreading has also been re- we report stabilities of the ribosome decoding center as ported recently (Zaher and Green 2009), we did not construct internal energy or DH. DH is concerned with estimating the models for the P-site since there are no known direct readouts energy release upon hydrogen bond formation or other of the mRNA-tRNA mini-helix by any rRNA or ribosomal binding forces as the atomic groups are brought together proteins in the P-site. We also did not take the role of A1913 from infinity, in contrast to DDG which is concerned with of the 23S rRNA into consideration in the decoding of the the energy difference of hydrogen bonds when molecules third codon position (Ortiz-Meoz and Green 2011). A1913 is associate compared to when these molecules are isolated in located on top of the first codon base and is perpendicular to

www.rnajournal.org 5 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Parisien et al.

MC-Sym, and the output models were subjected to energy minimizations (see Materials and Methods for details). One best solution from the ensemble is shown in Figure 4A. At the first position, the cis W/W type C/U base pair consists of one direct hydrogen bond between the O2 atom of C1from mRNA and the N3 of U36 from tRNA and two water-mediated interactions bridging C1-O2/U36-O4 and C1-N3/ FIGURE 3. Modeling steps to rebuild the A-site. (A) First, nucleotide 530 (purple) of loop 530 U36-O2 (Fig. 4A, top). The physical in the rRNA and nucleotides 34 (green), 35 (marine), and 36 (orange) of the anticodon loop presence of the water molecule W1 also are sampled using the MC-Sym computer program. An adjacency relationship (adjacent) is widens the minor groove, in addition to used to position nucleotide 36 with respect to 37 and so on down to nucleotide 34 in the tRNA. A stacking relationship (stack) positions 530 with respect to 518. (B) Second, forming a hydrogen bond to the N3 of nucleotides of the messenger RNA (1, red; 2, green; 3, marine) and the two A1493 to enable recognition. This type 1492 (orange) and 1493 (purple) in the rRNA are sampled in the context of a given of U/U base pair, bridged by a water conformation chosen from step 1. Here, nucleotides in the mRNA are positioned using molecule, can be found in helix 44 of a pairing relationship (pair) with respect to their paired partner in the tRNA (anticodon 36 positions mRNA 1, 35 positions 2, and 34 positions 3). A1493 of rRNA pairs with 36 of tRNA, the ribosome in close proximity to the while A1492 of rRNA pairs with mRNA residue 2. A1492-A1493 decoding center (Vicens and Westhof 2003). This U/U base pair with the bridging water is isosteric to the first codon base pair; it does not make a direct contact canonical Watson-Crick base pairs and preserves the C19- with the third codon position. A1913 may serve as a ratchet C19 distance in a standard A-form helix. At the second lock to help prevent slippage of the mRNA-tRNA mini-helix position, an A/G imino pair is present, and this hetero toward the P-site. All of our models end up with planar base purine-purine pair widens the minor groove (Fig. 4A, pairs which should not interfere with A1913. middle). Although the hydrogen bond to G530 is absent, Instead of going over every codon-anticodon base-pair- the A1492 nucleoside which interacts only with the codon ing in detail, we describe a few case studies to illustrate the nucleotide from mRNA and not with the anticodon considerations that went into our modeling. Complete rule- nucleotide from the tRNA can still make the ribose zipper. out descriptions for CAA are presented in Table 3, for CGA In terms of other base-pairing modes, a cis H/W confor- in Table 4, and for CAG in Table 5. mation is not supported, as the H8 atom of A2 would clash with A1492-H2 in the A-minor interaction. MC-Sym is also able to build a trans H/S A/G pair, but the energy of this Case study I: Decoding CAA with tRNAThr (UGU) conformation is unfavorable. At the third position, a canon- The U/U cis W/W base pair is viable in the mRNA-tRNA ical Watson-Crick A/U pair is accommodated as anticipated. Thr mini-helix at the first codon position pairing with the third The energy profile predicts a ranking of tRNA (UGU) > Thr Ser Tyr Asn Lys anticodon nucleotide of tRNA (UGU). However, the mi- tRNA (UGU) $ tRNA (GUA) >> tRNA (GUU)  tRNA nor-groove width of such a U/U pair has to be properly (UUU) (Table 3). The experimental results show that CAA maintained so that A1493, which interacts with both nu- is primarily decoded by Thr and Ser, in good agreement with Asn Lys cleotides of this base pair, can correctly recognize it. We our rankings. tRNA (GUU) and tRNA (UUU) feature a C/U reason that in order to widen the minor groove of the U/U pair at the first position, yet lack an A/G hetero-purine pair base pair, two prerequisites need to be met simultaneously: at the second position. The resulting energetic curves are in water molecule(s) at the Watson-Crick edge and a hetero- good agreement with our hypothesis that a C/U pair purine mismatch at the second codon position pairing with is viable only when it coexists with an A/G pair. Tyr the middle nucleotide of the tRNA anticodon. Hence, tRNA (GUA) is the only tRNA that shows a comparable Ser Tyr tRNAs bearing a uridine at position 36 can be ruled out energy profile to tRNA (UGU), but tRNA (GUA) is not used Tyr except when there is a hetero-purine mismatch at the in decoding. We do not know at this time why tRNA (GUA) second codon position. is outcompeted at the A-site. We propose that the only tRNA capable of decoding CAA for threonine is tRNAThr . In this case, the (UGU) Case study II: Decoding CGA with tRNATyr codon-anticodon base pairs contain a C/U pair at the first (GUA) position and an A/G imino pair at the second position. According to the considerations in Table 4, tyrosine can be To provide an explanation of the decoding mode at the incorporated following the decoding of CGA by a GUA molecular level, extensive modeling was performed with anticodon. This means that a consecutive G/U base pair at

6 RNA, Vol. 18, No. 3 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Recoding of pseudouridine-modified codons

pair is ruled out since it would prevent G530 from interact- TABLE 3. Rule-out description for tRNAs with an A or U at the third anticodon position in the decoding of the CAA nonsense codon ing with the ribose of the third codon nucleotide. The cis H/W type A/G pair seems to be energetically preferred (Table 4): The adenosine is in syn conformation with respect to its sugar and forms two direct hydrogen bonds with the opposing guanosine (Fig. 4B, bottom). The minor groove width is z11.1 A˚ , close to 10.3 A˚ for a canonical base pair. The cis W/W conformation cannot be ruled out purely based on modeling, but its C19-C19 distance is wider than cis H/W, and it is less energetically favored (Table 4). Lastly, Leu Ile Ile tRNA (UAA),tRNA (IAU), and tRNA (UAU),whichcannot be ruled out by simple principles of base-pairing mode, seem

TABLE 4. Rule-out description for tRNAs with an A or U at the third anticodon position in the decoding of the CGA nonsense codon

The graph shows the decoy set energy distributions of the decoding center for selected tRNAs. Black curves represent decoding of CAA Thr Ser by tRNA (UGU) (line 1) and tRNA (UGA) (line 2). Decoding of Tyr CAA may also be possible with tRNA (GUA): the A/G base pair at the third codon position can be either type IX, a.k.a. cis H/W (line 3a) or type VIII, a.k.a. cis W/W imino (line 3b). However, Tyr Asn tRNA (GUA) is outcompeted for unknown reasons. tRNA (GUU) Lys (line 4) and tRNA (UUU) (line 5) are less energetically favored and therefore not used in decoding. All base pairs are of cis W/W type unless annotated otherwise.

the first and second position and an A/G pair at the third position are tolerated. Surveys of the PDB indicate that G/U pairs are mainly found in the wobble form (cis W/W, Saenger type XXVIII), although there exist other hydrogen The graph shows the decoy set energy distributions for selected bonding schemes that occur much less frequently than the Tyr tRNAs. Black curves represent decoding of CGA by tRNA (GUA) Phe cis W/W type. Surveys of the PDB also shows that the A/G (line 1a) and tRNA (GAA) (line 3a), both with an A/G base pair at base pair at the third position can be in either one of four the third codon position of type IX, a.k.a. cis H/W, which is slightly more favored than their type VIII, a.k.a. cis W/W, imino counter- types: cis W/W (‘‘imino,’’ Saenger type VIII), cis W/H parts (line 1b and 3b). Decoding of CGA is also calculated for Cys Arg Leu (Saenger type IX), trans W/S (Saenger type X), or trans S/H tRNA (GCA) (line 2), tRNA (UCU) (line 4), tRNA (UAA) (line 5), Ser Ile Ile tRNA (GCU) (line 6), tRNA (IAU) (line 7), tRNA (UAU) (line 8), (‘‘sheared,’’ Saenger type XI). In all, a total of eight structural Asn Lys tRNA (GUU) (line 9), and tRNA (UUU) (line 10). Most of these hypotheses were tested with MC-Sym (two G/U types times Tyr Phe are less energetically favored than tRNA (GUA) and tRNA (GAA), Cys Cys four A/G types). The wobble G/U pair is found to be favored except for tRNA (GCA) (line 2). Although tRNA (GCA) is high in over all other G/U pair types at the second position (Fig. 4B, energy ranking, its ranking in decoding should be lower because of its i6A37 modification. middle). Regarding the third base pair, a trans H/S type A/G

www.rnajournal.org 7 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Parisien et al.

Lys tRNA (UUU) are also used in the decoy set energy calcula- TABLE 5. Rule-out description for tRNAs with an A or U at the third Ser anticodon position in the decoding of the CAG nonsense codon tion in addition to these four tRNAs (Table 5). tRNA (CGA) Thr (line 1) and tRNA (CGU) (line 2) are energetically pre- ferred, which agrees well with the experimental results. Since the base-pairing modes observed for these two tRNAs are very similar to those shown in Figure 4 (cis W/W type C/U and C/A [Fig. 4A], A/G imino pair [Fig. 4B], and both tRNAs form a regular Watson-Crick G/C pair at the third position of the codon-anticodon pair), they are not illus- trated in detail to avoid redundancy.

Role of C in decoding At the first position of the mini-helix between CGA and Tyr Ser tRNA (GUA), as well as CAA/CAG with tRNA , C is in the anti conformation and forms a Watson-Crick base pair with the anticodon A36 in the tRNA. The N1 atom of C is protonated at physiological pH and resides in the major groove; the N1H is frequently found to be involved in hydrogen bonding interactions with phosphate group(s) of the C residue or a nearby nucleotide (Arnez and Steitz 1994; Yarian et al. 1999). For example, a C/A pair is em- bedded within the anticodon stem of certain tRNA species (Fig. 5A; Delagoutte et al. 2000). In this helical context, previous studies have shown that a C27/A43 pair is of higher stability compared to a U/A pair and should be considered as an alternative base pair for CG (Price and Gray 1998). We used the conformation of the C27/A43 pair from PDB file 1F7U (Delagoutte et al. 2000) and superimposed it on the first U/A pair of the A-site codon-anticodon mini-helix from PDB file 2HGP (Yusupova et al. 2006). Because the 59 phos- phate groups of C27 and U overlay very well with each other Tyr (Fig. 5B), during the decoding of CGA by tRNA (GUA), The graph shows the decoy set energy distributions for selected Ser a similar water molecule is also expected to bridge the N1 tRNAs. Black curves represent decoding of CGA by tRNA (CGA) Thr Ser C 9 (line 1), tRNA (CGU) (line 2), tRNA (UGA) (line 4), and atom of to its 5 phosphate and to increase the rigidity of Thr tRNA (UGU) (line 5). Decoding of CAG is also calculated for the base pair (Fig. 5C). The kink of the mRNA backbone Lys Lys tRNA (CUU) (line 3) and tRNA (UUU) (line 6). The overall ratio between the A- and P-site is known to contribute to the of amino acid incorporation is labeled next to the correspond- ing, most energetically favored tRNA decoder. discrimination of the cognate versus noncognate mRNA- tRNA mini-helix (Sanbonmatsu and Joseph 2003). There- fore, the greater rigidity of the C/A pair, conferred primarily Cys to be energetically outcompeted. tRNA (GCA) cannot be by the water molecule, may help the formation of mis- ruled out with energetic considerations; yet, a cysteine residue matched base pairs in the second and third position of the was not observed experimentally in the decoded CGA codon mini-helix that eventually help the mini-helix to pass the (Karijolich and Yu 2011). We will provide an explanation as selectivity test via the mRNA kink. Similarly, the water to why Cys is not decoded in the ‘‘Role of C in decoding’’ molecule retained in the major groove by C in a C/U pair Thr section below. (e.g., CAA pairing with tRNA (UGU)) can make favorable contributions to the formation of the mini-helix. Finally, from a kinetic point of view, increased stiffness of the phos- Case study III: Decoding of CAG phate backbone connecting the codon in the A-site to that in CAG is selectively decoded as serine and threonine, which the P-site may help slow down the rate of mRNA trans- is similar to CAA, yet differs in the observed ratio of amino location (Rodnina and Wintermeyer 2001), thereby pro- acid incorporation. Application of the rule-out consider- viding extra time for the mismatched base pairs in the A-site ations in Tables 1 and 2 effectively narrows down the tRNA to form. Ser Ser Thr candidates to tRNA (UGA), tRNA (CGA), tRNA (UGU), Nucleotide modifications in the tRNA anticodon loop Thr Lys and tRNA (CGU). To be extra cautious, tRNA (CUU) and can be another factor that influences decoding of C-mod-

8 RNA, Vol. 18, No. 3 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Recoding of pseudouridine-modified codons

cule(s) in the major groove. As a result, a solvation shell around the i6 modifica- tion composed of these low-mobility water molecules can form (Fig. 5E). Because the isopentenyl group is hydro- phobic, this solvation shell likely disfa- vors i6A37 containing tRNAs when decoding C-modified nonsense codons. Two of the top-three-ranked tRNAs under consideration for decoding CGA, Cys Tyr tRNA (GCA), and tRNA (GUA) con- tain i6A37 modification in yeast (Table 5; Sprinzl and Vassilenko 2005). The presence of i6A37 should decrease the ranking of both tRNAs. Unfortunately, we are unable to calculate the quantita- tive contributions of the i6A37 modifi- cation to the energy of the decoding Cys center. For tRNA (GCA), this modifi- cation-dependent decrease may become large enough to drop its ranking to Phe below tRNA (GAA), which would ex- plain why Phe is used for decoding CGA instead of Cys. FIGURE 4. Roles of the A-minor motifs in the decoding recognition. (A) tRNA-mRNA base Thr Tyr pairs between tRNA (UGU) and CAA. (B) tRNA-mRNA base pairs between tRNA (GUA) Phe and CGA. (C) tRNA-mRNA base pairs between tRNA (GAA) and UUU taken from the crystal Reprogramming the genetic code structure of 1IBM is also shown for comparison. Distances between C19 atoms of all base pairs using C in sense codons are measured and annotated. Black dotted lines represent hydrogen bonds, and gray lines represent weaker interactions. Carbon atoms are colored in green (for crystal structures) or To our knowledge, no C has been cyan (modeled structures), nitrogen in blue, oxygen in red, and hydrogen in white. found in mRNA so far. Given the recent discoveries that certain C modifications ified nonsense codons. For instance, hypermodified A37 in U2 spliceosomal RNA can be induced under stress and U34 are known to increase not only the stability of the conditions (Wu et al. 2011), it is possible that C may also mRNA-tRNA mini-helix but also the decoding capacity of be present in mRNAs under certain physiological conditions. tRNA (Bjork et al. 1987; Agris et al. 1997; Murphy et al. If true, mRNA modification with C in sense codons may 2004; Weixlbaumer et al. 2007; Jenner et al. 2010). In yeast, reprogram the sense codon, allowing it to be read as amino both threonylcarbamoyl (t6A) and isopentenyl (i6A) groups acids distinct from the genetic code. We, therefore, per- are known modifications to the N6 amino group of A37: formed modeling to predict how C-modified sense codons the former plays a role in tRNAs such as tRNAThr decoding might be decoded, starting with CUU derived from the ANN codons, while i6A37 is often present in tRNAs that UUU codon coding for phenylalanine (Table 6; Fig. 6). read UNN codons such as tRNASer, tRNACys, and tRNATyr Based on the principles described above for selective de- (Hall 1970; Nishimura 1972; Persson et al. 1994; Agris et al. coding of C-modified nonsense codons, tRNAs with Leu, 2007). Since CRR codons are the subjects of this study, we Ser, and Trp anticodons can readily be eliminated. The focused primarily on the isopentenyl modification. The remaining candidates were subjected to decoy set energy lack of i6A37 modification is known to reduce the activity distribution analysis. The two highest ranked candidates are Cys Tyr of the serine-inserting and tyrosine-inserting UGA suppres- tRNA (GCA) and tRNA (GUA) (Fig. 6). Interestingly, in the sion, and the i6 chemical group is thought to affect codon- model for mRNA-tRNA mini-helix formed between CUU Cys Tyr anticodon interaction (Bjork et al. 1987). A recent ribosome and tRNA (GCA) or tRNA (GUA),acis W/W type U/C or crystal structure (Jenner et al. 2010) shows that the iso- U/U pair is observed for the second codon-anticodon pair, pentenyl group of A37 in the A-site tRNA resides in the and a water molecule is also found to reside in proximity to major groove of the mRNA-tRNA mini-helix right on top of the minor groove (Fig. 6A,B, middle), similar to the situa- the first codon-anticodon pair and also projects toward the tion of the C/U pair for decoding CAA/CAG. It is possible 59 side of the anticodon (Fig. 5D). When C is present at the that such water molecules can always help the mini-helix to first codon position, its N1H group has the enhanced ability survive the ribosomal A-site test through widening the (compared to C5H group of uridine) to retain water mole- minor groove of the pyrimidine-pyrimidine base pair. Our

www.rnajournal.org 9 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Parisien et al.

ified nonsense codons, and the modified codons are selectively decoded by just two amino acids each (Kar- ijolich and Yu 2011). In this study, we provide a molecular rationale for the C-mediated suppression of nonsense codons and how they are decoded as specific amino acids. The presence of C weakens the recognition of the modified nonsense codons by release factors, allowing for the binding of competing tRNA molecules. Subsequent decoding events are not governed by the cellular abun- dance of tRNA, but rather by the stability of certain codon-anticodon pairs and the sterics of certain base pairs that pass the ribosome A-site test. In the minor groove, A1492, A1492, and G530 residues of the small subunit rRNA measure the width and geometry of the A-site mini-helix; we illustrate two cases in detail how mismatch- containing mini-helices can pass this A-site test. In the major groove, C plays positive roles in the formation of the codon- anticodon mini-helices by stabilizing codon-anticodon base pairsandnegativerolesfortRNAswithhydrophobic modifications immediately 39 to the anticodon base that pairs with C. We also extend the decoding principles derived from C-modified nonsense codons to predict the decoding capacity of potential C-modified sense codons, as exem- plified by CUU. Of course, many more possible scenar- ios of C-modified sense codons remain since uridine is present in 34 among the 61 sense codons (13 in the first codon position). C in the second or third position of a

TABLE 6. Prediction of tRNAs that may decode the CUU codon

FIGURE 5. Proposed roles of C in decoding. (A) In the high resolution structure of 1F7U, a water molecule bridges the N1 atom of C and two backbone phosphate groups. (B)OverlayofC27/A43 (1F7U) with U/A, first pair in the A-site of 2HGP. (C)ModeledC/A Tyr pair from the mini-helix formed between CGA and tRNA (GUA) superimposed on C27/A43 (1F7U). (D)Chemicalstructureofi6A37 on the left and a molecular view made from 3I8G showing the position of i6 group on the right.(E) Space-filled illustrations of the A-site tRNA-mRNA mini-helix show that the accommodation of the hydrophobic i6 modification is less favorable for C in the first codon position compared to uridine. Color coding is the same as that used in Figure 1, except that carbon atoms in A37 are colored in magenta. prediction of CUU decoding is consistent with the exper- imental results (Y-T Yu, pers. comm.).

CONCLUSIONS Nonsense codon suppression is a biologically relevant event and provides unique angles to study the mechanism of Only tRNAs with an A at the third anticodon position are consid- translational termination (Atkins and Gesteland 2010). Yu ered. The graph shows the decoy set energy distributions for Cys Tyr et al. demonstrated recently that pseudouridylation of decoding CUU by tRNA (GCA) (line 1), tRNA (GUA) (line 2), tRNAPhe (line 3), tRNASer (line 4), and tRNAIle (line 5). nonsense codons stimulates read-through of the C-mod- (GAA) (IGA) (IAU)

10 RNA, Vol. 18, No. 3 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Recoding of pseudouridine-modified codons

et al. 1991; Major 2003; Parisien and Major 2008). In a typical modeling trial, a partial structure is taken as the input by MC-Sym, and nucleotides are then added to this partial structure. The addition of nucleo- tidesisdonebasedonrelations(stacking, pairing, etc.) between nucleotides. For in- stance, it can load the ribosome’s decoding center, and add nucleotide 530 in a stacking fashion on top of nucleotide 518 (Fig. 3A). The stacking conformations are those com- piled from existing ones from solved struc- tures in the PDB (Berman et al. 2000). MC- Sym is a constraint satisfaction problem (CSP) solver, which means that any output solution structure satisfies all the structural constraints specified in the input script. Structural constraints are explicit distance constraints or steric clashes between heavy atoms, as well as the relations between nucle- otides given in the input script. The use of MC-Sym allows for explicitly controlling the types of a base pair through either the Saenger (Saenger 1984) or the Leontis-Westhof nomen- clature (Leontis and Westhof 2001) and var- Cys Tyr iants (Major and Lemieux 2002), hence, FIGURE 6. Predicting decoding of the CUU codon by tRNA (GCA) and tRNA (GUA).(A) Cys tRNA-mRNA base pairs between tRNA (GCA) and CUU. (B) tRNA-mRNA base pairs various structural hypothesis can be tested. Tyr Phe between tRNA (GUA) and CUU. (C) tRNA-mRNA base pairs between tRNA (GAA) and The modeling of the decoding center has UUU taken from the crystal structure of 1IBM is also shown for comparison. The same color been broken down into two steps in order coding as from Figure 4 is used. to take advantage of the divide-and-conquer principle of MC-Sym’s backtracking algorithm (Major 2007; Ali et al. 2010; Wang et al. codon may have different effects on decoding property com- 2011), in which smaller fragments are first generated and then pared to C in the first codon position; and different rules are used as input to subsequent modeling runs. likely needed to understand these decoding events. Experi- In the first step, nucleotides 34–36 of the tRNA’s anticodon ments designed to determine the specificity of amino acid loop are sampled along with nucleotide 530 of the ribosome’s incorporation for C-modified sense codons will be highly Loop 530 (Fig. 3A). In this step, two structural features are desirable to further derive and improve the rules of decoding. enforced with the use of distance constraints: the anticodon loop, The combination of computational and experimental anal- Loop 530, ribose zipper and the Hodgson and Fuller’s rule for the anticodon loop conformation (Fuller and Hodgson 1967). Dis- ysis of C-modified codons should be very useful in eluci- tance constraints for the ribose zipper are (530:N3)-(35:O29) and dating genetic code reprogramming by mRNA modifications. (530:O29)-(35:O29), between 2.5 and 3.0 A˚ . Distance constraints for the Hodgson and Fuller rule: (34:PSY)-(35:PSY)-(36:PSY)- ˚ MATERIALS AND METHODS (37:PSY), between 2.5 and 6.0 A. The stacking of nucleotides in the anticodon loop can be controlled via a distance constraint between consecutive PSY pseudoatoms, which can be found at the Molecular modeling center of each six-member ring of each nucleotide. All molecular modeling attempts start with PDB file 1IBM (Ogle In the second step, models from the first step are used as input to et al. 2001). There are more recent 70S structures, but their A-site MC-Sym upon which the mRNA nonsense codon and the two aden- superimposes perfectly with that of 1IBM (e.g., Selmer et al. 2006). osines of Helix 44 are grafted (Fig. 3B). At this point, many structural From this template, we extracted nucleotides 30–40 from chain hypotheses can be tested by explicitly specifying particular base-pair ‘‘Y’’ for the tRNA anticodon loop, nucleotides 1406–1410 and types between the tRNA anticodon loop and the nonsense codon 1490–1495 of chain ‘‘A’’ for Helix 44, and nucleotides 516–521 mRNA. The stringent A-site test, performed by adenosines 1492 and and 528–533 of chain ‘‘A’’ for Loop 530. Nucleotides at the edges 1493 of Helix 44 via A-minor interactions to the tRNA-mRNA mini- of these various stems and loops are kept fixed in 3D space, thus helix and by G530 of Loop 530 sensing the third codon position, providing a structural context to build the decoding event of imposes structural constraints that are enforced using distance con- C-containing nonsense codons in the ribosomal A-site. straints. All MC-Sym scripts are available upon request. We used the computer program MC-Sym to build the atomic- While modeling, we did not enforce the G530/A1492 pairing, as precision, 3D models of the ribosome’s decoding center (Major MC-Sym would sometimes not yield solutions, especially in the

www.rnajournal.org 11 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Parisien et al.

case of the A/G mismatch at the second codon position. A-minor Received November 11, 2011; accepted December 5, 2011. interactions occur by themselves and do not require A1492 to pair with G530. Furthermore, G530 is locked in the anti position through hydrogen bonds with both the sugar rings of the second anticodon REFERENCES nucleotide of tRNA and the third nucleotide of the codon. Agris PF, Guenther R, Ingram PC, Basti MM, Stuart JW, Sochacka E, Lys Malkiewicz A. 1997. Unconventional structure of tRNA SUU anti- Energy minimization codon explains tRNA’s role in bacterial and mammalian ribosomal frameshifting and primer selection by HIV-1. RNA 3: 420–428. All models have been energy-minimized using the Tinker molecular Agris PF, Vendeix FA, Graham WD. 2007. tRNA’s wobble decoding of package version 5 (Ponder and Richards 1987) and the Amber-99 the genome: 40 years of modification. J Mol Biol 366: 1–13. force-field (Kollman et al. 2000) in the gas-phase. The minimiza- Ali M, Lipfert J, Seifert S, Herschlag D, Doniach S. 2010. The ligand- tion algorithm is the memory-limited Broyden–Fletcher–Goldfarb– free state of the TPP riboswitch: A partially folded RNA structure. Shanno (L-BFGS) method. Convergence is attained when the root- J Mol Biol 396: 153–165. Arnez JG, Steitz TA. 1994. Crystal structure of unmodified tRNAGln mean-squared gradient value is <0.1 kcal/mol/A˚ . A sodium cation complexed with glutaminyl-tRNA synthetase and ATP suggests is appended at the apex of each phosphate group to neutralize the a possible role for pseudo- in stabilization of RNA structure. total charge of the complexes. During minimization, all heavy Biochemistry 33: 7560–7567. atoms of a nucleobase are kept to their original position using Atkins JF, Gesteland RF, ed. 2010. Recoding: Expansion of decoding a spring constant of 1001 kcal/mol/A˚ , while all ions, protons, and rules enriches expression. In Nucleic acids and molecular backbone and sugar atoms are free to move. biology, Vol. 24. Springer Science+Business Media, New York. The results of the modeling and minimization are decoy sets, each Ben-Shem A, Jenner L, Yusupova G, Yusupov M. 2010. Crystal structure of the eukaryotic ribosome. Science 330: 1203–1209. featuring a different structural hypothesis, i.e., base-pair types. By Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, plotting the energy distributions of each decoy set, we can compare Shindyalov IN, Bourne PE. 2000. The protein data bank. Nucleic and rank them by energy preferences. Thus, the decoy set that Acids Res 28: 235–242. features3Dmodelswiththebestenergiesisbelievedtobepreferably Bjork GR, Ericson JU, Gustafsson CE, Hagervall TG, Jonsson YH, decoded over a decoy set whose models are less favored energetically. Wikstrom PM. 1987. Transfer RNA modification. Annu Rev In addition, the whole energy curves, instead of only the best energy Biochem 56: 263–287. Boobbyer DNA, Goodford PJ, Mcwhinnie PM, Wade RC. 1989. New models, are compared since the curves may convey more robust hydrogen-bond potentials for use in determining energetically energy information than the single best energy model alone (which favorable binding-sites on molecules of known structure. J Med could happen to fall in a deep energy well but does not represent Chem 32: 1083–1094. appropriately the ‘‘ensemble’’). Because the energy function is not Cheng Z, Saito K, Pisarev AV, Wada M, Pisareva VP, Pestova TV, perfect (functional form, parameters, etc.), entropy is discarded, and Gajda M, Round A, Kong C, Lim M, et al. 2009. Structural insights the energy computations have been performed in the context of an into eRF3 and stop codon recognition by eRF1. Dev 23: implicit solvation model, we do not claim that the energy curves 1106–1118. Delagoutte B, Moras D, Cavarelli J. 2000. tRNA aminoacylation by correlate with the efficiency of decoding, but they should, at least, arginyl-tRNA synthetase: Induced conformations during sub- provide a qualitative ranking of the various competing tRNA species. strates binding. EMBO J 19: 5599–5610. The use of an implicit solvation model is motivated by the Dittmar KA, Goodenbour JM, Pan T. 2006. Tissue-specific differences following observations: first, implicit solvation models use solvent in human transfer RNA expression. PLoS Genet 2: e221. doi: accessible surface area (SASA) to determine the average contribution 10.1371/journal.pgen.0020221. of water on specific atomic groups. Here, the presence of a few explicit Fuller W, Hodgson A. 1967. Conformation of the anticodon loop in tRNA. Nature 215: 817–821. water molecules will decrease the SASA for the atomic groups in Gong Z, Xiao Y. 2010. RNA stability under different combinations of direct contact with these explicit water molecules, therefore preventing amber force fields and solvation models. J Biomol Struct Dyn 28: a double-count of water contribution. Second, soaking the modeled 431–441. system in explicit water will increase the number of atoms signifi- Hall RH. 1970. N6-(delta 2-isopentenyl)adenosine: Chemical reac- cantly, hence offsetting the energetic differences upon mRNA:tRNA tions, biosynthesis, metabolism, and significance to the structure nucleotide mutations. Although the use of an explicit water model and function of tRNA. Prog Nucleic Acid Res Mol Biol 10: 57–86. could lead to a better modeling of the dynamics of the mRNA: Hamma T, Ferre-D’Amare AR. 2006. Pseudouridine synthases. Chem Biol 13: 1125–1135. tRNA:rRNA interactions, it is beyond the scope of this paper. Jenner LB, Demeshkina N, Yusupova G, Yusupov M. 2010. Structural Depicted models are those that boast a specified network of aspects of messenger RNA reading frame maintenance by the hydrogen bonds, more specifically the ribose zippers for nucleo- ribosome. Nat Struct Mol Biol 17: 555–560. tides 530, 1492, and 1493, and the proper hydrogen bonds between Karijolich J, Yu YT. 2011. Converting nonsense codons into sense the mRNA and the tRNA. Hydrogen bond strengths are computed codons by targeted pseudouridylation. Nature 474: 395–398. as suggested (Boobbyer et al. 1989), which takes into account the Kollman PA, Wang JM, Cieplak P. 2000. How well does a restrained electrostatic potential (RESP) model perform in calculating confor- directionality of hydrogen bonds. mational energies of organic and biological molecules? JComput Molecular graphics figures were prepared with PyMOL. Chem 21: 1049–1074. Kondo J, Westhof E. 2008. The bacterial and mitochondrial ribosomal A-site molecular switches possess different conformational sub- ACKNOWLEDGMENTS states. Nucleic Acids Res 36: 2654–2666. Korostelev A, Asahara H, Lancaster L, Laurberg M, Hirschi A, Zhu J, This work was supported by a grant from NIH (GM088599 Trakhanov S, Scott WG, Noller HF. 2008. Crystal structure of to T.P.) and by the Chicago Fellows program (to M.P.). We thank a translation termination complex formed with release factor RF2. Drs. Yi-tao Yu and Timothy Nilsen for stimulating discussions. Proc Natl Acad Sci 105: 19684–19689.

12 RNA, Vol. 18, No. 3 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Recoding of pseudouridine-modified codons

Korostelev A, Zhu J, Asahara H, Noller HF. 2010. Recognition of the Selmer M, Dunham CM, Murphy FV IV, Weixlbaumer A, Petry S, amber UAG stop codon by release factor RF1. EMBO J 29: 2577–2585. Kelley AC, Weir JR, Ramakrishnan V. 2006. Structure of the 70S Laurberg M, Asahara H, Korostelev A, Zhu J, Trakhanov S, Noller HF. ribosome complexed with mRNA and tRNA. Science 313: 1935– 2008. Structural basis for translation termination on the 70S 1942. ribosome. Nature 454: 852–857. Siegfried NA, O’Hare B, Bevilacqua PC. 2010. Driving forces for + Lemieux S, Chartrand P, Cedergren R, Major F. 1998. Modeling active nucleic acid pKa shifting in an A ÁC wobble: Effects of helix RNA structures using the intersection of conformational space: position, temperature, and ionic strength. Biochemistry 49: 3225– Application to the lead-activated ribozyme. RNA 4: 739–749. 3236. Leontis NB, Westhof E. 2001. Geometric nomenclature and classifi- Song H, Mugnier P, Das AK, Webb HM, Evans DR, Tuite MF, cation of RNA base pairs. RNA 7: 499–512. Hemmings BA, Barford D. 2000. The crystal structure of human Leontis NB, Stombaugh J, Westhof E. 2002. The non-Watson-Crick eukaryotic release factor eRF1–mechanism of stop codon recog- base pairs and their associated isostericity matrices. Nucleic Acids nition and peptidyl-tRNA hydrolysis. Cell 100: 311–321. Res 30: 3497–3531. Sprinzl M, Vassilenko KS. 2005. Compilation of tRNA sequences and Lescoute A, Westhof E. 2006. The A-minor motifs in the decoding sequences of tRNA genes. Nucleic Acids Res 33: D139–D140. recognition process. Biochimie 88: 993–999. Stombaugh J, Zirbel CL, Westhof E, Leontis NB. 2009. Frequency Major F. 2003. Building three-dimensional ribonucleic acid structures. and isostericity of RNA base pairs. Nucleic Acids Res 37: 2294– Comput Sci Eng 5: 44–53. 2312. Major FPT. 2007. RNA tertiary structure prediction. In Bioinformatics: Sund J, Ander M, Aqvist J. 2010. Principles of stop-codon reading on From genomes to therapies (ed. T Lengauer), pp. 491–539. Wiley- the ribosome. Nature 465: 947–950. VCH, Weinheim, Germany. Tuller T, Carmi A, Vestsigian K, Navon S, Dorfan Y, Zaborske J, Pan Major F, Lemieux S. 2002. RNA canonical and non-canonical base T, Dahan O, Furman I, Pilpel Y. 2010. An evolutionarily conserved pairing types: a recognition method and complete repertoire. mechanism for controlling the efficiency of protein translation. Nucleic Acids Res 30: 4250–4263. Cell 141: 344–354. Major F, Turcotte M, Gautheret D, Lapalme G, Fillion E, Cedergren R. Vicens Q, Westhof E. 2003. Molecular recognition of aminoglycoside 1991. The combination of symbolic and numerical computation for antibiotics by ribosomal RNA and resistance enzymes: An analysis three-dimensional modeling of RNA. Science 253: 1255–1260. of x-ray crystal structures. Biopolymers 70: 42–57. Major F, Gautheret D, Cedergren R. 1993. Reproducing the three- Villescas-Diaz G, Zacharias M. 2003. Sequence context dependence of dimensional structure of a tRNA molecule from structural tandem guanine:adenine mismatch conformations in RNA: A constraints. Proc Natl Acad Sci 90: 9408–9412. continuum solvent analysis. Biophys J 85: 416–425. Murphy FV IV, Ramakrishnan V. 2004. Structure of a purine-purine Wang Z, Parisien M, Scheets K, Miller WA. 2011. The cap-binding wobble base pair in the decoding center of the ribosome. Nat translation initiation factor, eIF4E, binds a pseudoknot in a viral Struct Mol Biol 11: 1251–1252. cap-independent translation element. Structure 19: 868–880. Murphy FV IV, Ramakrishnan V, Malkiewicz A, Agris PF. 2004. The Weixlbaumer A, Murphy FV IV, Dziergowska A, Malkiewicz A, Lys role of modifications in codon discrimination by tRNA UUU. Nat Vendeix FA, Agris PF, Ramakrishnan V. 2007. Mechanism for Struct Mol Biol 11: 1186–1191. expanding the decoding capacity of transfer by modification Nishimura S. 1972. Minor components in transfer RNA: Their of uridines. Nat Struct Mol Biol 14: 498–502. characterization, location, and function. Prog Nucleic Acid Res Weixlbaumer A, Jin H, Neubauer C, Voorhees RM, Petry S, Kelley Mol Biol 12: 49–85. AC, Ramakrishnan V. 2008. Insights into translational termination Ogle JM, Brodersen DE, Clemons WM Jr, Tarry MJ, Carter AP, from the structure of RF2 bound to the ribosome. Science 322: Ramakrishnan V. 2001. Recognition of cognate transfer RNA by 953–956. the 30S ribosomal subunit. Science 292: 897–902. Williamson JR. 2000. Induced fit in RNA-protein recognition. Nat Ortiz-Meoz RF, Green R. 2011. Helix 69 is key for uniformity during Struct Biol 7: 834–837. substrate selection on the ribosome. JBiolChem286: 25604–25610. Wu M, Turner DH. 1996. Solution structure of (rGCGGACGC)2 by Parisien M, Major F. 2008. The MC-Fold and MC-Sym pipeline infers two-dimensional NMR and the iterative relaxation matrix ap- RNA structure from sequence data. Nature 452: 51–55. proach. Biochemistry 35: 9677–9689. Persson BC, Esberg B, Olafsson O, Bjork GR. 1994. Synthesis and Wu G, Xiao M, Yang C, Yu YT. 2011. U2 snRNA is inducibly function of isopentenyl adenosine derivatives in tRNA. Biochimie pseudouridylated at novel sites by Pus7p and snR81 RNP. EMBO J 76: 1152–1160. 30: 79–89. Ponder JW, Richards FM. 1987. An efficient Newton-like method Yarian CS, Basti MM, Cain RJ, Ansari G, Guenther RH, Sochacka E, for molecular mechanics energy minimization of large molecules. Czerwinska G, Malkiewicz A, Agris PF. 1999. Structural and J Comput Chem 8: 1016–1024. functional roles of the N1- and N3-protons of C at tRNA’s Price D, Grey M. 1998. Editing of tRNA. In Modification and editing of position 39. Nucleic Acids Res 27: 3543–3549. RNA (ed. H Grosjean, R Benne). ASM Press, Washington, DC. Yildirim I, Turner DH. 2005. RNA challenges for computational Rabl J, Leibundgut M, Ataide SF, Haag A, Ban N. 2011. Crystal chemists. Biochemistry 44: 13225–13234. structure of the eukaryotic 40S ribosomal subunit in complex with Yildirim I, Stern HA, Sponer J, Spackova N, Turner DH. 2009. Effects initiation factor 1. Science 331: 730–736. of restrained sampling space and nonplanar amino groups on free- Rodnina MV, Wintermeyer W. 2001. Fidelity of aminoacyl-tRNA energy predictions for RNA with imino and sheared tandem GA selection on the ribosome: Kinetic and structural mechanisms. base pairs flanked by GC, CG, iGiC or iCiG base pairs. J Chem Annu Rev Biochem 70: 415–435. Theory Comput 5: 2088–2100. Saenger W. 1984. Principles of nucleic acid structure. Springer-Verlag, Yusupova G, Jenner L, Rees B, Moras D, Yusupov M. 2006. Structural New York. basis for messenger RNA movement on the ribosome. Nature 444: Sanbonmatsu KY, Joseph S. 2003. Understanding discrimination by 391–394. the ribosome: Stability testing and groove measurement of codon- Zaborske JM, Narasimhan J, Jiang L, Wek SA, Dittmar KA, Freimoser F, anticodon pairs. J Mol Biol 328: 33–47. PanT,WekRC.2009.Genome-wide analysis of tRNA charging and Schuwirth BS, Borovinskaya MA, Hau CW, Zhang W, Vila-Sanjurjo activation of the eIF2 kinase Gcn2p. JBiolChem284: 25254–25267. A, Holton JM, Cate JH. 2005. Structures of the bacterial ribosome Zaher HS, Green R. 2009. Quality control by the ribosome following at 3.5 A˚ resolution. Science 310: 827–834. peptide bond formation. Nature 457: 161–166.

www.rnajournal.org 13 Downloaded from rnajournal.cshlp.org on September 24, 2021 - Published by Cold Spring Harbor Laboratory Press

Rationalization and prediction of selective decoding of pseudouridine-modified nonsense and sense codons

Marc Parisien, Chengqi Yi and Tao Pan

RNA published online January 26, 2012

P

License

Email Alerting Receive free email alerts when new articles cite this article - sign up in the box at the Service top right corner of the article or click here.

To subscribe to RNA go to: http://rnajournal.cshlp.org/subscriptions

Copyright © 2012 RNA Society