<<

Ribosomal frameshifting into an overlapping gene in PNAS PLUS the 2B-encoding region of the cardiovirus genome

Gary Loughrana,1, Andrew E. Firthb,1,2, and John F. Atkinsa,c

aBioSciences Institute, University College Cork, Cork, Ireland; bDepartment of Pathology, University of Cambridge, Cambridge CB2 1QP, United Kingdom; and cDepartment of Human Genetics, University of Utah, Salt Lake City, UT 84112-5330

Edited by Thomas Shenk, Princeton University, Princeton, NJ, and approved September 20, 2011 (received for review February 21, 2011)

The genus Cardiovirus (family Picornaviridae) currently comprises In the tandem slippage model, the P-site anticodon re-pairs from the species Encephalomyocarditis (EMCV) and Theilovirus. XXY to XXX, whereas the A-site anticodon re-pairs from YYZ Cardioviruses have a positive-sense, single-stranded RNA genome to YYY, thus allowing for perfect re-pairing except at the wobble that encodes a large polyprotein (L-1ABCD-2ABC-3ABCD) that is position (19). Because the codon:anticodon duplex in the P site is cleaved to produce approximately 12 mature proteins. We report not monitored so strictly as that in the A site, certain deviations on a conserved ORF that overlaps the 2B-encoding sequence from the canonical XXX of the slippery site are tolerated, includ- of EMCV in the þ2 reading frame. The ORF is translated as a ing UCC in some members of the Japanese ser- 128–129 amino acid transframe fusion (2B*) with the N-terminal ogroup of flaviviruses, GGU in Bean leafroll luteovirus, GUU in 11–12 amino acids of 2B, via ribosomal frameshifting at a con- Equine arteritis arterivirus, and GGA in Culex flavivirus (20–24). served GGUUUUY motif. Mutations that knock out expression The relative efficiencies of a large selection of slippery heptanu- of 2B* result in a small-plaque phenotype. Curiously, although cleotides have been tested extensively in reticulocyte lysates (18). þ theilovirus sequences lack a long ORF in the 2 frame at this geno- The efficiency of frameshifting depends on the identity of the mic location, they maintain a conserved GGUUUUU motif just slippery site nucleotides but is typically <1% in the absence of downstream of the 2A-2B junction, and a highly localized peak additional stimulatory elements. Thus, most known instances of in conservation at polyprotein-frame synonymous sites suggests eukaryotic −1 frameshifting are stimulated (typically to a level that theiloviruses also utilize frameshifting here, albeit into a of between 1% and 50%, depending on the particular system) BIOCHEMISTRY very short þ2-frame ORF. Unlike previous cases of programmed by the presence of a 3′ stable RNA secondary structure, such as −1 frameshifting, here frameshifting is modulated by virus infec- a pseudoknot or stem-loop, that is separated from the slippery tion, thus suggesting a novel regulatory role for frameshifting in “ ” – these . heptanucleotide by a spacer region of 5 9 nt (16, 17). Struc- tures of this type are thought to be located at the mRNA unwind- ∣ translation ∣ genetic recoding ∣ Theiler’s murine ing site of the mRNA entrance channel when their stimulatory encephalomyelitis virus ∣ Saffold virus effect is exerted, with at least part of the relevant consequence being a ribosomal pause. However, another class of 3′ frameshift he genus Cardiovirus belongs to the Picornaviridae family stimulators appear not to exert their effect via intra-mRNA struc- Tof positive-sense single-stranded RNA viruses. Currently, ture but, instead, are likely to involve direct interaction between two cardiovirus species are recognized—Encephalomyocarditis the mRNA and rRNA or other translation components (25, 26). virus (EMCV) and Theilovirus. The latter species encompasses Until recently, most known cases of phylogenetically conserved a number of divergent clades and isolates, including Theiler’s eukaryotic −1 ribosomal frameshifting involved frameshifting murine encephalomyelitis virus (TMEV), Rat theilovirus (herein between two long ORFs. The vast majority of known cases came referred to as RTV), Vilyuisk virus (VHEV) and Saffold virus from viral genomes, and very often the frameshift ORF encodes (SAFV). The genome comprises a single RNA molecule of ap- the viral polymerase that is required in lower quantities than the proximately 8 kb that contains a single long open reading frame products of the upstream ORF (27). One reason for this bias is (ORF), which is translated as a polyprotein and cleaved by simply that long ORFs are easily detected by sequence analysis. the virus-encoded 3C protease (1–3). Like many , With the recent massive increase in sequence data available for cardioviruses use an unusual proteolysis-independent but ribo- many viruses, powerful comparative computational analyses have some-dependent mechanism, termed “StopGo” or “stop-carry revealed a number of cases where the frameshift ORF is rela- on,” to separate the L-1ABCD-2A segment of the polyprotein tively, and sometimes very, short (28–30). Where the function of from the 2BC-3ABCD segment during polyprotein translation frameshifting is simply to produce a truncated form of a protein, (4–7). StopGo is mediated by the amino acid motif D(V/I) the frameshift ORF may be almost nonexistent (e.g., a frameshift ExNPG|P (where the “|” represents the junction between 2A site followed immediately by a stop codon), as in the dnaX gene and 2B), which, together with upstream amino acids that have of Escherichia coli and related bacteria (reviewed in ref. 31). a strong alpha-helical propensity, prevents formation of a peptide Here, we describe a short ORF accessed via −1 ribosomal frame- bond between Asn-Pro-Gly and Pro but allows the continuation shifting in the cardioviruses. Frameshifting occurs at a conserved – of translation with up to near-100% efficiency (7 13). Recently, G GUU UUY motif just downstream of the StopGo cassette SAFV has been shown to be a common and widespread human virus—causing infection mainly in early childhood—although the pathogenicity and clinical significance of SAFV infection is Author contributions: G.L., A.E.F., and J.F.A. designed research; G.L. and A.E.F. performed currently unclear (14, 15). research; G.L., A.E.F., and J.F.A. analyzed data; and G.L., A.E.F., and J.F.A. wrote the paper. Many viruses harbor sequences that induce a portion of trans- The authors declare no conflict of interest. lating ribosomes to shift −1 nt and continue translating in the This article is a PNAS Direct Submission. new reading frame (16, 17). The eukaryotic −1 frameshift site Freely available online through the PNAS open access option. typically consists of a “slippery” heptanucleotide fitting the con- 1G.L. and A.E.F. contributed equally to this work. sensus motif X XXY YYZ, where XXX represents any three 2To whom correspondence should be addressed. E-mail: [email protected]. identical nucleotides; YYY represents AAA or UUU; Z repre- This article contains supporting information online at www.pnas.org/lookup/suppl/ sents A, C, or U; and spaces separate zero-frame codons (18). doi:10.1073/pnas.1102932108/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1102932108 PNAS Early Edition ∣ 1of9 Downloaded by guest on September 27, 2021 and, in EMCV, results in the translation of a 128–129 amino acid paring the observed number of base substitutions with the num- transframe fusion protein, 2B*. ber expected under a neutral evolution model. The procedure takes into account whether synonymous site codons are 1-, 2-, Results 3-, 4-, or 6-fold degenerate and the differing probabilities of Computational Analysis Reveals a Conserved Overlapping ORF in transitions and transversions. Statistics were then summed over EMCV. The polyprotein coding sequences from the 21 full- or a phylogenetic tree as described in ref. 33, and averaged over near full-length EMCV sequences available in GenBank were a sliding window. extracted, translated, aligned, and back-translated to a nucleotide The analysis revealed a striking and highly statistically sig- sequence alignment. Next, the alignment was analyzed for con- nificant (p ∼ 10−32) increase in synonymous site conservation in a servation at polyprotein-frame synonymous sites, as described in region covering approximately 120 codons beginning shortly ref. 32. Briefly, beginning with pairwise sequence comparisons, downstream of the junction between the 2A and 2B coding re- conservation at synonymous sites (only) was evaluated by com- gions (Fig. 1A). Within this region, the mean synonymous substi-

−1 frameshift site A EMCV L1A 1B 1C 1D 2A 2B 2C 3AB 3C 3D (1)

StopGo 2B* transframe protein

(2) +1

(3) 2B* +2

− 10 8 − 10 6 CRE, Lobert et al 1999 − (4) 10 4 − 10 2 1

(5) 1.0 0.0 500 1000 1500 2000 B TMEV/RTV

(6) L* +1

(7) +2

− 10 8 − 10 6 CRE, Lobert et al 1999 − (8) 10 4 − 10 2 1

(9) 1.0 0.0 500 1000 1500 2000 C Saffold virus

(10) +1

(11) +2

− 10 8 −6 presumed CRE 10 − (12) 10 4 − 10 2 1

(13) 1.0 0.0 500 1000 1500 2000 codon index in polyprotein coding sequence

Fig. 1. Conservation at polyprotein-frame synonymous sites in three cardiovirus clades. (A) Twenty-one EMCV sequences, including . (B) Twelve TMEV and RTV sequences. (C) Twenty-two SAFV sequences. Panel 1 shows a map of the EMCV coding region, along with the overlapping gene 2B*, the site of frameshifting, and the site of StopGo cotranslational separation at the 2A-2B junction. Panels 2–3, 6–7, and 10–11 show the positions of stop codons (blue triangles) in the þ1 and þ2 reading frames relative to the polyprotein reading frame, and alignment gaps (green rectangles), in all the aligned sequences. Note the conserved absence of stop codons in the þ2 frame in EMCV in the 2B* region (panel 3) and in the þ1 frame in TMEV/RTV in the L* region (panel 6). Panels 4– 5, 8–9, and 12–13 depict the conservation at polyprotein-frame synonymous sites in a 15-codon sliding window. Panels 4, 8, and 12 depict the probability that the degree of conservation within a given window could be obtained under a null model of neutral evolution at synonymous sites, whereas panels 5, 9, and 13 depict the absolute amount of conservation as represented by the ratio of the observed number of substitutions within a given window to the number ex- pected under the null model. The range of features detected by this type of analysis depends partly on the sliding window size: 15 codons is optimal for features with a width of 15 codons and was used to illustrate the very localized conservation peak in the TMEV/RTV and SAFV alignments. In EMCV, the conservation in the 2B* region is even more pronounced if one uses a larger window size—indeed, the total conservation summed over the 2B* ORF has a corresponding p-value of approximately 10−32. Conversely, features that cover significantly fewer than 15 codons may not be apparent in this analysis. Note that, due to codon usage, the lengths of “random” (i.e., noncoding) ORFs tend to be longer in the þ1 frame than in the þ2 frame (34).

2of9 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1102932108 Loughran et al. Downloaded by guest on September 27, 2021 tution rate was only approximately 27% of the genomic average and transcription, translation enhancement, and/or packaging PNAS PLUS (Fig. 1A). Such peaks in synonymous site conservation are gen- (42). In contrast to these elements, the conservation peak in erally indicative of functionally important overlapping elements the 2B region of EMCVextends to a greater degree over a greater —either coding or noncoding. In this case, however, an inspec- region. tion of the positions of stop codons in the þ1 and þ2 reading frames relative to the polyprotein reading frame, in all 21 se- Computational Analysis of Theiloviruses. Next, we applied the same quences, revealed a conserved absence of stop codons in the þ2 analyses to other members of the Cardiovirus genus. Phylogeneti- reading frame in a region that corresponds precisely to the region cally, VHEV clusters with TMEV, whereas SAFV forms another of enhanced conservation (Fig. 1A), thus suggesting an over- clade that is more closely related to the TMEV–VHEV clade lapping coding sequence in the þ2 reading frame as a possible than to EMCV (14, 43). The divergent RTV clade clusters am- explanation for the enhanced conservation at polyprotein-frame biguously with either SAFVor TMEV, depending on the genomic synonymous sites. One other explanation for extended conserva- region analyzed, but we chose to analyze RTV alongside TMEV tion could be recombination, a phenomenon that is relatively because of the common presence of the L* ORF (Fig. 1B). When common in the Picornaviridae, though analysis with a variety of we analyzed the conservation at synonymous sites and recombination detection algorithms in the RDP3 package failed the positions of þ2-frame stop codons in alignments of either to reveal any putative recombination events that could explain TMEV/RTV (Fig. 1B; 12 sequences) or SAFV (Fig. 1C;22 the conservation in this region (35, 36). sequences) we still observed a striking peak in synonymous site At the 5′ end of the conserved region there is a conserved conservation just downstream of the 2A-2B junction. However, G GUU UUY motif (where Y represents either a C or a U), the extent of the peak was greatly reduced compared to that seen which we inferred to be the site of frameshifting (Fig. 2A). This in EMCV. Moreover, the extent of the corresponding þ2-frame heptanucleotide has been implicated previously as the slippery ORF was similarly reduced—covering just 8 codons in the major- sequence for −1 ribosomal frameshifting in certain luteoviruses, ity of sequences (Fig. 2A). Nonetheless, an inspection of the such as Bean leafroll and Soybean dwarf (22). Moreover, sequence data revealed a conserved G GUU UUU motif at the although the G at position one is conserved, the corresponding 5′ end of the conserved region and, again, the functionality of polyprotein-frame codon is not conserved, with CCG (Pro), CAG this motif was further supported by conservation of the G at posi- (Gln), and AAG (Lys) all being represented in different EMCV tion one despite both UCG (Ser) and UUG (Leu) codons being sequences, thus providing further support for the functional present at that position (recall that there are six possible codons

importance of the G GUU UUY motif. Using alignment RNA- for each of Ser and Leu, yet only three of the 12 codons end BIOCHEMISTRY folding software such as alidot and pseudoknot-predicting soft- in G; Fig. 2A). Analysis of the 3′ sequence with RNA-folding ware such as pknotsRG, we were unable to predict a convincing algorithms again failed to reveal a 3′ RNA structure at the cano- 3′ RNA secondary structure at a canonical distance (i.e., 5–9 nt) nical (5–9 nt) distance for stimulation of −1 frameshifting. How- for stimulating efficient ribosomal frameshifting at the G GUU ever, enhanced conservation at polyprotein-frame synonymous UUY motif (37, 38). However, we were also unable to definitively sites extended for approximately 5 codons 3′ of the þ2-frame rule out such a structure, because long-range base pairings and/or stop codon, suggesting that the 3′ nucleotides may nonetheless structures involving complex tertiary interactions can be difficult be involved in the stimulation of frameshifting. to detect based on computational analysis alone (39, 40). Besides the conservation peak in the sequence encoding 2B, A number of other peaks in synonymous site conservation were our analysis also recovered the known overlapping gene in detected, including one corresponding to the cis-active RNA ele- the 5′-terminal region of the TMEV/RTV polyprotein coding ment (CRE) (41) that plays a role in replication, as well as several sequence that encodes L*, a protein essential for the progression in the region of the genome that encodes 3D, the viral polymerase of TMEV infection to persistence (Fig. 1B) (44–46) and—as (Fig. 1A). In fact, such conservation peaks are common in the for EMCV—the CRE, and several conservation peaks in the 3′ 5′- and/or 3′-terminal regions of many viral polyprotein coding region of the genome. sequences and often correspond to overlapping noncoding ele- There is currently only one VHEV sequence in GenBank with ments, such as RNA secondary structures involved in replication coverage of the 2B region, and because this sequence covers only A

B

C

Fig. 2. Representative nucleotide and peptide sequences. (A) Nucleotide sequences flanking the proposed site of frameshifting (confirmed in EMCV) in re- presentative cardiovirus sequences (GenBank accession numbers shown at left). The codons that encode NPG|P of the StopGo cassette are highlighted in pale blue. The slippery heptanucleotide where −1 frameshifting occurs is highlighted in pale orange. The stop codon that terminates the very short þ2-frame ORF in TMEV and SAFV is indicated in red (in EMCV, the þ2-frame ORF is much longer; see Fig. 1A). (B) Amino acid sequence of the transframe protein 2B* in men- govirus DQ294633. The two antigens used for Abs anti-N and anti-C are highlighted in yellow. Peptide sequences detected by mass spectrometric analysis of 2B* purified from infected cells are indicated with underscores (anti-N IP) and overscores (anti-C IP). The amino acids, VF, corresponding to the site of frame- shifting are indicated in bold. (C) Amino acid sequences of the much shorter predicted transframe peptide in representative TMEV and SAFV sequences.

Loughran et al. PNAS Early Edition ∣ 3of9 Downloaded by guest on September 27, 2021 part of the polyprotein, it was not used in the synonymous site tension, and a 16-nt loop. The predicted stems are supported by a conservation analysis. However, it maintains the G GUU UUU number of compensatory substitutions, but certain nucleotides in motif and an 8-codon þ2-frame ORF, as observed for TMEV. the loop are completely conserved, most notably the 5′-most 6 nt, A AAC AC. In fact AAAC loop motifs are a widespread feature Conservation of a More Distal 3′ RNA Structure. Although conserved of Picornaviridae CREs, and this motif is also present in the RNA structures were not found at the canonical spacing (5–9 nt) loop of the EMCV shift-site 3′-proximal stem-loop structure (48) for stimulating ribosomal frameshifting at the G GUU UUY (Fig. 3). motif, alignment folding did uncover a potential structure at a larger spacing (13–14 nt) from the slippery site (Fig. 3). Because Immunoblotting Demonstrates Expression of 2B*. We chose to test of very high nucleotide conservation in this region, the structure for expression of the predicted transframe protein 2B* in men- —although conserved—has no support from compensatory govirus (a strain of EMCV) rather than TMEV because the substitutions within the EMCV alignment. However, a similar length of 2B* in mengovirus (129 codons) is more amenable to structure was also predicted for TMEV, RTV, and SAFV, where experimental analysis. Two separate antibodies (Abs) were raised it is supported by an A-U to U-A substitution between TMEV against predicted antigens within mengovirus 2B*: one against and RTV, besides many differences in the primary nucleotide se- the amino-terminal 12 aa (anti-N) and a second against the car- quence between these viruses and EMCV (Fig. 3). Indeed, this boxy-terminal 14 aa (anti-C). The amino-terminal 12 aa of 2B* structure was one of just six EMCV–TMEV phylogenetically con- are encoded by the zero-frame sequence 5′ of, and including, the served potential RNA structures predicted within the cardiovirus G GUU UUU motif so that anti-N, but not anti-C, was expected coding region in a computational survey of the Picornaviridae to react with both 2B and 2B* (Fig. 2B). Additional Abs were also (47). The structure bears a resemblance—possibly coincidental obtained specifically for 2B but proved ineffective. —to the cardiovirus CRE, a structure essential for VPg uridylyla- Western blots with anti-N detected virus-specific proteins mi- tion in all picornaviruses (48) (Fig. 3). Using the EMCV and grating at approximately 13 and 16 kDa in lysates from BHK-21 TMEV CREs defined in ref. 41 as seeds, alignment folding pre- cells infected for 5 h with wild-type (WT) mengovirus (MOI 10; dicts very similar core CRE structures in EMCV, TMEV, RTV, Fig. 4A). To determine whether these products might correspond ′ and SAFV: a 6-bp stem, 1-nt 3 bulge, a potential 2-bp stem-ex- to 2B and/or 2B*, separate in vitro translation reactions were pro- grammed with constructs expressing 2B and 2B*. Although 2B* CA −1 U A has a predicted molecular mass of 14.2 kDa (assuming frame- A−U shifting at the G GUU UUU motif), in vitro expressed 2B* C−G U G migrates at a position corresponding to approximately 16 kDa * C A G−C U A (Fig. 4A). Meanwhile, in vitro expressed 2B, which has a pre- | A EMCV U−A C dicted molecular mass of 16.5 kDa, migrates at a position corre- UC C DQ294633 (mengovirus) G−C sponding to approximately 13 kDa (Fig. 4A). Thus, we identified A−U C−G the more slowly migrating product as 2B*. Interestingly, in 45 nt G−C infected cells, 2B* was much more readily detected with anti-N G−C C−G than 2B (Fig. 4A). Possible reasons for this include the unknown 5’− GGUU UUU CAGACUCAAGGAG AACGACUUGGCC −3’ WT frameshifting efficiency, differences in 2B and 2B* stability, or possible 3C protease cleavage between phylogenetically con- 3914 50 nt 3980 served Gln and Gly residues just 3 aa downstream of the shift site AAAU (conserved also in TMEV, RTV, and SAFV), which could remove U C C C the N-terminal antigen from a portion of 2B. Although the anti-N A C TMEV A−U U−A EU718733 U−A (in RTV) C−G G−C U−A (in RTV) G G−C G−C 5’− GGUU UUU CAGCCACAAGGUGC CCAGACAGGAA −3’

4239 4294 C CAG C U A C C A A A AAAU A C U C A C−G A C C A C A−U A A EMCV CRE | G SAFV U−A M81861 C−G EU681177 C−G U * G G−C G−C U−A G−C G−C C−G G−C A−U G−C 5’− CAAU CUUG −3’ 5’− GGUG AUGU −3’ Fig. 4. Immunodetection of 2B* and 2B. (A) 2B/2B* anti-N Western blot 1304 1344 3978 4011 (WB) of protein lysates from BHK-21 cells either mock-infected (mock) or infected (MOI 10) with WT mengovirus (vMC0) for 5 h. As controls for the Fig. 3. RNA structures predicted from alignment folding. The G GUU UUY apparent molecular masses of 2B and 2B*, in vitro translated (IVT) 2B and frameshift site in EMCV and corresponding predicted frameshift site in TMEV 2B*—as well as a negative control (Co) with no input RNA added—were also are boxed. A potential lower extension to the stem in EMCV is indicated by immunoblotted. The right-hand panel shows radiolabeled in vitro translated underscores, but this extension is not completely conserved throughout the 2B and 2B*. (B) Western blots of anti-N and anti-C immunoprecipitates pre- EMCV alignment. The EMCV CRE is also shown, with the 5′-most loop nucleo- pared from mock- or vMC0-infected (MOI 1.0) BHK-21 cells as indicated by − tides, AAACAC, that are conserved throughout the cardioviruses highlighted or +. Cell lysates were prepared 8 h p.i. and incubated with IgG beads ± either in bold. Similar nucleotides in the shift-site 3′-proximal stem-loops are also anti-N or anti-C as indicated. Control cell lysates removed before the addition highlighted in bold. Numbers represent genomic coordinates of selected nu- of beads or Abs are shown in the far right lanes marked “lysates.” The 2B/2B* cleotides within the sequences indicated by GenBank accession numbers. N-terminal and 2B* C-terminal antigens are indicated in Fig. 2B.

4of9 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1102932108 Loughran et al. Downloaded by guest on September 27, 2021 10 Western blots of the 2B and 2B* in vitro translations suggested mock−infected PNAS PLUS otherwise, it is also possible that 2B suffered from a reduced virus−infected transfer efficiency or reduced Ab affinity (e.g., due to the differ- 8 ing amino acid sequences immediately adjacent to the N-terminal epitope) relative to 2B*. 6 Western blots with anti-C also detected a virus-specific product in lysates from virus-infected cells that migrates at approximately 4 16 kDa but, as expected, did not detect a product migrating at approximately 13 kDa (Fig. 4B). To determine whether the ap- % frameshifting proximately 16-kDa product detected by both anti-N and anti-C 2 is the same product, immunocomplexes were prepared from 0 BHK-21 cells infected with WT mengovirus (MOI 1.0) for 8 h. WT SS WT SS WT SS Infected cells were lysed in RIPA buffer to dissociate proteins 45 nt 3′ 50 nt 3′ 125 nt 3′ that could coimmunoprecipitate with the approximately 16-kDa product. As shown in Fig. 4B, the approximately 16-kDa protein Fig. 5. Extent of the 3′ sequence that contributes to frameshift stimulation. immunoprecipitated by anti-N was detected by anti-C, and simi- Dual luciferase assay showing frameshifting efficiencies following transfec- larly the approximately 16-kDa protein immunoprecipitated by tion of WT and slip-site mutated (SS) pDluc constructs with 45, 50, or 125 nt ′ anti-C was detected by anti-N. This strongly suggests that the anti- 3 of the slip site. Relevant IFCs were also transfected in parallel. Eighteen hours later, transfectants were either mock-infected or infected with vMC0 gens recognized by anti-N and anti-C are on the same protein. (MOI 2.0) for 7 h. Cells were then lysed and luciferase activities measured. Error bars represent the standard deviation of four independent experi- Mass Spectrometric Confirmation of 2B* Expression and the Site of ments, in each of which each construct was transfected in quadruplicate. Frameshifting. To further establish that the approximately 16-kDa protein detected by both anti-N and anti-C is in fact the predicted 48 nt 3′ of the G GUU UUY motif (Fig. 3) and, consistent with 2B*, we resolved by SDS-PAGE anti-N and anti-C immunopre- this structure having a stimulatory effect on frameshifting, con- cipitates prepared from both mengovirus-infected and mock- structs with 50 nt of 3′ sequence stimulated frameshifting in infected BHK-21 cells, followed by total protein staining with virus-infected cells to a level 3-fold higher than constructs with Coomassie blue and excision of gel slices containing a product just 45 nt of 3′ sequence (Fig. 5). Targeted mutations that dis-

migrating at approximately 16 kDa that was present only in those rupted the potential structure also reduced frameshifting to BIOCHEMISTRY immunoprecipitates prepared from virus-infected cells. Gel slices low levels; however, mutations designed to restore the potential were digested with trypsin and the resulting peptides analyzed by stem but with reversed base pairings did not restore frameshift- – liquid chromatography tandem mass spectrometry (LC/MS/MS) ing to WT levels, suggesting that, if the structure does indeed – [with Fourier transform ion cyclotron resonance (FT-ICR)]. Pep- stimulate frameshifting, then both the structure and primary tides corresponding to 75% and 60% of the predicted 2B* se- sequence elements within the structure are important for this ef- quence were identified, respectively, from the anti-N and anti- fect (Fig. S2). Because of the potential for additional stimulatory C immunoprecipitates (Fig. 2B). Three different sized peptides factors, and the effect of virus infection, it is not clear at this from both anti-N and anti-C immunoprecipitates spanned the stage whether these values are closely representative of the actual predicted frameshift site. These data confirm that a portion of level of frameshifting in the virus itself. ribosomes translate 2B* via −1 frameshifting on the G GUU UUU motif during translation of the mengovirus transcript Inactivating 2B* Results in a Small-Plaque Phenotype. To investigate (raw data in Fig. S1). whether 2B* is an essential protein, we introduced mutations into an infectious clone of mengovirus. Three mutants were gen- Frameshifting Is Stimulated by Virus Infection and Requires at Least erated that are synonymous with respect to the polyprotein read- 50 nt 3′ of the Slip Site. Eukaryotic −1 frameshifting is usually ing frame but that are unable to express full-length 2B*: (i)aG stimulated by a stable 3′ RNA secondary structure beginning GUU UUU to A GUG UUU mutation to prevent frameshifting 5–9nt3′ of the shift site. However, as mentioned previously, no at the slip site (SS), (ii) a CCG AAC GAC to CCU AAU GAC such conserved structures were predicted, but we did identify a mutation at nt positions þ48 and þ51 from the shift site to in- potential structure beginning 13–14 nt 3′. The frameshifting troduce two consecutive þ2 frame premature termination codons efficiency and possible role of sequences 3′ of the slip site were just 3′ of the slip site (PTC1), and (iii) a CUA UCG GAU to CUU investigated using the dual luciferase reporter assay (49, 50). WT AGU GAU mutation at nt positions þ153 − 156 from the shift site (G GUU UUU) and mutant (A GUG UUU) slip-site sequences, to introduce an alternative set of þ2 frame stop codons (PTC2). as well as up to 125 nt 3′, were cloned between the two luciferase Although StopGo separation is not known to be affected by genes in vector pDluc so that the downstream firefly luciferase downstream elements, to avoid this as a potential confounding gene was in the −1 frame relative to the upstream renilla lucifer- factor we tested whether any of the mutants might affect StopGo. ase gene. Each frameshift construct had a corresponding in-frame Parts of the mengovirus genome (specifically, from the start of 1D control (IFC) construct in which the firefly gene was in the same to the end of 2B) from WT, SS, PTC1, and PTC2 viruses were frame as the renilla gene. Frameshifting efficiencies were deter- cloned into a plasmid suitable for cell-free transcription and mined by comparing the ratio of firefly to renilla enzymatic translation (Fig. 6A). As a control for impaired StopGo, a con- activities in parallel cell cultures transfected with either the fra- struct in which the critical NPG|P motif was mutated to NPLV meshift or IFC construct. was also prepared. Similar to WT, failure of StopGo separation Constructs were transfected into BHK-21 cells 18 h before occurred at just a very low level in the SS, PTC1, and PTC2 mu- either mock-infection or infection with WT mengovirus (MOI tants (Fig. 6B). 2.0). Frameshifting efficiencies of approximately 3.5% and 6.6% RNA transcribed from WT, SS, PTC1, and PTC2 infectious were observed in virus-infected cells transfected with constructs clones was transfected into BHK-21 cells to generate virus stocks. containing the WT shift site and, respectively, 50 or 125 nt 3′ Although end-point titers were similar between all viruses, plaque (Fig. 5). In contrast, the same constructs promoted only low levels sizes were significantly reduced when either the slip site was (<1%) of frameshifting in the absence of virus infection (Fig. 5). mutated or premature termination codons were introduced into When the shift site was mutated, frameshifting was reduced to the 2B* reading frame (Fig. 6C). We also consistently observed a background levels. The potential RNA structure in EMCV ends delayed cytopathic effect (CPE) when BHK-21 cells were infected

Loughran et al. PNAS Early Edition ∣ 5of9 Downloaded by guest on September 27, 2021 Fig. 6. Analysis of 2B* knockout mutants. (A) Schematic representation of the mengovirus region expressed by in vitro translation to test whether the SS, PTC1 or PTC2 mutations affect StopGo separation. (B) SDS-PAGE of radiolabeled in vitro translation reactions. LV has a mutation in the StopGo region (NPG|P to NPLV) that completely inhibits StopGo separation of 2B from 1D-2A. SS, PTC1, and PTC2 have the same mutations as the mutant mengoviruses of the same names (see main text). IFC is an in-frame control in which the WT slip site sequence has been mutated from G GTT TTT to A GTG TTT T so that, instead of 2B, a protein identical to transframe 2B* is translated. (C) Mean plaque diameter determined by measuring 100 plaques formed by standard plaque assay after infection with each WT or mutated mengovirus as indicated. Error bars indicate standard deviations. Inset: anti-N and anti-C Western blots of anti-N immu- nocomplexes on lysates prepared from BHK-21 cells infected with either WT mengovirus (MOI 1.0) for 8 h or each indicated mutated mengovirus (MOI 1.0) for 22 h. (D) Pulse labeling of vMC0, vMC0-SS, vMC0-PTC1, vMC0-PTC2 (MOI 10), and mock-infected BHK-21 cells at 8 and 22 h p.i.. The positions of the major mengovirus proteins are shown to the left. Arrowheads indicate positions of host proteins that are absent or decreased in cells infected with vMC0 when compared to mock-infected cells. There were no labeled proteins in the WT lane at 22 h p.i. because cells infected with vMC0 for 22 h had already undergone CPE.

with any of the three mutant viruses [up to approximately 24 h (2BC-3ABCD) to those encoded upstream of the shift site (L- postinfection (p.i.) at MOI 1.0] compared to WT virus infections 1ABCD-2A) appears to be greater in SS than in WT (compare, (CPE within approximately 10 h p.i. at MOI 1.0). Host cell shut- for example, the relative intensities of the bands corresponding to off was also delayed in the mutants, with many host cell proteins 1C and 3C in SS and WT in Fig. 6D). Notwithstanding potential expressed at higher levels after 8 h of mutant virus infection than confounding effects (e.g., polyprotein processing and turnover), after 8 h of infection with WT virus (arrowheads in Fig. 6D). This this difference may represent the portion of ribosomes that are may be due to slower virus multiplication within the cell, rather removed from the transcript within the region encoding 2B and than a specific role for 2B* in shutoff. 2B* in WT virus but not in the shift-site mutant. A discrepancy in Immunoprecipitation (IP) experiments using lysates from the molar ratio between the upstream and downstream products BHK-21 cells infected with WT mengovirus for 8 h or the three has been noted previously (51, 52)—though only by comparison mutant viruses for 22 h detected the approximately 16-kDa pro- with the shift-site mutant may one discriminate potential riboso- duct (2B*) for WT virus but failed to detect any specific product mal drop-off during StopGo from ribosomal “removal” via the from the mutant viruses when probed with either anti-N or anti-C frameshifting. (Fig. 6C). Immunoblotting with anti-N detected the approxi- Given that 2B* mutants have an attenuated phenotype in cell mately 16-kDa product only for WT virus, whereas an approxi- culture, one might expect there to be selection for reversion to mately 13-kDa product (2B) was detected for WT virus and all WTsequence. To test this, the SS, PTC1, and PTC2 mutants were three mutants (Fig. 7A). In principle, anti-N should have detected serially passaged in cell culture, and progeny from each passage C-terminally truncated versions of 2B* for the PTC1 and PTC2 were sequenced. The shift-site mutant (SS) partially reverted mutants. These products have expected sizes of approximately (A GUG UUU to A GUU UUU) from passage 3 in one experi- 3.0 kDa and 7.1 kDa, respectively, making detection difficult. It ment and from passage 8 in another, but was not observed to fully is also possible that such truncated products are rapidly degraded. revert to WT G GUU UUU by passage 20. Indeed (see the In- Nonetheless, anti-N Western blots for the PTC2 mutants revealed troduction), the G at position 4 in SS is expected to have a much a faint specific product migrating at a suitable position for the greater inhibitory effect on frameshifting than the A at position 1, approximately 7.1-kDa truncated 2B* (Fig. 7A). and the partial revertant A GUU UUU is likely to allow some Although clearly detected with Abs, we were unable to un- frameshifting (cf. ref. 18). In fact, 2B* expression is detectable ambiguously detect 2B* via pulse labeling (Fig. 6D); 2B* may via Western blotting using lysates prepared from cells infected migrate very close to 2A (predicted molecular mass of approxi- with the partial revertant (Fig. 7A). Furthermore, the small- mately 16.7 kDa) or a host cell protein, and/or it may be present plaque phenotype observed for SS (Fig. 6C) is almost completely at relatively low abundance. Meanwhile, the truncated 2B* pro- reversed in the partial revertant (Fig. 7B). Failure to revert A to G ducts expected to be produced by the PTC1 and PTC2 mutants at position 1, despite all WT isolates containing a G at position 1, lack any methionines and so cannot incorporate the 35S label and, may be a consequence of selection in cell culture as opposed to in in any case, based on the immunoblotting, are not readily detect- vivo. The PTC1 and PTC2 mutants, which involve two and four able. Nonetheless, one effect of frameshifting may be apparent nucleotide changes, respectively, were not observed to revert. in Fig. 6D, in that the ratio of the downstream-encoded products This is not surprising because (unlike the shift-site mutant) single

6of9 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1102932108 Loughran et al. Downloaded by guest on September 27, 2021 (25, 26, 53). The predicted 3′ RNA structure bears some resem- PNAS PLUS blance to the cardiovirus CRE element that binds 3CD (41, 48) (Fig. 3). If the shift-site 3′-proximal stem-loop also binds 3CD, then it is plausible that the RNA-protein complex could act as an unusual stimulator for frameshifting (with implications for the distance between the shift site and the stimulator), and this would also explain why efficient frameshifting was only observed in virus-infected cells. Such a mechanism has a precedent in an artificial system where eukaryotic −1 frameshifting stimulated by a3′-adjacent iron-responsive element (IRE) stem-loop structure was shown to be greatly enhanced by stimulation of IRE-binding proteins (54). However, this is just one of many possible explana- tions: virus infection could also conceivably modulate frameshift- ing efficiency via other virus or host trans-acting protein factors, changes in tRNA populations, ribosome modifications, and other changes in cellular environment and/or architecture (cf. refs. 55– 57). It is also possible that non-Watson–Crick intra-mRNA inter- actions extend the structure into the spacer region in such a way that allows it to affect frameshifting in a manner more analogous to typical frameshift-stimulating structures (albeit still, unusually, apparently dependent on virus infection). Establishing the nature and mode of action of the 3′ stimulator, as well as the influence of virus-infection, is, however, outside the scope of this report. Although we have not tested the functionality of the pro- posed site of frameshifting in SAFV and TMEV, it seems prob- able from the computational analysis that they, likewise, utilize Fig. 7. Partial reversion of the mutated slip site restores 2B* expression and programmed frameshifting at this location. Unlike EMCV, how-

plaque size. (A) Anti-N Western blot of lysates prepared from BHK-21 cells ever, frameshifting in TMEV and SAFV would result in just a BIOCHEMISTRY infected for 8 h (MOI approximately 10) with either WT or the mutant men- 14–15 aa transframe peptide. Although such a small peptide may goviruses (SS, PTC1, and PTC2) as indicated. P1 indicates the first passage from be functional in itself, one could also speculate an alternative a virus stock, whereas P10 and P20, respectively, indicate 10 and 20 serial pas- function for frameshifting in TMEV and SAFV. One possibility sages of a virus stock. A higher exposure is shown for a region below 10 kDa, showing a PTC2-specific product that may represent the approximately 7.1- might be that frameshifting occurs only when StopGo separation kDa truncated 2B*. (B) Mean plaque diameter determined by measuring 100 fails to occur, thus resulting in a larger protein comprising 2A plaques formed by standard plaque assay after infection with WT mengo- fused to a 14–15 aa C-terminal transframe peptide. An alterna- virus, or slip-site mutated mengovirus that had been serially passaged once tive explanation might be that frameshifting acts as a ribosome (P1) or eight times (P8) as indicated. Error bars indicate standard deviations. sink to reduce the number of ribosomes translating the 3′-en- coded replicase proteins, in particular the polymerase 3D, freeing nucleotide changes are not expected to even partially improve up more ribosomes to translate the 5′-encoded structural pro- fitness for these mutants. teins; though, to be effective, such a sink would require significant levels of frameshifting (e.g., 50%). Interestingly, StopGo has also Discussion been proposed as a candidate for a ribosome sink that might be We have demonstrated the existence of programmed ribosomal regulated during the course of virus infection (7), though a pos- frameshifting and a 128–129 aa transframe gene, 2B*, in EMCV. sible advantage of frameshifting over StopGo as a ribosome sink Evidence includes (i) a 117-codon overlapping ORF conserved in is that frameshifting and termination at an out-of-frame stop co- all available EMCV sequences, (ii) a statistically highly significant don cleanly removes ribosomes from the message, whereas dis- enhancement in the conservation at polyprotein-frame synon- continuation of translation at the StopGo cassette may result ymous sites coinciding with the position of the ORF, (iii)a in a stalled ribosome. A third explanation may be that the frame- well-defined and conserved translation mechanism via pro- shifting plays some sort of regulatory role in the viral lifecycle grammed ribosomal frameshifting at the 5′ end of the ORF, (iv) (cf. ref. 39). This is an attractive hypothesis given that the effi- a product of the expected size was specifically detected by two ciency of frameshifting in EMCV appears to be modulated by separate Abs raised against N- and C-terminal peptides, (v) mass viral infection. spectrometry of this product purified from infected cells con- For whatever benefit, however, it seems that programmed firmed expression of 2B* and the predicted site of frameshifting, frameshifting evolved at this location in the ancestral cardiovirus. and (vi) three separate 2B* knockout mutants, each differing Then in EMCV, but not TMEVor SAFV, the ancestral frameshift from WT virus by just 2–4 mutations synonymous in the polypro- site was co-opted (perhaps while also retaining the original func- tein frame, exhibited an attenuated small-plaque phenotype. tion) as a suitable site to begin the evolution of a long overlapping Interestingly, a frameshift stimulatory RNA structure was not gene via incremental elongation by substitutions of intervening apparent at the canonical distance of 5–9nt3′ of the shift site. þ2-frame stop codons with sense codons and selection on the The reporter assays with 45 and with 50 nt 3′ of the shift site show encoded amino acid sequence (cf. ref. 34). This finding adds to the importance of the longer sequence for efficient frameshifting. a small number of known cases of phylogenetically conserved Fifty nucleotides, but not 45 nt, is adequate to allow formation overlapping genes that are internal to longer coding sequences of the potential stem-loop structure identified in EMCV, TMEV, and accessed via programmed ribosomal frameshifting. Other and SAFV (Fig. 3). As the 5′ end of this potential structure is examples include potyvirus PIPO, TF, and flavivirus 13–14 nt 3′ of the shift site, if it forms it would be at the leading NS1′ (21, 28, 29). Such overlapping genes are difficult to identify edge of the mRNA entrance tunnel at the time the shift occurs. and are often overlooked. However, it is important to be aware There are other candidates for a frameshift stimulator at this of these genes as early as possible. Undetected overlapping genes position (albeit for þ1 not −1 frameshifting), as well as evidence are a source of confusion and confound mutagenic analysis of for 3′ stimulators that do not act via intra-mRNA base pairing the genes they overlap, in this case 2B. Furthermore, only once

Loughran et al. PNAS Early Edition ∣ 7of9 Downloaded by guest on September 27, 2021 it has been identified can the functions of an overlapping gene be onitrile (with 0.1% formic acid) to 70% acetonitrile (with 0.1% formic acid). investigated in their own right. Peptides were identified using the MASCOT search engine.

Materials and Methods Recombinant Constructions. Full-length recombinant mengovirus genomes Computational Analysis. Cardiovirus nucleotide sequences in Genbank with containing mutations in the slip site (SS), premature termination codons 1 complete or near-complete coverage of the polyprotein coding sequence (PTC1), and premature termination codons 2 (PTC2) were engineered by were identified by applying National Center for Biotechnology Information standard two-step PCR using primers 1–8inTable S1, along with pMC0 as tblastn (58) to the polyprotein sequences derived from GenBank cardiovirus template. Amplicons were cloned BglII/AflII into pMC0 using standard clon- reference sequences NC_001479, NC_001366, NC_009448, and NC_010810. ing procedures and all clones were confirmed by sequencing. WT, slip-site The following sequences were retrieved. EMCV: DQ464062, HM641897, mutated (SS), and IFC dual luciferase constructs were generated by two-step DQ464063, FJ604852, FJ604853, FJ897755, EU780148, EU780149, DQ517424, PCR with primers 3, 4, and 9–15 in Table S1, again using pMC0 as template. AF356822, M81861, X00463, DQ288856, X74312, AY296731, M22457, PCR products were cloned XhoI/BamHI into XhoI/BglII digested dual lucifer- M22458, M37588, X87335, DQ294633, L22089, and DQ835185. TMEV: ase vector pDluc using standard cloning procedures, and all clones were DQ401688, M16020, EU718732, EU723238, M20562, X56019, M20301, confirmed by sequencing. EU718733, and HQ652539. RTV: EU542581, EU815052, and AB090161. SAFV: AM922293, EU376394, GU595289, EU681177, EU681176, FN999911, Dual Luciferase Assays. BHK-21 cells were transfected in quadruplicate with FR682076, GU943518, EF165067, FJ463615, FJ463616, FJ463617, HM181996, 25 ng of each plasmid as indicated using Lipofectamine 2000 reagent (Invi- 4 EU681178, EU681179, HM181997, HM181998, HM181999, FM207487, trogen), using the 1-d protocol in which suspended cells (2 × 10 ) are added GU943513, GU943514, and JF813004. X00463, which is nearly identical to directly to the DNA complexes in half-area 96-well plates. Transfected cells M81861 but has six single-nucleotide deletions that cause local changes in were incubated at 37 °C for 18 h before virus infection as described above reading frame in the 1A/1B coding region, was omitted from the analysis. for the indicated times and MOI. Luciferase activities were determined using Within each clade, the polyprotein-encoding sequences were extracted, the Dual Luciferase Stop & Glo® Reporter Assay System (Promega). Relative translated, aligned, and back-translated to produce nucleotide sequence light units were measured on a Veritas Microplate Luminometer with two alignments using EMBOSS and Clustal (59, 60). Besides the full- or near full- injectors (Turner Biosystems). Transfected cells were washed once with PBS length sequences above, one additional partial sequence with coverage of and then lysed in 12.6 μLof1× passive lysis buffer, and light emission was the 2B region was retrieved (namely, EU723237, which was the only VHEV measured following injection of 25 μL of either renilla or firefly luciferase sequence in Genbank with coverage of 2B). substrate. Firefly luciferase activity was calculated relative to the activity of renilla luciferase, and frameshifting efficiencies were determined by com- Antibodies. Polyclonal Abs to two predicted antigens within 2B* were pre- paring the ratio of firefly to renilla enzymatic activities in parallel cell cultures pared by GenScript. Anti-N was raised against peptide sequence PFTFKP- transfected with either the test construct or an IFC. RQRPVFC (N-terminal 12 aa shared by 2B and 2B* + “C”). Anti-C was raised against peptide sequence CRDHKPDKPVRRNSS (“C” + C-terminal 14 aa In Vitro Transcription and Transfection. Plasmids were linearized with BamHI of 2B*). Rabbits were injected with one of the two peptides, and Abs were and transcribed with T7 RNA polymerase (Ambion) for 3 h at 37 °C as recom- affinity-purified from immune sera. A third Ab was raised against peptide mended by the manufacturer. RNA integrity was examined by electrophor- sequence CAWENVKGTLNNPEF (“C” +aa66–79 of 2B) but proved to be esis, and RNA was transfected into BHK-21 cells using DMRIE-C (Invitrogen) ineffective. according to the manufacturer’s recommendations.

Viruses and Cell Culture. BHK-21 cells (ATCC) were maintained in DMEM Cell-Free Translation. We chose not to use rabbit reticulocyte lysates for cell- supplemented with 10% FBS, 1 mM L-glutamine, and antibiotics. Virus free translation because of potential distortions in the apparent molecular vMC0 is a recombinant mengovirus that is identical to GenBank accession masses of small proteins caused by large amounts of endogenous β-globin. number DQ294633 except in the 5′UTR where vMC0 has the poly-C tract In addition, Western blotting with rabbit reticulocyte lysates subsequently deleted (61). For virus infection, BHK-21 cells were washed with PBS and probed with Abs raised in rabbit often results in high background. Therefore, then overlaid with vMC0 diluted in PBS/1% FBS at the multiplicities of cell-free translations were performed in human lysates (Thermo Scientific; infection (MOI) indicated throughout the text. After adsorption (60 min at Human In vitro Protein Expression kit; catalog no. 88855). RNA transcribed 37 °C) with gentle agitation, virus was removed and cells overlaid with from constructs expressing 2B or 2B* (see SI Materials and Methods for the DMEM/2% FBS for the times p.i. indicated. construction of these plasmids) were used to program in vitro translation reactions with [35S] methionine according to the manufacturer’s instructions. Immunodetection and IP. Infected cells were lysed in RIPA buffer plus protease Translation reactions were then analyzed by SDS-PAGE, and gels were either inhibitors on ice for 20 min. Cell debris were removed from cell lysates by dried and exposed to a phosphorscreen or else Western blotted with anti-N centrifugation at 15;000 × g at 4 °C for 15 min. Proteins were resolved by as described above. 15% Bis/Tris-PAGE (MES buffer) and transferred to nitrocellulose membranes (Protran) by electrotransfer in a tris/glycine buffer with 20% methanol at Plaque Assays. Virus titers were quantified by plaque assay. Twenty-four 1 mA∕cm2 (constant amps) for 55 min. Membranes were incubated at 4 °C hours after transfection of each infectious RNA, cells were subjected to three overnight in 2% milk with a 1∶1000 dilution of anti-N or a 1∶500 dilution cycles of freezing and thawing followed by centrifugation at 1;800 × g to pel- of anti-C. Immunoreactive bands were detected on membranes after incuba- let cell debris. Supernatants were collected then aliquotted and stored at tion with appropriate fluorescently labeled secondary Abs using a LI-COR −80 °C. BHK-21 cells were infected with 10-fold dilutions of each supernatant Odyssey®Infrared Imaging Scanner (LI-COR Biosciences). For IP, cell lysates for 1 h at 37 °C. The virus inoculums were then removed and cells were over- were incubated with 20 μL of protein G Agarose beads plus anti-N or layed with 1.0% agarose containing MEM. After 48 h of incubation, cells anti-C (3 μg of each Ab) overnight at 4 °C with gentle rocking. The beads were were fixed with 2.5% formalin, and plaques were visualized with crystal washed (3×) with ice-cold RIPA buffer and then removed from the beads by violet. boiling for 5 min in 20 μLof2× SDS-PAGE sample buffer for electrophoresis and either stained with Coomassie blue (for gel excision and subsequent Metabolic Labeling. BHK-21 cells at 90–100% confluency in 35-mm petri dishes mass spectroscopy analysis) or transferred to nitrocellulose membranes for were infected with WT or mutant mengovirus at MOI 10. After 60 min of Western blot analysis. adsorption at 37 °C, the virus inoculums were removed by aspiration. The cells were washed once with methionine-free DMEM and incubated in 0.25 ml Mass Spectrometry. Coomassie blue stained approximately 16-kDa products of methionine-free DMEM. The cells were labeled with [35S] methionine were excised from the gel and subjected to in-gel trypsin digestion. LC/MS/ (50 μCi∕mL) at 37 °C for 30 min. Cells were then washed once in PBS and lysed MS data were acquired using an LTQ-FT hybrid mass spectrometer (Thermo- by being suspended in 0.2 mL of sample buffer and heated at 95 °C for 5 min. Electron Corporation). Peptide molecular masses were measured by FT-ICR. Lysates were analyzed by SDS-PAGE (15%). After electrophoresis, the gels Peptide sequencing was performed by collision-induced dissociation in the were processed for fluorography with Amplify™ (Amersham). linear ion trap of the LTQ-FT instrument. Digest samples were introduced by nanoLC (Eksigent, Inc.) with nanoelectrospray ionization (ThermoElectron ACKNOWLEDGMENTS. We thank Chad Nelson for performing the mass Corporation). NanoLC was performed using a homemade C18 nanobore col- spectrometry and analysis. We are also very grateful to Martina Scallan umn [75-μmi:d: × 10 cm; Atlantis C18 (Waters Corporation); 3-μm particle for facilitating use of the virus laboratory. We thank Mike Howard for size]. Peptides were eluted from a 50-min linear gradient run from 4% acet- providing vector pDluc and Robert Fujinami for providing a copy of the

8of9 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1102932108 Loughran et al. Downloaded by guest on September 27, 2021 mengovirus cDNA (pMC0) developed by Ann Palmenberg. We also thank B1889 (to J.F.A.) and Wellcome Trust Grant 088789 (to A.E.F.). J.F.A. was PNAS PLUS Ann Palmenberg for providing other materials and for stimulating discus- personally supported by US National Institutes of Health Grant R01 sions. This work was supported by Science Foundation Ireland Grant 08/IN.1/ GM079523.

1. Palmenberg AC, et al. (1984) The nucleotide and deduced amino acid sequences 31. Baranov PV, Gesteland RF, Atkins JF (2002) Recoding: Translational bifurcations in gene of the encephalomyocarditis viral polyprotein coding region. Nucleic Acids Res expression. Gene 286:187–201. 12:2969–2985. 32. Firth AE, Wills NM, Gesteland RF, Atkins JF (2011) Stimulation of stop codon read- 2. Jackson RJ (1986) A detailed kinetic analysis of the in vitro synthesis and processing through: Frequent presence of an extended 3′ RNA structural element. Nucleic Acids of encephalomyocarditis virus products. Virology 149:114–127. Res 39:6679–6691. 3. Palmenberg AC (1990) Proteolytic processing of picornaviral polyprotein. Annu Rev 33. Firth AE, Brown CM (2006) Detecting overlapping coding sequences in virus genomes. Microbiol 44:603–623. BMC Bioinformatics 7:75. 4. Batson S, Rundell K (1991) Proteolysis at the 2A/2B junction in Theiler’s murine ence- 34. Belshaw R, Pybus OG, Rambaut A (2007) The evolution of genome compression and phalomyelitis virus. Virology 181:764–767. genomic novelty in RNA viruses. Genome Res 17:1496–1504. 5. Palmenberg AC, et al. (1992) Proteolytic processing of the cardioviral P2 region: 35. Simmonds P (2006) Recombination and selection in the evolution of picornaviruses Primary 2A/2B cleavage in clone-derived precursors. Virology 190:754–762. and other Mammalian positive-stranded RNA viruses. J Virol 80:11124–11140. 6. Atkins JF, et al. (2007) A case for “StopGo”: Reprogramming translation to augment 36. Martin DP, et al. (2010) RDP3: A flexible and fast computer program for analyzing codon meaning of GGN by promoting unconventional termination (Stop) after recombination. Bioinformatics 26:2462–2463. addition of glycine and then allowing continued translation (Go). RNA 13:803–810. 37. Hofacker IL (2003) Vienna RNA secondary structure server. Nucleic Acids Res 7. Brown JD, Ryan MD (2010) Ribosome “skipping”: “Stop-Carry On” or “StopGo” trans- 31:3429–3431. lation. Recoding: Expansion of Decoding Rules Enriches Gene Expression, eds JF Atkins 38. Reeder J, Steffen P, Giegerich R (2007) pknotsRG: RNA pseudoknot folding including – and RF Gesteland (Springer, Heidelberg), pp 101 122. near-optimal structures and sliding windows. Nucleic Acids Res 35:W320–324. 8. Ryan MD, Drew J (1994) Foot-and-mouth disease virus 2A oligopeptide mediated 39. Barry JK, Miller WA (2002) A –1 ribosomal frameshift element that requires base – cleavage of an artificial polyprotein. EMBO J 13:928 933. pairing across four kilobases suggests a mechanism of regulating ribosome and repli- 9. Hahn H, Palmenberg AC (1996) Mutational analysis of the encephalomyocarditis case traffic on a viral RNA. Proc Natl Acad Sci USA 99:11133–11138. – virus primary cleavage. J Virol 70:6870 6875. 40. Su L, Chen L, Egli M, Berger JM, Rich A (1999) Minor groove RNA triplex in the crystal 10. Hahn H, Palmenberg AC (2001) Deletion mapping of the encephalomyocarditis virus structure of a ribosomal frameshifting viral pseudoknot. Nat Struct Biol 6:285–292. – primary cleavage site. J Virol 75:7215 7218. 41. Lobert PE, Escriou N, Ruelle J, Michiels T (1999) A coding RNA sequence acts as a 11. Donnelly ML, Gani D, Flint M, Monaghan S, Ryan MD (1997) The cleavage activities replication signal in cardioviruses. Proc Natl Acad Sci USA 96:11560–11565. – of aphthovirus and cardiovirus 2A proteins. J Gen Virol 78:13 21. 42. Simmonds P, et al. (2008) Bioinformatic and functional analysis of RNA secondary ‘ ’ 12. Donnelly ML, et al. (2001) Analysis of the aphthovirus 2A/2B polyprotein cleavage structure elements among different genera of human and animal caliciviruses. mechanism indicates not a proteolytic reaction, but a novel translational effect: A Nucleic Acids Res 36:2530–2546. putative ribosomal ‘skip’. J Gen Virol 82:1013–1025. 43. Liang Z, Kumar AS, Jones MS, Knowles NJ, Lipton HL (2008) Phylogenetic analysis 13. Doronina VA, et al. (2008) Site-specific release of nascent chains from ribosomes at a of the species Theilovirus: Emerging murine and human pathogens. J Virol sense codon. Mol Cell Biol 28:4227–4239. 82:11545–11554. 14. Blinkova O, et al. (2009) Cardioviruses are genetically diverse and cause common BIOCHEMISTRY 44. Kong WP, Roos RP (1991) Alternative translation initiation site in the DA strain of enteric infections in South Asian children. J Virol 83:4631–4641. Theiler’s murine encephalomyelitis virus. J Virol 65:3395–3399. 15. Zoll J, et al. (2009) Saffold virus, a human Theiler’s-like cardiovirus, is ubiquitous 45. van Eyll O, Michiels T (2002) Non-AUG-initiated internal translation of the L* protein and causes infection early in life. PLoS Pathog 5:e1000416. of Theiler’s virus and importance of this protein for viral persistence. J Virol 16. Brierley I, Gilbert RJC, Pennell S (2010) Pseudoknot-dependent programmed −1 76:10665–10673. ribosomal frameshifting: Structures, mechanisms and models. Recoding: Expansion 46. Stavrou S, et al. (2010) Theiler’s murine encephalomyelitis virus L* amino acid of Decoding Rules Enriches Gene Expression, eds JF Atkins and RF Gesteland (Springer, position 93 is important for virus persistence and virus-induced demyelination. J Virol Heidelberg), pp 149–174. 84:1348–1354. 17. Miller WA, Giedroc DP (2010) Ribosomal frameshifting in decoding plant viral RNAs. 47. Witwer C, Rauscher S, Hofacker IL, Stadler PF (2001) Conserved RNA secondary Recoding: Expansion of Decoding Rules Enriches Gene Expression, eds JF Atkins and structures in Picornaviridae genomes. Nucleic Acids Res 29:5079–5089. RF Gesteland (Springer, Heidelberg), pp 193–220. 48. Steil BP, Barton DJ (2009) Cis-active RNA elements (CREs) and picornavirus RNA repli- 18. Brierley I, Jenner AJ, Inglis SC (1992) Mutational analysis of the “slippery-sequence” cation. Virus Res 139:240–252. component of a coronavirus ribosomal frameshifting signal. J Mol Biol 227:463–479. 49. Grentzmann G, Ingram JA, Kelly PJ, Gesteland RF, Atkins JF (1998) A dual-luciferase 19. Jacks T, Madhani HD, Masiarz FR, Varmus HE (1988) Signals for ribosomal frameshift- reporter system for studying recoding signals. RNA 4:479–486. ing in the Rous sarcoma virus gag-pol region. Cell 55:447–458. 50. Fixsen SM, Howard MT (2010) Processive selenocysteine incorporation during synthesis 20. Ogle JM, Ramakrishnan V (2005) Structural insights into translational fidelity. Annu – Rev Biochem 74:129–177. of eukaryotic selenoproteins. J Mol Biol 399:385 396. 21. Balmori Melian E, et al. (2010) NS1′ of flaviviruses in the Japanese encephalitis 51. Paucha E, Seehafer J, Colter JS (1974) Synthesis of viral-specific polypeptides in serogroup is a product of ribosomal frameshifting and plays a role in viral neuro- Mengo virus-infected L cells: Evidence for asymmetric translation of the viral genome. – invasiveness. J Virol 84:1641–1647. Virology 61:315 326. – 22. Domier LL, McCoppin NK, Larsen RC, D’Arcy CJ (2002) Nucleotide sequence shows 52. Lucas-Lenard J (1974) Cleavage of mengovirus polyproteins in vivo. J Virol 14:261 269. that Bean leafroll virus has a Luteovirus-like genome organization. J Gen Virol 53. Ivanov IP, Atkins JF (2007) Ribosomal frameshifting in decoding antizyme mRNAs from 83:1791–1798. yeast and protists to humans: Close to 300 cases reveal remarkable diversity despite – 23. den Boon JA, et al. (1991) Equine arteritis virus is not a togavirus but belongs to the underlying conservation. Nucleic Acids Res 35:1842 1858. coronavirus-like superfamily. J Virol 65:2910–2920. 54. Kollmus H, Hentze MW, Hauser H (1996) Regulated ribosomal frameshifting by an – 24. Firth AE, Blitvich BJ, Wills NM, Miller CL, Atkins JF (2010) Evidence for ribosomal RNA-protein interaction. RNA 2:316 323. frameshifting and a novel overlapping gene in the genomes of insect-specific flavi- 55. Hatfield D, et al. (1989) Chromatographic analysis of the aminoacyl-tRNAs which viruses. Virology 399:153–166. are required for translation of codons at and around the ribosomal frameshift sites 25. Guarraia C, Norris L, Raman A, Farabaugh PJ (2007) Saturation mutagenesis of of HIV, HTLV-1, and BLV. Virology 173:736–742. a þ1 programmed frameshift-inducing mRNA sequence derived from a yeast 56. Goff SP (2004) Genetic reprogramming by : Enhanced suppression of retrotransposon. RNA 13:1940–1947. translational termination. Cell Cycle 3:123–125. 26. Chung BY, Firth AE, Atkins JF (2010) Frameshifting in : A diversity of 3′ 57. Kobayashi Y, Zhuang J, Peltz S, Dougherty J (2010) Identification of a cellular stimulatory structures. J Mol Biol 397:448–456. factor that modulates HIV-1 programmed ribosomal frameshifting. J Biol Chem 27. Bekaert M, Rousset JP (2005) An extended signal involved in eukaryotic −1 frameshift- 285:19776–19784. ing operates through modification of the E site tRNA. Mol Cell 17:61–68. 58. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search 28. Chung BY, Miller WA, Atkins JF, Firth AE (2008) An overlapping essential gene in the tool. J Mol Biol 215:403–410. . Proc Natl Acad Sci USA 105:5897–5902. 59. Rice P, Longden I, Bleasby A (2000) EMBOSS: The European Molecular Biology Open 29. Firth AE, Chung BY, Fleeton MN, Atkins JF (2008) Discovery of frameshifting in Software Suite. Trends Genet 16:276–277. Alphavirus 6K resolves a 20-year enigma. Virol J 5:108. 60. Larkin MA, et al. (2007) Clustal W and Clustal X version 2.0. Bioinformatics 30. Firth AE, Atkins JF (2009) A conserved predicted pseudoknot in the NS2A-encoding 23:2947–2948. sequence of West Nile and Japanese encephalitis flaviviruses suggests NS1′ may derive 61. Duke GM, Osorio JE, Palmenberg AC (1990) Attenuation of Mengo virus through from ribosomal frameshifting. Virol J 6:14. genetic engineering of the 5′ noncoding poly(C) tract. Nature 343:474–476.

Loughran et al. PNAS Early Edition ∣ 9of9 Downloaded by guest on September 27, 2021