ARTICLE doi:10.1038/nature13429

Ribosomal frameshifting in the CCR5 mRNA is regulated by miRNAs and the NMD pathway

Ashton Trey Belew1*, Arturas Meskauskas1,2*, Sharmishtha Musalgaonkar1, Vivek M. Advani1, Sergey O. Sulima1{, Wojciech K. Kasprzak3, Bruce A. Shapiro4 & Jonathan D. Dinman1

Programmed 21 (21 PRF) signals redirect translating ribosomes to slip back one base on messenger RNAs. Although well characterized in , how these elements may regulate cellulargene expression is not understood. Here we describe a 21 PRF signal in the human mRNA encoding CCR5, the HIV-1 co-receptor. CCR5 mRNA-mediated 21 PRF is directed by an mRNA , and is stimulated by at least two . Mapping the mRNA–miRNA interaction suggests that formation of a triplex RNA structure stimulates 21 PRF. A 21 PRF event on the CCR5 mRNA directs translating ribosomes to a premature termination codon, destabilizing it through the nonsense-mediated mRNA decay pathway. At least one additional mRNA decay pathway is also involved. Functional 21 PRF signals that seem to be regulated by miRNAs are also demonstrated in mRNAs encoding six other cytokine receptors, suggesting a novel mode through which immune responses may be fine-tuned in mammalian cells.

Viral programmed ribosomal frameshift events typically produce purified CCR5–b-gal fusion (Extended Data Fig. 2c) unam- carboxy-terminally extended fusion . However, computational biguously identified the predicted 21 frameshift peptide harbouring analyses predict that .95% of 21 PRF events on cellular mRNAs direct the junction between the 02 and 21 frame encoded CCR5 sequence ribosomes to premature termination codons (PTC), suggesting that 21 (Fig. 1c, Extended Data Fig. 2d). Analysis of published ribosome pro- PRF may be used by cells to regulate expression by destabilizing filing data from human cells10 revealed a sizable fraction of ribosomes mRNAs through the nonsense-mediated mRNA decay (NMD) pathway1. paused at the CCR5 21 PRF signal, 9/59 (,15%) of which were shifted Whereas a role for 21 PRF has been shown in yeast2,3, it has not been into the 21 (Fig. 1d). These four lines of inquiry demon- tested in higher so far. In yeast, mutants and drugs that strate that this sequence in the human CCR5 mRNA promotes efficient globally affect 21 PRF generally promote deleterious phenotypes4, and 21 PRF. global dysregulation of 21 PRF may contribute to human disease3,5–7. How sequence-specific regulation of 21 PRF might be achieved has Structural analysis been the central unanswered question in the field. Computational analyses predicted the presence of two nearly equival- ent downstream mRNA or a tandem stem-loop structure A 21 PRF signal in the CCR5 receptor mRNA immediately 39 of the slippery site (Extended Data Fig. 3a). Analyses of CCR5 is a cytokine receptor which is exploited by HIV-1 as a co- chemical modification assays of a CCR5 runoff transcript (Extended receptor for entry into CD41 T-cells8. A strong candidate 21 PRF signal Data Fig. 3b, c) were consistent with the presence of a two-stemmed beginning at 407 in the human CCR5 mRNA was identified mRNA pseudoknot (Fig. 2). Whereas the slippery site distal region of computationally9. This sequence is .99% conserved among the great Stem 1 is stable, the proximal region is conformationally dynamic, con- apes and is highly conserved among the higher primates (Extended sistent with single-molecule optical trap experiments revealing a com- Data Fig. 1). Using dual luciferase reporters (Extended Data Fig. 2a), plex network of folding pathways for this element11. The weak slippery the CCR5 sequence promoted 9–11% 21 PRF in HeLa (Fig. 1a) and site proximal half of stem 1 coupled with the internal bulge is remin- 4.5–6.3% in Chinese hamster ovary (CHO) or Vero cells (Extended iscent of the HIV-1 21 PRF signal solution structure12 and is consistent Data Fig. 2b). Mutagenesis of the slippery site from UUUAAAA to with the emerging view of conformational complexity as a critical GCGCGCG reduced 21 PRF to ,1% (Fig. 1a). Introduction of an in- feature of recoding pseudoknots13,14. Stem 2 contains four semi-helical frame termination codon (PTC control) 59 of the CCR5-derived sequence, segments (labelled a, b, d, e in Fig. 2a), plus a small segment in the middle or placing the firefly luciferase reporter out of frame with respect to (c), all separated by unpaired bases. The unpaired bases may allow the Renilla reduced 21 PRF levels by more than two orders of magnitude entire structure to bend, enabling U23 to bridge the gap between C22 (Fig. 1a), ruling out the possible presence of either a splicing donor site and U24. The ‘best fit’ conformer diagrammed in Fig. 2a was used as the or an internal ribosome entry signal (IRES). An in vitro basis for molecular-dynamics-based simulation of the CCR5 21PRF assay revealed a peptide consistent with a CCR5 21 PRF event at levels stimulatory mRNA structure (Fig. 2b, Extended Data Fig. 3d, e). The comparable to that promoted by the HIV-1 21 PRF signal (Fig. 1b). root mean squared deviation (r.m.s.d.) average structure was calculated Liquid chromatography dual mass spectroscopic analysis of an affinity for the last 12 ns of an 80-ns long molecular dynamics simulation, where

1Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland 20742, USA. 2Department of Biotechnology and Microbiology, Vilnius University, Vilnius, LT 03101, Lithuania. 3Basic Science Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, USA. 4Basic Research Laboratory, National Cancer Institute, Frederick, Maryland 21702, USA. {Present address: VIB Center for the Biology of Disease, KU Leuven, Campus Gasthuisberg, Herestraat 49, bus 602, 3000 Leuven, Belgium. *These authors contributed equally to this work.

21 AUGUST 2014 | VOL 512 | NATURE | 265 ©2014 Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE

a c 3′ Ac-TSRIPVVHAVFALKSQDGHLWGG ab U-A 12 U-A *** 3,000 G-C -100 G-U 10 2,500 a G-C G-U | A 2,000 | G 8 U-A *** 1,500 30- G-C G-C S2 6 U-A 1,000 b S2 Intensity (counts) G-U -90 A-U –1 PRF (%) 4 500 C | U A | A-U 2 0 c 400 600 800 1,000 1,200 1,400 G-C m/z U | 0 40- G-U HIV CCR5 SSM PTC OOF U-A d G-C SS d A-U SS 60 U-A 3′ b C CCR5 HIV RTC Null A A kDa 40 C-G -80 5′ U-G e U-A

No. of hits 20 –1 PRF signal –1 Frame termination 50- G-C at 407 at 677 G-C 5′ 37 0 G-C 0 200 400 600 800 1,000 U S1 25 C-G 25 0 Frame 20- C-G 20 –1 Frame A-U 15 +1 Frame C-G 10 U-G L 15 G-C No. of hits 5 g cu UU U A A A A g c c a gg G-U -60 37 31 – % PRF 0 C-G A-U U 20 G U 404 406 408 410 412 414 416 418 4 | U S1 L G Nucleotide position G-C U G C G-U 10- A | 70 Figure 1 | The CCR5 sequence promotes efficient frameshifting. C | a, Measurement of 21 PRF in HeLa cells. 21 PRF efficiency was monitored C-C ′ G-U in HeLa cells using dual luciferase reporters. Error bars denote an 5 UUUAAAA approximation of standard errors. ***P , 0.001 compared to the out of frame Figure 2 | Models of the CCR5 21 PRF stimulating mRNA pseudoknot. control (Student’s two-tailed t-test). b, Efficient 21 PRF promoted by the a, Best-fit two-dimensional model based on chemical modification analyses. CCR5 sequence in vitro. Autoradiogram of in vitro translation reaction b, Three-dimensional model, two views. Slippery site is green, stem 1 is dark using mRNAs harbouring CCR5- or HIV-1-derived 21 PRF signals. Green blue, unpaired bases and the loop within stem 1 are light blue, paired bases in arrows denote 0-frame encoded products. Red arrows denote 21 PRF encoded stem 2 are red, and unpaired bases in stem 2 are yellow. peptides. RTC indicates the readthrough control. Percentage 21 PRF promoted by CCR5 and HIV-1 frameshift signals is indicated below the lanes. c, Liquid chromatography with tandem mass spectrometry (LC-MS/MS) between miRNAs and 21 PRF signals. Neither miR-141 nor miR-711 spectrum of a proteolytic fragment containing the CCR5 frameshift affected 21 PRF in a HeLa-cells-based assay (Extended Data Fig. 4b), peptide. N-terminally acetylated leader peptide sequence is coloured blue, perhaps owing to the presence of endogenous miR-711, and/or miR- CCR5-derived 0-frame sequence beginning at V94 is red, and CCR5 21 frame 141. However, miR-141 specifically stimulated CCR5-mediated 21 encoded sequence beginning after L101 is coloured green. d, Ribosomes PRF in CHO cells (Extended Data Fig. 4c). accumulate at the CCR5 21 PRF signal. Data mined from ref. 10. Top, locations of the 21 PRF signal and first 21 frame termination codon are indicated. miR-1224/CCR5 mRNA interactions Bottom, profiling data at the slippery site (indicated in capital letters) at single nucleotide resolution. Ribosomes arrested in the three different reading frames Two different in vitro electrophoretic mobility shift assays (EMSAs) are colour-coded. were used to probe the interactions between the CCR5 21 PRF signal and miR-1224. In one, the RNAs were mixed and incubated at physio- the r.m.s.d.s of the full structure and its two sub-domains SL1 (nucleo- logical temperature (‘native’), whereas in the second, they were co- tides 8–22 and 55–75) and SL2 ( 24–53 and 76–103) are most denatured at high temperature and slowly annealed (‘refolded’). Both stable. The total energy for this structure is 224,296 kcal mol21. ab c Stimulation of 21 PRF by miR-1224 ** 30 ***12 ** As cellular gene expression tends to be regulated, it is reasonable to 60 CCR5 CCR5 ** HIV 20 HIV hypothesize that 21 PRF might be regulated in a sequence-specific 40 8 manner. This could be achieved through base-pairing interactions 20 10 4 between specific small noncoding RNAs (ncRNAs) and 21 PRF signals, –1 PRF (%) 0 0 0 0 30 10 5 2 0 CCR5 HIV-1 a hypothesis supported by the ability of antisense oligonucleotides to Scr stimulate 21 PRF (reviewed in ref. 1). Computational searches revealed 1224 1224+ Scr AGO1 AGO2 Anti-1224 Anti-1224 DGCR8 XPO5 miR-1224, miR-711 and miR-141 as potential interacting partners with the CCR5 21 PRF signal (Extended Data Fig. 4a). Transfection of HeLa 1224+ago1 siRNA cells with a miR-1224 precursor revealed concentration-dependent en- Figure 3 | Specific stimulation of CCR5-mediated 21 PRF by miR-1224. hancement of CCR5-mediated 21PRF(Fig.3a).miR-1224didnotaffect a, HeLa cells were transfected with 0–30 nmol of miR-1224 miRNA expressing HIV-1-mediated 21 PRF. Addition of a miR-1224 antagomir (anti-miR- constructs and with HIV-1 or CCR5 21 PRF reporters. b, PRF assays of 1224), or short interfering RNA knockdown of argonaute 1 reversed the HeLa cells mock-transfected (0), or transfected with scrambled miRNA (Scr), effect of miR-1224 on CCR5-mediated 21 PRF (Fig. 3b). Although a miR-1224 antagomir (anti-1224), miR-1224, miR1224 1 anti-miR-1224, or c anti-miR-1224 seemed to stimulate CCR5 21 PRF, the effect was not miR-1224 plus an siRNA directed against argonaute 1. , Ablation of the miRNA processing machinery affects 21 PRF promoted by the HIV-1 and significant (P 5 0.15). siRNA knockdown of mRNAs encoding pro- CCR5 frameshift signals. 21 PRF assays were performed using cells transfected teins involved in miRNA processing inhibited CCR5-mediated 21 with siRNAs targeting Argonaute 1 (AGO1), Argonaute 2 (AGO2), DGCR8, PRF, but stimulated HIV-1-driven 21 PRF (Fig. 3c), supporting the exportin 5 (XPO5) or scrambled sequences (Scr). Error bars denote standard model of sequence-specific regulation of 21 PRF through interactions error. *P , 0.05, **P , 0.01 (Student’s two-tailed t-test).

266 | NATURE | VOL 512 | 21 AUGUST 2014 ©2014 Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH reactions were resolved through native polyacrylamide gel electrophor- acylation analysed by primer extension (SHAPE) did not reveal differ- esis (PAGE; Extended Data Fig. 5a, b). Although miR-1224 interacted ences in RNA modification patterns in the presence of miR-1224 with the CCR5 sequence with sub-nanomolar dissociation constants in (Extended Data Fig. 5e), suggesting that miR-1224 does not function both conditions, its affinity was approximately twofold higher in the to create any new conformation(s) of the CCR5 21 PRF signal per se. ‘native’ context (Fig. 4a). miR-1224 enhanced the appearance of multiple Rather, it may stabilize a pre-existing structure(s) promoting efficient pre-existing conformers, particularly in the ‘refolded’ context, consistent 21 PRF. with the structurally complex nature of the pseudoknot. miR-1224 did not interact with a transcript containing the HIV-1 21 PRF signal Mapping the miR-1224 binding site (Extended Data Fig. 5c, d). CCR5-derived transcripts harbouring mutations in the predicted miR- An affinity capture assay to probe CCR5–miR-1224 interactions 1224 interacting sequences (mutants M1–M3, Extended Data Fig. 6a) in HeLa Tzm-BL cells expressing CCR515 revealed an approximately were assayed by EMSA. The 59 proximal mutant (M1) yielded the same threefold enrichment for CCR5 mRNA relative to cells transfected KD (0.76 nM) as the wild-type sequence under ‘refolded’ conditions, with a scrambled control (Fig. 4b). In a parallel experiment in HeLa the central sequence mutant (M2) promoted the same dissociation cells, the CCR5 21 PRF signal containing dual-luciferase reporter mRNA constant as the wild-type sequence assayed under ‘native’ conditions was enriched more than 2,000-fold compared to no-miRNA controls (0.36 nM), and the 39 proximal binding site mutant (M3) caused an (Fig. 4c), whereas the HIV-1 21 PRF reporter was only enriched about approximately 100-fold increase in KD (42 nM) (Fig. 4d, Extended tenfold. These findings demonstrate that miR-1224 specifically inter- Data Fig. 6b, c). These findings suggest that miR-1224 may participate acts with the CCR5 21 PRF signal in live cells. Selective 29-hydroxyl in a triple helical interaction with subdomains a–d of Stem 2 under native conditions (modelled in Fig. 4e and Extended Data Fig. 6c). The predicted triple-base interaction between miR-1224 and Stem 2 is con- a d sistent with the stable 39 end of the pseudoknot identified in the mole- 1.0 0.8 cular dynamics simulation (Extended Data Fig. 3d, e). The ‘torsional K (nM) 0.8 D restraint’ model of 21 PRF posits that ribosomes are directed to pause M2 = 0.75 over the slippery site by Stem 2-induced supercoiling of Stem 116. The 0.6 Native K = 0.36 nM ± 0.05 nM 0.4 M1 = 0.36 miR-1224 mapping data are consistent with this model: increased sta- 0.4 D Refolded M3 = 41.5 bility of the Stem 2 by the mRNA–miRNA interaction renders this 0.2 K = 0.75 nM ± 0.08 nM D structure even more difficult to resolve, further increasing the fraction 0.0 FractionmiRNA bound 0.0 0 5 10 15 0 25 50 75 100 of paused ribosomes. miR-141 is predicted to interact with the same CCR5 (nM) CCR5 (nM) region of the CCR5 21 PRF signal whereas miR-711 is not, suggesting b e that miR-141 enhances CCR5-directed 21 PRF in a similar manner to miR-1224. The sequence of the mature miR-1224 is 100% conserved 120 *** 3′ among higher primates (Homo, Pan, Pongo and Macaca)asisitsbinding site with the 39 end of their respective CCR5 21 PRF signals, suggesting 80

mRNA that miR-1224-mediated regulation of CCR5 21 PRF is evolutionarily mRNA/ 40 5′ conserved. It is also notable that the miR-1224/CCR5-interacting se- CCR5

GAPDH quences do not conform to established seed sequences for miRNAs.

0 3′ miR-1224 Scr mRNA suicide through 21 PRF c 5′ 3,000 *** Ribosome profiling data also revealed a cluster of ribosomes paused at the first 21 frame termination codon after the CCR5 slippery site 2,000 (Fig. 1d). A series of rabbit b-globin-derived reporters (Extended Data mRNA/ mRNA 2 luc Fig. 7a) were used to assess the effects of the CCR5 1 PRF signal on 1,000 mRNA steady-state abundance and stability. Steady-state abundance GAPDH

Dual- * of the CCR5 21 PRF-containing reporter mRNA was about 38% of the 0 readthrough control and was further decreased upon addition of miR- –1 PRF sig: CCR5 HIV-1 1224 (,10% of readthrough control), consistent with an inverse cor- miR-1224: + – + – relation between 21 PRF efficiency and mRNA abundance2 (Fig. 5a). Figure 4 | Mapping and modelling the interactions of miR-1224 with the An in-frame PTC strongly decreased mRNA abundance (,1% of read- CCR5 21 PRF signal in vitro and in live cells. a, Dilutions of CCR5 21 PRF through control). A reporter with the tumour necrosis factor (TNF)-a- 32 signal (R5) transcript were mixed with [ P]-labelled miR-1224 RNA (miR), derived AU-rich element (ARE) in its 39 untranslated region (UTR)17 and incubated at 30 uC (Native), or denatured at 90 uC and slowly cooled reduced mRNA abundance to ,22%. In combination with the CCR5 (Refolded). Samples separated through native PAGE were quantified and 21 PRF signal, mRNA abundance was reduced to ,6%, consistent data plotted onto single site binding isotherms. KD values and standard deviations are indicated. b, In vivo pull-down of native CCR5 mRNA in live with NMD and ARE-mediated decay operating independently of one cells. Biotinylated miR-1224 precursor or a scrambled biotinylated control another. The CCR5 slippery site mutant (SSM) decreased reporter (Scr) were transfected into HeLa TZM BL cells expressing CCR5. Fold mRNA abundance to ,64%, and addition of miR-1224 decreased this enrichment of affinity purified mRNAs were analysed by quantitative PCR with to ,47%. The former finding suggests that the stable mRNA pseudo- reverse transcription (qRT–PCR) using CCR5-orGAPDH-specific primer sets. knot has mRNA destabilizing activity independent of frameshifting, c, HeLa cells were co-transfected with dual-luciferase plasmids containing perhaps through the no-go mRNA pathway as described in yeast18. Its either the CCR5 or HIV-1 21 PRF signal sequences and affinity-purified stabilization by miR-1224 may enhance this process. Alternatively, mRNAs were analysed as in b. d, EMSA assays were performed using miR-1224 may promote accelerated mRNA turnover through canon- miR1224 and M1, M2 and M3 variants of the CCR5 signal using native ical miRNA-mediated translational repression19. However, if this were conditions. Single site binding isotherms generated from these data are plotted. true, miR-1224 should have reduced SSM mRNA abundance to the KD values are indicated. For a and d, n 5 6 for each sample (three times each of two technical replicates). For b and c, n 5 9 for each sample (three times same extent as the native sequence. Abundance of the CCR5 21 PRF each for three biological replicates. Error bars denote standard deviation. signal containing reporter mRNA was increased by about 4.4-fold by *P , 0.05, ***P , 0.001 (Student’s two-tailed t-test). e, Conceptual model of partial siRNA knockdown of NMD (Fig. 5b). Abundance of the SSM CCR5 pseudoknot complexed with miR-1224 (purple). construct was not increased by NMD ablation, consistent with its mRNA

21 AUGUST 2014 | VOL 512 | NATURE | 267 ©2014 Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE

a b 1.2 12 processing by siRNA knockdown of AGO1 or DGCR8 resulted in in- ** creased abundance mRNA and protein, consistent with inhibition of 1.0 10 CCR5-mediated 21 PRF under these conditions. Transcriptional arrest ** 0.8 * 8 ** time-course experiments demonstrated that theCCR5 mRNA isa direct 0.6 6 substrate for NMD (Extended Data Fig. 7c). Whereas miR-1224 de- * * * creased the abundance of CCR5 mRNA, this effect was abrogated by 0.4 4 ** Reporter mRNA addition of an anti-miR-1224 antagomir, but antagomir alone had no 2 0.2 (fold scrambled siRNA) ** effect (Extended Data Fig. 7d). siRNA knockdown of SMG1 was epi- Reporter mRNA (fraction of RTC) ** * 0 0 static to miR-1224, consistent with the mRNA destabilization activity RTC PTC SSM RTC PTC SSM RTC PTC ARE CCR5 CCR5 SSM ARE + of the miRNA being NMD-dependent. This is also consistent with CCR5CCR5 + SSM + CCR5 miR1224 miR1224 siRNA: UPF1 UPF2 20 c d findings that human UPF1 may participate in RNA silencing , with 3 CCR5 mRNA siRNA knockdown 0.6 the caveat that miR-1224 may also promote mRNA degradation by a ** ** mechanism that is independent of the NMD machinery, for example, ** 0.5 ** ** No-go decay. Combinations involving human SMG1 siRNA knock- 2 ** * 0.4 ** down plus miR-1224, human SMG1 siRNA knockdown plus the anta- 0.3 * g total protein) * gomir, or all three together were also supportive of this model. μ 0.2 1 CCR5 protein

mRNA abundances ** * (pg per (fold mock transfected) ** ** 0.1 ** 21 PRF and interleukin receptor mRNAs ** ** ** 0 0 To our knowledge before the current study, only three 21 PRF signals 5 10 15 5 20 Scr Scr 5 10 15 5 10 20 Mock AGO1 Mock AGO1 were known in mammalian genomes, all thought to be remnants of SMG1 miR-1224 SMG1 DGCR8 DGCR8 miR-1224 21–23 siRNA (nM) (nM) siRNA (nM) (nM) ancient retroviral insertional events . Potential 21 PRF signals in e 12 seven additional interleukin receptor subunit mRNAs were assayed in

80 the presence of either a scrambled siRNA control or an siRNA targeting 10 * Scrambled argonaute 1. Efficient 21PRF(.1%) was elicited by six of these (Fig. 5e). AGO1 siRNA 2 8 60 siRNA knockdown of argonaute 1 stimulated 1 PRF insome cases and inhibited it in others, consistent with sequence-specific regulation of 21 6 PRF by miRNAs. Ribosome profiling data10 revealed ribosomes paused 40 –1 PRF (%) –1 PRF (%) and directed to new reading frames at three of these signals (Extended 4 ** Data Fig. 8). Single nucleotide polymorphisms (SNPs) capable of dis- ** 20 2 ** rupting frameshifting activity were identified in all these 21 PRF signals ** ** (Extended Data Fig. 8). These may account for disease phenotypes asso- ND ** 0 0 α α β α ciated with SNPs that do not alter the primary sequences of HIV IL2Rγ IL27Rα α-164 -572 CCR5 α IL8R L22R their encoded proteins. IL8R Human IL7RMouse IL7R Human Human I To summarize, precise regulation of 21 PRF is accomplished by Human IL8RHuman sequence-specific interactions between individual 21 PRF signals and Figure 5 | miRNA mediated control of CCR5 expression through 21PRF naturally occurring miRNAs. That global ablation of miRNA processing and NMD. a, qRT–PCR analysis of rabbit b-globin reporter (see Extended differentially affected 21 PRF promoted by many different signals sug- Data Fig. 7a) steady-state abundances in HeLa cells reported as fold of the gests that miRNA-mediated control of 21 PRF is the biologically sig- native b-globin readthrough control (RTC). b, Rabbit b-globin reporter nificant norm. This confers sequence specificity, and is energetically less abundances in cells transfected with siRNAs directed against human UPF1 or expensive than producing new, or modifying pre-existing ribosomes. It UPF2 compared to cells transfected with scrambled siRNA. c, HeLa TZM cells may also enable rapid regulation of 21 PRF on specific mRNAs within were mock-transfected, transfected with scrambled siRNA, human SMG1 siRNA (5–15 nM), miR-1224 precursor (5 and 20 nM), AGO1 or DGCR8 individual cells or intracellular compartments. This solves the central siRNAs (10 nM each). The effects of transfected RNA species on CCR5 mRNA question, unanswered until now, of how 21 PRF may be regulated in a steady-state abundance were assayed by qRT–PCR. d, Quantitative sandwich sequence-specific manner, and suggests a novel mode through which enzyme-linked immunosorbent assay (ELISA) of samples from c.Cell 21 PRF signals may be targeted for therapeutic intervention. To our lysates (16 mg protein per sample) were assayed and total amounts of CCR5 knowledge, this is also one of the few demonstrations of an miRNA protein were determined relative to standards. e, Computationally identified affecting the expression of a cellular gene through an interaction with putative 21 PRF signals assayed in HeLa cells transfected with an siRNA its ORF. The discovery of 21 PRF signals in the mRNAs encoding directed against AGO1 or a scrambled siRNA control. Numbers in human cytokine receptors has a potentially profound impact on our under- IL8Ra denote the nucleotide positions of the beginning of the slippery sites in standing of immune homeostasis. Although a robust immune response the native mRNA. a–c, n 5 9 (three times on three independent biological is critical for limiting and preventing infection, left uncontrolled, it can replicates). d, n 5 8 (quadruplicate assays of two independent biological replicates). Error bars denote standard error. *P , 0.05, **P , 0.01 (Student’s rapidly result in pathology and death. Despite a large body of literature two-tailed t-test). describing how expression of small-peptide mediators of the immune response are regulated at the level of mRNA stability, this only provides a global mechanism of immune regulation by controlling production destabilizing activity being independent of 21 PRF-directed NMD. A of effector molecules. In contrast, the ability to control expression of transcriptional arrest time course experiment showed that the CCR5 cytokine receptors through 21 PRF induced NMD, and how rates of 21 PRF signal rendered the reporter mRNA a direct substrate for NMD: 21 PRF in turn may be controlled by miRNAs, represents a way for its half-life was reduced to about 180 min whereas NMD ablation in- individual recipient cells to modulate their responses to cytokines; this creased the half-life to about 380 min (Extended Data Fig. 7b). would provide the means to fine-tune immune responses, and suggests a novel molecular mechanism underlying immune desensitization. NMD and miR-1224 affect CCR5 expression The studies described here also have consequences for directing anti- In HeLa Tzm-BL cells the abundances of both CCR5 mRNA and CCR5 viral efforts. RNA viruses such as , , alpha- protein increased proportionally to the extent of NMD abrogation viruses and totiviruses require extremely stringent levels of 21 PRF for (Fig. 5c, d, siRNA_SMG1). Conversely, addition of miR-1224 decreased their propagation1. We suggest that their 21 PRF promoting structural both CCR5 mRNA and CCR5 protein abundance. Abrogation of miRNA elements may have evolved in two different ways so as to ensure set

268 | NATURE | VOL 512 | 21 AUGUST 2014 ©2014 Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH rates of 21 PRF. First, either their 21 PRF stimulatory elements should 10. Guo, H., Ingolia, N. T., Weissman, J. S. & Bartel, D. P. Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466, 835–840 (2010). not interact with any ncRNAs present in the cells in which they replicate, 11. de Messieres, M. et al. Single-molecule measurements of the CCR5 mRNA or their 21 PRF signals may have evolved in the presence of trans-acting unfolding pathways. Biophys. J. 106, 244–252 (2014). ncRNAs specific to their host cells. If the latter is true, as suggested by 12. Staple, D. W. & Butcher, S. E. Solution structure and thermodynamic investigation stimulation of HIV-1-promoted 21 PRF in response to siRNA knock- of the HIV-1 frameshift inducing element. J. Mol. Biol. 349, 1011–1023 (2005). 13. Houck-Loomis, B. et al. An equilibrium-dependent retroviral mRNA switch down of argonaute, this may define a new parameter governing host regulates translational recoding. Nature 480, 561–564 (2011). cell permissiveness, presenting a novel therapeutic targeting opportun- 14. Ritchie, D. B., Foster, D. A. N. & Woodside, M. T. Programmed 21 frameshifting ity. Thus, while the discovery of operational 21 PRF signals in cellular efficiency correlates with RNA pseudoknot conformational plasticity, not resistance to mechanical unfolding. Proc. Natl Acad. Sci. USA 109, 16167–16172 mRNAs suggests that global targeting of 21 PRF may not be the wisest (2012). approach, discovery and subsequent targeting of specific cellular miRNAs 15. Platt, E. J., Wehrly, K., Kuhmann, S. E., Chesebro, B. & Kabat, D. Effects of CCR5 and required by viruses to ensure proper rates of 21 PRF may present a CD4 cell surface concentrations on infections by macrophagetropic isolates of more narrowly targeted therapeutic option. human immunodeficiency type 1. J. Virol. 72, 2855–2864 (1998). 16. Plant, E. P. & Dinman, J. D. Torsional restraint: a new twist on frameshifting pseudoknots. Nucleic Acids Res. 33, 1825–1833 (2005). METHODS SUMMARY 17. Guhaniyogi, J. & Brewer, G. Regulation of mRNA stability in mammalian cells. Gene HeLa, HeLa Tet-Off, HeLa TZM-BL, CHO and Vero cells were cultured according to 265, 11–23 (2001). suppliers instructions. Insertions were amplified using PCR and ligated into appro- 18. Belew, A. T., Advani, V. M. & Dinman, J. D. Endogenous ribosomal frameshift signals 2 operate as mRNA destabilizing elements through at least two molecular pathways priate backbone plasmids. 1 PRF was assayed in live cells using dual-luciferase in yeast. Nucleic Acids Res. 39, 2799–2808 (2011). assays. Liquid chromatography with tandem mass spectrometry (LC-MS/MS) 19. Djuranovic, S., Nahvi, A. & Green, R. A parsimonious model for gene regulation by analysis of affinity purified CCR5/b-galactosidase fusion protein digested with miRNAs. Science 331, 550–553 (2011). Asp-N was performed at the University of Maryland Proteomics Core Facility. 20. Jin, H. et al. Human UPF1 participates in small RNA-induced mRNA Data generated by chemical modification assays were used for in silico three- downregulation. Mol. Cell. Biol. 29, 5789–5799 (2009). dimensional modelling. Sandwich enzyme-linked immunosorbent assay (ELISA) 21. Manktelow, E., Shigemoto, K. & Brierley, I. Characterization of the frameshift signal was used to monitor CCR5 protein expression, and quantitative PCR with reverse of Edr, a mammalian example of programmed 21 ribosomal frameshifting. Nucleic Acids Res. 33, 1553–1563 (2005). transcription (qRT–PCR) analyses were used to monitor mRNA steady-state 22. Wills, N. M., Moore, B., Hammer, A., Gesteland, R. F. & Atkins, J. F. A functional abundance and half-lives. In vivo affinity capture used a double stranded miR- 21 ribosomal frameshift signal in the human paraneoplastic Ma3 gene. J. Biol. 1224-5p RNA containing a sense strand 59 biotin modification and mismatch trans- Chem. 281, 7082–7088 (2006). fected into HeLa and HeLa TZM-Bl cells. Electrophoretic mobility shift assays used 23. Clark, M. B. et al. Mammalian gene PEG10 expresses two reading frames by high a synthetic hsa-miR-1224-5p and transcripts harbouring the CCR5 or HIV-1 efficiency 21 frameshifting in embryonic-associated tissues. J. Biol. Chem. 282, 21 PRF signals. 37359–37369 (2007). Supplementary Information is available in the online version of the paper. Online Content Methods, along with any additional Extended Data display items and Source Data, are available in the online version of the paper; references Acknowledgements This work was supported by grants to J.D.D. from the National unique to these sections appear only in the online paper. Institutes of Health (5 R01GM058859, R21GM068123) and from the National Science Foundation (MCB-0084559). A.T.B. was supported by NIH/NIAID T32 AI051967, and a Received 24 February 2012; accepted 29 April 2014. University of Maryland College of CMNS Hockmeyer Fellowship. S.O.S. was supported by NIH/NIGMS T32GM080201. This publication has also been funded in part with Published online 9 July 2014. federal funds from the Frederick National Laboratory for Cancer Research, NIH, under Contract HHSN261200800001E to W.K.K. This research was supported in part by the 1. Dinman, J. D. Mechanisms and implications of programmed translational Intramural Research Program of the National Institutes of Health, Center for Cancer frameshifting. Wiley Interdiscip. Rev. RNA 3, 661–673 (2012). Research to B.A.S. The content of this publication does not necessarily reflect the views 2. Advani, V. M., Belew, A. T. & Dinman, J. D. Yeast telomere maintenance is globally or policies of the DHHS, nor does mention of trade names, commercial products, or controlled by programmed ribosomal frameshifting and the nonsense-mediated organizations imply endorsement by the US Government. mRNA decay pathway. Translation 3, e24418 (2013). 3. Sulima, S. O. et al. Bypass of the pre-60S ribosomal quality control as a pathway to Author Contributions A.T.B. designed and performed experiments (Figs 1a, 3a, b, 4b, c, oncogenesis. Proc. Natl Acad. Sci. USA 111, 5640–5645 (2014). 5a, b, e and Extended Data Figs 1a, 2b, 4b, c, 5b–d and 6), analysed data, and helped in 4. Dinman, J. D. & O’Connor, M. in Recoding Expansion of Decoding Rules Enriches editing the manuscript. A.M. designed and performed experiments and analysed data Gene Expression. (Atkins, J. F. & Gesteland, R. F.) 321–344 (Springer, 2010). (Fig. 4a, d and Extended Data Figs 2c, d, 3b,c, 5 and 6). S.M. made constructs and 5. Hekman, K. E. et al. A conserved eEF2 coding variant in SCA26 leads to loss of performed the experiment shown in Fig. 1b, and helped with experiments. V.M.A. translational fidelity and increased susceptibility to proteostatic insult. Hum. Mol. designed, performed and analysed the data collected in the experiments shown in Figs Genet. 21, 5472–5483 (2012). 3c, 5c, d, and assisted with data shown in Figs 3b, 5a, b, and Extended Data Fig. 5d. 6. De Keersmaecker, K. et al. Exome sequencing identifies mutation in CNOT3 and S.O.S. assisted with sample preparation and data collection. W.K.K. and B.A.S. ribosomal RPL5 and RPL10 in T-cell acute lymphoblastic leukemia. Nature performed all of the molecular dynamics simulations and molecular modelling. J.D.D. Genet. 45, 186–190 (2013). conceived and directed the project, designed experiments and wrote the manuscript. 7. Jack, K. et al. rRNA pseudouridylation defects affect ribosomal ligand binding and translational fidelity from yeast to human cells. Mol. Cell 44, 660–666 (2011). Author Information The sequences determined in this work are deposited in the NCBI 8. Dragic, T. et al. HIV-1 entry into CD41 cells is mediated by the chemokine receptor Nucleotide database; a list is available as Supplementary Information. Reprints and CC–CKR-5. Nature 381, 667–673 (1996). permissions information is available at www.nature.com/reprints. The authors declare 9. Belew, A. T., Hepler, N. L., Jacobs, J. L. & Dinman, J. D. PRFdb: a database of no competing financial interests. Readers are welcome to comment on the online computationally predicted eukaryotic programmed -1 ribosomal frameshift version of the paper. Correspondence and requests for materials should be addressed signals. BMC Genomics 9, 339 (2008). to J.D.D. ([email protected]).

21 AUGUST 2014 | VOL 512 | NATURE | 269 ©2014 Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE

METHODS –Trp in a 200 ml baffled flask and grown overnight at 30 uC. Five ml of this culture Molecular genetics and cell culture. Escherichia coli strain DH5a was used to was then used to inoculate 500 ml –Trp in a 2 l volume baffled flask. Five litre of amplify plasmid DNA. Transformations of E. coli were performed as described culture (10 flasks of 500 ml each) was incubated overnight at 30 uC with shaking (250 r.p.m.). Cells (35 g wet weight) were collected by centrifugation (10 min at previously using the calcium chloride method. HeLa (ATCC), HeLa Tet-Off 21 (Clonetech), HeLa TZM-BL (http://aidsresearch.org/, and provided by J. DeStefano), 4,000g at 4 uC). Cells were washed and suspended in lysis buffer (0.5 g ml ) and disrupted with glass beads (0.5 mm, MiniBeadbeater, Biospec). Fos-choline-12 CHO and Vero cells were cultured at 37 uC in 5% (v/v) CO2 using DMEM medium supplemented with 10% (v/v) FBS, 13 non-essential amino acids, (Atlanta (Affymetrix) detergent was added to 0.05% and homogenate was brought to Biologicals), and 4 mM glutamine. 1% penicillin/streptomycin was used when 1.5 M NaCl by adding 5 M stock NaCl. Cells and large debris were removed by cultivating TZM-BL cells before splitting for RNA interference. HeLa Tet-Off cells centrifugation at 4,000g for 10 min. The supernatant was cleared by centrifugation were cultured with 200 mgml21 G418. at 30,000g for 30 min and loaded on 4-aminophenyl-b-D-thiogalactopyranoside- Plasmid construction. Synthetic oligonucleotides and plasmids used in this study agarose 4B (Sigma) column (1 ml bed volume). The column was washed with 50 are listed in Supplementary Materials. The PRF signal from Homo sapiens CCR5 volumes of lysis buffer containing 1.5 M NaCl. Recombinant CCR5-b-galactosidase was amplified from pCMV-XL4 (pJD819) containing the CCR5 open reading protein was eluted with 0.1 M sodium borate buffer (pH 10.0) and concentrated on frame (Origene) using oligonucleotides with BamHI and SalI restriction sites. PCR Ultracel-50K filter units (Amicon). b-galactosidase activity was monitored during products were ligated into p2luci (pJD175e)10 and clones were confirmed by sequenc- purification using b-galactosidase Assay Kit (Pierce). Eluted proteins were fractio- ing (Genewiz). The QuikChange Lightning SDM kit (Agilent) was used to insert a nated through 8% SDS–PAGE. A protein band of expected size (124.7 kDa) was immediately after the Renilla open reading frame and to make muta- cut from gel and analysed by mass spectrometry at the UMD Mass Spectroscopy tions to the CCR5 21 PRF signal. pTRE-Rb (pJD976), pTRE-Rb-ARE (pJD975), Core Center. and pTET-Off (pJD979) were gifts from G. Brewer. Oligonucleotides (IDT) were Mass spectroscopic analysis of CCR5/b-galactosidase 21 PRF fusion protein. chosen to include 23 nucleotides of the b-globin exon 1, 36 nucleotides of Renilla Sequencing grade Asp-N was purchased from Promega (Madison, WI). Triethylam- luciferase on the 59 side; and 23 nucleotides of b-globinexon1,15nucleotidesoffirefly monium bicarbonate, and iodoacetamide were purchased from Supelco. DTT was luciferase on the 39 side. Three hundred and thirty three nucleotides of b-globin from Sigma. Formic acid, optima grade water and acetonitrile are from Fisher containing sequence was PCR-amplified from the CCR5 dual luciferase plasmid, Scientific. In-Gel digestion was carried out following manufacturer’s protocol. 3 purified, and used as site-directed mutagenesis primer for the QuikChange Lightning Briefly, gel was cut into pieces ,1mm , destained with 50% acetonitrile (ACN) SDM kit (Agilent) to generate CCR5 containing b-globin constructs with and without in water, dehydrated with ACN, then rehydrated with 25 mM DTT, incubated at the TNFa ARE. Oligonucleotides encoding the 21 PRF signals from the human IL- 65 uC for 20 min, washed, dehydrated with ACN. Samples were rehydrated with 2c, human and mouse IL-7a, and human IL-8a,IL-8b,IL-22a and IL-27a receptor 50 mM iodoacetamide, incubated at room temperature in the dark for 25 minutes, 21 chains were purchased containing 15–18 bases of internal overlap, PCR extended washed with water, and dehydrated with ACN. Gel was then rehydrated with 10 ng ml into a single BamHI- and SalI-containing product, and ligated into p2luci. Primers Asp-N solution, and incubated overnight at 37 uC. Peptides were extracted twice by bGal-CCR5-Forward and bGal-CCR5Rev-Mut containing the CCR5 21 PRF sig- sequential addition of 100 ml 50% ACN, 100 ml 20% formic acid, and 100 ml. Extracts nal were synthesized by IDT. The CCR5 frameshift signal sequence was amplified were combined and concentrated with speedvac to dryness. by PCR from pJD827 using these primers and Pfu Ultra II Hotstart 2X Mastermix Liquid chromatography with tandem mass spectrometry (LC-MS/MS) ana- (Agilent). The PCR product was digested with BamHI and KpnI and purified with lysis. Peptides were redissolved in 50 ml solvent A and loaded into a trapping GeneJet PCR purification Kit (ThermoFisher Fermentas). Plasmid, purified on cartridge (0.3 3 5 mm, Agilent, Palo Alto, CA) with autosampler, and desalted 21 0.8% agarose gel and extracted with GeneJet Gel extraction Kit (ThermoFisher with 100% solvent A at 10 ml min for 10 min. Peptides were separated using a Fermentas). The CCR5 21 PRF signal containing PCR product was ligated into Zorbax 300 B-C18 nano column (3.5 mm, 0.075 3 150 mm, Agilent Technologies, BamHI/KpnI digested pTI2524 and the resulting clone named pJD1930. Thus, the Palo Alto, CA) with a binary gradient consisting of A: 0.1% formic acid with 2.5% expression of b-galactosidase from pJD1930 plasmid was dependent on 21 PRF acetonitrile and B: 0.1% formic acid and 97.5% acetonitrile at a flow rate of 21 event. 200 nl min . A gradient was run from 5% B to 25%B over 60 min, followed by 21 PRF assays. For dual-luciferase reporter assays, HeLa, CHO or Vero cells a 50-min gradient to 50% B, and 10-min gradient to 80%B. The gradient was held were seeded into 24-well plates (3 3 104–5 3 104 cells per well) and cultured for at 80% B for 5 min before returning to 5% B. Positive ion mass spectra of Nano LC 2–36 h before transfection with dual-luciferase plasmids. Cells were transfected eluents were acquired with a Thermo Finnigan LTQ Orbitrap XL mass spectro- with dual luciferase plasmids using 0.6 ml of the FuGene 6 reagent in 20 mlof meter with data dependent analysis in which a full Scan FT analysis of m/z DMEM without FBS. 200–400 ng of plasmid was used per well and incubated for 400–1,800 with resolution at 60,000 (m/z 400) in the Orbitrap is followed by 20–60 min atroom temperature. At24–48 h post-transfection, dual luciferase assays up to 5 MS/MS analyses in the linear ion trap at unit mass resolution. Peptides were performed using the standard dual luciferase protocol (Promega) with slight eluting from the HPLC column that have ions above 10,000 arbitrary intensity modifications using a Turner Biosystems GloMax-Multi Microplate Multimode units and charges higher than 1 trigger the ion trap to isolate the ion and perform Reader. Changes to the standard protocol are as follows: lysates were resuspended an MS/MS experiment scan after the MS full scan. Dynamic exclusion was set at in 200 ml13 lysis buffer before reading using 15–35 ml lysate per well, and 50 mlof 1 repeat count and 60 s exclusion time. each reagent per well were used with a 10-s integration and 2-s pause between reads. Data processing. A common contaminants protein database was downloaded Sample sizes were determined following the rigorous criteria previously described25. from http://maxquant.org. Sequence of CCR5-LacZ was inserted into the database A minimum of 15 (three independent biological replicates assayed in quintuplicate) before the database was imported into the in-house Mascot Server and Proteome were performed per sample, or until normal distributions were obtained. Statistical Discoverer 1.4 program. Raw data from LTQ Orbitrap was searched against the analyses were performed as previously reported25. modified common contaminant database using Sequest HT and Mascot with To monitor 21 PRF in vitro, linear DNA templates were prepared by PCR AspN as digestion enzyme with up to 1 missed cleavage. Peptide mass tolerance reactions using plasmids pJD175f, pJD187, pJD827, pJD1078, and the T7-Kozak- was 6 20 p.p.m., and fragment mass tolerance was 6 0.8 Da. Carbamidomethyl Renilla/PolyA-Stop-firefly primer set. PCR products were purified by agarose (C), Deamidated (NQ), Oxidation (M), Acetyl (N-term) are set as variable mod- gel electrophoresis. Capped mRNAs were synthesized using the mMESSAGE ifications. Decoy search was performed and false discovery rate of ,1% required mMACHINE Kit (Ambion). Transcription reactions (40 ml) were assembled con- for positive Identification. Search results showed 36% sequence coverage of the taining 2 mg linearized DNA templates, incubated at 37 uC for 4 h, and mRNAs CCR5-LacZ protein. The N-terminal peptide was unambiguously identified, with were purified using a MEGAclear mRNA purification kit (Applied Biosystems). loss of the initial methionine and acetylation of the N-terminal threonine. mRNA concentrations were calculated from OD260 nm readings. In vitro trans- Quantification of protein levels by ELISA. HeLa TZM-BL were transiently lation reactions were assembled in a total volume of 25 ml containing 0.2 to 5 mg transfected with siRNAs against SMG1, AGO1, DGCR8, miRNA precursor of hsa- capped mRNAs using the Retic Lysate IVT Kit (Applied Biosystems). Reactions miR-1224-5p, scrambled sequence or mock transfected using HiPerFect transfec- contained [35S]methionine (1,175 Ci mmol21, Perkin Elmer) and the 203 –Met tion reagent (Qiagen). The cells were lysed using 0.5% Fos-choline-12 (Affymetrix) Translation mix provided by the kit. Reactions were incubated at 30 uC for 90 min. in 50 mM Tris-HCl, pH 8.0 lysis buffer. Protein concentrations of cell lysates were Translation products were resolved through 12% SDS–PAGE, and translation determined by Bradford assay (Bio-Rad). Concentrations of CCR5 protein were products were visualized using a phosphorimager. measured in 16 mg total protein of cell lysates using the CCR5 sandwich ELISA kit Production and purification of CCR5/b-galactosidase 21 PRF fusion protein (US Biological Life Sciences), according to the manufacturer’s protocol. All assays in yeast. Yeast strain JD1585 was created by transformation of yeast strain JD1370 were repeated in triplicate. with pJD1930 and selection on defined medium lacking tryptophan (2Trp). JD1585 Quantitative real-time reverse-transcription PCR. For qPCR analyses of the was inoculated from overnight grown –Trp plate into 2 ml of –Trp and incubated b-globin based reporters, assays were performed as previously described26 with at 30 uC in shaker overnight. This culture (0.5 ml) was used to inoculate 50 ml of the following modifications. The dual luciferase readthrough control was used for

©2014 Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH co-transfections rather than EGFP. RNA samples for qPCR were isolated using 30 min in HB buffer (50 mM Tris, pH 7.5, 0.1 mM EDTA, 10 mM NaCI, 10 mM the RNAqueous kit (Ambion), digested with rDNase (Ambion) and analysed MgCI2, 3% glycerol, 0.05% bromophenol blue) and immediately separated using agarose gel electrophoresis and/or OD260/280 measurements. The remaining through 10% native polyacrylamide gels. For experiments with RNA refolding samples were reverse transcribed using the iScript cDNAkit (Bio-Rad).Theresulting the RNA mix was incubated at 90 uC for 5 s, cooled quickly to 60 uC and then complementary were diluted to 1:50–10,000 depending on mRNA concen- slowly to 37 uC (0.02 uC per s). The electrophoresis buffer was 34 mM Tris-66 mM tration. Reactions were performed using 10 ml of LightCycler 480 SYBR Green I HEPES pH 7.5, 0.1 mM EDTA, 10 mM MgCl2. To confirm that multiple bands Master mix (Roche), 0.2–0.3 mM of each oligonucleotide, 2 ml of cDNA, and water of CCR5 in native gel were RNA conformers, labelled CCR5, miR-1224-5p RNA, to 20 ml per well. Samples were assayed for genomic DNA contamination by and the mix of these RNAs were separated through an 8% denaturing gel and visu- performing the assay using wells containing 1–2 ml of digested mRNA instead of alized using a phosphoimager. Single site binding isotherms were generated using cDNA. Reactions were amplified using either a Roche 480 LightCycler or a Bio- GraphPad Prizm. Rad CFX 96 thermocycler as follows: 25 uC for 10 s, 95 uC for 5 min, followed by Chemical modification assays. Dimethylsulphate, kethoxal and CMCT were 45–60 cycles of 95 uC for 10 s, 52 uC for 15 s, and 72 uC for 15 s. Melting curves were used to probe the solvent accessibility of individual bases29,30, while NMIA was monitored by taking readings every 0.5 uC from 52–95 uC. The time-course qPCR used to probe ribose 29-OH groups31 in [32P]-labelled run-off transcripts. In a analyses were performed with 53 uC and 54 uC annealing temperatures and 20 s separate experiment, synthetic CCR5 (139 nM) and miR-1224 (1.1 mM) RNAs extension time with no significant changes in results. For qPCR analyses of the full were annealed at 37 uc for 30 min in 33 mM HEPES, pH 8.0, 33 mM NaCl, 10 mM length CCR5 mRNA, assays were performed as described for the b-globin assays, MgCl2. Structure probing with NMIA and reverse transcription reactions were but using oligonucleotides specific for b-micoglobulin and/or GAPDH and subsequently performed as described31. Products were separated through 8% CCR527. Reactions were amplified using the same conditions as for the b-globin denaturing polyacrylamide gels, and visualized using a phosphorimager. constructs except all reactions used 20 s at 55 uC for extension. All assays were Molecular modelling and molecular dynamics simulation protocol. Three performed at least three times. dimensional modelling based on the secondary structure shown in Fig. 2 was Time course assays. mRNA decay time course assays using the tetracycline re- performed using programs RNAComposer32,33 and RNA2D3D34. The prelimi- pressible rabbit b-globin reporter were performed as previously described with nary 10 alternative structures were obtained from the RNAComposer web server minor changes26. To monitor time-dependent decay of the CCR5 mRNA, HeLa (http://euterpe.man.poznan.pl/Home). However, most of the models produced Tzm1 cells were first transfected with either scrambled or hSMG1 siRNAs as described were knotted in 3D and the best two alternatives had the backbones of loop L above. Forty eight hours after transfection, cells were treated with actinomycin D (see Fig. 2) pass through a helix in S1. RNA2D3D was used to separate the tangled (10 mM). In all experiments, RNA isolations were performed immediately at each structures. The corrected models (a total of six alternatives) were minimized, time point after transcriptional arrest using the RNAqueous kit (Ambion) rather equilibrated in solvent with some restraints meant to maintain the base pairs than after freezing samples on dry ice. strained in the preliminary models, and subjected to short (25 ns) molecular RNAi assays. Cells were transfected with RNA oligonucleotides specific to UPF1, dynamics runs without any restraints, after which the best model (average UPF2, SMG1, argonaute 1, argonaute 2, DGCR8, exportin 5, or random oligonu- structure with the lowest minimized energy) was subjected to extended molecular cleotides using the HiPerFect transfection reagent (Qiagen). Initial transfections dynamics simulation lasting a total of 80 ns. Given how structurally tight the were performed at 1, 5, 10 nM and 20 nM for optimization. Final transfections initial models were and the extent of manual editing required, the extended were performed at 5 nM. The MAPK cell-death positive control was used for simulation was necessary to stabilize the model and obtain an accurate average optimization as well as qPCR quantification of the targeted mRNA. Most final structure. transfections were performed at 5 nM. Transfections were performed into Molecular dynamics simulations were performed with Amber 12 with the ff10 30,000–40,000 cells (via haemocytometer) in 500 ml of DMEM1FBS using Cornell force field for RNA. The Particle Mesh Ewald (PME) summation method 100 ml of DMEM without FBS and 3 ml of HiPerFect reagent after incubating was used to calculate the electrostatic interactions35–37. Following minimization, for 15–20 min at room temperature. Media was replaced with fresh DMEM1 the RNA models (103 nucleotides long) were solvated in TIP3P waters with 102 FBS after 8–12 h. Assays were performed 36–72 h after siRNA transfection. When neutralizing Na1 ions and additional Na1/Cl- ion pairs added to the solvent box other plasmids were also transfected, they were performed separately 24–48 h to achieve a relative salt concentration of 0.1 M (from 51 to 57 depending on the after siRNA transfection using the FuGENE 6 (Roche) reagent. system). The multi-step equilibration protocol started with solvent equilibration miRNA transfection. Cells were transfected with the following miRNA precur- (minimization, heating and short dynamics stages) with the RNA being subject to sors: hsa-miR-141, hsa-miR-711, hsa-miR-1224-5p, and hsa-miR-1205 using either slowly released motion restraints (holding). A periodic boundary condition was siPORT, lipofectamine, or HiPerfect reagent (Applied Biosystems/Ambion). When used in the simulation. The entire system was equilibrated at 300K using the performing miRNA transfections with all miRNAs, 10–30 nM was used, depending Berendsen thermostat38, a cut-off of 9 A˚ was used with the non-bonded interac- on cell viability. When performing the miRNA titration, four 1:10 dilutions were tions and SHAKE was applied to all hydrogen bonds in the system. Pressure was used starting at 5 nM. Transfections were performed into 20,000–40,000 cells in maintained at 1.0 Pa using the Berendsen algorithm39. The last phase of the equili- 500 mlDMEM1FBS using 25 ml of DMEM without FBS and 1 ml of siPORT reagent bration was performed for 2.0 ns with distance restraints placed on the hydrogen after incubating for 20 min at room temperature per well. Media was replaced with bonds of six base-pairs that were affected by the manual editing of the preliminary fresh DMEM1FBS after 8–12 h. Dual luciferase plasmid transfections were either RNA models (base pairs: G8-U75, G12-U73, G13-C72, A14-U63, G16-U61, G17- performed at the same time or 24 h later using the FuGENE 6 reagent. When C60). Following equilibration, the production simulation was performed with HiPerfect was used, the conditions followed those used for siRNA transfections. 2 fs time steps to obtain short trajectories of 25 ns. The total sizes of the systems Affinity purification of miRNA targets. Double stranded miR-1224-5p RNA subjected to molecular dynamics ranged from 71,462 to 79,346 atoms, including containing a sense strand 59 biotin modification and mismatch were purchased the 3,290 RNA atoms of the CCR5 model. The solvent boxes had a clearance from IDT. Pull-down experiments were performed as previously described28 with distance of 10 A˚ (also named ‘buffer’ in Amber, that is, the minimum distance the following modifications: Streptavidin agarose beads were pre-washed as des- between the solute and the solvent box wall). Analyses of the molecular dyna- cribed in 500 ml aliquots and stored for up to a week at 220 uC. Five washes with mics results excluded the equilibrations and were performed using the ptraj lysis buffer were performed rather than 3; after the final wash, 450 ml buffer was module of Amber. removed, samples were incubated for 5 min at 80u and quenched on ice for 2 min Multiple of CCR5 coding sequences. Sequences for CCR5 before isolating RNA. RNA isolations were performed using the RNAqueous kit coding sequences were chosen for 39 primates and a naive BLAST search extended (Ambion). qRT–PCR was used to observe mRNA isolation as previously described, these candidates to 45 sequences with the CCR5 mRNA from Danio rerio chosen as using oligonucleotide primers specific for CCR5, Renilla luciferase, firefly luciferase an outgroup. Sequences were aligned with Clustal W40 using default parameters (gap and GAPDH for normalization. opening penalty: 15, extension penalty: 6, IUB matrix). The resulting alignment was miR-1224 electrophoretic mobility shift assays. Transcripts harbouring the manually edited using seaview41 to trim especially long 39 UTR and long 59 UTR CCR5 21 PRF signal (247 nucleotides), mutants thereof, or HIV-1 21 PRF signal sequences (notably Danio rerio, sheep, goat, rat, and mouse). A guide tree was con- (315 nucleotides) were synthesized from DNA templates using T3 RNA poly- structed using the default parameters of BioNJ42 which was used as a starting tree merase using MEGAscript, and purified using MEGAClear kits (Ambion). HPLC for PhyML43. The GTR model was used with a nearest neighbour and SPR search purified miR-1224-5p (59GUGAGGACUCGGGAGGUGG39) RNA oligonu- strategy, 100 bootstrap replicates were performed. Branch lengths and bootstrap cleotide was purchased from Integrated DNA Technologies, and was 59-[32P]- support are supplied above branches where appropriate. labelled by using the KinaseMax kit (Ambion). Small amounts of the CCR5 and Analysis of ribosome profiling data. Sequence reads from Guo et al.10 and Hsieh HIV-1 derived mRNAs were also 59-[32P]-labelled and used as markers. CCR5 or et al.44 were downloaded from the Gene Expression Omnibus45. The ribosomal HIV RNA dilutions at 23 final concentration were mixed with equal volumes of footprinting data was extracted and aligned against a library of human coding 1.0 nM 59-[32P]-labelled 1224-5p RNA. Samples were incubated at 37 uC for sequences from the Mammalian Gene Collection46 using Bowtie247.

©2014 Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE

The following options were used: 33. Popenda, M. et al. Automated 3D structure composition for large RNAs. Nucleic Acids Res. 40, e112 (2012). cd ${REFDIR} && ${BINDIR}/bowtie2 -x ${MRNA_LIB} -U ${READS} \ 34. Popenda, M. et al. RNA FRABASE 2.0: an advanced web-accessible database with -S ${DATADIR}/${READS}.sam -D 10 -R 2 -N 0 -L 22 -i S,0,2.50 \ the capacity to search the three-dimensional fragments within RNA structures. -3 5 --no-unal --un-gz ${DATADIR}/${READS}-unaligned.txt.gz \ BMC Bioinformatics 11, 231 (2010). --al-gz ${DATADIR}/${READS}-aligned.txt.gz --no-head \ 35. Martinez, H. M., Maizel, J. V. Jr & Shapiro, B. A. RNA2D3D: a program for generating, . viewing, and comparing 3-dimensional models of RNA. J. Biomol. Struct. Dyn. 25, --no-sq -p 20 2 ${DATADIR}/${READS}.stderr \ 669–683 (2008). 1.${DATADIR}/${READS}.stdout 36. Wang, J. M., Cieplak, P. & Kollman, P. A. How well does a restrained The resulting SAM output was parsed with a simple Perl script (available at: electrostatic potential (RESP) model perform in calculating conformational energies of organic and biological molecules? J. Comput. Chem. 21, 1049–1074 https://github.com/abelew/prfdb/tree/master/ingolia) which extracted reads and (2000). sorted them by position and apparent reading frame. The resulting data structure 37. Essmann, U. et al. A smooth particle mesh Ewald method. J. Chem. Phys. 103, was translated into JSON and plotted with the flot library. 8577–8593 (1995). Genome ontology analysis. Homo sapiens accessions were collected from the 38. Case, D. A. et al. AMBER 12 (Univ. San Francisco; 2012). PRFdb which lie more than one standard deviation from mean with respect to 39. Berendsen, H. J. C., Postma, J. P. M., Vangunsteren, W. F., Dinola, A. & Haak, J. R. predicted MFE value and randomized Z-score. This population of 1,846 acces- Molecular-dynamics with coupling to an external bath. J. Chem. Phys. 81, 3684–3690 (1984). sions was provided to the FuncAssociate48 analysis tool. 40. Zhang, Y. W., Ryder, O. A. & Zhang, Y. P. Sequence evolution of the CCR5 chemokine 24. Ysla, R. M., Wilson, G. M., Brewer, G. & Riza, Y. M. Assays of adenylate uridylate-rich receptor gene in primates. Mol. Biol. Evol. 16, 1145–1154 (1999). element-mediated mRNA decay in cells. Methods Enzymol. 449, 47–71 (2008). 41. Thompson, J. D., Higgins, D. G. & Gibson, T. J. CLUSTAL W: improving the sensitivity 25. Grentzmann, G., Ingram, J. A., Kelly, P. J., Gesteland, R. F. & Atkins, J. F. A dual- of progressive multiple sequence alignment through sequence weighting, luciferase reporter system for studying recoding signals. RNA 4, 479–486 (1998). position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 26. Dinman, J. D. & Wickner, R. B. Ribosomal frameshifting efficiency and Gag/Gag-pol 4673–4680 (1994). ratio are critical for yeast M1 double-stranded RNA virus propagation. J. Virol. 66, 42. Gouy, M., Guindon, S. & Gascuel, O. SeaView version 4: A multiplatform graphical 3669–3676 (1992). user interface for sequence alignment and phylogenetic tree building. Mol. Biol. 27. Jacobs, J. L. & Dinman, J. D. Systematic analysis of bicistronic reporter assay data. Evol. 27, 221–224 (2010). Nucleic Acids Res. 32, e160–e170 (2004). 43. Gascuel, O. BIONJ: an improved version of the NJ algorithm based on a simple 28. Mattapallil, J. J. et al. Massive infection and loss of memory CD41 T cells in multiple model of sequence data. Mol. Biol. Evol. 14, 685–695 (1997). tissues during acute SIV infection. Nature 434, 1093–1097 (2005). 44. Guindon, S. et al. New algorithms and methods to estimate maximum-likelihood 29. Orom, U. A. & Lund, A. H. Isolation of microRNA targets using biotinylated synthetic phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321 microRNAs. Methods 43, 162–165 (2007). (2010). 30. Hu, J. C. & Dahlberg, J. E. Structural features required for the binding of tRNATrp to 45. Hsieh, A. C. et al. The translational landscape of mTOR signalling steers cancer avian myeloblastosis virus reverse transcriptase. Nucleic Acids Res. 11, initiation and metastasis. Nature 485, 55–61 (2012). 4823–4833 (1983). 46. Barrett, T. et al. NCBI GEO: archive for high-throughput functional genomic data. 31. Peattie, D. A. & Gilbert, W. Chemical probes for higher-order structure in RNA. Proc. Nucleic Acids Res. 37, D885–D890 (2009). Natl Acad. Sci. USA 77, 4679–4682 (1980). 47. Temple, G. et al. The completion of the Mammalian Gene Collection (MGC). 32. Wilkinson, K. A., Merino, E. J. & Weeks, K. M. Selective 2’-hydroxyl acylation Genome Res. 19, 2324–2333 (2009). analyzed by primer extension (SHAPE): quantitative RNA structure analysis at 48. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nature single nucleotide resolution. Nature Protocols 1, 1610–1616 (2006). Methods 9, 357–359 (2012).

©2014 Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH

a 0.05 0.386 XM_691452_Danio_rerio 0.248 NM_001045834_Gallus_gallus 0.026 AM051302_Oryctolagus_cuniculus 100 DQ142892_Oryctolagus_cuniculus 0.025 0.222 0.040 51 0.021 DQ017768_Sylvilagus_floridanus 0.023 99 GU047869_Lepus_townsendii 0.050 GU047863.1_Sylvilagus_brasilie 51 0.143 NM_001097580_Felis_catus 0.050 0.104 NM_001012342_Canis_lupus 0.048 AF162010_Lemur_catta 74 0.058 EF507706_Equus_asinus NM_001091534_Equus_caballus 100 AF162013_Varecia_variegata 0.022 100 0.013 FJ008056_Ovis_aries AF161946_Aotus_trivirgatus 67 0.062 HQ650162_Capra_hircus 0.010100 98 0.009NM_001011672_Bos_taurus 35 0.010 0.070 0.018 AF161945_Alouatta_caraya 29 EF524204_Loxodonta_africana 20 0.015 0.035 31 100 AF177880_Alouatta_seniculus 61 0.059 0.022 NM_053960_Rattus_norvegicus 0.021 0.012 100 NM_009917_Mus_musculus AF177887_Callicebus_moloch 0.048 AF162010_Lemur_catta 100 0.011 100 AF162013_Varecia_variegata AF161934_Callithrix_jacchus AF161934_Callithrix_jacchus 0.014 16 0.008 79 AY278749_Leontopithecus_chrysomelas 42 AY278749_Leontopithecus_chryso 48 AF161946_Aotus_trivirgatus AF177884_Symphalangus_syndactylus 0.018 AF161923_Saguinus_sp. 0.013 0.020 0.017 0.016 AF161945_Alouatta_caraya AF075451_Hylobates_leucogenys 100 91 97 99 AF177880_Alouatta_seniculus 0.012AF177887_Callicebus_moloch AF161887_Hylobates_concolor 0.036 0.039 DQ993155_Saimiri_sciureus 43 AF177899_Hylobates_moloch EF394447_Homo_sapiens 100 91 AF252551_Hylobates_agilis AF005659_Gorilla_gorilla 99 39 AF005663_Pan_troglodytes AF161907_Pongo_pygmaeus 0.024 AF177893_Pan_paniscus 100 0.009AF161907_Pongo_pygmaeus AF005659_Gorilla_gorilla 78 45 63 AF075451_Hylobates_leucogenys AF005663_Pan_troglodytes AF161887_Hylobates_concolor AF177899_Hylobates_moloch 0.02 45 AF177893_Pan_paniscus 73 82 56 AF252551_Hylobates_agilis 0.016 EF394447_Homo_sapiens 0.012 AF177884_Symphalangus_syndacty 99 AF075444_Pygathrix_roxellana 100 AF177886_Miopithecus_talapoin AF075448_Pygathrix_nemaeus 55 AF075442_Trachypithecus_franco AF075442_Trachypithecus_francoisi AF075443_Trachypithecus_phayre 49 AF075443_Trachypithecus_phayrei AF162002_Colobus_guereza 74 AF177886_Miopithecus_talapoin AF075444_Pygathrix_roxellana AF005660_Macaca_fascicularis 82 U77672_Macaca_mulatta 53AF075448_Pygathrix_nemaeus AF079472_Cercocebus_torquatus 45 AF162002_Colobus_guereza U83324_Cercopithecus_aethiops 96 AF081578_Cercopithecus_aterrim U83324_Cercopithecus_aethiops AF005658_Papio_hamadryas AF177877_Mandrillus_sphinx AF177891_Theropithecus_gelada AF105288_Papio_hamadryas 95 AF177897_Lophocebus_aterrimus NM_001112629_Papio_anubis AF105288_Papio_hamadryas AF177897_Lophocebus_aterrimus 90 AF177877_Mandrillus_sphinx 66 NM_001112629_Papio_anubis 41 AF005658_Papio_hamadryas AF177891_Theropithecus_gelada b SS S1-5’ S2-5’ S1-3’ L S2-3’

Papio_hamadryas TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Gorilla_gorilla TTTAAAAGCCAGGACGGTCACGTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCGTCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Macaca_fascicularis TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Pan_troglodytes TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCATCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Trachypithecus_francoisTTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Trachypithecus_phayrei TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Pygathrix_roxellana TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Pygathrix_nemaeus TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Hylobates_leucogenys TTTAAAAGCCAGGACGGTCACCTTTGGGGTAGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCATCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Cercocebus_torquatus TTTAAAAGCCAGGACAGTCACCTTTGGGTTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Cercopithecus_aterrimusTTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Papio_hamadryas TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Hylobates_concolor TTTAAAAGCCAGGACGGTCACCTTTGGGGTAGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCATCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Pongo_pygmaeus TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCATCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Saguinus_sp. TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGTT GGTGGCTGTATTTGCCTCTCTCCCAGGAATCATCTTTACCAGGTCTCAAAAAGA Callithrix_jacchus TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTATTTGCCTCTCTCCCAGGAATCATCTTTACCAGGTCTCAAAAAGA Alouatta_caraya TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGACCACTTGGGT GGTGGCTGTATTTGCCTCTCTCCCGGGAATCATCTTTACCAGGTCTCAAAAAGA Aotus_trivirgatus TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTACTTGCCTCTCTCCCAGGAATCATCTTTACCAGGTCTCAAAAAGA Colobus_guereza TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Lemur_catta TTTAAAAGCCAGGACGGTCACCTTTGGGGTAGTGACAAGTGGGGTGACCTGGAT AGTGGCTGTGTTAGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Varecia_variegata TTTAAAAGCCAGGACGGTCACCTTTGGGGTAGTGACAAGTGGGGTGACCTGGAT AGTGGCTGTGTTAGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Mandrillus_sphinx TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Alouatta_seniculus TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGACCACTTGGGT GGTGGCTGTATTTGCCTCTCTCCCGGGAATCATCTTTACCAGGTCTCAAAAAGA Symphalangus_syndactyluTTTAAAAGCCAGGACGGTCACCTTTGGGGTAGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCATCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Miopithecus_talapoin TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Callicebus_moloch TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTATTTGCCTCTCTCCCGGGAATCATCTTTACCAGGTCTCAAAAAGA Pan_paniscus TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCATCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Lophocebus_aterrimus TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Hylobates_moloch TTTAAAAGCCAGGACAGTCACCTTTGGGGTAGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCATCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Theropithecus_gelada TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Hylobates_agilis TTTAAAAGCCAGGACAGCCACCTTTGGGGTAGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCATCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Oryctolagus_cuniculus CTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGCGGGGTCACCTGGGT GGCGGCCATCTTGGTGTCTCTCCCAGACATCATCTTTACCAGATCTCAAAAAGA Leontopithecus_chrysomeTTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTATTTGCCTCTCTCCCAGGARTCATCTTTACCAGGTCCCAAAAAGA Sylvilagus_floridanus TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGCGGGGTCACCTGGGT GGCGGCCATCTTGGTGTCTCTCCCAGACATCATCTTTACCAGATCTCAAAAAGA Oryctolagus_cuniculus CTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGCGGGGTCACCTGGGT GGCGGCCATCTTGGTGTCTCTCCCAGACATCATCTTTACCAGATCTCAAAAAGA Saimiri_sciureus TTTAAAAGCCAGGACGGTCACGTTTGGGCTGCTGACAAGTGTGATCACTTGGGT GGTGGCTGTATTCGCCTCTCTCCCAGGAATCATCTTCACCAGGTCTCAAAAAGA Homo_sapiens TTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCGTCTCTCCCAGGAATCATCTTTACCAGATCTCAAAAAGA Equus_asinus TTTAAAAGCCAGGACGGTCACCTTTGGGCTGATGACAAGTGGGGTCACTTGGGC GGTGGCTGTGTTTGTCTCTCTCCCAGGAATCATCTTTACCAAATCCCAAAAAGA Loxodonta_africana TGTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGGGGTCACCTGGGT GGTGGCTGTGCTTGTCTCTCTCCCAGGGATCATCTTTACCAGATCCCAAAAAGA Ovis_aries TTTAAAAGCCAGGACAGTCACCTTTGGGGCAGTGACAAGTGGGGTCACGTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATTATCTTTACCAAATCCCAAAAAGA Sylvilagus_brasiliensisCTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACAAGCGGGGTCACCTGGGT GGCGGCCATATTTGCCTCGGTCCCGGGAATCATGTTTACCAGGTTCCAAAAAGA Lepus_townsendii CTTAAAAGCCAGGACGGTCACCTTTGGGGTGGTGACGAGTGTGATCACCTGGGT GGCGGCCATCTTGGTGTCTCTCCCGGGCATCATCTTTACCAGATCTCAAAAAGA Capra_hircus TTTAAAAGCCAGAACAGTCACCTTTGGGGCAGTGACAAGTGGGGTCACGTGGGT GGTGGCTATGTTTGCCTCTCTCCCAGGAATTATCTTTACCAAATCCCAAAAGGA Bos_taurus TTTAAAAGCCAGAACAGTCACCTTTGGGGCGGCGACAAGTGTGGTCACCTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATTATCTTTACCAAATCCCAAAAAGA Canis_lupus TTCAAAAGCCCGGACAGTCACCTTTGGGGTGGTGACAAGTGGGATCGCCTGGGT GGTGGCCGTGCTCGCCTCTTTCCCCAGGATCATCTTCACCAGATCCCAAAAAGA Gallus_gallus TTTAAAAGCTAGGACAGTTACCTACGGCATCCTCACCAGCATTGTCACGTGGGC TGTTGCTCTTTTTGCCTCTGTTCCTGGGATAGTATTTCACAAAACTCAACAGGA Equus_caballus TTTAAAAGCCAGGACGGTCACCTTTGGGCTGATGACAAGTGGGGTCACTTGGGC GGTGGCTGTGTTTGTCTCTCTCCCAGGAATCATCTTTACCAAATCCCAAAAAGA Felis_catus TTTAAAGGCCAGGACGGTCACCTTTGGGGCGGTGACAAGCGCGGTCACCTGGGC GGCGGCCGTGGTCGCCTCTCTGCCGGGGTGCGTCTTTAACAAGTTGCAGTGGGA Papio_anubis TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Mus_musculus TTTAAAAGTCAGAACGGTCAACTTTGGGGTGATAACAAGTGTAGTCACTTGGGC GGTGGCTGTGTTTGCCTCTCTCCCAGAAATAATCTTTACCAGATCTCAGAAAGA Rattus_norvegicus TATAAAAGCCAGAACAGTCAACTTTGGGGTAATAACAAGTGTAGTCACTTGGGT GGTGGCTGTGTTTGTCTCTCTCCCAGAAATAATCTTTATGAGATCTCAAAAAGA Macaca_mulatta TTTAAAAGCCAGGACAGTCACCTTTGG GGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAGGAATCATCTTTACCAGATCTCAGAGAGA Cercopithecus_aethiops TTTAAAAGCCAGGACAGTCACCTTTGGGGTGGTGACAAGTGTGATCACTTGGGT GGTGGCTGTGTTTGCCTCTCTCCCAAGAATCATCTTTACCAGATCTCAGAGAGA Danio_rerio AAATAAAAACCGCAGAAGCGTCTACGCTGCATCGTTATCTGTGGCCGTCTGGAT TATCAGCCTTCTCGAAAGCCTCGAATATCTCATCCGTTTC------ACAGTGG

Extended Data Figure 1 | Close relationship of CCR5 sequences among the seaview sequence editor. In total, these 45 species were trimmed to 7,000 simians. a, Phylogenetic trees. The larger tree uses Danio rerio CCR5 as an bases each and comprise 827 complete sites, of which 241 are invariant and 586 outgroup while the smaller inset tree uses two Lemur species. Bootstrap support were informative for phylogenetic analyses. Danio rerio (at bottom) provides is included at each node; branch lengths are provided where appropriate. The an outgroup; when considering the conservation of the region surrounding the low bootstrap values between some clades are indicative of the very small 21 PRF signal at position 408, it was removed. This results in 170 sites, of number of phylogenetically informative bases and high propensity of invariant which 69 are invariant, and 58 informative. With the exception of the Lemur, bases. b, Sequence alignment of CCR5 coding sequences. The portion of the the 21 PRF region is strikingly conserved among primates. CCR5 coding sequence alignment containing the 21 PRF signal is shown from

©2014 Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE

a b In-frame control RLuc FLuc (in frame) 8

Out-of-frame RLuc FLuc (-1 frame) 6 control 4 HIV-1 -1 PRF RLuc FLuc (-1 frame)

% -1 PRF 2 CCR5 -1 PRF RLuc FLuc (-1 frame) CCR5 slip site RLuc FLuc (-1 frame) mutant U UUA AAA GCGCGCG UAA RLuc HIV CHO PTC control FLuc (-1 frame) CCR5CCR5 CHO Vero

c 1 MTSRIPVVHA VFALKSQDGH LWGGDKCDHL GGGCVCVSPR NHLYQISKRR 51 SSLHLQLSFS IQSVSILEEF PDIKDSHLGV GTDPVVLQRR DWENPGVTQL 101 NRLAAHPPFA SWRNSEEART DRPSQQLRSL NGEWRFAWFP APEAVPESWL 151 ECDLPEADTV VVPSNWQMHG YDAPIYTNVT YPITVNPPFV PTENPTGCYS 201 LTFNVDESWL QEGQTRIIFD GVNSAFHLWC NGRWVGYGQD SRLPSEFDLS 251 AFLRAGENRL AVMVLRWSDG SYLEDQDMWR MSGIFRDVSL LHKPTTQISD 301 FHVATRFNDD FSRAVLEAEV QMCGELRDYL RVTVSLWQGE TQVASGTAPF 351 GGEIIDERGG YADRVTLRLN VENPKLWSAE IPNLYRAVVE LHTADGTLIE 401 AEACDVGFRE VRIENGLLLL NGKPLLIRGV NRHEHHPLHG QVMDEQTMVQ 451 DILLMKQNNF NAVRCSHYPN HPLWYTLCDR YGLYVVDEAN IETHGMVPMN 501 RLTDDPRWLP AMSERVTRMV QRDRNHPSVI IWSLGNESGH GANHDALYRW 551 IKSVDPSRPV QYEGGGADTT ATDIICPMYA RVDEDQPFPA VPKWSIKKWL 601 SLPGETRPLI LCEYAHAMGN SLGGFAKYWQ AFRQYPRLQG GFVWDWVDQS 651 LIKYDENGNP WSAYGGDFGD TPNDRQFCMN GLVFADRTPH PALTEAKHQQ 701 QFFQFRLSGQ TIEVTSEYLF RHSDNELLHW MVALDGKPLA SGEVPLDVAP 751 QGKQLIELPE LPQPESAGQL WLTVRVVQPN ATAWSEAGHI SAWQQWRLAE 801 NLSVTLPAAS HAIPHLTTSE MDFCIELGNK RWQFNRQSGF LSQMWIGDKK 851 QLLTPLRDQF TRAPLDNDIG VSEATRIDPN AWVERWKAAG HYQAEAALLQ 901 CTADTLADAV LITTAHAWQH QGKTLFISRK TYRIDGSGQM AITVDVEVAS 951 DTPHPARIGL NCQLAQVAER VNWLGLGPQE NYPDRLTAAC FDRWDLPLSD 1001 MYTPYVFPSE NGLRCGTREL NYGPHQWRGD FQFNISRYSQ QQLMETSHRH 1051 d LLHAEEGTWL NIDGFHMGIG GDDSWSPSVS AEFQLSAGRY HYQLVWCQK Start - End Observed Mr(expt) Mr(calc) ppm Miss Sequence 2 - 17 599.0121 1794.0144 1794.0152 -0 0 M.TSRIPVVHAVFALKSQ.D Acetyl (N-term) (Ions score 31) 2 - 17 599.0132 1794.0179 1794.0152 2 0 M.TSRIPVVHAVFALKSQ.D Acetyl (N-term) (Ions score 22) 2 - 17 599.0138 1794.0195 1794.0152 2 0 M.TSRIPVVHAVFALKSQ.D Acetyl (N-term) (Ions score 32) 2 - 17 599.0138 1794.0195 1794.0152 2 0 M.TSRIPVVHAVFALKSQ.D Acetyl (N-term) (Ions score 23) 2 - 17 599.0141 1794.0203 1794.0152 3 0 M.TSRIPVVHAVFALKSQ.D Acetyl (N-term) (Ions score 55) 2 - 17 599.0142 1794.0207 1794.0152 3 0 M.TSRIPVVHAVFALKSQ.D Acetyl (N-term) (Ions score 45) 2 - 17 599.0198 1794.0377 1794.0152 13 0 M.TSRIPVVHAVFALKSQ.D Acetyl (N-term) (Ions score 43) 2 - 24 630.0921 2516.3391 2516.3288 4 1 M.TSRIPVVHAVFALKSQDGHLWGG.D Acetyl (N-term) (Ions score 29)

Extended Data Figure 2 | Dual-luciferase reporters, frameshifting data slippery heptamer of the CCR5 21 PRF signal was mutated to GCGCGCG. The from other cell types, and frameshift peptide LC-MS/MS sequencing data. PTC reporter is based on the CCR5 21 PRF reporter in which a premature a, Schematic of dual-luciferase constructs used to test the CCR5 frameshifting termination codon was inserted following the Renilla open reading frame. signal. Transcription is driven from the SV40 early enhancer/promoter and b, Efficient 21 PRF promoted by the CCR5 sequence in CHO and Vero cells. transcription termination and utilizes the SV40 late poly(A) Error bars denote standard error. c, d, LC-MS/MS analysis of the CCR5/b-Gal signal. The in-frame control is p2luci, encoding a firefly/Renilla luciferase 21 PRF fusion protein. c, Primary amino acid sequence of the predicted fusion fusion protein. In the out of frame reporter (OOF), firefly luciferase lies in the protein. Matched peptides are shown in bold red. Leader sequence is 21 reading frame with respect to the Renilla open reading frame. In the HIV highlighted in cyan. CCR5 0-frame sequence is yellow, and CCR5 21 frame 21 PRF reporter, the 21 PRF signal of HIV-1 was cloned in between the peptide is highlighted in green. Non-highlighted sequence is b-galactosidase. two luciferase reporters, and the firefly ORF is in the 21 frame with respect to d, Partial list of matching peptides identified by MS/MS analysis. Highlighted Renilla. The CCR5 21 PRF reporter is the same, except that it contains the sequences show MSMS spectra of TSRIPVVHAVFALKSQ (2–17) and of CCR5 21 PRF signal. In the CCR5 slip site mutant (SSM), the UUUAAAA TSRIPVVHAVFALKSQDGHLWGG (2–24) with N-terminal acetylation.

©2014 Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH

abc Chemical protection (Bases) SHAPE (Riboses) 3’ U-A 3’ A G U-A U-A A U -40 G-C U-A 3’ C-G 5’ G-C G-U -100 U-A A-U C10 G-C a G-U -100 U-A G G G-U G-C a G-C U-A | A G-U G-U -100 G-U | G | A G-C a G-C A20 S1-5’ 30- U-A | G G-U 30- U-A G-C 30- U-A | A G-C G-C b G-C | G G-U U-A G-C b 30- U-A G-U -50 G-U U-A G-C G G U30 A-U -90 G-U G-C b U-G C | A-U -90 U-A U-G C10 C | G-U U U U A | A | A-U -90 S1 C-G U40 A-U c U G-C A-U c C | C-G U40 S2-5’ S1-5’ 40- U | G-C U A | 20- A-U 40- A20 G-U U | A-U c C-G U50 U-A S2 G-U S2 G-C U-G G-C U-A 40- U | G-C -60 d G-C G-U S2 G-U S1-3’ A-U d C60 U-A A-U U-A C-G U30 C U-A G-C A-U A C A-U d G G U50 A A C-G A U-A G-U C-G C 10- A-U U-G -80 A G70 Loop 50- U-A U-G -80 A C U G-C e 50- U-A C-G C-G U40 S2-5’ G-C G-C G-C e U-G -80 5’ UUUAAAA G-C G-C 50- U-A G -70 U G-C G-C e U G80 C-G U G-C C C60 C-G C-G U G-C U S1-3’ 20- A-U C-G U C U U50 C-G 20- A-U C-G U U-G C-G C-G C G-C -60 U-G C 20- A-U C C 3’ U90 S2-3’ S1 G-U G-C -60 C-G C A C-G G-U U-G L A A C-G L G-C -60 80- G-C A-U G U U S1 S1 U | U A-U G-U G-U -100 L G | G U G-C U G C C-G A C C60 G-U A-U S2 A-U G70 G-U C U100 S1-3’ 10- A | 70 G-U G G U-A 10- A-U G-U C-G C | C C-C C-G 10- A-U U A A G-U | C C U U C Loop 5’ UUUAAAA C-G -70 C-G G -70 C C Base G-U C74 Ribose 2’OH G-C U A 5’ UUUAAAA 5’ UUUAAAA 90- U U G70 Best Fit Pknots NUPACK mFold C74 GAUC un 1 2 3 Loop G80 Primer 1 GAUC un 1 2 3 Primer 2 Highly Reactive U90 Mildly Reactive S2-3’ Uneactive

U100

3’ GAUC un D C K d e SL 2 RMSD (Å) MD Trajectory of 80 ns 16

12 FULL

8 SL 1 3’

SL 2 5’ 5’ 4 o

FULL (8-103) 8.2 + 1.5 Ao SL1 (8-22, 55-75) 6.7 + 0.6 A o SL2 (24-53, 76-103) 6.1 + 1.5 A 0 SL 1 0 10 20 30 40 50 60 70 80 Time (ns)

MD average structure for the last 12 ns (68 - 80 ns)

Extended Data Figure 3 | Structural analysis of the CCR5 21 PRF signal. nucleotide 59 of bases modified by chemical reagents. b, The CCR5 mRNA was a, Computationally predicted and best fit 2-dimensional mRNA structures with either left unmodified (un), or modified with 3 increasing concentrations of chemical modification data. Stems 1, 2, and the loop are indicated as S1, S2 and dimethyl sulphate (DMS, reacts with A and C), 1-cyclohexyl-(2- L. The five different segments of stem 2 are labelled a–e. Nucleotide bases morpholinoethyl)carbodiimide metho-p-toluene sulfonate (CMCT, reacts showing low levels of reactivity with DMS, CMCT, and kethoxal, and ribose with U), or 1,1-dihydroxy-3-ethoxy-2-butanone (kethoxal, reacts with G) sugars whose 29-OH groups with NMIA were similarly non-reactive are noted respectively. c, Primer extension reactions were performed on unmodified as open circles. Moderately reactive sugars and bases are denoted by grey samples and samples incubated with 30, 65, and 110 nM NMIA (N-methyl filled circles. Strongly modified (strongly reactive) sugars and bases are isatoic anhydride). These are labelled 1, 2 and 3, respectively, beneath each represented as black filled circles. Circles proximal to the bases denote reactivity sample, and ‘un’ denotes untreated RNA. d, The all atom r.m.s.d. plot of the of the bases, while circles distal to the bases denote reactivity of the ribose 80 ns-long molecular dynamics simulation states against the reference structure sugars. From left to right: mRNA pseudoknot best fit to data; mRNA (the starting state of the simulation) at the end of a 2 ns-long equilibration. pseudoknot predicted by Pknots; mRNA pseudoknot predicted by NUPAK; All measures exclude most mobile, single-stranded, first 7 nucleotides. The tandem stem-loops predicted by mFold. Red bases represent chemical full structure r.m.s.d. values are shown in black, SL1 data are plotted in cyan, modification patterns inconsistent with computational predictions. and the SL2 data are plotted in red. Mean r.m.s.d. values with standard b, c, Chemical modification experiments. Autoradiograms of reverse deviations are given in the box. Overall, after approximately 21 ns the structure transcriptase primer extensions performed on RNA transcribed by T7 RNA stabilizes. e, Two views (opposite sides) of an average minimized structure Pol from template PCR amplified from CCR5 21 PRF signal containing based on the last 12 ns of the simulation (indicated with red arrow in d) where plasmid. Bands correspond to strong readthrough control (RT) stops 1 all the r.m.s.d. measures converge and are most stable.

©2014 Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE

miR-711 a 3’GAAUGCAGAGAGGGACCCAGG_G5’ miR-141 .::.:::::::::: :: .: . miR-141 3’GGUAGAAAUGGU_CUGUCACAAU 5’ GGUAGAAAUGGUCUGUCACAAU ::. ::::. . :... .:::::::::::::. 5’ UUUAAAAGCCAGGACGGUCACCUUUGGGGUGGUGACAAGUGUGAUCACUUGGGUGGUGGCUGUGUUUGCGUCUCUCCCAGGAAUCAUCUUUACCAGAUCUCAAAAAG 3’ .:::::.. :.:: : :: . :.: :..:.:: : .: .::.::.. : :: .:::: 3’GGUGGAGGGCUCAGGAGUG5’ 3’GGUGGAGGGCUCAGGAGUG 5’ 3’GGUGGAGG_GCUCAGGAGUG 5’ miR-1224 miR-1224 miR-1224 b c 14 25 * HIV HIV 12 * * CCR5 CHO 20 * CCR5 ssM 10 CCR5 Vero 15 8

% -1 PRF 10 6 % -1 PRF

5 4

2 0 0 none scr. miR-141 miR-711 mi1224

Extended Data Figure 4 | Specific stimulation of CCR5-mediated 21 PRF incubation, cells were transfected with the indicated 21 PRF dual-luciferase by miR-1224. a, Sequence of the CCR5 21 PRF signal is shown. The reporters, and frameshift assays were performed 24–36 h. later. c, miR-141 UUUAAAA slippery site is italicized, stems 1 and 2 of the mRNA pseudoknot stimulates CCR5 mediated 21 PRF in CHO cells. CHO and Vero cells were are coloured blue and red, respectively, and unpaired bases are black. transfected with 5 nM of the indicated miRNA expressing constructs, with Sequences of miR-1224, miR-711, and miR-1413, and their potential scrambled miRNA sequences, or mock transfected. After 24 h incubation, cells hybridization patterns with CCR5 sequence are indicated. b, Stimulation of were transfected with the indicated 21 PRF dual-luciferase reporters, and 21 PRF by miR-1224 in HeLa cells. Cells were transfected with 5 nM of the frameshift assays were performed 24–36 h later. Error bars denote standard indicated miRNA expressing constructs, or mock transfected. After 24 h error. *P , 0.05.

©2014 Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH

a CCR5 Primer1 Primer2 e Native R5

miR

b Refolded R5

miR

c HIV-1

HIV

miR d HIV

1234 5678 1234 5678 1-4: GATC 5: CCR5 Ctrl miR 6: CCR5 + mir-1224 Ctrl 7: CCR5 NMIA 8: CCR5 + mIR-1224 + NMIA Extended Data Figure 5 | Gel shifts and SHAPE analysis of CCR5 1 miR- 1224. a, b, In vitro binding of mIR-1224 with the CCR5 21 PRF signal. Serial dilutions from 30 nM to 0.25 nM of a CCR5 21 PRF signal (R5) containing transcript were mixed with equal volumes of 1.0 nM [32P]-labelled synthetic miR-1224 RNA (miR), and incubated for 30 uC for 30 min (a, Native), or at 90 uC for 5 s, cooled quickly to 60u and then slowly to 37u (b, Refolded). Samples were separated through 10% native PAGE, dried, and intensities of retarded bands were quantified using phosphorimager and BioRad Quantity One software. c, d, miR-1224 does not bind to the HIV-1 21 PRF signal in vitro. c, ‘Native’ annealing conditions. d, ‘Refolded’ annealing conditions. For a–d, n 5 6 for each sample (three times each of two technical replicates). e, SHAPE analysis of the CCR5 21 PRF signal in the absence (lane 7) or presence (lane 8) of miR-1224.

©2014 Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE

a M1 M2 M3 GGUGGAGG UGGAGGGCUCA GCUCAGGAGU

5’ UUUAAAAGCCAGGACGGUCACCUUUGGGGUGGUGACAAGUGUGAUCACUUGGGUGGUGGCUGUGUUUGCGUCUCUCCCAGGAAUCAUCUUUACCAGAUCUCAAAAAG 3’ .:::::.. :.:: : :: . :.: :..:.:: : .: .::.::.. : :: .:::: 3’GGUGGAGGGCUCAGGAGUG5’ 3’GGUGGAGGGCUCAGGAGUG 5’ 3’GGUGGAGG_GCUCAGGAGUG 5’ miR-1224 miR-1224 miR-1224

b CCR5-M1 CCR5-M2 CCR5-M3

CCR5

miR-1224

c Native Re -folded M1 M2 M3

Kd (nM): 0.36 0.76 0.76 0.36 41.5

Extended Data Figure 6 | Mapping the CCR5/miR-1224 interaction: two technical replicates). c, Models diagramming experimental results. The miR-1224 stimulates CCR5 21 PRF by stabilizing Stem 2 through a CCR5 pseudoknot is cartooned with wild-type sequences shown as black proposed triple helical interaction. a, Putative miR-1224 binding sites were lines, mutant sequences shown as green lines, and miR-1224 in red. Proposed changed to their complements (green). b, Gel-shift assays using miR1224 triple helical interactions between the CCR5 21 PRF promoting pseudoknot and M1, M2 and M3 variants of the CCR5 signal (described in Extended Data and miR-1224 are shown in the Native and M2 models. Fig. 5) using native conditions. n 5 6 for each sample (three times each of

©2014 Macmillan Publishers Limited. All rights reserved ARTICLE RESEARCH

b a 150 β-globin reporter RT Dox-repr Ex: 1 2 3 SV40 5’ UTR 3′UTR 100 polyA promoter CCR5 +CCR5 -1 PRF +ARE + hSmg1 siRNA or PTC 50

% mRNA remaining % mRNA ARE CCR5 0 PTC 0 100 200 300 400 c d time (min) 1.8 1.6 ** hSmg1 siRNA 3 * 1.4 1.2 * 1 2 0.8 0.6 1 scrambled control 0.4 (Fold scrambled control) CCR5 mRNA abundance

CCR5 mRNA (Fold-0 min.) ** 0.2 0 0 0 60 120 180 Scr time (min) miR-1224 1224 + antiSmg +1224Smg + anti Smg1 siRNA anti miR-1224 Smg +1224 + anti Extended Data Figure 7 | mRNA stability studies. a, Rabbit b-globin HeLa-TZM BL cells transcriptionally arrested with actinomycin D. Cells were reporter containing a doxycycline repressible promoter and SV40 derived transfected with SMG1 miRNA or scrambled miRNA control. d, Effects of polyA signal is shown. The native CCR5 21 PRF signal was cloned into exon 1. various RNAs on CCR5 mRNA steady-state abundance. HeLa-TZM BL Controls included insertion of the CCR5 SSM 21 PRF signal, PTCs in all cells expressing CCR5 were transfected as follows: scrambled siRNA three reading frames at this position, or insertion of the 27 nucleotide control (scr), human SMG1 siRNA (Smg), miR-1224 (1224), a mIR-1224 TNF-a-derived A-U rich element (ARE) immediately following the rabbit antagomir (anti), and combinations thereof. b–d, n 5 9 (three times on three b-globin open reading frame. b, Time course measurements of rabbit b-globin independent biological replicates). Error bars denote standard error. *P , 0.05, reporter abundances transcriptionally arrested with doxycycline. c, Time **P , 0.01 (Student’s two-tailed t-test). course measurements of native CCR5 mRNA reporter abundances from

©2014 Macmillan Publishers Limited. All rights reserved RESEARCH ARTICLE

IL2Rγ

CCR5

A SNP rs140216601 UUUAAAAGCCAGGACGGUCACCUUUGGGGUGGUGACAAGUGUGAUCACUUGGGUGGUGGCUGUGUUUGCGUCUCUCCCAGGAAUCAUCUUUACCAGAUCUCAAAAAG

A SNP rs55639502 U SNP rs1800942

IL27Rα

U A SNP rs112963788 SNP rs113593439 G SNP rs145262345 IL7Rα (CD127)

C UG SNP rs193922647 SNP rs200889239 SNP rs202060571 Probable pathogenic allele C SNP rs200107390 SNP rs145262345

IL8Rα_164 (CXCR1)

UCU SNP rs138773569 SNP rs201294973 SNP rs199633255 U SNP rs150483744 A A SNP rs201754155 SNP rs149364972

IL8Rβ (CXCR2) IL8Rα_572

U C A SNP rs2012494973 SNP rs199633255 U SNP rs188773569 SNP rs150483744 U A AC A AU SNP rs188773569 SNP rs201920477 SNP rs146445859 SNP rs139052408 SNP rs142076386 SNP rs144399308 SNP rs199972612 A SNP rs149364972 U SNP rs199933392 Extended Data Figure 8 | Mapping of single nucleotide polymorphisms the positions of slippery heptamers; the horizontal green line shows the (SNPs) and paused ribosomes within operational 21 PRF signals. mean Vienna RNAfold value over the ORF; the horizontal red line shows the Operational 21 PRF signals are shown as Feynman diagrams. SNPs mean pknots value over the ORF; the horizontal blue line shows the mean were identified and mapped to 21 PRF signals using the http:// nupack value. Green, red and blue lines delineate the MFE over a sliding 105-bp www.ncbi.nlm.nih.gov/gene/ utility and are indicated below sequences. For window. Arrows indicate where 21 PRF signals align with paused ribosomes the IL2RG, IL7RA and IL27RA sequences, minimum free energy (MFE) in profiling data. landscapes below the Feynman diagrams depict black vertical lines to delineate

©2014 Macmillan Publishers Limited. All rights reserved