Role of Mrna Structure in the Control of Protein Folding Guilhem Faure, Aleksey Y

10898–10911 Nucleic Acids Research, 2016, Vol. 44, No. 22 Published online 27 July 2016 doi: 10.1093/nar/gkw671 Role of mRNA structure in the control of protein folding Guilhem Faure, Aleksey Y. Ogurtsov, Svetlana A. Shabalina and Eugene V. Koonin* National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA Received May 03, 2016; Revised July 12, 2016; Accepted July 14, 2016 ABSTRACT speed and fidelity of translation. Although much attention had focused on the initiation step as a major determinant Speciﬁc structures in mRNA modulate translation of translation rate, multiple studies over nearly two decades rate and thus can affect protein folding. Using the have made it clear that elongation also plays an important protein structures from two eukaryotes and three role in the regulation of translation and co-translational prokaryotes, we explore the connections between protein folding (1–3). In particular, it has been shown that the protein compactness, inferred from solvent ac- in some proteins, ␣-helices and ␤-strands are flanked by cessibility, and mRNA structure, inferred from mRNA strong signals in the mRNA sequence (4,5). Different pro- folding energy (G). In both prokaryotes and eukary- tein structures have been reported to correlate with distinct otes, the G value of the most stable 30 nucleotide patterns of synonymous codon usage in the respective mR- segment of the mRNA (Gmin) strongly, positively NAs (6,7). More generally, it has been proposed that dif- correlates with protein solvent accessibility. Thus, ferent protein secondary structures are encoded by mRNA sequences with distinct properties. For example, ␣-helices in mRNAs containing exceptionally stable secondary Escherichia coli and human proteins appear to be preferen- structure elements typically encode compact pro- tially encoded by ‘fast’ mRNA regions, i.e. those enriched teins. The correlations between G and protein com- in optimal codons, whereas ‘slow’ regions often code for ␤- pactness are much more pronounced in predicted strands and loops (8–10). ordered parts of proteins compared to the predicted Recent work, in particular ribosome profiling experi- disordered parts, indicative of an important role of ments, has clearly demonstrated that translation speed is mRNA secondary structure elements in the control far from being uniform either among different mRNAs of protein folding. Additionally, G correlates with or along a single mRNA molecule (11,12). Structural fea- the mRNA length and the evolutionary rate of syn- tures of an mRNA, in particular segments with a sta- onymous positions. The correlations are partially in- ble secondary structure, as well as specific protein se- dependent and were used to construct multiple re- quences within a nascent polypeptide (arrest sequences), cause translating ribosomes to pause or stall (13,14). For ex- gression models which explain about half of the vari- ample, positively charged amino acid residues and proline- ance of protein solvent accessibility. These ﬁndings rich sequences have been reported to substantially affect suggest a model in which the mRNA structure, partic- the rate of translation and hence protein production, appar- ularly exceptionally stable RNA structural elements, ently by partially blocking the ribosomal exit channel (14– act as gauges of protein co-translational folding by 18). It has been shown that in the yeast Saccharomyces cere- reducing ribosome speed when the nascent peptide visiae, positively charged amino acid residues in the growing needs time to form and optimize the core structure. translated peptide interact with the negatively charged inner surface of the ribosomal exit tunnel and slow down translation (15). Ribosome stalling caused by the ‘arrest peptides’ INTRODUCTION that block the ribosomal exit channel affects the structure The primary function of a mRNA is to encode the sequence of the mRNA, resulting in specific biological effects includ- of a specific protein. However, an mRNA is not an abstract ing regulation of protein production, maturation or local- sequence of codons but rather a molecule with its own com- ization (19). Notably, it has been shown that to overcome plex structure various features of which can be subject to stalling at polyproline sequences, the ribosome requires a selection. In particular, mRNA molecules form secondary specific translation elongation factor (17). The availability structure elements (stems and loops) of broadly varying sta- of tRNAs also can affect the rate of elongation of nascent bility that can affect both the stability of the mRNA and the polypeptide chains (20). However, the actual contributions *To whom correspondence should be addressed. Tel: +1 301 435 5913; Fax: +1 301 435 7793; Email: [email protected] Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US. Nucleic Acids Research, 2016, Vol. 44, No. 22 10899 of each of these potential mechanisms to the control of lation (27,45). Indeed, mRNA structure-dependent changes translation speed remain uncertain and controversial (21– in translation rates can dramatically affect protein abun- 23). dance and cause major phenotypic effects including human Is the efficiency of translation and/or protein folding disease (46). encoded in the mRNA sequence, and if so, how? The 5- Additionally, several recent studies have demonstrated terminal 30–50 nucleotides of the protein-coding regions that variations in translation speed mediated by mRNA sec- in prokaryotes and eukaryotes are generally devoid of ondary structure can lead to changes in post-translational stable secondary structure that conceivably could impede modifications of the nascent polypeptide, a level of pro- mRNA interaction with the ribosomes (24–27). However, tein regulation previously believed not to be connected with 5-terminal parts of coding regions also contain structural the RNA level regulation (27). In particular, translation- elements that function as various translation regulatory sig- dependent regulation of post-translational protein arginy- nals. For example, the secondary structure in the region be- lation mediated by synonymous codon usage has been tween codons 14 and 34 downstream of the start codon has demonstrated for the purine nucleotide biosynthesis en- been variously proposed to facilitate the recognition of the zyme PRPS2 (47) and for actins (48). start codon and/or to prevent ribosomal jamming (13,28– A rapidly growing body of experimental data indicates 30). Furthermore, it has been shown that the 5-terminal re- that folding of many if not most proteins is predominantly gions of mRNAs form a ‘translational ramp’ which is trans- co-translational, i.e. individual protein domains fold before lated substantially slower than the rest of the coding region the synthesis of the respective polypeptide chain is com- (31,32). plete (49–53). Recently, the process of co-translational fold- The relationships between mRNA folding and transla- ing of several proteins has been dissected experimentally tion appear to be complex and involve opposite effects. Nu- (54–57). For example, cystic fibrosis transmembrane con- merous, independent experiments indicate that stable sec- ductance regulator has been shown to fold in discrete steps, ondary structure elements in mRNA decrease the rate of namely sequential compaction of the N-terminal, ␣-helical translation, especially in vitro (33–36). However, in an ap- and ␣/␤-core domains. The sequence of these events is criti- parent contradiction to these findings, recent progress in cal for the overall folding completion as premature ␣-helical the experimental determination of RNA secondary struc- domain folding hampered the subsequent formation of the tures has led to the demonstration of significant positive core domain. The synthesis of this particular protein is fa- correlations between mRNA folding (the prevalence of sta- cilitated by intrinsic folding propensity modulation in three ble structures) and protein abundance (37–39). These or- distinct ways: delaying ␣-subdomain compaction, facilitat- thogonal findings imply that multiple, still poorly under- ing ␤-strand intercalation and optimizing translation kinet- stood mechanisms that involve interactions between ribo- ics via codon usage (58). somes and different structural elements in translated mR- Evidence that folding of at least some proteins is modu- NAs differentially modulate the rate and efficiency of translated by translational pauses caused by mRNA secondary lation (24,27,40). structure has been reported, for example, for the coat pro- A pronounced periodic pattern of mRNA secondary tein of the RNA bacteriophage MS2 (59,60). Synonymous structure, stability and nucleotide base-pairing has been single-nucleotide polymorphisms within the same gene that identified in coding regions of diverse eukaryotes24 ( ,41,42). affect the secondary structure of mRNA can create varia- Although all codon positions are important for the forma- tions in translation speed, leading to dramatic differences in tion of the secondary structure of mRNA, synonymous po- protein folding between individuals. A striking example of sitions that are free from selection on protein sequence make this effect involves synonymous polymorphisms in the mul- the greater contribution to the evolution of mRNA sec- tidrug resistance 1 (MDR1 or ABCB1)

Role of Mrna Structure in the Control of Protein Folding Guilhem Faure, Aleksey Y

Translation Readthrough Mitigation Joshua A

Non-Synonymous to Synonymous Substitutions Suggest That Orthologs Tend to Keep Their Functions, While Paralogs Are a Source of Functional Novelty

Unbiased Estimate of Synonymous and Non-Synonymous Substitution Rates with Non-Stationary Base Composition

The KA /KS Ratio Test for Assessing the Protein

Molecular Phylogeny and Evolution

The Selection-Mutation-Drift Theory of Synonymous Codon Usage

Does Mrna Structure Contain Genetic Information for Regulating Co-Translational Protein Folding?

Analysis of Stop Codons Within Prokaryotic Protein-Coding Genes Suggests Frequent Readthrough Events

Cotranslational Assembly Imposes Evolutionary Constraints on Homomeric Proteins

Male-Driven Evolution Wen-Hsiung Li*, Soojin Yi and Kateryna Makova

Molecular Evolution

Wolf 2009 Genome Biol Evol {2F