Cotranslational Folding Allows Misfolding-Prone Proteins To
Total Page:16
File Type:pdf, Size:1020Kb
Cotranslational folding allows misfolding-prone proteins to circumvent deep kinetic traps Amir Bitrana,b, William M. Jacobsc, Xiadi Zhaia, and Eugene Shakhnovicha,1 aDepartment of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138; bHarvard University Program in Biophysics, Harvard University, Cambridge, MA 02138; and cDepartment of Chemistry, Princeton University, Princeton, NJ 08544 Edited by William A. Eaton, National Institutes of Health, Bethesda, MD, and approved December 12, 2019 (received for review July 31, 2019) Many large proteins suffer from slow or inefficient folding in positive evolutionary selection. However, the specific mecha- vitro. It has long been known that this problem can be allevi- nisms by which cotranslational folding is beneficial have not been ated in vivo if proteins start folding cotranslationally. However, elucidated. the molecular mechanisms underlying this improvement have Here, we address this question using an all-atom computa- not been well established. To address this question, we use an tional method for inferring protein-folding pathways and rates all-atom simulation-based algorithm to compute the folding prop- while accounting for the possibility of nonnative conformations. erties of various large protein domains as a function of nascent We apply this method to compute protein-folding properties at chain length. We find that for certain proteins, there exists a various nascent chain lengths to investigate how vectorial syn- narrow window of lengths that confers both thermodynamic sta- thesis affects cotranslational folding efficiency. We find that for bility and fast folding kinetics. Beyond these lengths, folding is certain large proteins, vectorial synthesis is beneficial because drastically slowed by nonnative interactions involving C-terminal it allows nascent chains to fold rapidly at shorter chain lengths, residues. Thus, cotranslational folding is predicted to be benefi- prior to the synthesis of C-terminal residues which stabilize non- cial because it allows proteins to take advantage of this optimal native kinetic traps. Many of these proteins’ sequences contain window of lengths and thus avoid kinetic traps. Interestingly, conserved rare codons ∼30 amino acids downstream of these many of these proteins’ sequences contain conserved rare codons faster-folding intermediate lengths, suggesting these sequences that may slow down synthesis at this optimal window, suggest- may have evolved to provide enough time for cotranslational ing that synthesis rates may be evolutionarily tuned to optimize folding. We also identify counterexamples—proteins without folding. Using kinetic modeling, we show that under certain con- conserved rare codons that do not misfold into deep kinetic BIOPHYSICS AND COMPUTATIONAL BIOLOGY ditions, such a slowdown indeed improves cotranslational folding traps and for which vectorial synthesis thus confers no advan- efficiency by giving these nascent chains more time to fold. In tage. Together, these results provide a detailed molecular picture contrast, other proteins are predicted not to benefit from cotrans- of how vectorial synthesis may improve in vivo folding speed lational folding due to a lack of significant nonnative interactions, and efficiency and how cotranslational folding may be optimized and indeed these proteins’ sequences lack conserved C-terminal evolutionarily. rare codons. Together, these results shed light on the factors that promote proper protein folding in the cell and how biomolecular Results self-assembly may be optimized evolutionarily. Predicting Folding Properties of Nascent Chains. To compute cotranslational folding pathways and rates, we developed a protein folding j cotranslational folding j codon usage j evolution j simulation-based method and analysis pipeline described in self-assembly Significance any large proteins refold from a denatured state very Mslowly in vitro (on timescales of minutes or slower) while Many proteins must adopt a specific structure to perform their others do not spontaneously refold at all (1–6). Given that functions, and failure to do so has been linked to disease. proteins must rapidly and efficiently fold in the crowded cellu- Although small proteins often fold rapidly and spontaneously lar environment, how is this conundrum resolved? The answer to their native conformations, larger proteins are less likely involves a number of factors that affect cellular folding, but which to fold correctly due to the myriad incorrect arrangements are absent in vitro. For example, molecular chaperones such as they can adopt. Here, we provide mechanistic insights into GroEL in Escherichia coli and TriC and HSP90 in eukaryotes how this problem can be alleviated if proteins start folding may substantially improve folding efficiency by passively confin- while they are being translated by the ribosome. This process ing unfolded chains to promote their folding or by expending of cotranslational folding biases certain proteins away from energy to repeatedly anneal misfolded chains until the correct misfolded states that tend to hinder spontaneous refolding. structure is attained (6–12). A second factor that can improve Signatures of unusually slow translation suggest that some of in vivo folding efficiency is cotranslational folding on the ribo- these proteins have evolved to fold cotranslationally. some (13–23), which may affect the folding of as much as 30% Author contributions: A.B., W.M.J., and E.S. designed research; A.B. and X.Z. performed of the E. coli proteome (20). A recent set of works (13, 14) sug- research; A.B. analyzed data; and W.M.J. and E.S. wrote the paper.y gests that protein synthesis rates in various organisms may be The authors declare no competing interest.y under evolutionary selection to allow for cotranslational fold- This article is a PNAS Direct Submission.y ing. Namely, these works show that conserved stretches of rare codons, which are typically translated more slowly than their Published under the PNAS license.y synonymous counterparts, are significantly enriched roughly 30 Data deposition: A dataset containing folding rates and free energies for all protein constructs included in this publication has been deposited in Figshare (https://figshare. amino acids upstream of chain lengths at which folding is pre- com/articles/Analyzed data/11496954).y dicted to begin. This 30 amino acid gap is expected given that 1 To whom correspondence may be addressed. Email: shakhnovich@chemistry. the ribosome exit tunnel sequesters the last ∼30 amino acids of a harvard.edu.y nascent chain and generally impedes their folding. The observed This article contains supporting information online at https://www.pnas.org/lookup/suppl/ enrichment of conserved translation pauses at folding-competent doi:10.1073/pnas.1913207117/-/DCSupplemental.y chain lengths suggests that cotranslational folding may be under First published January 7, 2020. www.pnas.org/cgi/doi/10.1073/pnas.1913207117 PNAS j January 21, 2020 j vol. 117 j no. 3 j 1485–1495 Downloaded by guest on September 24, 2021 Replica exchange simulations Compute folding rates from detailed balance Unfolding simulations Repeat for multiple chain lengths, Incorporate into kinetic model Fig. 1. (Top Left) We run replica exchange atomistic simulations with a knowledge-based potential and umbrella sampling to compute a protein’s free- energy landscape. (Bottom Left) To obtain barrier heights, we run high-temperature unfolding simulations and extrapolate unfolding rates down to lower temperatures assuming Arrhenius kinetics. (Top Right) The principle of detailed balance is then used to compute folding rates. (Bottom Right) The process is repeated at multiple chain lengths and incorporated into a kinetic model of cotranslational folding. For details, see Materials and Methods. Fig. 1 and Materials and Methods. The method utilizes an all- ferent intermediates computed here and is not included in our atom Monte Carlo simulation program with a knowledge-based simulations. 2) Unfolding rates obey Arrhenius kinetics, such potential and a realistic move set described previously (24–26). that rates computed at high temperatures can be readily extra- In essence, rather than simulating a protein’s folding ab initio polated to lower temperatures. This holds as long as the barri- from an unfolded ensemble (which is intractable for large pro- ers between intermediates are large so that a local equilibrium teins at reasonable simulation timescales), we simulate unfold- is reached in each free-energy basin prior to unfolding. 3) Non- ing and, in tandem, calculate the free energies of the folded, native contacts form on timescales faster than the timescales unfolded, and various intermediate states from simulations with of native folding transitions. This condition, which has previ- enhanced sampling. Given rates of sequential unfolding between ously been verified in lattice simulations (29, 30), is also satisfied these states and their free energies, the reverse folding rates can for the misfolded states observed here which are dominated be computed from detailed balance. Importantly, our sequence- by short-range interactions that form rapidly compared to the based potential energy function is not biased toward the native long-range native contacts. This implies that a protein’s folding state, as in native-centered (Go)¯ models, and allows for the landscape can be described by macrostates characterized by cer- possibility of nonnative interactions. Thus we can account for tain