Markov State Models for Protein and Rna Folding A
Total Page:16
File Type:pdf, Size:1020Kb
MARKOV STATE MODELS FOR PROTEIN AND RNA FOLDING A DISSERTATION SUBMITTED TO THE PROGRAM IN BIOPHYSICS AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Gregory R. Bowman July 2010 © 2010 by Gregory Ross Bowman. All Rights Reserved. Re-distributed by Stanford University under license with the author. This work is licensed under a Creative Commons Attribution- Noncommercial 3.0 United States License. http://creativecommons.org/licenses/by-nc/3.0/us/ This dissertation is online at: http://purl.stanford.edu/ky974bm1455 ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Vijay Pande, Primary Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Russ Altman I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. Daniel Herschlag Approved for the Stanford University Committee on Graduate Studies. Patricia J. Gumport, Vice Provost Graduate Education This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file in University Archives. iii ABSTRACT Understanding the molecular bases of human health could greatly augment our ability to prevent and treat diseases. For example, a deeper understanding of protein folding would serve as a reference point for understanding, preventing, and reversing protein misfolding in diseases like Alzheimer’s. Unfortunately, the small size and tremendous flexibility of proteins and other biomolecules make it difficult to simultaneously monitor their thermodynamics and kinetics with sufficient chemical detail. Atomistic Molecular Dynamics (MD) simulations can provide a solution to this problem in some cases; however, they are often too short to capture biologically relevant timescales with sufficient statistical accuracy. We have developed a number of methods to address these limitations. In particular, our work on Markov State Models (MSMs) now makes it possible to map out the conformational space of biomolecules by combining many short simulations into a single statistical model. Here we describe our use of MSMs to better understand protein and RNA folding. We chose to focus on these folding problems because of their relevance to misfolding diseases and the fact that any method capable of describing such drastic conformational changes should also be applicable to less dramatic but equally important structural rearrangements like allostery. One of the key insights from our folding simulations is that protein native states are kinetic hubs. That is, the unfolded ensemble is not one rapidly mixing set of conformations. Instead, there are many non-native states that can each interconvert more rapidly with the native state than with one another. In addition to these general observations, we also demonstrate how MSMs can be used to make predictions about the structural and kinetic properties of specific systems. Finally, we explain how MSMs and other enhanced sampling algorithms can be used to drive efficient sampling. iv ACKNOWLEDGMENTS Thanks to my family and my God for giving me the passion, intellect, and opportunity to do this work. It is difficult to imagine life without the love, support, and training my parents, brother, and wife have given me. Graduate school—and life in general—have been much more enjoyable with the companionship of my beautiful wife Angela. Thanks to my advisor, Vijay Pande, for being such a superb guide, for creating such an intellectually invigorating environment, and for being so generous with resources of all kinds. My lab-mates have also been great. I’m especially indebted to Xuhui Huang for helping to jump-start my progress by working so closely with me during my rotation and the early years of my PhD. Sergio Bacallado, Kyle Beauchamp, John Chodera, Dan Ensign, Imran Haque, Peter Kasson, Yu-Shan Lin, Paul Novick, and Vince Voelz were all great collaborators. Thanks to Jason Wagoner and Del Lucent for all the conversations about science, religion, politics, and philosophy. Thanks to my committee members, Russ Altman and Dan Herschlag, for making time to help me along the way. Dan was especially generous in including me in his group and getting me into the wet-lab. Seb Doniach has also been like a co- advisor. v Table of Contents List of tables ........................................................................................................................x List of figures .....................................................................................................................xi Introduction .........................................................................................................................1 Chapter 1: Using generalized ensemble simulations and Markov state models to identify conformational states .......................................................................................6 Abstract..........................................................................................................................6 Introduction ...................................................................................................................6 Description of Method.................................................................................................10 Conclusions .................................................................................................................17 Chapter 2: Progress and challenges in the automated construction of Markov state models for full protein systems ...................................................................................19 Abstract........................................................................................................................19 Introduction .................................................................................................................20 Materials & Methods...................................................................................................24 Results & Discussion...................................................................................................29 Conclusions .................................................................................................................45 Chapter 3: Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1-39).......................................................................................................47 Abstract........................................................................................................................47 Introduction .................................................................................................................48 Materials & Methods...................................................................................................48 Results & Discussion...................................................................................................49 Conclusions .................................................................................................................55 Chapter 4: Protein folded states are kinetic hubs ..............................................................56 Abstract........................................................................................................................56 Introduction .................................................................................................................57 Results & Discussion...................................................................................................59 Conclusions .................................................................................................................71 vi Materials & Methods...................................................................................................73 Chapter 5: Atomistic folding simulations of the five helix bundle protein Lambda6- 85 ..................................................................................................................................75 Abstract........................................................................................................................75 Introduction .................................................................................................................76 Results & Discussion...................................................................................................78 Conclusions .................................................................................................................84 Chapter 6: Enhanced modeling via network theory: Adaptive sampling of Markov state models .................................................................................................................86 Abstract........................................................................................................................86 Introduction .................................................................................................................86 Theoretical Underpinnings ..........................................................................................89 Results & Discussion...................................................................................................96