University of Cincinnati
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSITY OF CINCINNATI Date:___________________ I, _________________________________________________________, hereby submit this work as part of the requirements for the degree of: in: It is entitled: This work and its defense approved by: Chair: _______________________________ _______________________________ _______________________________ _______________________________ _______________________________ Homology Modeling of Bovine Rhodopsin: Investigation of the Effect of Lipid Composition and Equilibration on Predicted Structure A thesis submitted to the Division of Graduate Studies and Research of the University of Cincinnati in partial fulfillment of the requirement for the degree of MASTER OF SCIENCE in the Department of Chemical and Materials Engineering of the College of Engineering 2005 by Jonathan Burkhardt B.S., University of Cincinnati, Cincinnati, OH, 2003 Committee: Joel R. Fried, Ph.D. Vadim V. Guliants, Ph.D. Matthew D. Wortman, Ph.D. i ABSTRACT Structure prediction techniques are geared at bridging the gap between protein sequencers, and the more complicated crystallographic techniques for creating three- dimensional protein structures. This work used homology modeling techniques to estimate a structure for bovine rhodopsin, the first G-protein-coupled receptor (GPCR) to have a known 3D structure, using bacteriorhodopsin as a template. Molecular dynamic simulations were performed with DMPC and POPC membranes to investigate the effect of bilayer composition. Homology modeling proved to be too limited for predicting the structure of bovine rhodopsin. The initial model differed from the crystalline structure by a RMSD value of 9.46 Å for helical heavy atoms, which compares to 4.04 Å from Memb-Struk23- 28, and 8.647 for just alpha carbons (Cα) within the helices, which compares to 3.87 Å from PREDICT29,30. Results from simulations with a bilayer suggest that systems that contain DMPC show a minimal improvement relative to their POPC counterparts. ii iii ACKNOWLEDGEMENTS I would like to thank Dr. Joel Fried for all his time and assistance throughout this project. His guidance, and knowledge helped me to better understand the topics that were outside of my previous academic experiences. I would like to dedicate this work to Dr. Rod Bush from P&G Pharmaceuticals, who was a tremendous help prior to his passing. His previous work had been very similar to the content of this project, and he gladly shared his insight and knowledge into the subject. I would also like to thank Dr. Vadim Guliants, and Dr. Matthew Wortman for their time in acting as a member of my thesis committee. I also like to thank the Ohio Supercomputing Center, and their technical help staff, specifically Troy Baer. Without their assistance in working through some software issues, it would not have been possible to complete this project in a timely fashion. iv TABLE OF CONTENTS Section Page Abstract ii Acknowledgements iv Table of Contents 1 List of Tables 2 List of Figures 3 Introduction 5 Structure Determination & Prediction 8 Background: Bacteriorhodopsin & Bovine Rhodopsin 15 Methodology 20 Results & Discussion 26 Conclusion 38 References 41 Appendix 47 1 LIST OF TABLES Table 1: Helix Sequence Alignment Table 28 Table 2: Averaged RMSD Values 29 Table 3: Averaged Values Relative to 1F88-prRh 36 Table 4: Control Values Within Systems and Final Energy 37 Table A-1: Averaged RMSD Values and Changes with Std. Dev. 47 2 LIST OF FIGURES Figure 1: 2D Sketches of Lipids Used in this Work 18 Figure 2: Helices Viewed from Extracellular Side 30 Figure 3: Helices Viewed from the Side 31 Figure 4: Surface Nature of 1F88 & prRh 34 Figure A-1: Energy vs. Time 48 Figure A-2: Energy vs. Time (First 250ps) This graph is a zoomed view of Figure A-1. This is done, to 49 highlight the differences in energies during minimization. Figure A-3: Heavy Atom RMSD For Equilibrated Systems that Contain 1F88 50 Relative to 1F88 Figure A-4: Backbone Atom RMSD For Equilibrated Systems that Contain 51 1F88 Relative to 1F88 Figure A-5: Heavy Atom RMSD for Equilibrated Systems with the Same 52 Protein but Different Lipids Figure A-6: Backbone Atom RMSD for Equilibrated Systems with the Same 53 Protein but Different Lipids Figure A-7: Heavy Atom RMSD for Equilibrated Systems that Contain prRh 54 Relative to 1F88 Figure A-8: Backbone Atom RMSD for Equilibrated Systems that Contain 55 prRh Relative to 1F88 Figure A-9: Heavy Atom RMSD for Equilibrated Systems that Contain prRh 56 Relative to prRh Figure A-10: Backbone Atom RMSD for Equilibrated Systems that Contain 57 prRh Relative to prRh Figure A-11: Heavy Atom RMSD for Equilibrated Systems that Contain the 58 Same Lipid Bilayer 1F88 ĺ prRh Figure A-12: Backbone Atom RMSD for Equilibrated Systems that Contain 59 the Same Lipid Bilayer 1F88 ĺ prRh 3 Figure A-13: Change Relative to 1F88-prRh RMSD in Heavy Atom RMSD 60 for Equilibrated Systems that Contain prRh Relative to 1F88 Figure A-14: Change Relative to 1F88-prRh RMSD in Backbone Atom RMSD 61 for Equilibrated Systems that Contain prRh Relative to 1F88 Figure A-15: Change Relative to 1F88-prRh RMSD in Heavy Atom RMSD for Equilibrated Systems that Contain prRh Relative to the 62 Equilibrated System that Contains 1F88 and the Same Lipid Figure A-16: Change Relative to 1F88-prRh RMSD in Heavy Atom RMSD for Equilibrated Systems that Contain prRh Relative to the 63 Equilibrated System that Contains 1F88 and the Same Lipid Figure A-17: Chain A – Chain B RMSD 64 4 INTRODUCTION The super-family of proteins called G-protein-coupled receptors (GPCRs) makes up only ~3-4% of the human genome1, yet they represent almost 50% of all drug targets2. This makes GPCRs of great interest to both pharmaceutical and academic research focused on drug discovery and the function and malfunction of various human systems. GPCRs respond to external stimuli, including ions, hormones, neurotransmitters, odors, and light3,4. There are three main subfamilies of GPCRs, designated by A, B, and C. Family A being the largest accounting for nearly 90% of all GPCRs4. GPCRs range in length from less than 300 to greater than 1100 residues in the case of metabotropic glutamate receptor5. The majority of family A GPCRs range in length from 310-470 residues6. All GPCRs have seven transmembrane (TM) D-helices. It is hypothesized that helix motion is similar throughout all GPCRs. These family-wide characteristics allow researchers to predict structures and functions of other GPCRs based on information gathered from known GPCRs. Loop regions, the segments that connect the TM helices, have greater mobility than the helices, and may contribute to GPCR function, and G- protein coupling. The three-dimensional (3D) structure of bacteriorhodopsin (bR), a light sensitive protein found in bacteria, was initially available with a resolution of 3.5 Å in 19907, with subsequent structures having higher resolutions. Although bR is not a GPCR, it has the same heptahelical structure found in GPCRs. Bovine rhodopsin’s (bovR) structure, a light activated GPCR and currently the only available GPCR structure, was made available with a resolution of a 2.8 Å in 2000 [pdb code 1F88]4. Additional structures of 5 bovR have been released to improve resolution [1HZX]8, or correct the internal waters within bovR [1L9H]9. Both proteins have a permanently bound chromophore, retinal, that reorients and activates the protein when exposed to light. These 3D structures offer researchers an opportunity to run dynamic simulations, to calculate system and local energies in an attempt to explain function or malfunction, and to look at potential drug docking sites within the system. These structures can also be used as a template for other GPCRs when used in conjunction with structure estimation techniques. Research on GPCRs is inhibited by two main factors. The first is the difficulty to obtain a representative crystal structure of the protein of interest in a natural environment. As techniques for isolating and crystallizing other GPCRs are improved, or developed, more structures will be available for analysis. The second is that computers are still not capable of simulating systems with full atomistic detail for a length of time that is long enough to incorporate the receptor, its coupled G-protein, and other molecules including water, ions, lipids, and drugs. For example, the total time of rhodopsin’s photo-cycle, the time it takes for activation and protein and lipid reconfiguration, is on the order of milliseconds10, while the longest reported simulation is less than 100 ns. This maximum simulation time reflects molecular dynamic (MD) calculations only. Simulations involving quantum mechanical (QM) calculations in addition to MD, called hybrid methods, typically have a maximum time of 1-2 ns. Current simulations either include bovR with lipids and water, or Arrestin, bovR’s G-protein, with only the cytoplasmic tail portion of bovR11. Methods for estimating protein structures from their sequences have been developed in order to study proteins that do not have existing structural data. These 6 techniques overcome the difficulties of experimental structural determination, by relying on only a protein’s sequence and the fact that a protein will form a structure that minimizes its total free energy12. A protein’s sequence can be determined through a purely chemical process, and only requires that the protein is intact. Many proteins are not surfactant/solvent resistant, and will denature when they are exposed to the wrong separation media. This is a challenge in crystallographic techniques, but does not affect sequencing techniques. There are various approaches to transitioning from a sequence to a 3D structure. These approaches can be categorized as; ab initio, threading/potential energy calculations, homology modeling or sequence folding, and hybrid methods that rely on additional information, such as hydrophobicity, to build a model. Many molecular modeling software packages are currently available.