A Concise Guide to CHARMM and the Analysis of Protein Structure and Function
Total Page:16
File Type:pdf, Size:1020Kb
A Concise Guide to CHARMM and the Analysis of Protein Structure and Function Robert Schleif Biology Department Johns Hopkins University 3400 N. Charles St. Baltimore, MD 21218 1/8/06 9/17/13 Preface Increasingly, biologists and biochemists are faced with understanding how their favorite proteins work. The structures of many of these proteins have been determined, and the structures of many more will be determined in the next few years. Once a protein's structure has been determined, it becomes possible and also enticing to design experiments probing the protein's mechanism of action. Tools for the graphical display of structure, the manipulation of the structure, and the calculation of various interaction energies all become interesting and important. Additionally, some properties of a protein may best be revealed by modeling the protein in water and simulating its molecular thermal motion at 300 K. A researcher interested in protein structure and function faces the question of whether to use one of the complete, but expensive, computer programs for the manipulation and analysis of protein structure, use a number of the highly specialized but almost completely undocumented programs that are available on the web, or to learn and use a powerful and general program that can perform most of the manipulations and calculations one might need. This book is written for those who decide to follow the latter course and to learn the program CHARMM (Chemistry at Harvard Macromolecular Mechanics) that was initiated in the laboratory of Dr. Martin Karplus. The program has been continuously refined and extended by many workers over the years since the initial publication, "CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations", J. Comp. Chem. 4, 187-217 (1983), by B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus. This book describes the use of the program for structure analysis, model building, energy calculations, and dynamics simulations. Additionally, the program can perform Monte Carlo calculations, normal mode analysis, free energy calculations, and incorporate quantum mechanical calculations. Revision Notes In 2011 the PDB modified the format of coordinate files, placing segment identifiers on lines for water molecules. This necessitated changing the fixpdb.awk script. This change was made in 9/2013. ii Contents CHAPTER 1 FUNDAMENTALS Introduction ................................................................................................................................ 1 Required Hardware, Software, and Computer Expertise ........................................................... 1 The Flavor of Linux ................................................................................................................... 2 Sources of Information ............................................................................................................... 4 Installing, Testing, and Basic Operation of CHARMM ............................................................ 5 Cartesian and Internal Coordinate Systems ............................................................................... 8 Forces and Potential Energy ...................................................................................................... 9 Hydrogen Bonds and CHARMM............................................................................................. 13 Methods of Dynamics Calculations ......................................................................................... 13 The Verlet Propagation Algorithm .......................................................................................... 14 Achieving Precise but Convenient Structural Description of Systems .................................... 15 Description of Polymer Units, the Residue Topology File, RTF ............................................. 15 Definition of Atom Properties and Interactions, the Parameters File, PARA ......................... 17 Coordinate Files ....................................................................................................................... 18 Description of a Complete System, The Principle Structure File, PSF ................................... 19 Explicit and Implicit Representation of Water ........................................................................ 20 Arrays, and Built-in Substitution Parameters .......................................................................... 22 Atom Selection ......................................................................................................................... 23 Units ......................................................................................................................................... 27 More Useful Linux Commands ................................................................................................ 27 Some Refinements to CHARMM Scripts ................................................................................ 29 Problems ................................................................................................................................... 31 iii Bibliography ............................................................................................................................. 32 Related Web Sites .................................................................................................................... 33 CHAPTER 2 INPUTTING FILES AND COORDINATE CALCULATIONS Reformatting Protein Data Bank Files for Input to CHARMM ............................................... 34 Using awk to Reformat Protein Data Bank Files ..................................................................... 37 Providing Missing Atoms and Coordinates ............................................................................. 40 Reading AraC into CHARMM ................................................................................................ 42 Phi-Psi Angles in Proteins ........................................................................................................ 48 Determining Phi-Psi Angles in AraC ....................................................................................... 49 Coordinate Manipulation Commands--Using CHARMM Documentation ............................. 53 Surface Area, Cavities and Holes in Proteins .......................................................................... 54 Solvent Exposure of Residues in AraC .................................................................................... 56 Looping, Loop Counters, and Calculation of Unfolded Surface Area ..................................... 58 Finding Cavities and Holes in AraC ........................................................................................ 60 Handling Multisubunit Proteins and Reading in Multiple Coordinate Files ........................... 63 Identifying Residues Constituting a Dimerization Interface ................................................... 64 RMS Overlaying Structurally Similar Molecules .................................................................... 66 Asymmetric Units, Biological Molecules and Unit Cells ........................................................ 70 Translating and Rotating a Subunit or Protein With Awk and With CHARMM .................... 73 Constructing the Biological Dimer of Apo-AraC Protein and a Linux-CHARMM TRICK ... 74 Area of the Dimerization Interface of AraC ............................................................................ 79 Distance Maps-Secondary Structure Identification in AraC ................................................... 82 Distance Difference Maps, Application to Hemoglobin .......................................................... 85 Problems ................................................................................................................................... 90 Bibliography ............................................................................................................................. 90 iv Related Web Sites .................................................................................................................... 91 CHAPTER 3 ENERGY MINIMIZATION AND RUNNING DYNAMICS SIMULATIONS Methods of Energy Minimization ............................................................................................ 93 Energy Minimizing the Dimerization Domain of AraC .......................................................... 94 Considerations for a Dynamics Simulation ............................................................................. 97 A Dynamics Run with the AraC Dimerization Domain ........................................................ 100 Langevin Dynamics ............................................................................................................... 112 A Langevin Simulation of the AraC Dimerization Domain .................................................. 113 A Simulation with Periodic Boundary Conditions ................................................................ 114 Reading Trajectories .............................................................................................................. 116 Calculating and Interaction Energy at Intervals During a Trajectory .................................... 117 Writing out PDB Format Coordinates from a Trajectory File ............................................... 119 Time Series Analysis, Reading Rotamer Angles ..................................................................