Implementation of Methods to Accurately Predict Transition Pathways and the Underlying Potential Energy Surface of Biomolecular Systems

IMPLEMENTATION OF METHODS TO ACCURATELY PREDICT TRANSITION PATHWAYS AND THE UNDERLYING POTENTIAL ENERGY SURFACE OF BIOMOLECULAR SYSTEMS By DELARAM GHOREISHI A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2019 c 2019 Delaram Ghoreishi I dedicate this dissertation to my mother, my brother, and my partner. For their endless love, support, and encouragement. ACKNOWLEDGMENTS I am thankful to my advisor, Adrian Roitberg, for his guidance during my graduate studies. I am grateful for the opportunities he provided me and for allowing me to work independently. I also thank my committee members, Rodney Bartlett, Xiaoguang Zhang, and Alberto Perez, for their valuable inputs. I am grateful to the University of Florida Informatics Institue for providing financial support in 2016, allowing me to take a break from teaching and focus more on research. I like to acknowledge my group members and friends for their moral support and technical assistance. Natali di Russo helped me become familiar with Amber. I thank Pilar Buteler, Sunidhi Lenka, and Vinicius Cruzeiro for daily conversations regarding science and life. Pancham Lal Gupta was my cpptraj encyclopedia. I thank my physicist colleagues, Ankita Sarkar and Dustin Tracy, who went through the intense physics coursework with me during the first year. I thank Farhad Ramezanghorbani, Justin Smith, Kavindri Ranasinghe, and Xiang Gao for helpful discussions regarding ANI and active learning. I also thank David Cerutti from Rutgers University for his help with NEB implementation. I thank Pilar Buteler and Alvaro Gonzalez for the good times we had camping and climbing. Lastly, I express my sincere gratitude to Farhad Ramezanghorbani for always being there for me, for encouraging me, and for his significant scientific inputs. I also thank my mother, Fatemeh Kaheh, and my brother, Ramin Ghoreishi, for their love and encouragement at every step of my life. I am forever grateful to all three of them. 4 TABLE OF CONTENTS page ACKNOWLEDGMENTS...................................4 LIST OF TABLES......................................8 LIST OF FIGURES.....................................9 ABSTRACT......................................... 11 CHAPTER 1 INTRODUCTION................................... 13 1.1 Minimum Energy Path Sampling........................ 13 1.2 Molecular Dynamics with Machine Learned Potentials............. 15 2 THEORY AND METHODS.............................. 17 2.1 Statistical Mechanics............................... 17 2.1.1 Statistical Ensembles........................... 18 2.1.2 Microcanonical Ensembles: Constant N-V-E............... 19 2.1.3 Canonical Ensembles: Constant N-V-T................. 20 2.1.4 Isothermal-Isobaric Ensembles: Constant N-P-T............. 20 2.1.5 Grand Canonical Ensembles: Constant µ-V-T.............. 21 2.2 Nudged Elastic Band............................... 21 2.3 String Method.................................. 25 2.4 Free Energy and Transition Rate Calculations.................. 27 2.5 Computational Methods of Free Energy Calculations.............. 28 2.5.1 Free Energy Perturbation......................... 29 2.5.2 Thermodynamic Integration....................... 31 2.5.3 Bennett Acceptance Ratio........................ 31 2.6 Indirect Approach to Free Energy Calculations................. 33 2.7 Feed-Froward Neural Networks......................... 35 2.8 Active Learning................................. 38 2.9 Transfer Learning................................ 39 2.10 ANI Neural Network Potentials......................... 40 2.10.1 Network Architecture........................... 40 2.10.2 Sampling the Chemical Space...................... 42 2.10.2.1 Normal Mode Sampling.................... 44 2.10.2.2 Molecular Dynamics Sampling................. 44 3 IMPLEMENTATION.................................. 45 3.1 Implementation of Nudged Elastic Band in Amber............... 45 3.2 Implementaion of ANI-Amber Interface..................... 47 3.3 Sample Amber Input Files............................ 50 5 3.3.1 Sample Input File for NEB Simulations................. 50 3.3.2 Sample Input File for ANI Simulations.................. 51 4 NUDGED ELASTIC BAND: VALIDATION AND RESULTS............. 53 4.1 Computational Details.............................. 53 4.1.1 Test Case 1: Conformational Change of Alanine Dipeptide....... 53 4.1.2 Test Case 2: α-helix to β-sheet Transition in Polyalanine........ 54 4.1.3 Test Case 3: Base Eversion Pathway of the OGG1{DNA Complex... 55 4.2 Accuracy Tests.................................. 55 4.2.1 Test Case 1: Conformational Change of Alanine Dipeptide....... 55 4.2.2 Test Case 2: α-helix to β-sheet Transition in Polyalanine........ 57 4.2.3 Test Case 3: Base Eversion Pathway of the OGG1{DNA Complex... 58 4.3 Timing Benchmarks............................... 59 5 FREE ENERGY METHODS WITH MACHINE LEARNING............. 64 5.1 Two Dimensional Energy Surface with ANI-Amber............... 64 5.2 End-State Free Energy Corrections....................... 65 5.2.1 Conformational Free Energy with ANI-Amber.............. 66 5.2.2 Hydration Free Energy with ANI-Amber................. 69 5.2.2.1 Data preparation and network training............ 70 5.2.2.2 Energy prediction results.................... 71 6 CONCLUDING REMARKS AND FUTURE DIRECTIONS.............. 74 6.1 Final Remarks on Nudged Elastic Band..................... 74 6.2 Final Remarks on Free Energy Calculations with ANI-Ambers......... 75 APPENDIX A KABSCH ALGORITHM................................ 77 B PARAMETERIZATION OF A CURVE........................ 79 B.1 Re-parameterization of a Curve......................... 79 B.2 Arclength of a Curve............................... 79 B.3 Arclength Parameterization........................... 80 C DERIVATION OF EQUATION (2-42)........................ 81 D PENALTY METHOD................................. 82 E TWO DIMENSIONAL TEST POTENTIALS..................... 83 E.1 LEPS Potential.................................. 83 E.2 LEPS Harmonic Oscillator Potential....................... 83 F ALANINE DIPEPTIDE CONFORMATIONAL CHANGE............... 85 6 REFERENCES........................................ 86 BIOGRAPHICAL SKETCH................................. 95 7 LIST OF TABLES Table page 5-1 Free energy difference of cis/trans conformational transition............. 69 8 LIST OF FIGURES Figure page 2-1 Mass and spring representation of NEB........................ 22 2-2 Force decoupling representation of NEB........................ 24 2-3 Thermodynamic cycle of indirect approach...................... 34 2-4 Feed-forward neural network.............................. 35 2-5 Active learning work-flow............................... 39 2-6 Transfer learning work-flow.............................. 41 2-7 Radial symmetry functionals.............................. 41 2-8 ANI neural network potential............................. 43 3-1 MPI implementation of NEB in Amber........................ 46 3-2 GPU implementation of NEB in Amber........................ 47 3-3 Amber molecular dynamics.............................. 48 3-4 ANI-Amber molecular dynamics............................ 49 4-1 Alanine dipeptide potential energy surface...................... 56 4-2 Energy of NEB replicas in alanine dipeptide...................... 57 4-3 End to end distance in polyalanine.......................... 58 4-4 Glycosidic angle vs. eversion distance......................... 60 4-5 Performance comparison between sander and pmemd................. 60 4-6 Performance comparison for different nebfreq values................. 62 4-7 Performance dependence on the size of the data transfers.............. 63 5-1 Ethylene glycol..................................... 64 5-2 Two-dimensional energy surface with GAFF..................... 65 5-3 Two-dimensional energy surface with ANI-1x..................... 66 5-4 cis conformer N-methyl acetamide........................... 66 5-5 trans conformer N-methyl acetamide......................... 67 5-6 Umbrella sampling................................... 68 9 5-7 Endpoint correction.................................. 69 5-8 Data preparation.................................... 70 5-9 Energy prediction results................................ 72 5-10 Cumulative number of data-points.......................... 73 F-1 φ dihedral angle change of NEB replicas in alanine dipeptide............. 85 F-2 dihedral angle change of NEB replicas in alanine dipeptide............ 85 10 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy IMPLEMENTATION OF METHODS TO ACCURATELY PREDICT TRANSITION PATHWAYS AND THE UNDERLYING POTENTIAL ENERGY SURFACE OF BIOMOLECULAR SYSTEMS By Delaram Ghoreishi December 2019 Chair: Adrian Roitberg Major: Physics This thesis focuses on the fast implementation of the nudged elastic band (NEB) method, and the implementation of the interface of a deep neural network potential into the AMBER molecular dynamics package. The details of the implementations and validation results are explored within this document. The reliability of physics-based simulations is restricted by the accuracy of potential energies that regulate the dynamics of a system of particles, as well as the efficiency and the precision of the advanced sampling techniques. Biological systems often experience transitions that completely change their conformations

Implementation of Methods to Accurately Predict Transition Pathways and the Underlying Potential Energy Surface of Biomolecular Systems

Multiscale Simulation of Laser Ablation and Processing of Semiconductor Materials

UOW High Performance Computing Cluster User's Guide

Redox Active Iron Nitrosyl Units in Proton Reduction Electrocatalysis

Computational Chemistry: a Practical Guide for Applying Techniques to Real-World Problems

Seeding the Multi-Dimensional Nonequilibrium Pulling for Hamiltonian Variation

Gaussian 16 Features at a Glance Features Introduced Since Gaussian 09 Rev a Are in Blue

Acronyms Used in Theoretical Chemistry

Information to Users

WHAT INFLUENCE WOULD a CLOUD BASED SEMANTIC LABORATORY NOTEBOOK HAVE on the DIGITISATION and MANAGEMENT of SCIENTIFIC RESEARCH? by Samantha Kanza

S.A. Raja Pharmacy College Vadakkangulam-627 116

Isurvey - Online Questionnaire Generation from the University of Southampton

TDDFT As a Tool in Chemistry II