An Interacting Quantum Atoms Approach to Constructing A

An Interacting Quantum Atoms Approach to Constructing A

AN INTERACTING QUANTUM ATOMS APPROACH TO CONSTRUCTING A CONFORMATIONALLY DEPENDENT BIOMOLECULAR FORCE FIELD BY GAUSSIAN PROCESS REGRESSION: POTENTIAL ENERGY SURFACE SAMPLING AND VALIDATION A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Engineering and Physical Sciences 2016 Salvatore Cardamone School of Chemistry Contents Abstract 20 Declaration 22 Copyright Statement 23 Acronyms and Abbreviations 25 Acronyms 25 Acknowledgements 29 30 1 Introduction 31 2 Exposition 36 2.1 Molecular Mechanics.......................... 36 2.1.1 Molecular Dynamics...................... 36 2.1.2 Classical Force Fields...................... 41 2.1.3 ab initio Molecular Dynamics................. 52 2 2.2 Atomic Partitioning........................... 55 2.2.1 Hirshfeld Partitioning...................... 56 2.2.2 Bader's Atoms in Molecules.................. 58 2.2.3 Interacting Quantum Atoms.................. 64 2.3 Multipole Moment Electrostatics................... 70 2.3.1 Mathematical Details...................... 70 2.3.2 Implementation......................... 81 2.4 Machine Learning............................ 97 2.4.1 Overview............................ 97 2.4.2 Kriging............................. 101 2.4.3 Particle Swarm Optimisation................. 106 2.5 The Quantum Chemical Topological Force Field........... 108 2.5.1 Previous Work......................... 108 2.5.2 Proposed Implementation................... 110 3 Conformational Sampling 112 3.1 Introduction............................... 112 3.2 Stationary Point Vibrations...................... 116 3.2.1 Kinetic Energy......................... 116 3.2.2 Potential Energy........................ 117 3.2.3 Equations of Motion...................... 118 3.3 Generalised Vibrations......................... 121 3.4 Tyche - Conformational Sampling Software............. 124 3 3.4.1 Transformation to Normal Coordinates............ 124 3.4.2 Dynamics............................ 127 3.4.3 Markov Chains on Tessellated PESs.............. 134 3.4.4 Algorithm............................ 137 3.5 Redundant Internal Coordinates.................... 140 3.5.1 Overview............................ 140 3.5.2 Valence Coordinates...................... 142 3.5.3 Implementation......................... 147 3.6 Results.................................. 149 3.6.1 Validation and Benchmarking................. 149 3.6.2 Markov Chain Conformational Sampling........... 168 3.6.3 Conformational Sampling of Large Molecular Species.... 173 3.7 Conclusion................................ 176 4 A Novel Carbohydrate Force Field 178 4.1 Introduction............................... 178 4.2 A Basis Set for Machine Learning................... 182 4.3 Computational Details......................... 185 4.4 Results and Discussion......................... 193 4.4.1 Single Minimum......................... 193 4.4.2 Training Set Size Dependency................. 196 4.4.3 Multiple Minima........................ 200 4.5 Conclusion................................ 202 4 5 Subset Selection for Machine Learning 204 5.1 Introduction............................... 204 5.2 Greedy Heuristics............................ 206 5.2.1 Experimental Details...................... 208 5.2.2 MaxMin............................. 210 5.2.3 Deletion............................. 213 5.2.4 Alternative Distance Metric.................. 218 5.3 Sequential Selection........................... 225 5.4 Iterative Voronoi Subset Selection................... 233 5.5 Domain of Applicability........................ 239 5.5.1 Introduction........................... 239 5.5.2 The Convex Hull........................ 242 5.5.3 Test Case............................ 245 5.5.4 Largest d-Polytope....................... 247 5.6 Conclusion................................ 257 6 Raman Optical Activity 259 6.1 General Theory............................. 261 6.1.1 Classical Raman Scattering.................. 261 6.1.2 The Electromagnetic Field................... 264 6.1.3 Hamiltonian in an Electromagnetic Field........... 268 6.1.4 Wavefunction Perturbed by Periodic Field.......... 272 6.1.5 The Raman Optical Activity Tensors............. 276 5 6.1.6 ROA Intensities......................... 279 6.2 Previous Computational Work..................... 281 6.2.1 Algorithmic Developments................... 281 6.2.2 Solvation Effects........................ 284 6.2.3 Quantitative Measures of Similarity.............. 286 6.2.4 Large Biomolecular Systems.................. 288 6.3 Technical Details............................ 290 6.3.1 Zwitterionic Histidine...................... 290 6.3.2 Computing Spectra from a Number of Conformers...... 294 6.3.3 Filtering of Similar Geometries................ 295 6.4 Computational Details......................... 299 6.4.1 Molecular Dynamics...................... 299 6.4.2 Geometric Filtering....................... 301 6.4.3 Calculation of Spectra..................... 304 6.5 Conformer Optimisation........................ 305 6.5.1 Boltzmann Weighting...................... 307 6.5.2 Effects of Level of Optimisation................ 312 6.5.3 Alternative Optimisation Schemes............... 315 6.6 Microsolvation.............................. 320 6.6.1 Neutral Histidine........................ 321 6.6.2 Protonated Histidine...................... 325 6.7 Conclusion................................ 330 6 7 Conclusion and Future Work 331 7.1 General Conclusions.......................... 331 7.2 Future Work............................... 334 Bibliography 338 A Density Functional Theory 395 A.1 Thomas-Fermi Model.......................... 395 A.2 Hohenberg-Kohn Theorems...................... 397 A.2.1 First Theorem.......................... 397 A.2.2 Second Theorem........................ 398 A.3 Kohn-Sham Equations......................... 400 B Storage of Kriging Models 403 7 List of Tables 2.1 The number of geometric minima predicted by a variety of force fields for serine and cysteine. Also given are the number of minima that the various force fields predicted but that were not represented in the set of minima generated by ab initio calculations, and the mean average deviations (MAD) for the molecular energies at each geometry................................. 95 2.2 Free energies of hydration for a variety of molecules predicted by use of AMOEBA and TIP3P-like water potentials. Corresponding experimental values are also given. Energies are in kcal mol-1.... 97 3.1 Memory requirements for tyche.................... 140 3.2 Conformational sampling of water at a number of values of nperiod.. 150 3.3 Conformational sampling of NMA at a number of values of nperiod.. 150 3.4 Conformational sampling of histidine at a number of values of nperiod.151 3.5 Conformational sampling of water using the Jacobian and Hessian from a number of different levels of theory............... 152 3.6 Conformational sampling of NMA using the Jacobian and Hessian from a number of different levels of theory............... 152 3.7 Conformational sampling of histidine using the Jacobian and Hes- sian from a number of different levels of theory............ 152 8 3.8 Sampling ranges of four prominent bond stretching degrees of free- dom in zwitterionic alanine as the sampling temperature is increased. The ab initio Hessian and Jacobian are taken from gas phase fre- quency calculations. In the bottom row, we include the ranges of these degrees of freedom as obtained from a 300 K in vacuo MD tra- jectory. In red, we have refined these sampling ranges down so that they represent the sampling ranges taken by over 98% of conformers over the course of the MD trajectory.................. 162 3.9 Sampling ranges of four prominent valence angle bending degrees of freedom in zwitterionic alanine as the sampling temperature is increased. The ab initio Hessian and Jacobian are taken from gas phase frequency calculations. In the bottom row, we include the ranges of these degrees of freedom as obtained from a 300 K in vacuo MD trajectory. In red, we have refined these sampling ranges down so that they represent the sampling ranges taken by over 98% of conformers over the course of the MD trajectory.......... 163 3.10 Sampling ranges of four prominent bond stretching degrees of free- dom in zwitterionic alanine as the sampling temperature is increased. The ab initio Hessian and Jacobian are taken from implicitly sol- vated (CPCM) frequency calculations. In the bottom row, we in- clude the ranges of these degrees of freedom as obtained from a 300 K TIP3P-solvated MD trajectory. In red, we have refined these sampling ranges down so that they represent the sampling ranges taken by over 98% of conformers over the course of the MD trajectory.165 3.11 Sampling ranges of four prominent valence angle bending degrees of freedom in zwitterionic alanine as the sampling temperature is in- creased. The ab initio Hessian and Jacobian are taken from implic- itly solvated (CPCM) frequency calculations. In the bottom row, we include the ranges of these degrees of freedom as obtained from a 300 K TIP3P-solvated MD trajectory. In red, we have refined these sampling ranges down so that they represent the sampling ranges taken by over 98% of conformers over the course of the MD trajectory.166 3.12 Sampling ranges of four prominent bond stretching degrees of free- dom in zwitterionic alanine, as obtained through tyche with a sin- gle seeding geometry (tyche (Single)), with five seeding geometries (tyche (Multi)) and through a MD trajectory............ 169 9 3.13 Sampling ranges

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    486 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us