An Interacting Quantum Atoms Approach to Constructing A

AN INTERACTING QUANTUM ATOMS APPROACH TO CONSTRUCTING A CONFORMATIONALLY DEPENDENT BIOMOLECULAR FORCE FIELD BY GAUSSIAN PROCESS REGRESSION: POTENTIAL ENERGY SURFACE SAMPLING AND VALIDATION A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Engineering and Physical Sciences 2016 Salvatore Cardamone School of Chemistry Contents Abstract 20 Declaration 22 Copyright Statement 23 Acronyms and Abbreviations 25 Acronyms 25 Acknowledgements 29 30 1 Introduction 31 2 Exposition 36 2.1 Molecular Mechanics.......................... 36 2.1.1 Molecular Dynamics...................... 36 2.1.2 Classical Force Fields...................... 41 2.1.3 ab initio Molecular Dynamics................. 52 2 2.2 Atomic Partitioning........................... 55 2.2.1 Hirshfeld Partitioning...................... 56 2.2.2 Bader's Atoms in Molecules.................. 58 2.2.3 Interacting Quantum Atoms.................. 64 2.3 Multipole Moment Electrostatics................... 70 2.3.1 Mathematical Details...................... 70 2.3.2 Implementation......................... 81 2.4 Machine Learning............................ 97 2.4.1 Overview............................ 97 2.4.2 Kriging............................. 101 2.4.3 Particle Swarm Optimisation................. 106 2.5 The Quantum Chemical Topological Force Field........... 108 2.5.1 Previous Work......................... 108 2.5.2 Proposed Implementation................... 110 3 Conformational Sampling 112 3.1 Introduction............................... 112 3.2 Stationary Point Vibrations...................... 116 3.2.1 Kinetic Energy......................... 116 3.2.2 Potential Energy........................ 117 3.2.3 Equations of Motion...................... 118 3.3 Generalised Vibrations......................... 121 3.4 Tyche - Conformational Sampling Software............. 124 3 3.4.1 Transformation to Normal Coordinates............ 124 3.4.2 Dynamics............................ 127 3.4.3 Markov Chains on Tessellated PESs.............. 134 3.4.4 Algorithm............................ 137 3.5 Redundant Internal Coordinates.................... 140 3.5.1 Overview............................ 140 3.5.2 Valence Coordinates...................... 142 3.5.3 Implementation......................... 147 3.6 Results.................................. 149 3.6.1 Validation and Benchmarking................. 149 3.6.2 Markov Chain Conformational Sampling........... 168 3.6.3 Conformational Sampling of Large Molecular Species.... 173 3.7 Conclusion................................ 176 4 A Novel Carbohydrate Force Field 178 4.1 Introduction............................... 178 4.2 A Basis Set for Machine Learning................... 182 4.3 Computational Details......................... 185 4.4 Results and Discussion......................... 193 4.4.1 Single Minimum......................... 193 4.4.2 Training Set Size Dependency................. 196 4.4.3 Multiple Minima........................ 200 4.5 Conclusion................................ 202 4 5 Subset Selection for Machine Learning 204 5.1 Introduction............................... 204 5.2 Greedy Heuristics............................ 206 5.2.1 Experimental Details...................... 208 5.2.2 MaxMin............................. 210 5.2.3 Deletion............................. 213 5.2.4 Alternative Distance Metric.................. 218 5.3 Sequential Selection........................... 225 5.4 Iterative Voronoi Subset Selection................... 233 5.5 Domain of Applicability........................ 239 5.5.1 Introduction........................... 239 5.5.2 The Convex Hull........................ 242 5.5.3 Test Case............................ 245 5.5.4 Largest d-Polytope....................... 247 5.6 Conclusion................................ 257 6 Raman Optical Activity 259 6.1 General Theory............................. 261 6.1.1 Classical Raman Scattering.................. 261 6.1.2 The Electromagnetic Field................... 264 6.1.3 Hamiltonian in an Electromagnetic Field........... 268 6.1.4 Wavefunction Perturbed by Periodic Field.......... 272 6.1.5 The Raman Optical Activity Tensors............. 276 5 6.1.6 ROA Intensities......................... 279 6.2 Previous Computational Work..................... 281 6.2.1 Algorithmic Developments................... 281 6.2.2 Solvation Effects........................ 284 6.2.3 Quantitative Measures of Similarity.............. 286 6.2.4 Large Biomolecular Systems.................. 288 6.3 Technical Details............................ 290 6.3.1 Zwitterionic Histidine...................... 290 6.3.2 Computing Spectra from a Number of Conformers...... 294 6.3.3 Filtering of Similar Geometries................ 295 6.4 Computational Details......................... 299 6.4.1 Molecular Dynamics...................... 299 6.4.2 Geometric Filtering....................... 301 6.4.3 Calculation of Spectra..................... 304 6.5 Conformer Optimisation........................ 305 6.5.1 Boltzmann Weighting...................... 307 6.5.2 Effects of Level of Optimisation................ 312 6.5.3 Alternative Optimisation Schemes............... 315 6.6 Microsolvation.............................. 320 6.6.1 Neutral Histidine........................ 321 6.6.2 Protonated Histidine...................... 325 6.7 Conclusion................................ 330 6 7 Conclusion and Future Work 331 7.1 General Conclusions.......................... 331 7.2 Future Work............................... 334 Bibliography 338 A Density Functional Theory 395 A.1 Thomas-Fermi Model.......................... 395 A.2 Hohenberg-Kohn Theorems...................... 397 A.2.1 First Theorem.......................... 397 A.2.2 Second Theorem........................ 398 A.3 Kohn-Sham Equations......................... 400 B Storage of Kriging Models 403 7 List of Tables 2.1 The number of geometric minima predicted by a variety of force fields for serine and cysteine. Also given are the number of minima that the various force fields predicted but that were not represented in the set of minima generated by ab initio calculations, and the mean average deviations (MAD) for the molecular energies at each geometry................................. 95 2.2 Free energies of hydration for a variety of molecules predicted by use of AMOEBA and TIP3P-like water potentials. Corresponding experimental values are also given. Energies are in kcal mol-1.... 97 3.1 Memory requirements for tyche.................... 140 3.2 Conformational sampling of water at a number of values of nperiod.. 150 3.3 Conformational sampling of NMA at a number of values of nperiod.. 150 3.4 Conformational sampling of histidine at a number of values of nperiod.151 3.5 Conformational sampling of water using the Jacobian and Hessian from a number of different levels of theory............... 152 3.6 Conformational sampling of NMA using the Jacobian and Hessian from a number of different levels of theory............... 152 3.7 Conformational sampling of histidine using the Jacobian and Hes- sian from a number of different levels of theory............ 152 8 3.8 Sampling ranges of four prominent bond stretching degrees of freedom in zwitterionic alanine as the sampling temperature is increased. The ab initio Hessian and Jacobian are taken from gas phase frequency calculations. In the bottom row, we include the ranges of these degrees of freedom as obtained from a 300 K in vacuo MD trajectory. In red, we have refined these sampling ranges down so that they represent the sampling ranges taken by over 98% of conformers over the course of the MD trajectory.................. 162 3.9 Sampling ranges of four prominent valence angle bending degrees of freedom in zwitterionic alanine as the sampling temperature is increased. The ab initio Hessian and Jacobian are taken from gas phase frequency calculations. In the bottom row, we include the ranges of these degrees of freedom as obtained from a 300 K in vacuo MD trajectory. In red, we have refined these sampling ranges down so that they represent the sampling ranges taken by over 98% of conformers over the course of the MD trajectory.......... 163 3.10 Sampling ranges of four prominent bond stretching degrees of freedom in zwitterionic alanine as the sampling temperature is increased. The ab initio Hessian and Jacobian are taken from implicitly solvated (CPCM) frequency calculations. In the bottom row, we include the ranges of these degrees of freedom as obtained from a 300 K TIP3P-solvated MD trajectory. In red, we have refined these sampling ranges down so that they represent the sampling ranges taken by over 98% of conformers over the course of the MD trajectory.165 3.11 Sampling ranges of four prominent valence angle bending degrees of freedom in zwitterionic alanine as the sampling temperature is increased. The ab initio Hessian and Jacobian are taken from implicitly solvated (CPCM) frequency calculations. In the bottom row, we include the ranges of these degrees of freedom as obtained from a 300 K TIP3P-solvated MD trajectory. In red, we have refined these sampling ranges down so that they represent the sampling ranges taken by over 98% of conformers over the course of the MD trajectory.166 3.12 Sampling ranges of four prominent bond stretching degrees of freedom in zwitterionic alanine, as obtained through tyche with a single seeding geometry (tyche (Single)), with five seeding geometries (tyche (Multi)) and through a MD trajectory............ 169 9 3.13 Sampling ranges

An Interacting Quantum Atoms Approach to Constructing A

Aqueous Pka Prediction for Tautomerizable Compounds Using Equilibrium Bond Lengths

Bader's Theory of Atoms in Molecules

FORCE FIELDS for PROTEIN SIMULATIONS by JAY W. PONDER

1 an Introduction to the Quantum Theory of Atoms in Molecules

Electronic Reprint Critical Examination of the Radial Functions in the Hansen

Some Notes On: What Is an Atom in a Molecule

1.3 Structures of Covalent Compounds 13

DFT and QTAIM Study of the Tetra-Tert ... -.:. Michael Pittelkow

Atoms in Molecules

Formation of Β-Cyclodextrin Complexes in an Anhydrous Environment

Description of Aromaticity with the Help of Vibrational Spectroscopy: Anthracene and Phenanthrene Robert Kalescky, Elﬁ Kraka, and Dieter Cremer*

Predikce Hodnot Pka Na Základě EEM Atomových Nábojů