Data-‐Driven Quantum Chemistry Predictions For
Total Page:16
File Type:pdf, Size:1020Kb
DATA-DRIVEN QUANTUM CHEMISTRY PREDICTIONS FOR UNIQUE LEWIS ACID-BASE PAIRS AND SMALL ORGANIC MOLECULES A Thesis Submitted to the Graduate Faculty in Partial Fulfilment of the Requirements for the Degree of Doctor of Philosophy Molecular and Macromolecular Sciences Department of Chemistry Faculty of Science University of Prince Edward Island QAMMAR ALMAS Charlottetown, Prince Edward Island December 2018 © 2018. Q. Almas To all of my sincere teachers. "It matters that you don’t just give up." - Dr. Stephen Hawking iv Acknowledgements First of all, I would like to acknowledge my supervisor, Dr. Jason K. Pearson, for his guidance in the field of computational chemistry and his expertise for directing in academic technicalities. I thank him for his teachings to train me for high standards of scientific research. I also express gratitude for his constant support during all phases throughout the course of my PhD studies. I am grateful to the members of the Supervisory Committee namely Dr. Rabin Bissessur, Dr. Brian Wagner and Dr. James Polson for their productive directions. I am particularly thankful to Dr. Bissessur and Dr. Wagner for their directions to improve myself in aspects other than computational chemistry and become an overall a better researcher. I am greatly thankful to all faculty members of Chemistry Department. They all have been my teachers and guides in one or the other way throughout the years. I would also like to express my gratitude for all the members of administration and management in Chemistry Department. I particularly thank Ms Janette Paquet, Ms. Jillian MacDonald and Mr. Stephen Scully who have been kind and supportive in academic matters as well as in general. I thank them for making my life easier. I am also thankful to all the members of the Pearson Lab throughout my time of studies. In particular, I would like to express my thanks to Dr. Adam Proud, Brenden Sheppard and Dalton K. Mackenzie who have been supportive colleagues and friends. I am also thankful to all the grad and senior students in the Chemistry Department. It was fun for the most part. I am very grateful to the University of Prince Edward Island for allowing me use all of their resources required to complete a number of tasks which would have hardly possible without the facilities provided at campus as well as abroad via internet. I am also very grateful to Compute Canada and other research supporting agencies that allowed me to use high performance computing systems (ACE-NET as well as Westgrid) with abundantly available resources and support. vi I would like to express my thanks to the Graduate Science Committee for their consistent support in academic matters. I thank Dr. Pedro Quijon, Ms Colleen Gallant and Dr. Amy Hsiao who have helped in many official matters where the Chemistry Department was not enough. And last but not least I thank all of my friends, family members and relatives all over the world. I thank them for their love and support that have encouraged and supported me to complete my PhD studies. I am also thankful to my haters, for they have been a great source of motivation to improve myself. Abstract Besides many other fields of science and technology, big data discoveries and inventions have also been emerging in the field of quantum chemistry. This thesis presents two data-driven projects and one investigational task which emerged from one of the data-driven projects. In chapter 3, a test case of 24 computational chemistry models has been assessed for the performance on the reproducibility of potential energy surfaces of 8 unique Lewis acid-base pairs compared to high accuracy reference calculations. The assessment of density functionals (computational chemistry methods based on density functional theory which states that all ground state information of a molecular system is contained in its electron density and energy of the system can be calculated as the the functional of its electron density) has been employed by means of an automated program written in Python to the data at a central repository followed by the applied queries and analytics. The results reveal that density functionals in general are inaccurate for the prediction of potential energy surfaces of the Lewis pairs. During the analysis of the potential energy surfaces of Lewis acid-base pairs, an inflection in the potential energy surface was observed. The fourth chapter of the thesis is attributed to the inves- tigation of this novel phenomenon. Several medium and high-level computational models have been employed to reproduce the potential energy surfaces which are then compared to standard reference calculations. It is shown that the inflection is the result of a competition between the energetic cost of the required pyramidalization of the Lewis acid and the stabilization from the electrostatic potential between the Lewis acid and base for the formation of the dative bond. Chapter 5 is focused on the power of big data, however, the methods employed are a combination of ab initio quantum mechanics models and machine learning (QM/ML) algorithms. The position intracules (electron pair distributions) and the electron correlation energies of 5660 small organic molecules composed of hydrogen, carbon, nitrogen, oxygen and sulphur are calculated at Hartree-Fock and G4 level of theory respectively. This dataset was distributed as different percentages of test fractions and training fractions for predictions and training of the QM/ML model used. A kernel ridge regression algorithm has been employed to develop a viii QM/ML model for the predictions of the correlation energies of these molecules. The regression model contains only a single hyperparameter sigma which was found to produce optimum results at 0.00004 value. The results are then compared to G4 reference correlation energies. It is shown that predictions are approaching the so called "chemical accuracy" of 1 kcal//mol, and they show great potential for further improvement. The inaccurate performance of density functionals in reproducing potential energy surfaces of Lewis acid-base complexes led to the discovery of an anomalous behavior of potential energy surface of a phosphine complex. This revelation and its investigation is anticipated to be crucial to Lewis acid-base chemistry. The other constructive aspect of the inaccuracy of density functionals to reproduce potential energy surfaces in chapter 3 is the motivation to test an alternative quantum chemistry model like machine learning. The successful application of a kernel ridge regression based QM/ML model has shown promising future aspects of such a model for the prediction of electronic properties of more complex systems like frustrated Lewis pairs. Table of contents List of figures xii List of tables xiv 1 Introduction1 1.1 Frustrated Lewis Pairs . .2 1.2 Scope of Thesis . .4 2 Theory and Methods7 2.1 Schrödinger Wave Equation . .7 2.2 Born-Oppenheimer Approximation . .9 2.3 Variational Theorem . 10 2.4 Basis Set Approximation . 13 2.5 Hartree-Fock Method . 15 2.6 Post Hartree-Fock Methods . 17 2.6.1 Møller-Plesset Perturbation Theory . 18 2.6.2 Configuration Interaction . 18 2.6.3 Multi-Configurational Self-Consistent Field . 20 2.6.4 Coupled Cluster Methods . 21 2.7 Density Functional Theory . 24 2.7.1 Hohenberg-Kohn Theorems . 25 2.7.2 Kohn-Sham DFT Formulation . 25 2.7.3 Density Functional Theory Methods . 26 2.8 Composite methods . 31 2.9 Potential Energy Surfaces . 33 2.10 Machine Learning and Quantum Chemistry . 35 x Table of contents 3 Automated Benchmark of Density Functionals for Stretched Dative Bond Com- plexes 37 3.1 Introduction . 37 3.2 Methods . 40 3.2.1 Description of Workflow . 40 3.2.2 Computational Models . 43 3.2.3 Data Processing . 45 3.3 Results and Discussion . 46 3.3.1 Workflow Performance . 46 3.3.2 Model Chemistry Performance . 48 3.4 Conclusion . 53 4 A Novel Bonding Mode in Phosphine-Haloboranes 55 4.1 Introduction . 55 4.2 Computational Methods . 58 4.3 Results and Discussion . 60 4.3.1 PES of F3B-PH3 ............................ 60 4.3.2 Energy Decomposition . 61 4.3.3 Molecular Orbital Analysis . 63 4.3.4 Comparison with Analogous Systems . 65 4.4 Conclusion . 68 5 Intracules as New Molecular Descriptor in QM/ML 70 5.1 Introduction . 70 5.2 Methods . 72 5.2.1 Model . 72 5.2.2 Descriptor . 74 5.2.3 Dataset . 75 5.2.4 Preparation . 76 5.3 Results and Discussion . 80 5.4 Conclusion . 84 6 Conclusion and Future Perspective 85 References 88 Table of contents xi Appendix A Python program written for machine learning calculations 105 Appendix B Data of potential energy surface calculations of selected Lewis pairs 113 List of figures 1.1 (a) There was no reaction upon mixing trimethylboron and lutidine due to steric hindrance caused by bulky groups attached, and (b) No formation of classic Lewis acid-base adduct due to steric hindrance between bulky moieties attached to LA and LB. (c) The hindrance between LA and LB caused frustration which was overcome by breaking a hydrogen molecule bond and reaction was also reversible at 150◦C. ...3 1.2 The scheme for the analysis of the reversible hydrogenation by Lewis acid-base pair with bulky groups attached. ............................5 2.1 HF or SCF optimization algorithm. ......................... 17 2.2 Electronic excitations from ground state to singly, doubly and triply excited states. Green represents electrons at ground state whereas red represents excited states. G. state in figure means ground state ......................... 19 2.3 Schematic of CASSCF and RASSCF depicting permitted electronic excitations. .... 21 2.4 Kohn-Sham density functional theory algorithm for optimization energy. ...... 30 2.5 PES of an arbitrary diatomic molecule where rAB represents the distance between two nuclei, E(r) represents the potential energy depending on AB bond distance and req is the distance between two atoms at equilibrium structure.