Automated Calculation of Reaction Kinetics Via Transition State Theory

AUTOMATED CALCULATION OF REACTION KINETICS VIA TRANSITION STATE THEORY A Dissertation Presented By Pierre Lennox Bhoorasingh to The Department of Chemical Engineering In partial fulfillment of the requirements for the degree of Doctor of Philosophy in the field of Chemical Engineering Northeastern University Boston, Massachusetts August 2016 Dedication I dedicate this thesis to AMT. i Acknowledgments I have been able to complete this thesis work due to the help I have received from those who have found time in their busy schedules. This is my attempt to express my profound gratitude to those who have helped me during my thesis work. Thanks to my advisor, Prof. Richard West, for the guidance over the 5 years. You also gave me the freedom to explore and that has only enhanced my thesis work, and it has been a pleasure to be your first graduate student. I would also like to thank my thesis committee members, Dr. David Budil, Dr. Hicham Fenniri, Dr. C. Franklin Goldsmith, and Dr. Reza Sheikhi. They made the time to have engaging discussions that impacted this thesis, and were also very generous with their professional advice. Thanks to the Computational Modeling group. Fariba Seyedzadeh Khanshan and Be- linda Slakman, you were always helpful in our discussions and made the laboratory a fun working environment. I’d also like to thank Jason Cain for being a super helpful under- graduate who assumed nothing in pursuit of the right approach. I want to also thank Sean Troiano, Victor Lambert, Jacob Barlow, and Elliot Nash for their contributions to laboratory discussions. Thanks to past and present RMG developers, who do a great job working on a complex open-source software. I would like to thank Joshua Allen and Amrit Jalan for their scientific perspectives in the early stages of this thesis work. I’d also like to thank Shamel Merchant and Enoch Dames for their help with CanTherm. I would like to thank Greg Landrum and the RDKit developers, for this thesis would be much more difficult without their work. Thanks to Pat Rowe, Jessica Smith-Japhet, and Brandon Mennillo for their assistance over the years. I would like to express my gratitude to the Research Computing team at ii iii Northeastern University, and in particular Dr. Nilay Roy, for their work on the Discovery cluster. I would also like to thank Bill Sheehan for his help with the now retired Venture and Opportunity clusters. I’d like to thank the Combustion Energy Frontier Research Center, especially Prof. Chung Law and Lilian Tsang, for organizing and hosting the Combustion Summer School, which I had the opportunity to attend twice (2012 and 2014). I must thank Prof. David Beck and the organizers of the 2015 Data Science Work- shop for hosting an enjoyable and intense discussion group on the role of data science in academia. I would also like to thank Michael Li and the team at the Data Incubator for running an informative and rigorous data science bootcamp that I had the opportunity to attend in the Spring of 2016. Thank you to my classmates, Avinash, Dan, Dinara, Emily, and Nil. Your support has been important through the years. I want to also thank the friends I made in the Chemical Engineering Department. Finally, thanks to my family, for their unending support as I take another step in life. Abstract Modeling complex chemical systems often requires knowledge of the elementary reactions involved, such as in combustion kinetics where models routinely contain thousands of reactions. Automated tools have been developed to construct such models, as manual methods have proven to be tedious and susceptible to human error. A large number of kinetic parameters are required to complete the construction of detailed kinetic models, but the available data are quite sparse. As a result, estimation methods use existing data to predict the many unknown kinetics, but the accuracy of these kinetics suffers due to insufficient data to make good kinetic predictions. Theoretical calculations can be used to improve the kinetics in models, but these calculations require a transition state geometry estimate that is typically provided manually. Manual geometry estimation is slow and infeasible for automated construction of reaction mechanisms, so this thesis describes an automated method to estimate transition state geometries and calculate reaction kinetics. The three dimensional chemical structure for un- reactive atoms at the transition state can be predicted with existing computational methods, but the geometry of the reaction center is unknown. The unknown section of the transition state must be predicted to create the transition state geometry. The reaction center distances are predicted using data from analogous transition state structures, and the transition state geometry prediction is constructed using an existing tech- nique known as distance geometry. The transition state geometry prediction is optimized using a commercially available computational chemistry software package in order to calculate molecular properties of the transition state, such as bond vibrational frequencies. The molecular properties of reactants and products are also required to calculate reaction kinetics, and these are determined using an existing automated method. Molecular properties of the reactants, products, and transition state are used to calculate the kinetics of a i ii reaction via classical transition state theory. The work in this thesis was initially developed for hydrogen abstraction reactions, and has been extended to b-scission and intra-hydrogen migration reactions. The automatically determined kinetics and state-of-the-art estimation methods were compared to high accuracy theoretical calculations, and the automated calculations were shown to outperform the estimation methods. This enables improved mechanism generation, where high-fidelity complex chemical models can be constructed with minimal human intervention. Contents 1 Introduction 1 1.1 Background . .2 1.1.1 Automatic mechanism generation . .2 1.1.2 Kinetic and thermodynamic parameter estimation . .4 1.1.3 Theoretical rate calculation . .5 1.1.4 Statistical mechanics and quantum chemistry . .8 1.1.5 Stable geometry and transition state searches . 10 1.1.6 Automated transition state searches . 11 1.1.7 Kinetic Programs . 13 1.2 Thesis overview . 14 2 Using double-ended methods to automate transition state searches 15 2.1 Background . 15 2.2 Methods . 17 2.2.1 Generating 3-dimensional geometries for double-ended search methods .................................. 17 2.2.2 Locating transition states with the automatic double-ended search . 22 2.2.3 Electronic Structure calculations . 23 2.3 Results and Discussion . 24 2.4 Conclusion . 26 2.5 Recommendations . 26 2.5.1 Semi-empirical methods are insufficient for transition state searches 26 2.5.2 Consider more robust double-ended search methods . 27 iii iv 3 Automatic transition state geometry estimation using group contributions 28 3.1 Background . 28 3.2 Methods . 31 3.2.1 Geometry estimation and optimization . 31 3.2.2 Method evaluation . 37 3.3 Results and Discussion . 38 3.3.1 Transition state geometries were successfully estimated using the distance estimates . 38 3.3.2 Increasing training data improves the group value predictions . 38 3.3.3 Geometry estimation needs improvement to make best use of predicted values . 39 3.3.4 Algorithm optimization . 43 3.4 Conclusion . 43 3.5 Recommendations . 44 3.5.1 Conformer recognition . 44 4 Improving the group contribution transition state search method 45 4.1 Background . 45 4.2 Methods . 46 4.2.1 Modifying the transition state geometry prediction . 46 4.2.2 Modifying the transition state optimization sequence . 49 4.3 Results and Discussion . 49 4.3.1 Tree structure and data diversity affect prediction accuracy . 49 4.3.2 Manipulating distance limits and force constants can improve UFF optimization . 53 4.3.3 Replacing the UFF optimization with more robust calculations may improve transition state prediction . 54 4.4 Conclusion . 56 v 4.5 Recommendations . 57 4.5.1 Efficient calculation of the molecular group contributions . 57 4.5.2 UFF optimization with constrained optimization . 58 5 Method extension to new reaction families and automated kinetic parameter calculation 59 5.1 Background . 59 5.2 Methods . 61 5.2.1 Computational chemistry . 61 5.2.2 Automated geometry searches . 61 5.2.3 Kinetic calculations . 63 5.2.4 Comparison of Automated TST calculations and Rate Rules . 63 5.2.5 Comparison to benchmark calculations . 64 5.3 Results . 64 5.3.1 Comparison of automated TST calculations and rate rules . 65 5.3.2 Comparing predictions to benchmark calculations . 65 5.3.3 Sources of error in the automated calculations . 66 5.4 Discussion . 68 5.5 Conclusion . 72 5.6 Recommendations . 73 5.6.1 Improve symmetry number calculation . 73 5.6.2 Automate hindered rotor calculations . 73 6 Summary 75 Appendices 90 Appendix A Double-ended method 91 vi Appendix B Group contribution method 100 B.1 Group Training Regression Details . 100 B.2 Predicted vs Optimized distances . 102 B.3 Group Naming Convention . 103 B.4 Group values for original tree . 104 B.5 Group values for new tree . 110 B.6 List of test reactions . 118 B.7 Effect of increasing force constants and reducing the difference between upper and lower limits . 148 Appendix C Kinetic calculations 150 C.1 Molecular group trees . 150 C.1.1 Hydrogen Abstraction . 150 C.1.2 Intra-hydrogen migration . 160 C.1.3 b-scission . 167 List of Figures 1.1 Beta-scission reaction template from RMG. .3 1.2 Potential energy profile for a typical reaction. .6 2.1 The molecular bounds matrix. (A) Bonded atom limits are set by bond length rules, while connectivity limits non-bonded atoms in the same molecule.

Load more