Property Biased-Diversity Guided Explorations of Chemical Spaces by Chetan Raj Rupakheti Graduate Program in Computational Biolo

Property Biased-Diversity Guided Explorations of Chemical Spaces by Chetan Raj Rupakheti Graduate Program in Computational Biology and Bioinformatics Duke University Date:_______________________ Approved: ___________________________ David Beratan, Supervisor ___________________________ Weitao Yang ___________________________ Qiu Wang ___________________________ Patrick Charbonneau Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate Program in Computational Biology and Bioinformatics in the Graduate School of Duke University 2015 ABSTRACT Property Biased-Diversity Guided Explorations of Chemical Spaces by Chetan Raj Rupakheti Graduate Program in Computational Biology and Bioinformatics Duke University Date:_______________________ Approved: ___________________________ David Beratan, Supervisor ___________________________ Weitao Yang ___________________________ Qiu Wang ___________________________ Patrick Charbonneau An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate Program in Computational Biology and Bioinformatics in the Graduate School of Duke University 2015 Copyright by Chetan Raj Rupakheti 2015 Abstract Discovering functionally useful structures and materials by exploring the vastness of chemical space is an exciting undertaking. If done efficiently, one can discover structures that can have therapeutic value (such as drug like organic molecules) or technological value (such as organic light emitting diodes). While mining of chemical space has the potential to generate libraries of functional structures and materials, one can also easily be lost in its vastness (~1060 theoretically possible small organic molecule ~500 Da molecular weight or less). We have developed a strategy that allows efficient explorations of vast chemical spaces to generate libraries of functional organic molecules. The method, at its core, applies physical properties and structural diversity biased sampling of chemical space to search for new structures. We demonstrate the soundness and efficiency of this approach by searching through the known and enumerated databases to discover diverse organic molecules with optimum electronic and biophysical properties, and we also compare it to various existing approaches used for molecular search and property optimization. We also show a practical application of this approach by designing libraries of chromophores that emit light in the blue region of the spectrum as well as potential leads for protein and RNA binding. iv Dedication To my family, mentors, and friends. v Contents Abstract ......................................................................................................................................... iv List of Tables .................................................................................................................................. xi List of Figures .............................................................................................................................. xii Acknowledgements .................................................................................................................... xix Chapter 1. Background ................................................................................................................. 1 1.1 Chemical space of small organic molecules .................................................................. 1 1.1.1 Motivation for exploration ......................................................................................... 1 1.2 Experimental Approaches for chemical space exploration ........................................ 3 1.2.1 Fragment based drug design ...................................................................................... 3 1.2.2 Diversity oriented drug design .................................................................................. 5 1.3 Computational approaches for chemical space exploration ....................................... 6 1.3.1 Generated databases (GDB) of enumerated small organic molecules ................. 6 1.3.1.1 Methodologies and Principle to Virtually Screen Drug-Like Molecular Libraries ............................................................................................................................. 9 1.3.2 Existing Complementary Computational Approaches for Material Designs ... 10 1.3.2.1 AFLOW Material Database ............................................................................... 11 1.3.2.2 Harvard Clean Energy Project (CEP) .............................................................. 12 1.3.3 Diversity-oriented stochastic search of chemical Spaces ..................................... 15 1.4 Thesis contribution: Property-biased exploration of diverse regions of chemical space ........................................................................................................................................ 17 1.5 Outlines ............................................................................................................................ 19 vi Chapter 2. A strategy to discover diverse optimal molecules in the small molecule universe. ........................................................................................................................................ 21 2.1 Introduction ..................................................................................................................... 21 2.2 Methods ............................................................................................................................ 22 2.2.1 ACSESS algorithm to search molecular space ....................................................... 25 2.2.2 Software Used for Property-Optimizing ACSESS ................................................ 27 2.2.3 Optimization within GBD9 property landscape ................................................... 27 2.2.4 Optimization in the NKp fitness landscape model ............................................... 28 2.3 Results and Discussions ................................................................................................. 30 2.3.1 ACSESS exploration of a NKp model space .......................................................... 30 2.3.2 Comparing property-optimizing ACSESS with simple genetic algorithm searches ................................................................................................................................. 31 2.3.3 ACSESS explorations in GDB-9 ............................................................................... 33 2.3.4 Comparisons of property-optimizing ACSESS and genetic algorithms methods for property optimization .................................................................................................. 34 2.3.5 Self-organizing maps of GDB-9 and of sampled solutions .................................. 37 2.4 Conclusions ...................................................................................................................... 41 Chapter 3. Diverse optimal molecular libraries for organic light emitting diodes. ........... 43 3.1 Introduction ..................................................................................................................... 43 3.1.1 OLED structures ......................................................................................................... 44 3.2 Method .............................................................................................................................. 46 3.2.1 Using property-optimizing ACSESS to design fluorescent emitters .................. 46 vii 3.2.2 Quantum chemical computation of singlet excitation energies and oscillator strengths ............................................................................................................................... 49 3.2.3 Design of fluorescent emitters that utilizes only singlet excitation state ........... 50 3.2.4 Designing the TADF library by optimizing the singlet - triplet gap .................. 50 3.3 Results and Discussions ................................................................................................. 53 3.3.1 Validating the Semiempirical Excitation Energies ................................................ 53 3.3.2 Property-biased ACSESS-based library of singlet emitters ................................. 55 3.3.3 Validating the ZINDO/S Excitation Energy Calculations with TDDFT for the ACSESS-designed Library ................................................................................................. 59 2.3.4 Property-optimized ACSESS Library of TADF Emitters ..................................... 60 3.4 Conclusions ...................................................................................................................... 64 Chapter 4. Efficiently charting astronomically large chemical spaces to search drug leads ........................................................................................................................................................ 66 4.1 Introduction ..................................................................................................................... 66 4.2 Methods ........................................................................................................................... 68 4.2.1 Benchmarking Docking Methods for the COX-2 Receptor .................................. 69 4.2.2 Comparing Property-Optimized ACSESS with Simple

Property Biased-Diversity Guided Explorations of Chemical Spaces by Chetan Raj Rupakheti Graduate Program in Computational Biolo

Visualization of Chemical Space Using Principal Component Analysis

Exploring Chemical Space

A Thermodynamic Atlas of Carbon Redox Chemical Space

Cheminformatic Analysis of Natural Products and Their Chemical Space

Data-Driven Astrochemistry: One Step Further Within the Origin of Life Puzzle

Navigating Chemical Space by Interfacing Generative Artificial Intelligence and Molecular Docking

Mapping the Molecular Tree of Life Using Assembly Spaces

Alexander Ruf Dissertation

The Next Level in Chemical Space Navigation: Going Far Beyond Enumerable Compound Libraries

Feature Optimization in High Dimensional Chemical Space: Statistical and Data Mining Solutions Jinuraj K

A Close-Up Look at the Chemical Space of Commercially Available Building Blocks for Medicinal Chemistry

Quest for the Rings. in Silico Exploration of Ring Universe to Identify Novel Bioactive Heteroaromatic Scaffolds