FROM QSAR to QNAR, DEVELOPING ENHANCED MODELS for DRUG DISCOVERY By

FROM QSAR TO QNAR, DEVELOPING ENHANCED MODELS FOR DRUG DISCOVERY by WENYI WANG A dissertation submitted to the Graduate School-Camden Rutgers, the State University of New Jersey In partial fulfillment of the requirements For the degree of Doctor of Philosophy Graduate Program in Computational and Integrative Biology Written under the direction of Dr. Hao Zhu And approved by Dr. Hao Zhu Dr. Sunil Shende ________________________________ ________________________________ Dr. Bing Yan Dr. Suneeta Ramaswami ________________________________ ________________________________ Dr. Jinglin Fu Dr. Joseph Martin ________________________________ ________________________________ Camden, New Jersey October 2018 ABSTRACT OF THE DISSERTATION From QSAR to QNAR, Developing Enhanced Models for Drug Discovery by WENYI WANG Dissertation Director: Dr. Hao Zhu Exploring new chemical entities in drug discovery requires extensive investigations on libraries of thousands of molecules. While conventional animal-based tests in drug discovery procedure are expensive and time consuming, the evaluation of a drug candidate can be facilitated by alternative computational methods. For example, the Quantitative Structure Activity Relationship (QSAR) model has been widely used to predict bioactivities for drug candidates. However, traditional QSAR models are solely based on chemical structures, and are less effective in the drug discovery procedure due to various limitations related to complicated structures or bioactivities. In this thesis, we aimed to establish high quality and predictive models by using novel modeling approaches beyond QSAR. First, we developed a methodology for predicting the Blood- Brain Barrier permeability of small molecules by incorporating biological assay information (e.g. transporter interactions) into the modeling process. This method can be further extended to modeling and predicting in vivo bioactivities of drug candidates. Second, we created a new Quantitative Nanostructure Activity Relationship (QNAR) modeling strategy to extend the applicability of QSAR to predict bioactivities of ii nanomaterials. The research presented in this thesis opens a new path to the precise prediction of bioactivities of molecules in the drug discovery procedure. iii Acknowledgements To my advisor, Dr. Hao Zhu, thank you so much. It was the best choice I ever made to work with you. Your guidance, support and encouragement have been my greatest motivation. Your intelligence, enthusiasm, and ambition have been a great source of inspiration throughout my graduate study. I couldn’t have been who I am now without you. No word can express my gratefulness to you. Thank you! To my committee members, Dr. Bing Yan, Dr. Jinglin Fu, Dr. Sunil Shende, Dr. Suneeta Ramaswami, Dr. Joseph Martin, thank you for your generous support and guidance. To my peer, mentor and friend, Marlene Kim, thank you for your guidance and help. You are my first English-speaking friend. Thank you for walking me through every single detail of research, while tolerating my Chinglish. Our friendship will be my treasure of life. To my lab members and friends, Abena Boison, Marlene Kim, Chris Mayer- Bacon, Dan Pinolini, Kathryn Ribay, Dan Russo, Alexander Sedykh, Yutang Wang, Min Wu, Xiliang Yan, Jun Zhang, Linlin Zhao, thank all of you for making my graduate study enjoyable. I learned from each of you and we accomplished great projects that would not have been realized had we did it ourselves. I will miss spending time with you all. To all my friends at Rutgers-Camden, friends at Society of Toxicology, friends at National Center for Toxicological Research U.S. FDA, friends at Genentech, friends at iv Sanofi, thank all of you for making my graduate life colorful and memorable. Every one of you is very important to me. Call me if you see this! Last but not least, to my dearest family, thank you for everything. To Chengyue, thank you for getting me into graduate school, for the many scientific discussions, your support and compromise. To my dad, thank you for immersing me with science ever since I was born, from genetic to epigenetic. To my mom, thank you for your continuous thoughtfulness, support and care whenever I am down. Love is LOVE. v Dedication To my husband, Chengyue Zhang, and My parents, Min Wang and Guoting Chen vi ABSTRACT OF THE DISSERTATION ........................................................................... ii Acknowledgements ............................................................................................................ iv Dedication .......................................................................................................................... vi TABLE OF CONTENTS .................................................................................................... x List of Figures ..................................................................................................................... x List of Tables .................................................................................................................... xii Chapter 1 Introduction ........................................................................................................ 1 1.1 Computer aided drug design ................................................................................. 1 1.2 QSAR principles and workflow ............................................................................ 2 1.2.1 Data curation .............................................................................................. 2 1.2.2 Chemical descriptors .................................................................................. 3 1.2.3 Machine learning models ........................................................................... 6 1.2.4 Statistical Evaluation of Model Performance ............................................ 7 1.2.5 Validation of models .................................................................................. 8 1.3 Limitation of conventional QSAR and solutions .................................................. 9 Chapter 2 Developing Enhanced Blood-Brain Barrier Permeability Models: Integrating External Bio-assay Data in QSAR Modeling ................................................................... 11 Chapter Overview ..................................................................................................... 11 vii 2.1 Introduction ......................................................................................................... 12 2.2 Methods............................................................................................................... 17 2.2.1 Dataset...................................................................................................... 17 2.2.2 Overview of the Workflow in this Study ................................................. 18 2.2.3 Chemical Descriptors ............................................................................... 19 2.2.4 Modeling and Approaches ....................................................................... 20 2.2.5 Integration of Biological Descriptors ....................................................... 21 2.3 Results ................................................................................................................. 23 2.3.1 Overview of the BBB Permeability Database ......................................... 23 2.3.2 Consensus QSAR Results ........................................................................ 25 2.3.3 Bio-assay Data Improve Model Predictive Power ................................... 26 2.4 Discussion ........................................................................................................... 31 2.5 Conclusion .......................................................................................................... 42 Chapter 3 Predicting Nano-bio Interactions by Integrating Nanoparticle Libraries and Quantitative Nanostructure Activity Relationship Modeling ........................................... 43 Chapter Overview ..................................................................................................... 43 3.1 Introduction ......................................................................................................... 44 3.2 Methods/experimental......................................................................................... 46 3.2.1 GNP library synthesis .............................................................................. 46 3.2.2 GNP library characterization ................................................................... 46 viii 3.2.3 Experimental logP measurement ............................................................. 47 3.2.4 Quantification of HO-1 level ................................................................... 47 3.2.5 Cellular uptake ......................................................................................... 48 3.2.6 Virtual GNP construction and structure optimization ............................. 48 3.2.7 Virtual GNP chemical descriptor calculation .......................................... 49 3.2.8 QNAR modeling ...................................................................................... 50 3.3 Results and discussion ........................................................................................ 51 3.3.1 Workflow of experimental testing, QNAR modeling and rational nanomaterial design. ......................................................................................... 51 3.3.2 Design and synthesis

FROM QSAR to QNAR, DEVELOPING ENHANCED MODELS for DRUG DISCOVERY By

Customs Tariff - Schedule

Report on an NIH Workshop on Ultralarge Chemistry Databases Wendy A

Qsar Methods Development, Virtual and Experimental Screening for Cannabinoid Ligand Discovery

(12) United States Patent (10) Patent No.: US 6,264,917 B1 Klaveness Et Al

Open Chemoinformatic Resources to Explore the Structure, Properties and Chemical Space of Cite This: RSC Adv.,2017,7,54153 Molecules

Ovid MEDLINE(R)

Chromatographic Analysis of Pharmaceuticals Second Edition, Revised and Expanded

Pharmaceuticals

Selective Reaction Monitoring (SRM) Daten Von Mehr Als 900 Xenobiotika Für Aufbau Und Validierung Von LC-MS/MS Analysen

Sustained-Release Compositions Containing Cation Exchange Resins and Polycarboxylic Polymers

Retro Drug Design: from Target Properties to Molecular Structures

A Chemaxon/KNIME Based Tool for Designing Chemical Libraries