Toxicity Prediction Using Multi-Disciplinary Data Integration and Novel Computational Approaches
Total Page:16
File Type:pdf, Size:1020Kb
TOXICITY PREDICTION USING MULTI-DISCIPLINARY DATA INTEGRATION AND NOVEL COMPUTATIONAL APPROACHES Yen Sia Low A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Environmental Sciences and Engineering, Gillings School of Global Public Health. Chapel Hill 2013 Approved by: Alexander Tropsha, Ph.D. Ivan Rusyn, M.D., Ph.D. Louise Ball, Ph.D. Avram Gold, Ph.D. David Dix, Ph.D. David Leith, Ph.D. © 2013 Yen Sia Low ALL RIGHTS RESERVED ii ABSTRACT YEN SIA LOW: TOXICITY PREDICTION USING MULTI-DISCIPLINARY DATA INTEGRATION AND NOVEL COMPUTATIONAL APPROACHES (Under the direction of Alexander Tropsha and Ivan Rusyn) Current predictive tools used for human health assessment of potential chemical hazards rely primarily on either chemical structural information (i.e., cheminformatics) or bioassay data (i.e., bioinformatics). Emerging data sources such as chemical libraries, high throughput assays and health databases offer new possibilities for evaluating chemical toxicity as an integrated system and overcome the limited predictivity of current fragmented efforts; yet, few studies have combined the new data streams. This dissertation tested the hypothesis that integrative computational toxicology approaches drawing upon diverse data sources would improve the prediction and interpretation of chemically induced diseases. First, chemical structures and toxicogenomics data were used to predict hepatotoxicity. Compared with conventional cheminformatics or toxicogenomics models, interpretation was enriched by the chemical and biological insights even though prediction accuracy did not improve. This motivated the second project that developed a novel integrative method, chemical-biological read-across (CBRA), that led to predictive and interpretable models amenable to visualization. CBRA was consistently among the most accurate models on four chemical-biological data sets. It highlighted chemical and biological features for interpretation and the visualizations aided transparency. iii Third, we developed an integrative workflow that interfaced cheminformatics prediction with pharmacoepidemiology validation using a case study of Stevens Johnson Syndrome (SJS), an adverse drug reaction (ADR) of major public health concern. Cheminformatics models first predicted potential SJS inducers and non-inducers, prioritizing them for subsequent pharmacoepidemiology evaluation, which then confirmed that predicted non-inducers were statistically associated with fewer SJS occurrences. By combining cheminformatics’ ability to predict SJS as soon as drug structures are known, and pharmacoepidemiology’s statistical rigor, we have provided a universal scheme for more effective study of SJS and other ADRs. Overall, this work demonstrated that integrative approaches could deliver more predictive and interpretable models. These models can then reliably prioritize high risk chemicals for further testing, allowing optimization of testing resources. A broader implication of this research is the growing role we envision for integrative methods that will take advantage of the various emerging data sources. iv This dissertation is dedicated to my family, especially my parents, whose unwavering love and support throughout my life have encouraged me to dream and made this research possible. v ACKNOWLEDGEMENTS I wish to thank the following people who have enriched my life as a doctoral student at UNC-Chapel Hill. First and foremost, I thank advisors Alex Tropsha and Ivan Rusyn whose generous support have allowed me to pursue interests beyond cheminformatics. More importantly, their gift of time rounded my scientific apprenticeship during which I was invited to observe, participate and contribute to the discourse between two great minds. I thank members of the Molecular Modeling Lab (MML) whose presence have made MML a wonderful place to work and play. I am grateful for their friendship and generous feedback without which my research experience would have been much poorer. In particular, I thank Alec Sedykh, Denis Fourches, Sasha Golbraikh and Regina Politi for being such patient sounding boards to my ideas both good and bad. I also thank dear friends Liying Zhang and Guiyu Zhao for their company on too many late nights. I am grateful to friends near and afar for their inspiration and support, filling my PhD journey with great memories. I thank my family for always believing in me. Thank you Mummy for being the voice of reason and comfort. A special thank you too, Daniel Oreper, my boyfriend, for patiently sharing in my joys and pains. Finally, I thank the grace of God for putting all the above people in my life to make this dissertation possible. vi TABLE OF CONTENTS LIST OF TABLES ....................................................................................................................... ix LIST OF FIGURES ...................................................................................................................... x LIST OF ABBREVIATIONS .................................................................................................... xii LIST OF SYMBOLS ................................................................................................................. xiv CHAPTER 1. INTRODUCTION ................................................................................................ 1 1.1. Chemical structural data and cheminformatics in predictive toxicology ..................... 1 1.2. Bioassay data and bioinformatics in predictive toxicology ......................................... 5 1.3. Motivation for integrative chemical-biological modeling in predictive toxicology .... 8 1.4. Review of integrative chemical-biological modeling for predictive toxicology ....... 10 1.5. Human health data and epidemiology in predictive toxicology ................................ 16 1.6. Dissertation outline .................................................................................................... 18 CHAPTER 2. INTEGRATIVE CHEMICAL-BIOLOGICAL MODELING WITH EXISTING METHODS: PREDICTING DRUG-INDUCED HEPATOTOXICITY USING QSAR AND TOXICOGENOMICS APPROACHES ................................................................................................... 20 2.1. Overview .................................................................................................................... 20 2.2. Introduction ................................................................................................................ 21 2.3. Materials and Methods ............................................................................................... 24 2.4. Results ........................................................................................................................ 29 2.5. Discussion .................................................................................................................. 42 2.6. Conclusions ................................................................................................................ 49 CHAPTER 3. INTEGRATIVE CHEMICAL-BIOLOGICAL MODELING WITH NEW METHODS: INTEGRATIVE CHEMICAL AND BIOLOGICAL READ- ACROSS (CBRA) FOR TOXICITY PREDICTION ...................................... 50 3.1. Overview .................................................................................................................... 50 vii 3.2. Introduction ................................................................................................................ 51 3.3. Materials and methods ............................................................................................... 53 3.4. Results ........................................................................................................................ 61 3.5. Discussion .................................................................................................................. 70 3.6. Conclusions ................................................................................................................ 77 CHAPTER 4. INTEGRATIVE STUDY OF ADVERSE DRUG REACTIONS: CHEMINFORMATICS PREDICTION AND PHARMACOEPIDEMIOLOGY EVALUATION OF DRUG-INDUCED STEVENS JOHNSON SYNDROME ................................................................ 79 4.1. Introduction ................................................................................................................ 79 4.2. Results ........................................................................................................................ 82 4.3. Discussion .................................................................................................................. 92 4.4. Conclusions ................................................................................................................ 95 4.5. Methods...................................................................................................................... 96 CHAPTER 5. GENERAL DISCUSSION AND CONCLUSIONS ....................................... 103 5.1. Summary of key findings ......................................................................................... 104 5.2. Contributions and practical implications ................................................................. 106 5.3. Limitations and future solutions .............................................................................. 110 5.4. Epilogue ..................................................................................................................