Classification of Organic Compounds Into Modes of Toxic Action
Total Page:16
File Type:pdf, Size:1020Kb
Classification of Organic Compounds into Modes of Toxic Action Den Naturwissenschaftlichen Fakultäten der Friedrich-Alexander-Universität Erlangen-Nürnberg zur Erlangung des Doktorgrades vorgelegt von Simon L. Spycher aus Grabs/Schweiz Als Dissertation genehmigt von den Naturwissenschaftlichen Fakultäten der Universität Erlangen-Nürnberg Tag der mündlichen Prüfung: 18. Oktober 2005 Vorsitzender der Promotionskommission: Prof. Dr. D.-P. Häder Erstberichterstatter: Prof. Dr. J. Gasteiger Zweitberichterstatter: Prof. Dr. T. Clark This work would not have been possible without the know-how and encouragement of specialists from the most different fields of science. There is a large number of people I would like to thank. First of all my supervisor Prof. Dr. J. Gasteiger who gave me the opportunity to work on this fascinating subject in the first place and then for the support and inspiring discussions during the years in his group. Then, I would like to thank my coauthors of the publications originating from this work for the excellent collaboration: Dr. Monika Nendza of the Analytisches Laboratorium Luhnstedt, Dr. Eric Pellegrini of our group, and PD Dr. Beate Escher of the Swiss Federal Institute of Aquatic Science and Technology (EAWAG). I would also like to thank my colleagues of the team working on the multimedia teaching project Vernetztes Studium–Chemie (VS-C): Dr. Thomas Engel, Angelika Hofmann and especially Dr. Axel Schunk for the good collaboration in the Chemie für Mediziner multimedia module. Additionally I would like to express thanks to my present and former colleagues in this group for the great working atmosphere and their help with numerous programming questions and especially those who adminstrated our large network of Unix- and Windows-machines: Prof. Dr. Joao Aires de Sousa, Prof. Dr. Fernando Batista Da Costa, Ulrike Burkard, Stephan Grell, Dr. Yongquan Han, Markus Hemmer, Dr. Achim Herwig, Alexander von Homeyer, Dimitar Hristozov, Dr. Qian-Nan Hu, Dr. Wolf-Dietrich Ihlenfeldt, Thomas Kleinöder, Dr. Thomas Kostka, Rastislav Krajcik, Dr. Sung Kwang Lee, Dr. Giorgi Lekishvili, Gisela Martinek, Jörg Marusczyk, Dr. Frank Oellien, Udo Ottmann, Dr. Matthias Pförtner, Martin Reitz, Dr. Oliver Sacher, Dr. Christian Scholten, Dr. Christof Schwab, Dr. Thomas Seidel, Dr. Markus Sitzmann, Dr. James Smith, Vladimir Sykora, Dr. Alexei Tarkhov, Dr. Andreas Teckentrup, Dr. Lothar Terfloth, Dr. Jaroslaw Tomczak, Thomas Tröger, Dr. Dietrich Trümbach, Dr. Dusica Vidovic, Dr. Jörg Wegner, Dr. Ai-Xia Yan, Dr. Jinhua Zhang. Special thanks also to the following poeple from outside this group: Dr. Bjoern Reineking of the ETH-Zürich for his extensive support with R-calculations, Dr. Johannes Ranke of the University of Bremen, Dr. Said Hilal of the US-EPA, Dr. Aptula Aynur and Prof. Dr. Marc Cronin of the University of Liverpool, and especially Dr. Jean-Pierre Kocher of Molecular Networks GmbH for his help with the second publication and for his encouragements to scientific curiosity and debate. I am also very indebted to our Secretaries Angela Döbler, Ulrike Scholz and Karin Holtzke who always made sure our group runs smoothly. A big thank you to my girlfriend Cornelia for not despairing with her scientific boyfriend and finally to my parents Alfred and Hella and my brother Boris whose example and support guided me through my studies and always gave me new energy. ii Contents 1 Introduction 1 1.1 Motivation . .............................. 1 1.2 Scientific Background . ....................... 3 1.2.1 Chemoinformatics . ....................... 3 1.2.2 The Mode of Toxic Action Approach ................ 6 1.3 Objectives and Structure of this Work . ................ 11 1.3.1 Data Analysis . ....................... 12 1.3.2 Chemistry . .............................. 12 1.3.3 Toxicology .............................. 13 References . ..................................... 14 2 Comparison of Different Classification Methods Applied to a Mode of Toxic Action Data Set 21 2.1 Introduction . .............................. 22 2.2 Materials and Methods . ....................... 24 2.2.1 Data . .............................. 24 2.2.2 Preprocessing . ....................... 27 2.2.3 Variable Selection and Pattern Recognition Methods . ..... 27 2.2.3.1 Linear Discriminant Analysis (LDA) . ..... 27 2.2.3.2 Multiple logistic regression (multinom) . ..... 28 2.2.3.3 Partial Least Squares (PLS) ................ 28 2.2.3.4 Counter Propagation Neural Networks (CPG NN) . 29 2.2.4 Multiple MOA . ....................... 29 2.2.5 Performance criteria . ....................... 30 2.2.6 Model validation . ....................... 31 iii iv CONTENTS 2.3 Results . ............................... 33 2.3.1 Preliminary studies . ........................ 33 2.3.2 Reassessment . ........................ 34 2.3.3 Optimization . ........................ 36 2.3.4 Multiple MOA . ........................ 40 2.4 Discussion . ............................... 41 2.5 Conclusions . ............................... 44 References ...................................... 45 2.6 Further Discussion of Data Analysis Methods . .......... 50 3 Use of Structure Descriptors to Discriminate between Modes of Toxic Action of Phenols 53 3.1 Introduction . ............................... 54 3.2 Materials and Methods . ........................ 55 3.2.1 The Data Sets . ........................ 55 3.2.1.1 Training Set ........................ 55 3.2.1.2 Test Sets . ........................ 57 3.2.2 Calculation of Atomic Physicochemical Properties . ...... 58 3.2.3 Encoding of Atomic Physicochemical Properties .......... 58 3.2.4 Classification Methods ........................ 59 3.2.4.1 Counter-Propagation Neural Networks .......... 59 3.2.4.2 Logistic Regression . ................. 60 3.2.4.3 Model Validation . ................. 61 3.2.4.4 Determination of Prediction Space . .......... 62 3.3 Results . ............................... 63 3.3.1 CPG NN-Models Based on Structure Descriptors .......... 63 3.3.1.1 Models Based on one Physicochemical Property . 63 3.3.1.2 Models Based on an Additional Physicochemical Property 63 3.3.1.3 Improvement of Proelectrophile Classification by Adding Surface Autocorrelation Descriptors . .......... 64 3.3.2 Multinomial Logistic Regression-Models Based on Structure De- scriptors . ............................... 67 3.3.3 Models Based on Previously Published Descriptors . ...... 68 CONTENTS v 3.3.4 Predictions of External Data Sets . ................ 68 3.4 Discussion . .............................. 71 3.5 Conclusion . .............................. 74 References . ..................................... 75 3.6 Further Discussion of Structure Representation . ............ 82 4 A QSAR Model for the Intrinsic Activity of Uncouplers of Oxidative Phospho- rylation 85 4.1 Introduction . .............................. 86 4.2 Materials and Methods . ....................... 88 4.2.1 Toxic Mechanisms . ....................... 88 4.2.2 Data Set . .............................. 96 4.2.3 Calculation of Descriptors . ................ 98 4.2.4 Statistical Methods and Model Validation . ............ 100 4.3 Results ..................................... 102 4.3.1 Intrinsic Activity, ECm ........................ 102 4.3.2 Aqueous Toxicity, ECw ........................ 104 4.4 Discussion . .............................. 106 4.5 Conclusion . .............................. 109 References . ..................................... 110 4.6 Further Discussion of Toxicological Aspects and Mechanistic Descriptors . 116 5 Classification Study for the Uncoupling Activity Model 119 5.1 Derivation of a classification method for uncoupling activity . ..... 119 5.1.1 Toxic Mechanism . ....................... 119 5.1.2 Data . .............................. 120 5.1.3 Results . .............................. 121 5.2 Evaluation of Descriptor Quality ....................... 125 5.2.1 Gas Phase Acidity . ....................... 126 5.3 Discussion . .............................. 127 5.4 Conclusion . .............................. 129 References . ..................................... 129 6 Conclusion and Outlook 133 vi CONTENTS References ...................................... 136 7 Summary 139 8 Zusammenfassung 143 A Full Equations used in Chapter 4 147 B Data set used for the classification study of Chapter 5 151 C Publications 157 D Lebenslauf 159 Chapter 1 Introduction 1.1 Motivation Over the last century, chemistry has been one of the strongest driving forces for innovation in science, technology and economy. The enormous progress especially in the field of synthesis allowed the development of tens of thousands of new compounds and products with previ- ously unknown efficiency and often with completely new properties. The innovations lead to new drugs, textiles, dyes, pesticides just to name a few applications. However, in the sixties and seventies it became increasingly clear that some products also had unwanted side effects not only on a local but also on a global scale [1]. In some cases these side effects came as complete surprise to the inventors, as an example the chlorofluorocarbons (CFC) were introduced as cooling gases especially because of their low toxicity and non-flammability. Decades later they revealed the unwanted property to catalytically reduce the concentration of stratospheric ozone and their use had to be severely restricted and partially banned. The increasing public pressure and scientific concern led most countries to introduce laws for the protection of the environment and specifically to require