Université De Strasbourg École Doctorale Des
Total Page:16
File Type:pdf, Size:1020Kb
UNIVERSITÉ DE STRASBOURG ÉCOLE DOCTORALE DES SCIENCES CHIMIQUES UMR 7140 THÈSE présentée par : Kyrylo KLIMENKO soutenue le: 14 mars 2017 pour obtenir le grade de : Docteur de l’université de Strasbourg Discipline/Spécialité : Chimie/Chémoinformatique Computer-aided drug design of broad-spectrum antiviral compounds THÈSE dirigée par : M. VARNEK Alexandre Professeur, Université de Strasbourg M. KUZMIN Victor Professeur, Institut de Chimie Physique, Odessa, Ukraine RAPPORTEURS : M. ERTL Peter Docteur, HDR, société Novartis M. TABOUREAU Olivier Professeur, Université Paris Diderot AUTRES MEMBRES DU JURY : Mme. KELLENBERGER Esther Professeur, Université de Strasbourg Mr Kyrylo Klimenko was a member of the European Doctoral College of the University of Strasbourg during the preparation of his PhD thesis (2013-2016), class Dmitri Mendeleev. In addition to his mainstream research topic, he has attended lectures and seminars dedicated to general European policies and values presented by international experts. This PhD research project was carried out in scope of collaboration between the University of Strasbourg, France and A.V. Bogatsky Physico-Chemical Institute, Ukraine. Acknowledgements First of all, I would like to express my gratitude to my academic advisors, Prof Alexandre Varnek and Prof Victor Kuz`min for their brilliant guidance in completing the academic challenges I have faced. I am also thankful to the French Embassy in Ukraine for providing financial means to the project’s completion. I would like to thank all the members of the jury, Prof Olivier Taboureau, Dr Peter Ertl and Prof Esther Kellenberger for accepting to judge and review my work. I would also like to express my gratitude to my senior colleagues Dr. Dragos Horvath, Dr Gilles Marcou, Dr Pavel Polishchuk, as well as, former and current PhD students Dr. Helena Gaspar, Dr. Fiorela Ruggiu, Grace Delouis, Pavel Sidorov, Timur Gimadiev and Julien Denos for their advice and support. I am thankful to my collaborators from Physico-chemical Institute, Ukraine and Institute of Chemical Biology and Fundamental Medicine Russia who provided experimental evidence to my theoretical findings. Special thanks to Dr. Olga Klimchuk and Soumia Hnini for their help on accommodation and administration tasks. I am deeply and forever indebted to my parents and friends for their love and support. ABSTRACT Virtual screening and cartography of chemical space approaches have been used for design of broad-spectrum antivirals acting as nucleic acids intercalators. The 1st part of thesis reports QSPR model for aqueous solubility of organic molecules within the wide temperature range. This model was later used for solubility assessment of antiviral compounds. In the second part of work, structural filters, QSAR and pharmacophore models were developed then used to screen a database containing some 3.2 M compounds. This resulted in 55 hits which were synthesized and experimentally tested. Two lead compounds displayed high activity against Vaccinia virus and low toxicity. In the 3d part of the thesis, Generative Topographic Mapping (GTM) approach was used to build 2D maps of chemical space of antiviral compounds. Experimental data on antiviral compounds were extracted from ChEMBL database, curated and annotated by major virus Genus. Selected dataset was used to build maps on which all other ChEMBL compounds were projected. Analysis of the maps revealed structural motifs characterizing particular types of antivirals. DECLARATION • This is my own work • Information sources were acknowledged and fully referenced. Kyrylo Klimenko March, 2017 Abbreviations AD : Applicability domain BA: Balanced accuracy BVDV : Bovine viral diarrhea virus CMV : Cytomegalovirus FN : False positives FP : False positives GFP : Green fluorescent protein GTM : Generative topographic mapping HBV : Hepatitis B virus HCV : Hepatitis C virus HIV : Human immunodeficiency virus HSV : Herpes simplex virus ISIDA-SMF : In silico design and data analysis substructure molecular fragments MRV : Mammalian orthoreovirus OOB : Out-of-bag PASS : Prediction of activity spectra for substances PFU : Plaque forming unit PSM : Privileged structural motifs PRP : Privileged responsibility patterns QSA(P)R : Quantitative structure-activity(property) relationship RF : Random forest RMSE : Root-mean-square-error SiRMS : Simplex representation of molecular structure TP : True positives TN : True negatives VSV : Vesicular stomatitis virus XV : Cross-validation Contents Résumé en français .................................................................................................................... 15 INTRODUCTION ......................................................................................................................... 25 PART 1 REVIEW ON VIRUS PROBLEMATICS ................................................................... 27 1.1 Overview of the virus structure and reproduction ........................................... 27 1.2 Current antiviral treatment strategies ................................................................ 32 1.3 Earlier studies on computer-aided design of antiviral drugs .......................... 35 PART 2 COMPUTATIONAL TECHNIQUES USED IN THIS STUDY ................................ 39 2.1 (Q)SA(P)R approach - The (Quantitative) Structure-Activity (Property) Relationship ................................................................................................................. 39 2.1.1 SiRMS (Simplex representation of molecular structure) ............................ 39 2.1.2 ISIDA substructure molecular fragments ...................................................... 40 2.1.3 Random Forest .................................................................................................. 42 2.1.4 Generative Topographic Method .................................................................... 43 2.1.5 Statistics used for QSAR models performance assessment ..................... 48 2.1.6 Pharmacophore modeling ............................................................................... 50 2.1.7 Third-party predictive models used in Virtual Screening ............................ 53 2.2 Databases ........................................................................................................................ 53 2.3 Data curation tool............................................................................................................ 54 2.4 Scaffold analysis tool ..................................................................................................... 56 PART 3 QSPR MODEL FOR AQUEOUS SOLUBILITY PREDICTION ............................. 61 3.1 Dataset preparation ........................................................................................................ 61 3.2 Model development ........................................................................................................ 63 3.2.1 Determination of solubility-temperature equation ........................................ 63 3.2.2 QSPR model for prediction ......................................................................... 65 ............................................................. 65 3.2.3 QSPR solubility model development 3.3 Model validation on external test set ........................................................................... 65 3.4 Conclusions ..................................................................................................................... 67 PART 4 COMPUTER-AIDED DESIGN OF BROAD-SPECTRUM ANTIVIRALS ............. 69 4.1 Data preparation ............................................................................................................. 69 4.2 Modeling ........................................................................................................................... 71 4.2.1 Filters .................................................................................................................. 71 4.2.2 Pharmacopore models ..................................................................................... 72 4.2.3 Classification SAR models .............................................................................. 75 4.3 Virtual screening ............................................................................................................. 75 4.4 Conclusions ..................................................................................................................... 80 PART 5 VISUALIZATION AND ANALYSIS OF CHEMICAL SPACE OF ANTIVIRAL COMPOUNDS ................................................................................................................... 81 5.1 Data curation ................................................................................................................... 82 5.2 Model development ........................................................................................................ 89 5.3 Chemical space analysis ............................................................................................... 93 5.3.1 Visualization of the chemical space ............................................................... 93 5.3.2. Analysis of the privileged patterns ................................................................ 99 5.4 Activity prediction of antiviral CADD compounds .................................................... 106 5.5 Conclusions ................................................................................................................... 109 GENERAL CONCLUSIONS