Development and Application of Ligand-Based Cheminformatics Tools for Drug Discovery from Natural Products

Development and application of ligand-based cheminformatics tools for drug discovery from natural products Entwicklung und Anwendung von ligandenbasierten Cheminformatik-Programmen für die Identifizierung von Arzneimitteln aus Naturstoffen INAUGURALDISSERTATION zur Erlangung des Doktorgrades der Fakultät für Chemie und Pharmazie der Albert-Ludwigs-Universität Freiburg im Breisgau vorgelegt von Kiran Kumar Telukunta aus Hyderabad, Indien Juli 2018 Dekan: Prof. Dr. Manfred Jung Universität Freiburg Institut für Pharmazeutische Wissenschaften Chemische Epigenetik Albertstraße 25 79104 Freiburg Vorsitzender: Prof. Dr. Stefan Weber des Promotionsausschusses Universität Freiburg Institut für Physikalische Chemie Physikalishe Chemie Albertstraße 21 79104 Freiburg Referent: Prof. Dr. Stefan Günther Universität Freiburg Institut für Pharmazeutische Wissenschaften Pharmazeutische Bioinformatik Hermann-Herder-Straße 9 79104 Freiburg Korreferent: Prof. Dr. Rolf Backofen Universität Freiburg Institut für Informatik Lehrstuhl für Bioinformatik Georges-Köhler-Allee 106 79110 Freiburg Drittprüfer: Prof. Dr. Andreas Bechthold Universität Freiburg Institut für Pharmazeutische Wissenschaften Pharmazeutische Biologie und Biotechnologie Stefan-Meier-Straße 19 79104 Freiburg Prüfungsdatum: 30 Aug 2018 Eidesstattliche Erklärung Hiermit erkläre ich, dass ich diese Arbeit allein und ausschließlich unter Nutzung der direkt oder sinngemäß gekennzeichneten Zitate geschrieben habe. Weiterhin versichere ich, dass diese Arbeit in keinem anderen Prüfungsverfahren eingereicht wurde. Kiran Kumar Telukunta Juli 2018 Acknowledgements “Learning gives creativity; Creativity leads to thinking; Thinking provides knowledge; Knowledge makes you great." -A. P.J. Abdul Kalam • Primarily I thank athletic Prof. Dr. Stefan Günther for giving me the opportunity to be part of Pharmaceutical Bioinformatics group, where I enjoyed my journey of knowledge and challenges of science expedition. I am happy to see his journey from Junior Professor to a Professor which gives me enormous motivation both personally and professionally. His diverse group of knowledge and peer PhDs from various countries making it a unique international research group provides a set platform with great exposure. I am fortunate to have Prof. Dr. Rolf Backofen again as supervisor to my Ph.D. and my special thanks to him. My gratitude to Prof. Dr. Andreas Bechthold for accepting to be my third supervisor. • I sincerely thank my funding institutions German National Research Foundation (DFG) and University of Freiburg. • One of the high motivation for me pursuing Ph.D. and completing it is the philosophy of IndiaYouth.info which I have quoted in my master thesis, and I am blissful that this envision still sought and it evolved in me in every Planck length (lp) of the moment. This philosophy gave me the opportunity to thank Rama Rao Madarapu who joined me in this philosophy and gradually became part of my life. His substantial support is an essential asset during this period. viii • I’m happy that Dr. Stephan Flemming and Dr. Xavier Lucas born in different countries because now I can express this easily that if somebody asks me if I were to born in Germany or Spain then probably I would have been like these guys respectively. They were everything to me in my new country of living. Both of them were always accessible and available for my ideas, queries and all topics of discussion. More importantly, several exciting activities with these guys will always be great memories in my life. • Dr. Björn Grünning and Dr. Fidele Ntie-Kang were great inspiring colleagues during my Ph.D. Their maximum utilization of their resources in contributing to scientific progress is admirable. • It was enormous pleasant feeling working and contributing to science along with Dr. Kersten Döring, Dr. Dennis Klementz, and Paul Zierep. I had a lovely environment working with Dr. Anika Erxleben, Pankaj Mishra, Dr. Gwang-Jin Kim, Mehrosh Pervaiz, Ammar Qaseem, Jianyu Li, Stefan Bleher, Dr. Martin Gotthardt, Dr. Nan Liu, Daniel Eckhardt, Dr. Maria Hörnke, and Lars Rösch. • I thank Moritz Konrad, Laura van Hazendonk, and Lea Purschke for being part of my work team during my stay at Pharmaceutical Bioinformatics. • My respect to the source of my life is my parents for their love and support. I gratefully appreciate and thank the unconditional support of my sister, brother-in-law, and especially my Aunt and Uncle during the journey of my Ph.D. Meinen großen Respekt und meine Liebe zu meiner deutschen Mutter Gudrun Lorenz und Familie, die mir ein besonderes Familiengefühl in Deutschland vermittelt hat. • In the end but the extraordinarily special person in my life is my life partner Sheetal Arya Telukunta who gave me the opportunity of going through all my emotions during this period gives me immense pleasure in my joyride, and I am remarkably thankful to her for all sensations she has given to me. • I thank my second home Germany for all its support during my Masters and Doctorate. Abstract In the drug-discovery identification of small molecules that selectively bind to a biological target from virtually infinite chemical space is a time-consuming crucial step among many other critical steps in drug development. Identified molecules need to possess adequate residence times of drug-target complexes to modulate the function of the target protein and must affect the desirable phenotype. Furthermore, examination of the pharmacological activity of compounds in in vivo studies is required which are characterized by pharmacoki- netic and pharmacodynamic properties. Finally, the efficacy of the drug has to be validated in human. The huge number of natural products being approved drugs indicates the importance of natural compounds for drug discovery. Genome-mining tools can be applied to identify a substantial number of novel natural products and ligand- or structure-based virtual screening methods will further increase the pace of therapeutic compound discovery. The present doctoral thesis focuses on developing cheminformatic tools which aid basic research for lead identification in drug development. The following applications were co- developed within the scope of this work: Tools evaluating existing literature by applying text-mining in natural language processing are becoming an essential part of identifying compound-protein, drug-drug, and protein- protein interactions along with their associations to diseases in literature. PubMedPortable is a framework developed for accessing large-scale biomolecule associated data and bridges the gap between natural language processing components and relation extraction methods by providing a local queryable and searchable instance of the literature. NANPDB annotates thousands of compounds from the Northern African region; Strep- x tomeDB is an updated database of molecules produced by actinobacteria. Both developed libraries contribute significantly to the biologically relevant natural chemical space. Fur- thermore, provided web services allow for the retrieval of information about a therapeutic application, physicochemical properties, and synthesis routes. Structural elucidation of biosynthetic substances is a hurdle. The developed web tool SeMPI provides a pipeline to identify encoding gene clusters from genomic data and predicts the basic structure of related natural products. DVS offers an algorithm that serves to narrow down the chemical space that has to be screened to identify putative drugs. FragPred provides a solution in another direction by predicting the activity of compounds based on the knowledge of contained active substruc- tures. The cheminformatic tools presented in the thesis are useful for creating hypotheses for the discovery of novel drugs to certain diseases. A case study, diabetes mellitus, illustrates these tools and their operation. Starting with finding literature on diabetes mellitus and the identification of existing drugs for treatment, proposes alternative compounds from the presented natural databases. Finally, for the alternative compounds, putative targets are predicted. The manifested drug-discovery cheminformatic tools demonstrate the importance of in silico methods in modern drug discovery. Zusammenfassung Die Identifizierung kleiner Moleküle aus dem nahezu unendlichen chemischen Raum, die selektiv an ein biologisches Zielprotein binden, ist neben vielen anderen kritischen Schritten in der Arzneimittelentwicklung ein zeitaufwändiger Prozess. Identifizierte Moleküle müssen über ausreichende Verweilzeiten am Wirkstoff-Zielkomplex verfügen, um auf die Funktion des Zielproteins einzuwirken und den gewünschten Phänotyp zu beeinflussen. Darüber hinaus charakterisieren pharmakokinetische und pharmakodynamische Eigenschaften die pharmakologische Aktivität von Molekülen, die in in vivo Studien weiter untersucht werden muss. Schließlich muss die Wirksamkeit des Medikaments im Menschen validiert werden. Die große Zahl der zugelassenen Naturstoffe zeigt deren Bedeutung für die Arzneimit- telforschung. Genome-mining kann eingesetzt werden, um eine Vielzahl neuartiger Natur- produkte zu identifizieren. Liganden- oder strukturbasierte virtuelle screening-Methoden unterstützen die Entdeckung neuer therapeutischer Wirkstoffe. Die vorliegende Dissertation beschäftigt sich mit der Entwicklung cheminformatischer Werkzeuge, mit deren Hilfe die Grundlagenforschung zur Identifizierung von Leitstrukturen unterstützt werden kann. Die folgenden Anwendungen wurden im Rahmen dieser Arbeit

Development and Application of Ligand-Based Cheminformatics Tools for Drug Discovery from Natural Products

Proquest Dissertations

Molecular Biology for Computer Scientists

Ontology-Based Methods for Analyzing Life Science Data

The International Society for Computational Biology 10Th Anniversary Lawrence Hunter, Russ B

Biocuration 2016 - Posters

Scalable Approaches for Auditing the Completeness of Biomedical Ontologies

Pacific Symposium on Biocomputing 2008

2016 National Library of Medicine Informatics Training Confernece

Pacific Symposium on Biocomputing 2020 January 3-7, 2020 Big Island of Hawaii

I S C B N E W S L E T T

Simulation and Graph Mining Tools for Improving Gene Mapping Efficiency

Emidio Capriotti Phd CURRICULUM VITÆ