Improving Posing and Ranking of Molecular Docking by Izhar Wallach

Total Page:16

File Type:pdf, Size:1020Kb

Improving Posing and Ranking of Molecular Docking by Izhar Wallach Improving Posing and Ranking of Molecular Docking by Izhar Wallach A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Computer Science University of Toronto Copyright c 2012 by Izhar Wallach Abstract Improving Posing and Ranking of Molecular Docking Izhar Wallach Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2012 Molecular docking is a computational tool commonly applied in drug discovery projects and fundamental biological studies of protein-ligand interactions. Traditionally, molecu- lar docking is used to address one of three following questions: (i) given a ligand molecule and a protein receptor, predict the binding mode (pose) of the ligand within the context of a receptor, (ii) screen a collection of small-molecules against a receptor and rank ligands by their likelihood of being active, and (iii) given a ligand molecule and a target receptor, predict the binding affinity of the two. Here, we focus on the first two questions, namely ranking and pose prediction. Currently, state-of-the-art docking algorithms predict poses within 2A˚of the native pose in a rate lower than ∼60% and in many cases, below 40%. In ranking, their ability to identify active ligands is inconsistent and generally suffers from high false-positive rate. In this thesis we present novel algorithms to enhance the ability of molecular docking to address these two questions. These algorithms do not substitute traditional docking but rather being applied on top of them to provide synergistic effect. Our algorithms improve pose predictions by 0.5-1.0A˚ and ranking order for 23% of the targets in gold-standard benchmarks. As importantly, the algorithms improve the con- sistence of the posing and ranking predictions over diverse sets of targets and screening libraries. In addition to the posing and ranking, we present the pharmacophore concept. A pharmacophore is an ensemble of physiochemical descriptors associated with a biolog- ical target that elucidates common interaction patterns of ligands with that target. We ii introduce a novel pharmacophore inference algorithm and demonstrate its utilization in molecular docking. This thesis is outlined as follow. First we introduce the molecular docking approach for pose prediction and ranking. Second, we discuss the pharmacophore concept and present algorithms for pharmacophore inference. Third, we demonstrate the utilization of pharmacophores for pose prediction by re-scoring candidate poses generated by docking algorithms. Finally, we present algorithms to improve ranking by reducing bias in scoring functions employed by docking algorithms. iii Contents 1 Introduction to Molecular Docking 3 1.1 Introduction . .3 1.2 Overview of docking and scoring approaches . .4 1.2.1 Conformational Search . .4 1.2.2 Scoring Functions . .6 1.3 Pose prediction using molecular docking . .7 1.3.1 Consensus scoring . .8 1.3.2 Ligand-based constraints . .9 1.4 Ranking using molecular docking . 11 1.4.1 Descriptor-based methods . 12 1.4.2 Receptor based methods . 15 1.4.3 Ligand-based methods . 16 1.5 Pharmacophore Inference and its application in molecular docking . 17 1.5.1 Protein { small-molecule binding patterns . 19 1.5.2 Pharmacophore inference methods . 21 2 Sub-Cavity-based Pharmacophore Inference 30 2.1 Introduction . 30 2.2 Methods . 32 2.2.1 Dataset generation . 33 iv 2.2.2 Ligand chemical analysis . 35 2.2.3 Shape identification and characterization of binding sites . 35 2.2.4 Binding site division into sub-cavities . 36 2.2.5 Sub-cavity similarity . 37 2.2.6 Sub-cavity clustering and reshaping . 39 2.3 Results & Discussion . 40 2.3.1 Simulated data . 41 2.3.2 Clustering different protein classes . 43 2.3.3 PDB sub-cavity analysis and inference . 44 2.3.4 Extensions and limitations . 49 2.4 Conclusion . 50 3 Pose Prediction using Pharmacophore Hypotheses 52 3.1 Introduction . 52 3.2 Methods . 55 3.2.1 Dataset generation . 55 3.2.2 Binding mode prediction . 57 3.3 Results & Discussion . 62 3.3.1 Scoring function analysis . 63 3.3.2 Predictions . 64 3.3.3 Robustness of the algorithm . 71 3.4 Conclusion & Future Work . 76 4 Virtual Decoy Sets for Molecular Docking Benchmarks 80 4.1 Introduction . 80 4.2 Results . 82 4.2.1 Benchmarks vs. DUD . 82 4.2.2 \Self-DUD" experiment . 85 v 4.2.3 Controlled bias experiments . 87 4.2.4 Benchmarks using EA-Inventor as decoy generator . 89 4.3 Discussion & Conclusion . 90 4.4 Methods . 93 4.4.1 Docking Procedure . 93 4.4.2 Self-Generated DUD . 94 4.4.3 Decoy generation algorithm . 94 5 Normalizing Molecular Docking Rankings 103 5.1 Introduction . 103 5.2 Results . 107 5.2.1 Experiments using a 0.8 tanimoto coefficient threshold . 109 5.2.2 Experiments using a 0.5 tanimoto coefficient threshold . 113 5.2.3 Results with two validated systems . 116 5.2.4 Experiments using randomly selected drug-like libraries . 120 5.3 Discussion . 123 5.4 Conclusion . 127 5.5 Methods . 128 5.5.1 Generation of datasets . 128 5.5.2 Docking . 129 5.5.3 Fitting and normalization . 130 6 Summary & Conclusion 133 Bibliography 137 vi List of Tables 2.1 Inference success rate by homogeneity score cutoffs. Clusters having higher homogeneity scores demonstrate better inference precision. The iterative clustering{reshaping process increases the precision for clusters with a higher homogeneity scores. It supports the assumption that the simi- larity within a cluster comes from the sharing of a common substructure { the reshaping process uncovers this structure and increases prediction accuracy. Clusters with lower homogeneity scores are less likely to share common substructure and benefit less from the reshaping process. 46 2.2 Inference results for HIV-1 Protease active site. Sub-cavity inference re- sults for the binding site of HIV-1 Protease. Using three different homo- geneity score thresholds (0.65, 0.75, 0.85), the predicted sub-cavity labels were compared to a set of nine ligands. R-groups refer to Figure 2.5A. The predictions made by our algorithm appear in the `Prediction' row (HBD: hydrogen bond donor, Arom: aromatic). No prediction was made when the cluster's homogeneity score did not pass the set threshold (indicated by a '-'). An entry X/Y indicates X correct predictions made for the Y lig- ands with a corresponding R-group (i.e., not all ligands have a substituted chemical group at each R position). 48 vii 2.3 Sub-cavity inference results for the binding site of Thrombin. Using three different homogeneity score thresholds (0.65, 0.75, 0.85), the predicted sub- cavity labels were compared to a set of nine ligands. The predictions made by our algorithm appear in the `Prediction' row (HBD: hydrogen bond donor, HBA: hydrogen bond acceptor). No prediction was made when the cluster's homogeneity score did not pass the set threshold (indicated by a '-'). An entry X/Y indicates X correct predictions made for the Y ligands with a corresponding R-group (i.e., not all ligands have a substituted chemical group at each R position). 51 3.1 Pearson correlation coefficient of configuration score vs. RMSD using the Sigmoid and Euclidean similarity functions and Thrombin (1TOM) as the target protein (p-value of all correlations are < 10−10)........... 67 3.2 The fraction of ligands with a correct solution ranked first (< 2:5A˚) which also have a solution τ percent better within the top 50 ranked candidate binding modes (see text). For 87% of the tested cases, when a correct binding mode appeared at the top of the ranked list, a pose more similar to the native binding mode appeared elsewhere in the top 50 poses. 72 3.3 Average ligand similarity of 1200 randomly generated prediction exper- iments. The table lists the average Tanimoto coefficient and standard deviation (std. dev.) of the ligand sets using Daylight and MACCS-like fingerprints. 75 viii 4.1 A comparison of different enrichments attained using the VDS and the DUD decoy sets over the 40 DUD protein targets. AUC denotes the area under the ROC curve. Larger AUC value indicates better enrichment. In our case, datasets that produce smaller AUC values are generally con- sidered better for benchmarking. EF1, EF20, and EFmax correspond to the enrichment factors at 1% of the decoys, 20% of the decoys, and the maximal enrichment over the whole set of decoys. 99 4.2 Using the eHiTS docking algorithm. A comparison of different enrichments attained using the VDS and the DUD decoy sets over the 13 targets from Andrew Good's DUD clustering. AUC denotes the area under the ROC curve. EF3 and EF20 correspond to the early enrichment at 3% of the decoys and late enrichment at 20% of the decoys. 100 4.3 Using the Glide docking algorithm. A comparison of different enrichments attained using the VDS and the DUD decoy sets over the 13 targets from Andrew Good's DUD clustering. AUC denotes the area under the ROC curve. EF3 and EF20 correspond to the early enrichment at 3% of the decoys and late enrichment at 20% of the decoys. 100 4.4 A comparison between the VDS and the DUD datasets over five physical descriptors used as indicators of physical similarity between active ligands and decoys. For every target and every physical descriptor, the proper- ties of the VDS and DUD decoys are compared to the properties of the corresponding active ligands. 101 5.1 Using Glide. Distance between the distributions of docking scores of de- coys generated for active (active-VDS) and non-active (non-active-VDS) molecules. Active- and non-active-VDS were generated with a 0.8 and a 0.5 TC penalty threshold.
Recommended publications
  • Open Data, Open Source, and Open Standards in Chemistry: the Blue Obelisk Five Years On" Journal of Cheminformatics Vol
    Oral Roberts University Digital Showcase College of Science and Engineering Faculty College of Science and Engineering Research and Scholarship 10-14-2011 Open Data, Open Source, and Open Standards in Chemistry: The lueB Obelisk five years on Andrew Lang Noel M. O'Boyle Rajarshi Guha National Institutes of Health Egon Willighagen Maastricht University Samuel Adams See next page for additional authors Follow this and additional works at: http://digitalshowcase.oru.edu/cose_pub Part of the Chemistry Commons Recommended Citation Andrew Lang, Noel M O'Boyle, Rajarshi Guha, Egon Willighagen, et al.. "Open Data, Open Source, and Open Standards in Chemistry: The Blue Obelisk five years on" Journal of Cheminformatics Vol. 3 Iss. 37 (2011) Available at: http://works.bepress.com/andrew-sid-lang/ 19/ This Article is brought to you for free and open access by the College of Science and Engineering at Digital Showcase. It has been accepted for inclusion in College of Science and Engineering Faculty Research and Scholarship by an authorized administrator of Digital Showcase. For more information, please contact [email protected]. Authors Andrew Lang, Noel M. O'Boyle, Rajarshi Guha, Egon Willighagen, Samuel Adams, Jonathan Alvarsson, Jean- Claude Bradley, Igor Filippov, Robert M. Hanson, Marcus D. Hanwell, Geoffrey R. Hutchison, Craig A. James, Nina Jeliazkova, Karol M. Langner, David C. Lonie, Daniel M. Lowe, Jerome Pansanel, Dmitry Pavlov, Ola Spjuth, Christoph Steinbeck, Adam L. Tenderholt, Kevin J. Theisen, and Peter Murray-Rust This article is available at Digital Showcase: http://digitalshowcase.oru.edu/cose_pub/34 Oral Roberts University From the SelectedWorks of Andrew Lang October 14, 2011 Open Data, Open Source, and Open Standards in Chemistry: The Blue Obelisk five years on Andrew Lang Noel M O'Boyle Rajarshi Guha, National Institutes of Health Egon Willighagen, Maastricht University Samuel Adams, et al.
    [Show full text]
  • A Java-Based Platform for Evolutionary De Novo Molecular Design
    Article MoleGear: A Java-Based Platform for Evolutionary De Novo Molecular Design Yunhan Chu and Xuezhong He * Department of Chemical Engineering, Norwegian University of Science and Technology, N-7491 Trondheim, Norway; [email protected] * Correspondence: [email protected]; Tel.: +47-73593942 Received: 25 March 2019; Accepted: 10 April 2019; Published: 11 April 2019 Abstract: A Java-based platform, MoleGear, is developed for de novo molecular design based on the chemistry development kit (CDK) and other Java packages. MoleGear uses evolutionary algorithm (EA) to explore chemical space, and a suite of fragment-based operators of growing, crossover, and mutation for assembling novel molecules that can be scored by prediction of binding free energy or a weighted-sum multi-objective fitness function. The EA can be conducted in parallel over multiple nodes to support large-scale molecular optimizations. Some complementary utilities such as fragment library design, chemical space analysis, and graphical user interface are also integrated into MoleGear. The candidate molecules as inhibitors for the human immunodeficiency virus 1 (HIV-1) protease were designed by MoleGear, which validates the potential capability for de novo molecular design. Keywords: de novo design; evolutionary algorithm; drug molecules; fitness; multi-objective function 1. Introduction Computational chemistry plays an important role in the design of new drug-like molecules [1– 5], catalysts [6–8], and novel solvents of ionic liquids [9–11]. De novo molecular design has been an active research area of drug design/discovery over the last decades, and many approaches such as LUDI [12], LEA3D [13], Flux [14,15], and pharmacophore-linked fragment virtual screening (PFVS) [16] have been developed by using protein and ligand structures.
    [Show full text]
  • Molecular Structure Input on the Web Peter Ertl
    Ertl Journal of Cheminformatics 2010, 2:1 http://www.jcheminf.com/content/2/1/1 REVIEW Open Access Molecular structure input on the web Peter Ertl Abstract A molecule editor, that is program for input and editing of molecules, is an indispensable part of every cheminfor- matics or molecular processing system. This review focuses on a special type of molecule editors, namely those that are used for molecule structure input on the web. Scientific computing is now moving more and more in the direction of web services and cloud computing, with servers scattered all around the Internet. Thus a web browser has become the universal scientific user interface, and a tool to edit molecules directly within the web browser is essential. The review covers a history of web-based structure input, starting with simple text entry boxes and early molecule editors based on clickable maps, before moving to the current situation dominated by Java applets. One typical example - the popular JME Molecule Editor - will be described in more detail. Modern Ajax server-side molecule editors are also presented. And finally, the possible future direction of web-based molecule editing, based on tech- nologies like JavaScript and Flash, is discussed. Introduction this trend and input of molecular structures directly A program for the input and editing of molecules is an within a web browser is therefore of utmost importance. indispensable part of every cheminformatics or molecu- In this overview a history of entering molecules into lar processing system. Such a program is known as a web applications will be covered, starting from simple molecule editor, molecular editor or structure sketcher.
    [Show full text]
  • Designing Universal Chemical Markup (UCM) Through the Reusable Methodology Based on Analyzing Existing Related Formats
    Designing Universal Chemical Markup (UCM) through the reusable methodology based on analyzing existing related formats Background: In order to design concepts for a new general-purpose chemical format we analyzed the strengths and weaknesses of current formats for common chemical data. While the new format is discussed more in the next article, here we describe our software s t tools and two stage analysis procedure that supplied the necessary information for the n i r development. The chemical formats analyzed in both stages were: CDX, CDXML, CML, P CTfile and XDfile. In addition the following formats were included in the first stage only: e r P CIF, InChI, NCBI ASN.1, NCBI XML, PDB, PDBx/mmCIF, PDBML, SMILES, SLN and Mol2. Results: A two stage analysis process devised for both XML (Extensible Markup Language) and non-XML formats enabled us to verify if and how potential advantages of XML are utilized in the widely used general-purpose chemical formats. In the first stage we accumulated information about analyzed formats and selected the formats with the most general-purpose chemical functionality for the second stage. During the second stage our set of software quality requirements was used to assess the benefits and issues of selected formats. Additionally, the detailed analysis of XML formats structure in the second stage helped us to identify concepts in those formats. Using these concepts we came up with the concise structure for a new chemical format, which is designed to provide precise built-in validation capabilities and aims to avoid the potential issues of analyzed formats.
    [Show full text]
  • Open Data, Open Source and Open Standards in Chemistry: the Blue Obelisk five Years On
    Open Data, Open Source and Open Standards in chemistry: The Blue Obelisk ¯ve years on Noel M O'Boyle¤1 , Rajarshi Guha2 , Egon L Willighagen3 , Samuel E Adams4 , Jonathan Alvarsson5 , Richard L Apodaca6 , Jean-Claude Bradley7 , Igor V Filippov8 , Robert M Hanson9 , Marcus D Hanwell10 , Geo®rey R Hutchison11 , Craig A James12 , Nina Jeliazkova13 , Andrew SID Lang14 , Karol M Langner15 , David C Lonie16 , Daniel M Lowe4 , J¶er^omePansanel17 , Dmitry Pavlov18 , Ola Spjuth5 , Christoph Steinbeck19 , Adam L Tenderholt20 , Kevin J Theisen21 , Peter Murray-Rust4 1Analytical and Biological Chemistry Research Facility, Cavanagh Pharmacy Building, University College Cork, College Road, Cork, Co. Cork, Ireland 2NIH Center for Translational Therapeutics, 9800 Medical Center Drive, Rockville, MD 20878, USA 3Division of Molecular Toxicology, Institute of Environmental Medicine, Nobels vaeg 13, Karolinska Institutet, 171 77 Stockholm, Sweden 4Unilever Centre for Molecular Sciences Informatics, Department of Chemistry, University of Cambridge, Lens¯eld Road, CB2 1EW, UK 5Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24 Uppsala, Sweden 6Metamolecular, LLC, 8070 La Jolla Shores Drive #464, La Jolla, CA 92037, USA 7Department of Chemistry, Drexel University, 32nd and Chestnut streets, Philadelphia, PA 19104, USA 8Chemical Biology Laboratory, Basic Research Program, SAIC-Frederick, Inc., NCI-Frederick, Frederick, MD 21702, USA 9St. Olaf College, 1520 St. Olaf Ave., North¯eld, MN 55057, USA 10Kitware, Inc., 28 Corporate Drive, Clifton Park, NY 12065, USA 11Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, PA 15260, USA 12eMolecules Inc., 380 Stevens Ave., Solana Beach, California 92075, USA 13Ideaconsult Ltd., 4.A.Kanchev str., So¯a 1000, Bulgaria 14Department of Engineering, Computer Science, Physics, and Mathematics, Oral Roberts University, 7777 S.
    [Show full text]
  • Biomolecular Simulation Data Management In
    BIOMOLECULAR SIMULATION DATA MANAGEMENT IN HETEROGENEOUS ENVIRONMENTS Julien Charles Victor Thibault A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Biomedical Informatics The University of Utah December 2014 Copyright © Julien Charles Victor Thibault 2014 All Rights Reserved The University of Utah Graduate School STATEMENT OF DISSERTATION APPROVAL The dissertation of Julien Charles Victor Thibault has been approved by the following supervisory committee members: Julio Cesar Facelli , Chair 4/2/2014___ Date Approved Thomas E. Cheatham , Member 3/31/2014___ Date Approved Karen Eilbeck , Member 4/3/2014___ Date Approved Lewis J. Frey _ , Member 4/2/2014___ Date Approved Scott P. Narus , Member 4/4/2014___ Date Approved And by Wendy W. Chapman , Chair of the Department of Biomedical Informatics and by David B. Kieda, Dean of The Graduate School. ABSTRACT Over 40 years ago, the first computer simulation of a protein was reported: the atomic motions of a 58 amino acid protein were simulated for few picoseconds. With today’s supercomputers, simulations of large biomolecular systems with hundreds of thousands of atoms can reach biologically significant timescales. Through dynamics information biomolecular simulations can provide new insights into molecular structure and function to support the development of new drugs or therapies. While the recent advances in high-performance computing hardware and computational methods have enabled scientists to run longer simulations, they also created new challenges for data management. Investigators need to use local and national resources to run these simulations and store their output, which can reach terabytes of data on disk.
    [Show full text]
  • Open Source Visualization of Scientific Data 8 August 2011 Dr
    Open Source Visualization of Scientific Data 8 August 2011 Dr. Marcus D. Hanwell [email protected] 1 Outline • Background • Why is open science important? • Opening up chemistry over the last four years • The Visualization Toolkit (VTK) • ParaView – a client-server Qt based VTK GUI • New frontiers – web, mobile and tablets • Future directions 2 My Background • Ph.D. (Physics) – University of Sheffield • Google Summer of Code – Avogadro • Postdoc (Chemistry) – University of Pittsburgh • R&D engineer – Kitware, Inc • Passionate about physics, chemistry, and the growing need to improve computational tools • See the need for powerful open source, cross platform frameworks and applications • Develop(ed): Gentoo, KDE, Kalzium, Avogadro, Open Babel, VTK, ParaView, Titan, CMake 3 Kitware • Founded in 1998: 5 former GE Research employees • 95 employees: 42% PhD • Privately held, profitable from creation, no debt • Rapidly Growing: >30% in 2010, 7M web-visitors/quarter • Offices • 2011 Small Business – Albany, NY Administration’s Tibbetts Award – Carrboro, NC • HPCWire Readers – Lyon, France and Editor’s Choice – Bangalore, India • Inc’s 5000 List: 2008 to 2010 Kitware: Core Technologies CMake CDash 5 What Is “Open Science”? “Open science is the idea that scientific knowledge of all kinds should be openly shared as early as is practical in the discovery process.” openscience.org 6 What Is The Problem? “…when the journal system was developed in the 17th and 18th centuries it was an excellent example of open science. The journals are perhaps
    [Show full text]
  • The Chemistry Development Kit (CDK). 3
    Willighagen et al. RESEARCH The Chemistry Development Kit (CDK). 3. Atom typing, Rendering, Molecular Formula, and Substructure Searching Egon L Willighagen1*, John W May2, Jonathan Alvarsson3, Arvid Berg3, Nina Jeliazkova4, Tom´aˇsPluskal7, Miguel Rojas-Cherto??, Ola Spjuth3, Gilleain Torrance??, Rajarshi Guha5 and Christoph Steinbeck6 *Correspondence: [email protected] Abstract 1 Dept of Bioinformatics - BiGCaT, NUTRIM, Maastricht Background: Cheminformatics is a well-established field with many applications University, NL-6200 MD, in chemistry, biology, drug discovery, and others. The Chemistry Development Kit Maastricht, The Netherlands Full list of author information is (CDK) has become a widely used Open Source cheminformatics toolkit, available at the end of the article providing various models to represent chemical structures, of which the chemical graph is essential. However, in the first five years of the project increased so much in size that interdependencies between components grew unmanageable large, resulting in unpredictable instabilities. Results: We here report improvements to the CDK since the 1.2 release series made to accommodate both the increased complexity of the library, as well as significant improvements of and additions to the functionality of the library. Second, we outline how the CDK evolved with respect to quality control and the approach we have adopted to ensure stability, including a peer review mechanism. Additionally, a selection of the new APIs that have been introduced will be discussed: atom type perception, substructure searching, molecular fingerprints, rendering of molecules, and handling of molecular formulas. Conclusions: With this paper we have shown the continued effort to provide a free, Open Source cheminformatics library, and show that such collaborative projects can exist over a long period.
    [Show full text]
  • Open Source Molecular Modeling
    Accepted Manuscript Title: Open Source Molecular Modeling Author: Somayeh Pirhadi Jocelyn Sunseri David Ryan Koes PII: S1093-3263(16)30118-8 DOI: http://dx.doi.org/doi:10.1016/j.jmgm.2016.07.008 Reference: JMG 6730 To appear in: Journal of Molecular Graphics and Modelling Received date: 4-5-2016 Accepted date: 25-7-2016 Please cite this article as: Somayeh Pirhadi, Jocelyn Sunseri, David Ryan Koes, Open Source Molecular Modeling, <![CDATA[Journal of Molecular Graphics and Modelling]]> (2016), http://dx.doi.org/10.1016/j.jmgm.2016.07.008 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Open Source Molecular Modeling Somayeh Pirhadia, Jocelyn Sunseria, David Ryan Koesa,∗ aDepartment of Computational and Systems Biology, University of Pittsburgh Abstract The success of molecular modeling and computational chemistry efforts are, by definition, de- pendent on quality software applications. Open source software development provides many advantages to users of modeling applications, not the least of which is that the software is free and completely extendable. In this review we categorize, enumerate, and describe available open source software packages for molecular modeling and computational chemistry. 1. Introduction What is Open Source? Free and open source software (FOSS) is software that is both considered \free software," as defined by the Free Software Foundation (http://fsf.org) and \open source," as defined by the Open Source Initiative (http://opensource.org).
    [Show full text]
  • A Multi-Scale Molecular Dynamic Approach to the Study of the Outer Membrane of the Bacteria Psudomonas Aeruginosa PA01 and the Biocide Chlorhexidine
    A Multi-Scale Molecular Dynamic Approach to the Study of the Outer Membrane of the Bacteria Psudomonas Aeruginosa PA01 and the Biocide Chlorhexidine by Brad Van Oosten B.Sc. (Honours) Department of Physics, Brock University, 2011 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in The Faculty of Mathematics and Sciences Department of Physics BROCK UNIVERSITY September 15, 2016 2016 © Brad Van Oosten In presenting this thesis in partial fulfilment of the requirements for an ad- vanced degree at the Brock University, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or pub- lication of this thesis for financial gain shall not be allowed without my written permission. (Signature) Department of Physics Brock University St.Catharines, Canada Date Abstract ii Abstract The introductory chapters of this thesis contains an explanation to the methods and basic theory of the molecular dynamics approach. Together with the ap- pendix section, in which a step by step tutorial how to set up and run basic simulations using the gromacs software is presented, this thesis can serve as an introductory aid in performing molecular dynamics simulations. In the research portion of this thesis, I provide several uses for the molecular dynamics approach applied to the biocide chlorhexidine as well as the study of membranes, including a mimic of the bacteria membrane of Pseudomonas Aeruginosa PA01.
    [Show full text]
  • Cheminformatics and the Semantic Web: Adding Value with Linked Data and Enhanced Provenance
    Advanced Review Cheminformatics and the Semantic Web: adding value with linked data and enhanced provenance Jeremy G. Frey∗ and Colin L. Bird Cheminformatics is evolving from being a field of study associated primarily with drug discovery into a discipline that embraces the distribution, management, ac- cess, and sharing of chemical data. The relationship with the related subject of bioinformatics is becoming stronger and better defined, owing to the influence of Semantic Web technologies, which enable researchers to integrate hetero- geneous sources of chemical, biochemical, biological, and medical information. These developments depend on a range of factors: the principles of chemical identifiers and their role in relationships between chemical and biological enti- ties; the importance of preserving provenance and properly curated metadata; and an understanding of the contribution that the Semantic Web can make at all stages of the research lifecycle. The movements toward open access, open source, and open collaboration all contribute to progress toward the goals of integration. C " 2013 John Wiley & Sons, Ltd. How to cite this article: WIREs Comput Mol Sci 2013. doi: 10.1002/wcms.1127 INTRODUCTION cipline of bioinformatics evolved more recently, in heminformatics is usually defined in terms of response to the vast amount of data generated by the application of computer science and infor- molecular biology, applying mathematical, and com- mationC technology to problems in the chemical sci- putational techniques not only to the management ences. Brown1 introduced the term chemoinformatics of that data but also to understanding the biological in 1998, in the context of drug discovery, although processes, pathways, and interactions involved.
    [Show full text]
  • UNIVERSITY of CALIFORNIA RIVERSIDE Small Molecule
    UNIVERSITY OF CALIFORNIA RIVERSIDE Small Molecule Interaction With Biological Targets A Dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Bioengineering by Tyler William H Backman December 2016 Dissertation Committee: Dr. Thomas Girke, Chairperson Dr. Jiayu Liao Dr. Dimitrios Morikis Copyright by Tyler William H Backman 2016 The Dissertation of Tyler William H Backman is approved: Committee Chairperson University of California, Riverside Acknowledgments I am grateful to my advisor Thomas Girke, for his mentorship and advice. I also thank Jiayu Liao, Dimitrios Morikis, and Victor G. J. Rodgers for support, advice, and constructive feedback. I thank Yiqun “Eddie” Cao for helping me get started in the field of cheminformatics. I thank Ronly Schlenk for extensively testing the prototype bioassayR database. I am greatly indebted to the many people whose excellent work I cite and build upon, and whose open source software tools and mathematical methods made this work possible. I also thank Samantha Lewis and Thomas Backman for inspiring and supporting me. I acknowledge the support of the compute facility at the Institute for Integrative Genome Biology (IIGB) at UC Riverside. This work was supported by grants from the National Science Foundation [ABI-0957099] and the National Institute of Health [U24AG051129]. It was also supported by graduate fellowships from the University of California Office of the President (UCOP), the UC Riverside Graduate Division, and the National Science Foundation. I acknowledge that the Oxford University Press (Nucleic Acids Research journal), and the American Chemical Society (Journal of Chemical Information and Model- ing) granted me permission to use my work published in these journals (chapters 2 and 3 respectively) in my dissertation.
    [Show full text]