Molecular Similarity and Xenobiotic Metabolism

Total Page:16

File Type:pdf, Size:1020Kb

Molecular Similarity and Xenobiotic Metabolism Molecular Similarity and Xenobiotic Metabolism Samuel Edward Adams Trinity College University of Cambridge This dissertation is submitted for the degree of Doctor of Philosophy Preface This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration except where specifically indicated in the text. The dissertation does not exceed the word limit for the Degree Committee. Copyright © 2010 Samuel Edward Adams This work is licensed under a Creative Commons Attribution-Share Alike 2.0 UK: England & Wales License. This means that you are free: to copy, distribute, display, and perform the work to make derivative works Under the following condition: Attribution. You must give the original author credit. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a licence identical to this one. For any reuse or distribution, you must make clear to others the licence terms of this work. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights. To view the full text of this license, visit http://www.creativecommons.org; or, send a letter to Creative Commons, 171 2nd Street, Suite 300, San Francisco, California, 94105, USA. i Summary Molecular Similarity and Xenobiotic Metabolism Samuel Edward Adams MetaPrint2D, a new software tool implementing a data-mining approach for predicting sites of xenobiotic metabolism has been developed. The algorithm is based on a statistical analysis of the occurrences of atom centred circular fingerprints in both substrates and metabolites. This approach has undergone extensive evaluation and been shown to be of comparable accuracy to current best-in-class tools, but is able to make much faster predictions, for the first time enabling chemists to explore the effects of structural modifications on a compound’s metabolism in a highly responsive and interactive manner. MetaPrint2D is able to assign a confidence score to the predictions it generates, based on the availability of relevant data and the degree to which a compound is modelled by the algorithm. In the course of the evaluation of MetaPrint2D a novel metric for assessing the performance of site of metabolism predictions has been introduced. This overcomes the bias introduced by molecule size and the number of sites of metabolism inherent to the most commonly reported metrics used to evaluate site of metabolism predictions. This data mining approach to site of metabolism prediction has been augmented by a set of reaction type definitions to produce MetaPrint2D-React, enabling prediction of the types of transformations a compound is likely to undergo and the metabolites that are formed. This approach has been evaluated against both historical data and metabolic schemes reported in a number of recently published studies. Results suggest that the ability of this method to predict metabolic transformations is highly dependent on the relevance of the training set data to the query compounds. MetaPrint2D has been released as an open source software library, and both MetaPrint2D and MetaPrint2D-React are available for chemists to use through the Unilever Centre for Molecular Science Informatics’ website. ii Acknowledgements Firstly I would like to thank my supervisor Professor Robert Glen for giving me the opportunity to undertake these studies, and for all of his help and support throughout the course of my research. My thanks go to Dr Scott Boyer and the members of his Computational Toxicology group at AstraZeneca, Mölndal, for their welcome and the help they have given me, in particular Lars Carlsson. I would also like to thank Ola Spjuth of Uppsala University for his assistance in working with Bioclipse. I am grateful to all the members of the Unilever Centre for Molecular Science Informatics for making my time there so interesting and enjoyable. Particular thanks have to go to Charlotte and Phil for keeping the computers working and to Susan and Emma for keeping the centre running! Finally, I would like to express my gratitude to those who have supported me and borne with me during the writing of this thesis. This work was funded by Boehringer Ingelheim and Unilever. iii Contents Preface .............................................................................................................................. i Summary .......................................................................................................................... ii Acknowledgements ......................................................................................................... iii Contents ......................................................................................................................... iv 1. Introduction .............................................................................................................. 1 1.1 The drug discovery process ..................................................................................................... 2 1.2 The role of computational methods ........................................................................................ 6 1.3 Virtual screening methods ...................................................................................................... 8 1.4 Current challenges and developments .................................................................................. 28 2. Prediction of xenobiotic metabolism ........................................................................ 41 2.1 Introduction ........................................................................................................................... 41 2.2 Effects of metabolism ............................................................................................................ 43 2.3 Mechanisms of metabolism .................................................................................................. 50 2.4 Predicting xenobiotic metabolism ......................................................................................... 56 2.5 Conclusion ............................................................................................................................. 64 3. Development of MetaPrint2D: a tool for predicting sites of xenobiotic metabolism .. 65 3.1 Substrate/Product Occurrence Ratio Calculator ................................................................... 65 3.2 Development of MetaPrint2D ............................................................................................... 74 3.3 The Symyx® Metabolite database .......................................................................................... 75 3.4 MetaPrint2D’s implementation ............................................................................................. 84 3.5 Software availability ............................................................................................................ 114 4. Evaluation and optimization of MetaPrint2D .......................................................... 120 4.1 Reaction centre identification ............................................................................................. 120 4.2 Pre-processing of Symyx® Metabolite data ......................................................................... 122 4.3 Evaluating metabolic site predictions ................................................................................. 126 4.4 Evaluation of MetaPrint2D and the effects of data pre-processing options ...................... 132 4.5 Analysis of MetaPrint2D’s performance ............................................................................. 135 4.6 Speed of predictions ............................................................................................................ 143 4.7 Parameterization of MetaPrint2D ....................................................................................... 144 iv 4.8 Isoform specific models ....................................................................................................... 147 4.9 Comparison with other tools ............................................................................................... 152 4.10 Accuracy of the test data ..................................................................................................... 153 4.11 Conclusions .......................................................................................................................... 155 5. Extension of MetaPrint2D to the prediction of transformation types and the generation of metabolites.................................................................................... 157 5.1 Introduction ......................................................................................................................... 157 5.2 Identifying transformations ................................................................................................. 159 5.3 Predicting transformations .................................................................................................. 174 5.4 Generating product structures ............................................................................................ 175 5.5 User interface ...................................................................................................................... 175 5.6 Evaluation ...........................................................................................................................
Recommended publications
  • Istls Information Services to Life Science Internet Bioinformatics Resources Josef Maier [E-Mail: [email protected]] Last Checked August, 17Th, 2011
    IStLS Information Services to Life Science Internet Bioinformatics Resources Josef Maier [e-mail: [email protected]] Last checked August, 17th, 2011 IStLS Bioinformatics Resources http://www.istls.de/bioinfolinks.php Courses and lectures Bioinformatics - Online Courses and Tutorials http://www.bioinformatik.de/cgi-bin/browse/Catalog/Research_and_Education/Online_Courses_and_Tutorials/ EMBRACE Network of Excellence http://www.embracegrid.info/page.php EMBNet Quick Guides http://www.embnet.org/node/64 EMBNet Courses http://www.embnet.org/ Sequence Analysis with distributed Resources http://bibiserv.techfak.uni-bielefeld.de/sadr/ Tutorial Protein Structures (EXPASY) SwissModel http://swissmodel.expasy.org/course/course-index.htm CMBI Courses for protein structure http://swift.cmbi.ru.nl/teach/courses/index.html 2Can Support Portal - Bioinformatics educational resource http://www.ebi.ac.uk/2can Bioconductor Workshops http://www.bioconductor.org/workshops/ CBS Bioinformatics Courses http://www.cbs.dtu.dk/courses.php The European School In Bioinformatics (Biosapiens) http://www.biosapiens.info/page.php?page=esb Institutes Centers Networks Bioinformatics Institutes Germany WSI Wilhelm-Schickard-Institut für Informatik - Universitaet Tuebingen http://www.uni-tuebingen.de/en/faculties/faculty-of-science/departments/computer-science/department.html WSI Huson - Algorithms in Bioinformatics http://www-ab.informatik.uni-tuebingen.de/welcome.html WSI Prof. Zell - Computer Architecture http://www.ra.cs.uni-tuebingen.de/ WSI Kohlbacher - Div. for Simulation
    [Show full text]
  • Open Data, Open Source, and Open Standards in Chemistry: the Blue Obelisk Five Years On" Journal of Cheminformatics Vol
    Oral Roberts University Digital Showcase College of Science and Engineering Faculty College of Science and Engineering Research and Scholarship 10-14-2011 Open Data, Open Source, and Open Standards in Chemistry: The lueB Obelisk five years on Andrew Lang Noel M. O'Boyle Rajarshi Guha National Institutes of Health Egon Willighagen Maastricht University Samuel Adams See next page for additional authors Follow this and additional works at: http://digitalshowcase.oru.edu/cose_pub Part of the Chemistry Commons Recommended Citation Andrew Lang, Noel M O'Boyle, Rajarshi Guha, Egon Willighagen, et al.. "Open Data, Open Source, and Open Standards in Chemistry: The Blue Obelisk five years on" Journal of Cheminformatics Vol. 3 Iss. 37 (2011) Available at: http://works.bepress.com/andrew-sid-lang/ 19/ This Article is brought to you for free and open access by the College of Science and Engineering at Digital Showcase. It has been accepted for inclusion in College of Science and Engineering Faculty Research and Scholarship by an authorized administrator of Digital Showcase. For more information, please contact [email protected]. Authors Andrew Lang, Noel M. O'Boyle, Rajarshi Guha, Egon Willighagen, Samuel Adams, Jonathan Alvarsson, Jean- Claude Bradley, Igor Filippov, Robert M. Hanson, Marcus D. Hanwell, Geoffrey R. Hutchison, Craig A. James, Nina Jeliazkova, Karol M. Langner, David C. Lonie, Daniel M. Lowe, Jerome Pansanel, Dmitry Pavlov, Ola Spjuth, Christoph Steinbeck, Adam L. Tenderholt, Kevin J. Theisen, and Peter Murray-Rust This article is available at Digital Showcase: http://digitalshowcase.oru.edu/cose_pub/34 Oral Roberts University From the SelectedWorks of Andrew Lang October 14, 2011 Open Data, Open Source, and Open Standards in Chemistry: The Blue Obelisk five years on Andrew Lang Noel M O'Boyle Rajarshi Guha, National Institutes of Health Egon Willighagen, Maastricht University Samuel Adams, et al.
    [Show full text]
  • Molecular Structure Input on the Web Peter Ertl
    Ertl Journal of Cheminformatics 2010, 2:1 http://www.jcheminf.com/content/2/1/1 REVIEW Open Access Molecular structure input on the web Peter Ertl Abstract A molecule editor, that is program for input and editing of molecules, is an indispensable part of every cheminfor- matics or molecular processing system. This review focuses on a special type of molecule editors, namely those that are used for molecule structure input on the web. Scientific computing is now moving more and more in the direction of web services and cloud computing, with servers scattered all around the Internet. Thus a web browser has become the universal scientific user interface, and a tool to edit molecules directly within the web browser is essential. The review covers a history of web-based structure input, starting with simple text entry boxes and early molecule editors based on clickable maps, before moving to the current situation dominated by Java applets. One typical example - the popular JME Molecule Editor - will be described in more detail. Modern Ajax server-side molecule editors are also presented. And finally, the possible future direction of web-based molecule editing, based on tech- nologies like JavaScript and Flash, is discussed. Introduction this trend and input of molecular structures directly A program for the input and editing of molecules is an within a web browser is therefore of utmost importance. indispensable part of every cheminformatics or molecu- In this overview a history of entering molecules into lar processing system. Such a program is known as a web applications will be covered, starting from simple molecule editor, molecular editor or structure sketcher.
    [Show full text]
  • Spoken Tutorial Project, IIT Bombay Brochure for Chemistry Department
    Spoken Tutorial Project, IIT Bombay Brochure for Chemistry Department Name of FOSS Applications Employability GChemPaint GChemPaint is an editor for 2Dchem- GChemPaint is currently being developed ical structures with a multiple docu- as part of The Chemistry Development ment interface. Kit, and a Standard Widget Tool kit- based GChemPaint application is being developed, as part of Bioclipse. Jmol Jmol applet is used to explore the Jmol is a free, open source molecule viewer structure of molecules. Jmol applet is for students, educators, and researchers used to depict X-ray structures in chemistry and biochemistry. It is cross- platform, running on Windows, Mac OS X, and Linux/Unix systems. For PG Students LaTeX Document markup language and Value addition to academic Skills set. preparation system for Tex typesetting Essential for International paper presentation and scientific journals. For PG student for their project work Scilab Scientific Computation package for Value addition in technical problem numerical computations solving via use of computational methods for engineering problems, Applicable in Chemical, ECE, Electrical, Electronics, Civil, Mechanical, Mathematics etc. For PG student who are taking Physical Chemistry Avogadro Avogadro is a free and open source, Research and Development in Chemistry, advanced molecule editor and Pharmacist and University lecturers. visualizer designed for cross-platform use in computational chemistry, molecular modeling, material science, bioinformatics, etc. Spoken Tutorial Project, IIT Bombay Brochure for Commerce and Commerce IT Name of FOSS Applications / Employability LibreOffice – Writer, Calc, Writing letters, documents, creating spreadsheets, tables, Impress making presentations, desktop publishing LibreOffice – Base, Draw, Managing databases, Drawing, doing simple Mathematical Math operations For Commerce IT Students Drupal Drupal is a free and open source content management system (CMS).
    [Show full text]
  • Openchrom: a Cross-Platform Open Source Software for the Mass Spectrometric Analysis of Chromatographic Data Philip Wenig*, Juergen Odermatt
    Wenig and Odermatt BMC Bioinformatics 2010, 11:405 http://www.biomedcentral.com/1471-2105/11/405 SOFTWARE Open Access OpenChrom: a cross-platform open source software for the mass spectrometric analysis of chromatographic data Philip Wenig*, Juergen Odermatt Abstract Background: Today, data evaluation has become a bottleneck in chromatographic science. Analytical instruments equipped with automated samplers yield large amounts of measurement data, which needs to be verified and analyzed. Since nearly every GC/MS instrument vendor offers its own data format and software tools, the consequences are problems with data exchange and a lack of comparability between the analytical results. To challenge this situation a number of either commercial or non-profit software applications have been developed. These applications provide functionalities to import and analyze several data formats but have shortcomings in terms of the transparency of the implemented analytical algorithms and/or are restricted to a specific computer platform. Results: This work describes a native approach to handle chromatographic data files. The approach can be extended in its functionality such as facilities to detect baselines, to detect, integrate and identify peaks and to compare mass spectra, as well as the ability to internationalize the application. Additionally, filters can be applied on the chromatographic data to enhance its quality, for example to remove background and noise. Extended operations like do, undo and redo are supported. Conclusions: OpenChrom is a software application to edit and analyze mass spectrometric chromatographic data. It is extensible in many different ways, depending on the demands of the users or the analytical procedures and algorithms. It offers a customizable graphical user interface.
    [Show full text]
  • Pipenightdreams Osgcal-Doc Mumudvb Mpg123-Alsa Tbb
    pipenightdreams osgcal-doc mumudvb mpg123-alsa tbb-examples libgammu4-dbg gcc-4.1-doc snort-rules-default davical cutmp3 libevolution5.0-cil aspell-am python-gobject-doc openoffice.org-l10n-mn libc6-xen xserver-xorg trophy-data t38modem pioneers-console libnb-platform10-java libgtkglext1-ruby libboost-wave1.39-dev drgenius bfbtester libchromexvmcpro1 isdnutils-xtools ubuntuone-client openoffice.org2-math openoffice.org-l10n-lt lsb-cxx-ia32 kdeartwork-emoticons-kde4 wmpuzzle trafshow python-plplot lx-gdb link-monitor-applet libscm-dev liblog-agent-logger-perl libccrtp-doc libclass-throwable-perl kde-i18n-csb jack-jconv hamradio-menus coinor-libvol-doc msx-emulator bitbake nabi language-pack-gnome-zh libpaperg popularity-contest xracer-tools xfont-nexus opendrim-lmp-baseserver libvorbisfile-ruby liblinebreak-doc libgfcui-2.0-0c2a-dbg libblacs-mpi-dev dict-freedict-spa-eng blender-ogrexml aspell-da x11-apps openoffice.org-l10n-lv openoffice.org-l10n-nl pnmtopng libodbcinstq1 libhsqldb-java-doc libmono-addins-gui0.2-cil sg3-utils linux-backports-modules-alsa-2.6.31-19-generic yorick-yeti-gsl python-pymssql plasma-widget-cpuload mcpp gpsim-lcd cl-csv libhtml-clean-perl asterisk-dbg apt-dater-dbg libgnome-mag1-dev language-pack-gnome-yo python-crypto svn-autoreleasedeb sugar-terminal-activity mii-diag maria-doc libplexus-component-api-java-doc libhugs-hgl-bundled libchipcard-libgwenhywfar47-plugins libghc6-random-dev freefem3d ezmlm cakephp-scripts aspell-ar ara-byte not+sparc openoffice.org-l10n-nn linux-backports-modules-karmic-generic-pae
    [Show full text]
  • The Chemistry Development Kit (CDK). 3
    Willighagen et al. RESEARCH The Chemistry Development Kit (CDK). 3. Atom typing, Rendering, Molecular Formula, and Substructure Searching Egon L Willighagen1*, John W May2, Jonathan Alvarsson3, Arvid Berg3, Nina Jeliazkova4, Tom´aˇsPluskal7, Miguel Rojas-Cherto??, Ola Spjuth3, Gilleain Torrance??, Rajarshi Guha5 and Christoph Steinbeck6 *Correspondence: [email protected] Abstract 1 Dept of Bioinformatics - BiGCaT, NUTRIM, Maastricht Background: Cheminformatics is a well-established field with many applications University, NL-6200 MD, in chemistry, biology, drug discovery, and others. The Chemistry Development Kit Maastricht, The Netherlands Full list of author information is (CDK) has become a widely used Open Source cheminformatics toolkit, available at the end of the article providing various models to represent chemical structures, of which the chemical graph is essential. However, in the first five years of the project increased so much in size that interdependencies between components grew unmanageable large, resulting in unpredictable instabilities. Results: We here report improvements to the CDK since the 1.2 release series made to accommodate both the increased complexity of the library, as well as significant improvements of and additions to the functionality of the library. Second, we outline how the CDK evolved with respect to quality control and the approach we have adopted to ensure stability, including a peer review mechanism. Additionally, a selection of the new APIs that have been introduced will be discussed: atom type perception, substructure searching, molecular fingerprints, rendering of molecules, and handling of molecular formulas. Conclusions: With this paper we have shown the continued effort to provide a free, Open Source cheminformatics library, and show that such collaborative projects can exist over a long period.
    [Show full text]
  • Open Source Molecular Modeling
    Accepted Manuscript Title: Open Source Molecular Modeling Author: Somayeh Pirhadi Jocelyn Sunseri David Ryan Koes PII: S1093-3263(16)30118-8 DOI: http://dx.doi.org/doi:10.1016/j.jmgm.2016.07.008 Reference: JMG 6730 To appear in: Journal of Molecular Graphics and Modelling Received date: 4-5-2016 Accepted date: 25-7-2016 Please cite this article as: Somayeh Pirhadi, Jocelyn Sunseri, David Ryan Koes, Open Source Molecular Modeling, <![CDATA[Journal of Molecular Graphics and Modelling]]> (2016), http://dx.doi.org/10.1016/j.jmgm.2016.07.008 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. Open Source Molecular Modeling Somayeh Pirhadia, Jocelyn Sunseria, David Ryan Koesa,∗ aDepartment of Computational and Systems Biology, University of Pittsburgh Abstract The success of molecular modeling and computational chemistry efforts are, by definition, de- pendent on quality software applications. Open source software development provides many advantages to users of modeling applications, not the least of which is that the software is free and completely extendable. In this review we categorize, enumerate, and describe available open source software packages for molecular modeling and computational chemistry. 1. Introduction What is Open Source? Free and open source software (FOSS) is software that is both considered \free software," as defined by the Free Software Foundation (http://fsf.org) and \open source," as defined by the Open Source Initiative (http://opensource.org).
    [Show full text]
  • Bioclipse-Opentox: Interactive Predictive Toxicology
    Bioclipse-OpenTox: Interactive Predictive Toxicology Egon Willighagen, @egonwillighagen Dept. of Bioinformatics - BiGCaT - Maastricht University orcid.org/0000-0001-7542-0286 DepartmentACS New of Bioinformatics Orleans, -9 BiGCaT April 2013, #ACSNola #DrugDisco 1 March 11 2013 Department of Bioinformatics - BiGCaT 2 SEURAT-1 “...replacing animal testing” Department of Bioinformatics - BiGCaT 3 OpenTox / ToxBank Department of Bioinformatics - BiGCaT 4 ToxBank Kohonen, P. et al. MolInf 2013 32(1):47-63. Department of Bioinformatics - BiGCaT 5 Alternative methods … Department of Bioinformatics - BiGCaT 6 Alternative methods … … computational. Department of Bioinformatics - BiGCaT 7 Alternative methods … … computational. “But they don't work, right?” Department of Bioinformatics - BiGCaT 8 How to integrate complementary info? • Experimental • Computational – Cell line – “COMP” stuff – Rat models – “CINF” stuff – Environmetal data – Systems biology – ... Needs: integrate, visualize, analyze. Department of Bioinformatics - BiGCaT 9 Integration platform: Bioclipse Spjuth, O et al. BMC bioinformatics 2007 8(1):59. Department of Bioinformatics - BiGCaT 10 Hanson, RM. Appl Cryst 2010 43(5):1250-1260. Department of Bioinformatics - BiGCaT 11 Why Bioclipse? • Open Source eco-system – Jmol, CDK, OPSIN, ... O’Boyle, NM et al. JChemInf 2011 3(1):1-15. Department of Bioinformatics - BiGCaT 12 Many extensions... Core bioclipse: latex, ui, bioclipse, xml, js, balloon, cdk, rdf, inchi, cml, moltable, jcp, jcpprops Additional libraries: bridgedb, metfrag, metware, joelib, oscar, opsin, r, pellet, specmol, spectrum, bibtex, owl, ds, qsar, ambit, structuredb Online services: cir, opentox, google, gist, myexperiment, sadi, pubchem, pubmed, nmrshiftdb, twitter … and more. Department of Bioinformatics - BiGCaT 13 Decision Support (Ola Spjuth) Spjuth, O. et al. JCIM 2011 51(8):1840-1847. Department of Bioinformatics - BiGCaT 14 OpenTox Hardy, B.
    [Show full text]
  • Computational Prediction of Metabolism: Sites, Products, SAR, P450 Enzyme Dynamics, and Mechanisms Johannes Kirchmair,† Mark J
    Perspective pubs.acs.org/jcim Computational Prediction of Metabolism: Sites, Products, SAR, P450 Enzyme Dynamics, and Mechanisms Johannes Kirchmair,† Mark J. Williamson,† Jonathan D. Tyzack,† Lu Tan,‡ Peter J. Bond,† Andreas Bender,† and Robert C. Glen†,* † Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, CB2 1EW, Cambridge, United Kingdom ‡ Department of Chemical Engineering and Biotechnology, University of Cambridge, Tennis Court Road, CB2 1QT, Cambridge, United Kingdom ABSTRACT: Metabolism of xenobiotics remains a central challenge for the discovery and development of drugs, cosmetics, nutritional supplements, and agrochemicals. Meta- bolic transformations are frequently related to the incidence of toxic effects that may result from the emergence of reactive species, the systemic accumulation of metabolites, or by induction of metabolic pathways. Experimental investigation of the metabolism of small organic molecules is particularly resource demanding; hence, computational methods are of considerable interest to complement experimental approaches. This review provides a broad overview of structure- and ligand- based computational methods for the prediction of xenobiotic metabolism. Current computational approaches to address xenobiotic metabolism are discussed from three major perspectives: (i) prediction of sites of metabolism (SOMs), (ii) elucidation of potential metabolites and their chemical structures, and (iii) prediction of direct and indirect effects of xenobiotics
    [Show full text]
  • Herman Skolnik Award Symposium Honoring Henry Rzepa and Peter Murray-Rust ACS National Meeting, Philadelphia, PA, August 21, 2012
    Herman Skolnik Award Symposium Honoring Henry Rzepa and Peter Murray-Rust ACS National Meeting, Philadelphia, PA, August 21, 2012 A report by Wendy Warr ([email protected]) for the ACS CINF Chemical Information Bulletin Introduction This one-day symposium was remarkable for its record number of speakers (23 in all, plus one withdrawn and one replaced by a demonstration). Despite the number of performers, and some unfortunate technical faults, the whole event proceeded on schedule and without serious mishap. Henry Rzepa’s own talk was an opening scene-setter. He told a 1992 tale of some molecular orbitals explaining the course of a chemical reaction in 1992. The color diagram of these lacked semantics, and when it had been sent by fax to Bangor, it even lost its color. Months later the work was published,1 but the supporting information (SI) is not available for this article, and even if it were available electronically, would it be usable? So, how can it be mined for useful data or used as the starting point for further investigation? By 1994 Henry and his colleagues had recognized the opportunities presented by the World Wide Web.2,3 The data for a later article4 do survive in the form of Quicktime and MPEG animations on the Imperial College Gopher+ server but they are semantically poor, i.e., they are interpretable by humans but not by computer. The X-ray crystallography data are locatable using the proprietary identifier HEHXIB allocated by the Cambridge Crystallographic Data Center. Open identifiers such as the IUPAC International Chemical Identifier, InChI, are preferable.
    [Show full text]
  • Download the Source Code and Functionality by Adding Shell Commands
    BMC Bioinformatics BioMed Central Software Open Access Bioclipse: an open source workbench for chemo- and bioinformatics Ola Spjuth*1, Tobias Helmus2, Egon L Willighagen2, Stefan Kuhn2, Martin Eklund1, Johannes Wagener3, Peter Murray-Rust4, Christoph Steinbeck2 and Jarl ES Wikberg1 Address: 1Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden, 2Cologne University Bioinformatics Center, Cologne University, Cologne, Germany, 3Johannes Wagener, Gabelsbergerstr. 58a, 80333 Munich, Germany and 4Department of Chemistry, Unilever Centre for Molecular Informatics, University of Cambridge, Cambridge, UK Email: Ola Spjuth* - [email protected]; Tobias Helmus - [email protected]; Egon L Willighagen - [email protected]; Stefan Kuhn - [email protected]; Martin Eklund - [email protected]; Johannes Wagener - [email protected] muenchen.de; Peter Murray-Rust - [email protected]; Christoph Steinbeck - [email protected]; Jarl ES Wikberg - [email protected] * Corresponding author Published: 22 February 2007 Received: 1 December 2006 Accepted: 22 February 2007 BMC Bioinformatics 2007, 8:59 doi:10.1186/1471-2105-8-59 This article is available from: http://www.biomedcentral.com/1471-2105/8/59 © 2007 Spjuth et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: There is a need for software applications that provide users with a complete and extensible toolkit for chemo- and bioinformatics accessible from a single workbench. Commercial packages are expensive and closed source, hence they do not allow end users to modify algorithms and add custom functionality.
    [Show full text]