Value Through Innovation

Total Page:16

File Type:pdf, Size:1020Kb

Value Through Innovation Chemistry Infrastructure Migration in a Global Pharmaceutical Company: Concerns and Reality Zhenbin (Benjamin) Li, Ph.D. Research Data Integration & Logistics Service Chemistry Infrastructure • Chemistry Infrastructure: Computer systems, applications or software that store, search, manipulate, calculate, and visualize chemical or biological entities and their properties. Chemistry infrastructure is indispensible computer support in drug discovery and development processes of pharmaceutical industry. • Examples of Chemistry Infrastructure: Chemistry cartridge, chemistry drawing tools, structure standardization, chemical reaction and molecule visualization, etc. • Vendors of Chemistry Cartridges: MDL Direct (Accelrys), Accord (Accelrys), JChem (ChemAxon), ICCartridge (InfoChem) Daylight (daylight), Bingo (GGA), etc. Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 2 History of Chemistry Infrastructure 1950 •1955 CAS Laid ground work for computer-based chemical information database •1957 Ray and Kirsch: substructure searching algorithm (atom-by-atom matching), later modified by Sussenguth (1965) •1959 Opler and Baird: first graphical display of chemical structure •1965 Gluck, Morgan, Chemical storage and search system (Du Pont), canonical form of connection (bond-by-bond) and later modified by Morgan and become Gluck-Morgan Algorithm •1967 Armitage and Lynch, Structure similarity •1970 Crowe et al. fragment-based screening •1971 Hamilton, established the protein data bank (PDB) at Brookhaven National Lab •1971 Gund et al. 3D structure searching •1972 Wipke et al. 3D model from 2 D drawing with stereochemistry •1977 Mason, Peacock, Wipke, Molecular Design Limited, First database MACCS •1979 Chevron Chemical Company, first company to license MDL •1981 Lynch et al. Markush structures, 2 Patent databases Markush DARC (Derwent) and MARPAT (CAS) •1985 First commercial sale of Robernstein’s ChemDraw to Stu Schrieber and Yale Univ. •1986 ChemDraw 1.0 was released •1987 Dolata et al. 2D-3D converter and Hiller and Gasteiger CORINA •1987 MDL listed on NASDAQ •1988 Weininger, SMILES notation •1988 Downs et al. Parallel computing system (Transputer) •1989 Gasteiger and Weiske ChemInform, ChemoData, InfoChem reaction database, digitalized Beilstein Handbook •1991 MDL ISIS Client/server application •1992 Delby et al. Introduced MOLfile, Sdfile, RDfile, RXNfile, CTAB (v2000), the de facto •1996 MDL introduced V3000 format •1997 MDL acquired by Reed Elsevier •1998 ChemAxon formed •2003 Elsevier MDL introduced Xdfile •2004 Launch and adoption of ChemAxon's JChem Cartridge for Oracle to medium sized CRO •2005 Neurogen completely migrated chemistry infrastructure from MDL to ChemAxon •2007 Symyx acquired Elsevier MDL •2010 Accelrys merged with Symyx •2012 ChemAxon JChem Cartridge globally licensed to 5 of the top 10 pharma 2012 W. L. Chen, 2006, J. Chem. Inf. Model And personal communications Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 3 When to Consider Chemistry Infrastructure Migration Common Considerations of Chemistry Infrastructure Migration • Certain legacy systems would not run on new environment (hardware and operation systems) or requires tremendous effort (or cost) for upgrade • The current chemistry infrastructure technology lags behind the industry trend • The current chemistry infrastructure cannot meet the increasing demand of in- house software development • Dissatisfaction from technology and business demand for support and consulting • Long-term financial gain Challenges in BI • Historically, systems were built with ISIS platform, and needed to be migrated away • Isentris based alternative solutions did not offer performance advantages, and were therefore temporarily shelved • In-house chemistry systems demand robust APIs to integrate and manage global work-flows Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 4 Chemistry Infrastructure Selection Criteria • Reliability • Consulting • License model • Company stability • Extensibility • Support • Negotiation power • Size • User friendliness • Expertise • Short-term cost • Culture/work • Clear path for • Customization cut ethics migration • New upgrade • Long-term • Familiarity with according to financial gain global customers’ needs pharmaceutical industry Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 5 ChemAxon as Chemistry Infrastructure Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 ChemAxon as Chemistry Infrastructure JChemBase (Java and .NET) Marvin/MarvinSketch JChem Cartridge InstantJChem Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 Things to Consider for Chemistry Infrastructure Migration Legacy systems Timeline System interdependency On-going business demand Production interruption Re-training of developers User acceptance Re-training of end-users Financial commitment Resource Expertise Global alignment Reliability and flexibility of the new chemistry infrastructure Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 8 Roadmap of Chemistry Infrastructure Migration Negotiation Financial commitment Business Business approval Data migration intention Acquire licenses In-depth System migration Market options evaluation Testing Preliminary Pilot Completion evaluation implementation Consulting In-depth system analysis Implementation planning Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 9 External Request Management System as A Pilot Project BI Chemist CRO Chemist ERMS -Initials -Ordering dates -Amount -Completion dates -Request date DB UI -Quantity shipped -# of steps -Difficulty Logistics Calculation and Reporting Inventory System Management Local / International Transfer compound info into inventory FedEx/DHL automatically Shipping Sheet -Customs issues -Shipment contents Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 10 ERMS as a Pilot Project Request Management ERMS DB Logistics Calculations User Login and Reporting Authentication Reagent and Authorization DB Shipping and Status Compound DB Structure Searching Commercial DB BI Internal only Accessible by both BI and CROs Accessible by BI, but partially accessible by CROs E-Notebook DB Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 11 Reaction Scheme Aldol condensation of acetone and ethyl acetoacetate gave β-keto-ester 3. A Grignard reaction involving methylmagnesium bromide provided alcohol 4, which was subjected to acid catalyzed elimination to give diene 5. Reduction and acylation gave diene 7 (Scheme 3, compound 1). -MgBr * http://en.wikipedia.org/wiki/Nicolaou_Taxol_total_synthesis Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 12 Reaction Scheme: Iteration of Compounds • JChem allows parsing the reaction scheme into individual compounds. • This can only be achieved when the regular arrow, instead of reaction arrow, is used in the scheme. • Mol file containing a mixture of all the compounds in reaction scheme can be separated using getFragments() method. • However, the order of compounds is not necessarily consistent with the reaction scheme. • Ideally developer should have some control of the order or at least the behavior can be understandable. Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 13 Steps of JChem Implementation • JChem Oracle Cartridge Installation • Data migration using JChem Manager or pure SQL statement • Create domain indices on the structures if data are created via SQL. • ChemAxon domain index can coexist with MDL Direct index on a same database instance. This allows us to better planning the data migration with low impact on current production systems • Rebuild the relationship in the database • Change application codes to implement ChemAxon technology • Change interface with ChemAxon user interfaces • Testing and deployment Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 Common Cartridge Functions •Search Structure: •Insertion SELECT COUNT(*) FROM JCHEM_STRUCTURE jchem_table_pkg.jc_insert('C1CCCCC1 WHERE JC_COMPARE(CD_STRUCTURE, ', 'JCHEM_Structure', null,null, null, 'C1CCCCC1', 'T:S') = 1; null); s: substructure search (default) na: substructure search fingerprint-only •Update f: full structure search; query and target jchem_table_pkg.jc_update('c1ccccc1' must have the same heavy atom network for , 'JCHEM_STRUCTURE', cd_id, null); matching. •Deletion ff: full fragment search; query must be full matching to a target fragment. jchem_table_pkg. d: duplicate search jc_delete('JCHEM_STRUCTURE', 'where i: similarity search. structure_id = 1001', null); u: superstructure search http://www.chemaxon.com/jchem/doc/dev/cartridge/cartapi.html Zhenbin Li, Boehringer Ingelheim, ChemAxon UGM 2012 ERMS Fully ChemAxon-enabled MDL ChemAxon select structure_id, molfile(molecule) as mole, select structure_id, jc_molconvertb(cd_structure, molwt(molecule) as mw, molfmla(molecule) as 'mol') as mole, cd_molweight as mw, cd_formula as formula, smiles, ….. from structure where formula, cd_smiles as smiles, ….. from flexmatch(molecule, ?, 'match=all')=1 jchem_structure where jc_compare(cd_structure, ?, 't:ff') = 1 select structure_id, jc_molconvertb(cd_structure, select structure_id, molfile(molecule) as mole, 'mol') as mole, cd_molweight as mw, cd_formula as chime(molecule) as chime, molwt(molecule) as mw, formula, cd_smiles as smiles, ….. from molfmla(molecule) as formula, smiles, ….. from jchem_structure where jc_compare(cd_structure, ?, structure where similar(molecule, ?, ?)=1 't:i simThreshold:?') = 1 select structure_id, molfile(molecule) as mole, select structure_id, jc_molconvertb(cd_structure, chime(molecule) as chime, molwt(molecule) as mw, 'mol')
Recommended publications
  • Report on an NIH Workshop on Ultralarge Chemistry Databases Wendy A
    1 Report on an NIH Workshop on Ultralarge Chemistry Databases Wendy A. Warr Wendy Warr & Associates, 6 Berwick Court, Holmes Chapel, Cheshire, CW4 7HZ, United Kingdom. Email: [email protected] Introduction The virtual workshop took place on December 1-3, 2020. It was aimed at researchers, groups, and companies that generate, manage, sell, search, and screen databases of more than one billion small molecules (Figure 1). There were about 550 “attendees” from 37 different countries. Recent advances in computational chemistry have enabled researchers to navigate virtual chemical spaces containing billions of chemical structures, carrying out similarity searches, studying structure-activity relationships (SAR), experimenting with scaffold-hopping, and using other drug discovery methodologies.1 For clarity, one could differentiate “spaces” from “libraries”, and “libraries” from “databases”. Spaces are combinatorially constructed collections of compounds; they are usually very big indeed and it is not possible to enumerate all the precise chemical structures that are covered. Libraries are enumerated collections of full structures: usually fewer than 1010 molecules. Databases are a way to storing libraries, for example, in a relational database management system. Figure 1. Ultralarge chemical databases. (Source: Marcus Gastreich based on the publication by Hoffmann and Gastreich.) This report summarizes talks from about 30 practitioners in the field of ultralarge collections of molecules. The aim is to represent as accurately as possible the information that was delivered by the speakers; the report does not seek to be evaluative. 2 Welcoming remarks; defining a drug discovery gateway Susan Gregurick, Office of Data Science and Strategy, NIH, USA Data should be “findable, accessible, interoperable and reusable” (FAIR)2 and with this in mind, NIH has been creating, curating, integrating, and querying ultralarge chemistry databases.
    [Show full text]
  • Qsar Methods Development, Virtual and Experimental Screening for Cannabinoid Ligand Discovery
    QSAR METHODS DEVELOPMENT, VIRTUAL AND EXPERIMENTAL SCREENING FOR CANNABINOID LIGAND DISCOVERY by Kyaw Zeyar Myint BS, Biology, BS, Computer Science, Hampden-Sydney College, 2007 Submitted to the Graduate Faculty of School of Medicine in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2012 UNIVERSITY OF PITTSBURGH SCHOOL OF MEDICINE This dissertation was presented by Kyaw Zeyar Myint It was defended on August 20th, 2012 and approved by Dr. Ivet Bahar, Professor, Department of Computational and Systems Biology Dr. Billy W. Day, Professor, Department of Pharmaceutical Sciences Dr. Christopher Langmead, Associate Professor, Department of Computer Science, CMU Dissertation Advisor: Dr. Xiang-Qun Xie, Professor, Department of Pharmaceutical Sciences ii Copyright © by Kyaw Zeyar Myint 2012 iii QSAR METHODS DEVELOPMENT, VIRTUAL AND EXPERIMENTAL SCREENING FOR CANNABINOID LIGAND DISCOVERY Kyaw Zeyar Myint, PhD University of Pittsburgh, 2012 G protein coupled receptors (GPCRs) are the largest receptor family in mammalian genomes and are known to regulate wide variety of signals such as ions, hormones and neurotransmitters. It has been estimated that GPCRs represent more than 30% of current drug targets and have attracted many pharmaceutical industries as well as academic groups for potential drug discovery. Cannabinoid (CB) receptors, members of GPCR superfamily, are also involved in the activation of multiple intracellular signal transductions and their endogenous ligands or cannabinoids have attracted pharmacological research because of their potential therapeutic effects. In particular, the cannabinoid subtype-2 (CB2) receptor is known to be involved in immune system signal transductions and its ligands have the potential to be developed as drugs to treat many immune system disorders without potential psychotic side- effects.
    [Show full text]
  • Open Chemoinformatic Resources to Explore the Structure, Properties and Chemical Space of Cite This: RSC Adv.,2017,7,54153 Molecules
    RSC Advances REVIEW View Article Online View Journal | View Issue Open chemoinformatic resources to explore the structure, properties and chemical space of Cite this: RSC Adv.,2017,7,54153 molecules a ab a Mariana Gonzalez-Medina,´ J. Jesus´ Naveja, Norberto Sanchez-Cruz´ a and Jose´ L. Medina-Franco * New technologies are shaping the way drug discovery data is analyzed and shared. Open data initiatives and web servers are assisting the analysis of the large amounts of data that we are now able to produce. The final goal is to accelerate the process of moving from new data to useful information that could lead to Received 27th October 2017 treatments for human diseases. This review discusses open chemoinformatic resources to analyze the Accepted 21st November 2017 diversity and coverage of the chemical space of screening libraries and to explore structure–activity DOI: 10.1039/c7ra11831g relationships of screening data sets. Free resources to implement workflows and representative web- rsc.li/rsc-advances based applications are emphasized. Future directions in this field are also discussed. Creative Commons Attribution 3.0 Unported Licence. 1. Introduction connections between biological activities, ligands and proteins.3 During the past few years, there has been an important increase Herein we review representative chemoinformatic tools in open data initiatives to promote the availability of free essential to explore the structure, chemical space and properties research-based tools and information.1 While there is still some of molecules. The review is focused on recent and representative resistance to open data in some chemistry and drug discovery free web-based applications.
    [Show full text]
  • Retro Drug Design: from Target Properties to Molecular Structures
    Retro Drug Design: From Target Properties to Molecular Structures Yuhong Wang*, Sam Michael, Ruili Huang, Jinghua Zhao, Katlin Recabo, Danielle Bougie, Qiang Shu, Paul Shinn, Hongmao Sun* National Center for Advancing Translational Sciences (NCATS) 9800 Medical Center Drive, Rockville, MD 20850 *Contact information for the corresponding authors: Yuhong Wang Hongmao Sun, PhD NCATS/NIH NCATS/NIH 9800 Medical Center Dr. 9800 Medical Center Dr. Rockville, MD 20854 Rockville, MD 20854 Phone: 301-480-9855 Phone: 301-480-9839 e-mail: [email protected] e-mail: [email protected] Abstract: To generate drug molecules of desired properties with computational methods is the holy grail in pharmaceutical research. Here we describe an AI strategy, retro drug design, or RDD, to generate novel small molecule drugs from scratch to meet predefined requirements, including but not limited to biological activity against a drug target, and optimal range of physicochemical and ADMET properties. Traditional predictive models were first trained over experimental data for the target properties, using an atom typing based molecular descriptor system, ATP. Monte Carlo sampling algorithm was then utilized to find the solutions in the ATP space defined by the target properties, and the deep learning model of Seq2Seq was employed to decode molecular structures from the solutions. To test feasibility of the algorithm, we challenged RDD to generate novel drugs that can activate µ opioid receptor (MOR) and penetrate blood brain barrier (BBB). Starting from vectors of random numbers, RDD generated 180,000 chemical structures, of which 78% were chemically valid. About 42,000 (31%) of the valid structures fell into the property space defined by MOR activity and BBB permeability.
    [Show full text]
  • A Chemaxon/KNIME Based Tool for Designing Chemical Libraries
    A ChemAxon/KNIME based tool for designing chemical libraries Tim Parrott Brock Luty Dart NeuroScience Dart NeuroScience September 25, 2013 ChemAxon UGM Dart NeuroScience Small molecules to maintain cognitive vitality (LTM) Currently about 200 FTEs with build-out expected at 260 Privately held LLC by a single individual Scientific Computing Scientific Computing collaborates with other DNS Departments to deliver solutions that simplify and accelerate the drug discovery process. We rely on our (non-traditional) knowledge and experience in both Science and Technology to develop novel and efficient systems to meet this goal Scientific Computing Groups Computational Bioinformatics Chemistry Project Support Project Support - Modeling Philip Cheung - Target ID *Tami Marrone - SBDD/Library Design Doug Fenger - Expression Analysis / Pathways Meg McCarrick + 1 FTE James Na - Apply Methods - Novel Software algorithms Amy Shih - Pre-LO/LO/PCC - Enterprise Software (with Methods) Bill Sinko Information Methods Management Development Software Development Data / Biz Analysis - Informatics Software Development - Data Capture + 1 Group Lead Ron Blanford - Developing new methods John Jaeger - Analytics Daniel Garden - Enterprise Scale Architecture Tim Parrott - Data Access Kevin Neal - RIA (MVC) with SOA James Harr Hari Muddana - QA/Scientific Support - Extensions for ELN, Spotfire, IJC, etc Eileen Tompkins - Project Management + 1 FTE Heather Jones Background Dart NeuroScience (DNS) 200+ Scientists 50+ Chemists Parallel Synthesis Group We need a About 20
    [Show full text]
  • Optimizing the Use of Open-Source Software Applications in Drug
    DDT • Volume 11, Number 3/4 • February 2006 REVIEWS TICS INFORMA Optimizing the use of open-source • software applications in drug discovery Reviews Werner J. Geldenhuys1, Kevin E. Gaasch2, Mark Watson2, David D. Allen1 and Cornelis J.Van der Schyf1,3 1Department of Pharmaceutical Sciences, School of Pharmacy,Texas Tech University Health Sciences Center, Amarillo,TX, USA 2West Texas A&M University, Canyon,TX, USA 3Pharmaceutical Chemistry, School of Pharmacy, North-West University, Potchefstroom, South Africa Drug discovery is a time consuming and costly process. Recently, a trend towards the use of in silico computational chemistry and molecular modeling for computer-aided drug design has gained significant momentum. This review investigates the application of free and/or open-source software in the drug discovery process. Among the reviewed software programs are applications programmed in JAVA, Perl and Python, as well as resources including software libraries. These programs might be useful for cheminformatics approaches to drug discovery, including QSAR studies, energy minimization and docking studies in drug design endeavors. Furthermore, this review explores options for integrating available computer modeling open-source software applications in drug discovery programs. To bring a new drug to the market is very costly, with the current of combinatorial approaches and HTS. The addition of computer- price tag approximating US$800 million, according to data reported aided drug design technologies to the R&D approaches of a com- in a recent study [1]. Therefore, it is not surprising that pharma- pany, could lead to a reduction in the cost of drug design and ceutical companies are seeking ways to optimize costs associated development by up to 50% [6,7].
    [Show full text]
  • Bringing Open Source to Drug Discovery
    Bringing Open Source to Drug Discovery Chris Swain Cambridge MedChem Consulting Standing on the shoulders of giants • There are a huge number of people involved in writing open source software • It is impossible to acknowledge them all individually • The slide deck will be available for download and includes 25 slides of details and download links – Copy on my website www.cambridgemedchemconsulting.com Why us Open Source software? • Allows access to source code – You can customise the code to suit your needs – If developer ceases trading the code can continue to be developed – Outside scrutiny improves stability and security What Resources are available • Toolkits • Databases • Web Services • Workflows • Applications • Scripts Toolkits • OpenBabel (htttp://openbabel.org) is a chemical toolbox – Ready-to-use programs, and complete programmer's toolkit – Read, write and convert over 110 chemical file formats – Filter and search molecular files using SMARTS and other methods, KNIME add-on – Supports molecular modeling, cheminformatics, bioinformatics – Organic chemistry, inorganic chemistry, solid-state materials, nuclear chemistry – Written in C++ but accessible from Python, Ruby, Perl, Shell scripts… Toolkits • OpenBabel • R • CDK • OpenCL • RDkit • SciPy • Indigo • NumPy • ChemmineR • Pandas • Helium • Flot • FROWNS • GNU Octave • Perlmol • OpenMPI Toolkits • RDKit (http://www.rdkit.org) – A collection of cheminformatics and machine-learning software written in C++ and Python. – Knime nodes – The core algorithms and data structures are written in C ++. Wrappers are provided to use the toolkit from either Python or Java. – Additionally, the RDKit distribution includes a PostgreSQL-based cartridge that allows molecules to be stored in relational database and retrieved via substructure and similarity searches.
    [Show full text]
  • Deltasoft's Chemcart
    DeltaSoft’s ChemCart An integrated suite of applications that leverage ChemAxon components ChemAxon 2011 US UGM September 27, 28– San Diego, CA DeltaSoft, Inc. Specializing in R&D Informatics since 1996 Commercial software applications ChemCart web interface to research data ChemCart Applications Compound Registration Reagent Inventory Sample Inventory Electronic Laboratory Notebook BioAssay Structure Activity Browser Custom Synthesis Tracker Services www.deltasoftinc.com DeltaSoft’s Discovery Informatics Expertise Cheminformatics & Bioinformatics Application Design, Development, Integration Chemistry Cartridge Evaluation and Tuning Oracle Optimization and Support Data Model Design Strategic Planning www.deltasoftinc.com Component Approach – Choice! ChemAxon Accelrys (Symyx) Accelrys CambridgeSoft Cartridge JChem Direct Accord/Oracle Cartridge ISIS Accelrys (Symyx) Sketcher Marvin ChemDraw Draw Draw/JDraw Chime Accelrys (Symyx) Accord Renderer Marvin ChemDraw Pro Draw/JDraw Chemistry JChem ISIS Accord Excel Tool for Excel for Excel for Excel Vendor Internal Reagent CAP ACD ACX Source(s) SDFiles Reagents Workflow / Analysis PipelinePilot Spotfire www.deltasoftinc.com ChemCart Dynamic web forms interface to research information, including structures/reactions, data, images, documents & files ChemCart Server Chemistry Cartridge Text, Numeric, Images (molecules & reactions) Documents, Files www.deltasoftinc.com Integration with ChemAxon ChemAxon JChem ChemAxon JChem Cartridge for Structure Cartridge for Structure Storage/Searching
    [Show full text]
  • Useful Molecular Modelling and Drug Design Softwares and Databases
    Useful Molecular Modelling and Drug Design Softwares and Databases Compiled by Bhakat Soumrndranath ([email protected]) - Dr Soliman’s Molecular Modelling and Drug Design Lab Databases Chemical Structure Databases: ZINC: a free database of commercially-available compounds for virtual screening. ZINC contains over 21 million purchasable compounds in ready-to-dock, 3D formats. ZINC is provided by the Shoichet Laboratory in the Department of Pharmaceutical Chemistry at the University of California, San Francisco (UCSF). To cite ZINC, please reference: Irwin, Sterling, Mysinger, Bolstad and Coleman, J. Chem. Inf. Model. 2012 DOI: 10.1021/ci3001277. ChEMBL: It is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. logP, Molecular Weight, Lipinski Parameters, etc.) and abstracted bioactivities (e.g. binding constants, pharmacology and ADMET data). ChemSpider: ChemSpider is a free chemical structure database providing fast text and structure search access to over 28 million structures from hundreds of data sources. ChemSpider SyntheticPages, CS|SP, extends this model to cover reactions, providing quick publication, peer review and semantic enhancement of repeatable reactions. Maintained by: Royal Society of Chemistry Drug Bank: The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. The database contains 6712 drug entries including 1448 FDA- approved small molecule drugs, 131 FDA-approved biotech (protein/peptide) drugs, 86 nutraceuticals and 5079 experimental drugs and many more. Supported By: Genome Alberta & Genome Canada. This project is also supported in part by GenomeQuest, Inc.
    [Show full text]
  • LNCS 5102, Pp
    A Java Tool for the Management of Chemical Databases and Similarity Analysis Based on Molecular Graphs Isomorphism Irene Luque Ruiz and Miguel Ángel Gómez-Nieto University of Córdoba. Department of Computing and Numerical Analysis. Campus de Rabanales. Albert Einstein Building E-14071 Córdoba (Spain) {iluque,mangel}@uco.es Abstract. This paper describes a computational chemistry solution for the management of large chemical databases of molecules and the performing of isomorphism calculation for the analysis of database similarity and diversity. The system has been fully developed using Java language and it uses other free and standard Java library. The system allows to the user the building of databases of molecules, store information about the molecules, the matching among molecules using different isomorphism paradigms and the similarity/diversity analysis of databases through a wide number of similarity indices. Keywords: Computational chemistry, isomorphism, matching, similarity and diversity analysis, chemical database management, Java. 1 Introduction The investigations in Computational Chemistry [1] use computers to assist in solving chemical problems, they incorporate the results of theoretical chemistry into efficient computer programs, aimed to calculate the structures and properties of molecules. While its results normally complement the information obtained by chemical experiments, computational chemistry find predict hitherto unobserved chemical phenomena. Several major areas may be distinguished within computational chemistry [1-3]: − The prediction of the molecular structure of substances. − Storing and searching for data on chemical entities. − Identifying correlations between chemical structures and properties (QSPR and QSAR). − Computational approaches to help in the efficient synthesis of compounds. − Computational approaches to design molecules that interact in specific ways with other molecules (e.g.
    [Show full text]
  • Mining Collections of Compounds with Screening Assistant 2
    Le Guilloux et al. Journal of Cheminformatics 2012, 4:20 http://www.jcheminf.com/content/4/1/20 SOFTWARE Open Access Mining collections of compounds with Screening Assistant 2 Vincent Le Guilloux1*, Alban Arrault2, Lionel Colliandre1,Stephane´ Bourg3, Philippe Vayer2 and Luc Morin-Allory1* Abstract Background: High-throughput screening assays have become the starting point of many drug discovery programs for large pharmaceutical companies as well as academic organisations. Despite the increasing throughput of screening technologies, the almost infinite chemical space remains out of reach, calling for tools dedicated to the analysis and selection of the compound collections intended to be screened. Results: We present Screening Assistant 2 (SA2), an open-source JAVA software dedicated to the storage and analysis of small to very large chemical libraries. SA2 stores unique molecules in a MySQL database, and encapsulates several chemoinformatics methods, among which: providers management, interactive visualisation, scaffold analysis, diverse subset creation, descriptors calculation, sub-structure / SMART search, similarity search and filtering. We illustrate the use of SA2 by analysing the composition of a database of 15 million compounds collected from 73 providers, in terms of scaffolds, frameworks, and undesired properties as defined by recently proposed HTS SMARTS filters. We also show how the software can be used to create diverse libraries based on existing ones. Conclusions: Screening Assistant 2 is a user-friendly, open-source software that can be used to manage collections of compounds and perform simple to advanced chemoinformatics analyses. Its modular design and growing documentation facilitate the addition of new functionalities, calling for contributions from the community.
    [Show full text]
  • Press Release. Enamine Collaborates with Chemaxon to Provide
    PRESS RELEASE Kiev, Ukraine, and Budapest, Hungary, 15th March 2018 Enamine collaborates with ChemAxon to create a convenient web-based search in immense chemical space Enamine Ltd., a world’s leading chemical research organization having a database of 337 million unique small molecular weight compounds that can be readily accessed through 1 step synthesis (REAL database), and ChemAxon, a globally-renowned provider of software solutions for chemistry and biology, today announced the launch of jointly-created online resource (available at EnamineStore.com) to allow researchers worldwide to explore the chemical space of REAL database. The resource is aimed to provide the drug discovery community with the efficient solutions in hit-to-lead development. The querying of the chemical space is empowered by ChemAxon’s proprietary fast similarity search tool – MadFast, delivering sub-second response against hundreds of millions of molecules. Michael Bossert, Head Strategic Alliances at Enamine, commented “We are pleased to partner with ChemAxon while realizing that their similarity search tool can largely support our leadership in library synthesis for early drug discovery. Thanks to expedite delivery, outstanding diversity and the highest synthesis success rate, REAL Database has become a reference among our clients in their virtual screening initiatives for express analogues searches and provisioning in their hit-to-lead projects. Our tool is now accessible freely online for everyone to use.” ChemAxon's CEO, Dr. Ferenc Csizmadia added “Our new software is a high-end toolkit for ultra-fast chemical similarity search. It complements other available chemoinformatics solutions on the market. The fast in-memory similarity searches provide useful tools for similarity-based search, overlap analysis and clustering.
    [Show full text]