CHEMICAL REPRESENTATION GUIDE 2016 Copyright Notice

Total Page:16

File Type:pdf, Size:1020Kb

CHEMICAL REPRESENTATION GUIDE 2016 Copyright Notice CHEMICAL REPRESENTATION GUIDE 2016 Copyright Notice ©2015 Dassault Systèmes. All rights reserved. 3DEXPERIENCE, the Compass icon and the 3DS logo, CATIA, SOLIDWORKS, ENOVIA, DELMIA, SIMULIA, GEOVIA, EXALEAD, 3D VIA, BIOVIA and NETVIBES are commercial trademarks or registered trademarks of Dassault Systèmes or its subsidiaries in the U.S. and/or other countries. All other trademarks are owned by their respective owners. Use of any Dassault Systèmes or its subsidiaries trademarks is subject to their express written approval. Acknowledgments and References To print photographs or files of computational results (figures and/or data) obtained using BIOVIA software, acknowledge the source in an appropriate format. BIOVIA may grant permission to republish or reprint its copyrighted materials. Requests should be submitted to BIOVIA Support, either through electronic mail to [email protected] , or in writing to: BIOVIA Support 5005 Wateridge Vista Drive, San Diego, CA 92121 USA No-Structures 28 Contents Reaction Representation 29 Introduction 29 Overview 1 Reaction Mapping 29 Audience for this Guide 1 Mapping Reactions Automatically 30 Prerequisite Knowledge 1 Mapping Reactions Manually 31 Related BIOVIA Documentation 1 Stereoconfiguration Atom Properties in Molecule Representation 3 Mapped Reactions 32 Substances, Structures and Fragments 3 Properties of Bonds in Mapped Reactions 32 The BIOVIA Periodic Table 3 Simple Bond Properties 32 Atom Properties 3 Combined Bond Properties 33 Charges, Radicals and Isotopes 3 Example: Ambiguity in the Fate of Reacting Valences and Implicit Hydrogens 4 Atoms and Bonds 33 Example: Multiple Fates of Bonds in Default Valences 4 Products 34 Explicit Valence 5 Stereogroup Information in Reactants and Rules for Calculation of Valence and Implicit Products 34 Hydrogen 5 Markush (Rgroup) Structures 36 Bond Types 5 Library Representation 36 Aromaticity 7 The Root Structure 37 Tautomers 8 Rgroups and Rgroup Members 38 Salts 8 Nested Rgroups 38 Tetrahedral Stereochemistry 9 One Attachment Point 40 Grouping Related Stereogenic Centers 9 Two Attachment Points 40 Structures with One Stereogenic Center 10 Unconnected Rgroup Atom 41 Structures with Multiple Stereogenic Centers 11 Null Members 42 Chiral Labels 13 Markush Structures Differ From Markush Rules for Unambiguous Representation of Queries 43 Tetrahedral Stereochemistry 15 Enumeration of Markush Structures 43 Original Accelrys Representation of Tetrahedral Stereochemistry 16 Scaffold-based Enumeration 43 Examples of Stereogroups 16 Reaction-based Enumeration 44 Example 1: Racemic Mixture 16 Rgroup Decomposition 44 Example 2: Acquiring Increasing Amounts of Homology Groups 47 Information on the Stereochemistry of a About Homology Groups 47 Sample 17 Creating Substructure Search (SSS) Queries 47 Example 3: A Mixture of Epimers in a Limitations 48 Reaction Product 20 Biopoloymer Representation 49 Stereochemistry of Allenes and Biaryls 23 Configuring Databases for Biopolymer Cis and Trans Stereochemistry 24 Registration and Searching 49 Meso Compounds 24 Condensed Representation of Biopolymers 49 Spiro Compounds 25 Hybrid Representation 49 Chemical and Data Substance Groups *Atoms Alone 49 (Sgroups) 25 Pseudoatoms Alone 50 Data Sgroups 25 *Atoms and Pseudoatoms 51 Abbreviated Structures 26 Summary of Biopolymer Structure Multiple Groups 26 Conventions 51 Other Chemical Sgroups 27 Compatibility of Sequences Created in Structural Uncertainty 27 ISIS/Draw 52 Star Atoms (*) 27 ISIS/Draw Sequences Use the Full Structure 52 Convention Reaction Product 67 ISIS/Draw Residue Templates Lack Explicit Mixture of Stereoisomers 67 Attachment Atoms 52 Aspirin Tablet 68 Stereochemistry of ISIS/Draw Residue Ordered Mixture 69 Template 52 Polymer Representation 70 Compatibility of Condensed and Full Structure Conventions 53 Introduction to Polymer Representation 70 Required Sgroup Fields for Biopolymer Structure-based and Source-based Representation 53 Representation 70 Sgroup Field for Identifying Attachment Polymer Bracket Types 71 Atoms 53 Structural Repeating Unit (SRU) Brackets 71 Sgroup Field for *Atom Representation 54 Monomer Brackets (mon) 71 Structures Used in Biopolymer Representation 54 Mer Brackets (mer) 71 Biopolymer Residues 55 Copolymer Brackets (co) 71 Single-attachment Groups 55 Additional Polymer Brackets 72 Protecting Groups 55 Cyclization and Phase Shifting 72 PEG Molecules 55 Polymer Repeat Pattern 73 Special Features of Abbreviated Structures 55 Polymers with Two Crossing Bonds 73 Template Format for Biopolymer Residues 56 Ladder-type Polymers 74 Abbreviation Class 56 Polymers with Three or More Brackets 76 Terminal Leaving Groups 57 Polymer End Groups 77 Order and Bond Matching Attributes of Why Are Representation Conventions Attachment Atoms 58 Important? 77 Attachment Atoms on Reactive Chains 58 Guidelines for Graphical Representation of Explicit Hydrogen Leaving Groups on Polymers 78 Attachment Atoms of Reactive Chains 59 Structure-based and Source-based Explicit Hydrogen Leaving Groups on Representation 78 Histidine 61 Stereoregularity in Polymers 79 Molfile Features in Biopolymer Templates 62 Using Attached Data in Polymer Abbreviation Class (SCL) 62 Representation 79 Sgroup Subscript (SMT) 62 Required Attached Data for Polymer Representation 79 Abbreviation Attachment Atom (SAP) 62 Polymer or Copolymer Type 79 Creating and Enforcing Conventions for Biopolymer Representation 63 Stereoregularity 79 Choose One Convention for Each Chemical Guidelines for Defining Additional Attached Entity 63 Data 80 Use Standard Abbreviations 63 Examples of Structure-Based Representation 81 Subsequence Searching in Isentris Applications 64 Regular Homopolymers 81 Subsequence Search Differs from Simple Homopolymers 81 Substructure Search 64 Stereoregularity 82 BIOVIA Draw Programming Interface (API) for Ladder-type Polymers 82 subsequence search 65 Irregular Homopolymers 82 Implementing Subsequence Search for Full Structure Representations 65 Alternating and Periodic Polymers 83 Statistical, Random, and Unspecified Related Documentation for BIOVIA Draw 65 Copolymers 84 Mixture Representation 66 Unspecified Copolymers 84 Ordered and Unordered Mixtures 66 Random Copolymers 85 Using * atoms for Unspecified Structures in Mixtures 66 Statistical Copolymers 86 Examples of Mixtures 67 Regular Block Copolymers 86 Unordered Mixtures 67 Ordered Diblock 86 Block Copolymer with Junction Unit 87 Dependencies Between Switches 106 Segmented Block Copolymers 87 Examples of the Interactivity of Flexmatch Star Block Copolymers 89 Switches 106 Irregular Block Copolymers 89 Sulfones 106 Chemically Modified Polymers 90 Sulfoxides 107 Graft Polymers and Copolymers 91 Thiocarboxylic Acid Salts 107 Single Graft at a Known Site 92 Organometallic Complexes 107 Mixed Graft at a Known Site 92 Flexmatch Search of Structures with Tetrahedral Stereochemistry 109 Cross-linked Polymers 92 Absolute Configuration 111 Examples of Source-based Representation 94 Relative Stereoconfigurations (OR groups) 112 Guidelines for Source-based Representation 94 Example 1 112 Homopolymers 95 Example 2 112 Alternating and Other Periodic Polymers 95 Example 3 113 Copolymers from Monomers that Do Not Homopolymerize 96 Example 4 115 Statistical, Random, and Unspecified Example 5 115 Copolymers 96 Example 6 117 Unspecified Copolymers 96 Mixtures of Relative Stereoconfigurations Random Copolymers 97 (AND groups) 117 Statistical Copolymers 98 Flexmatch Search of Polymers 119 Block Copolymer 98 Useful Combinations of Flexmatch Switches 121 Exact Search (Flexmatch) 100 Exact Match/As Drawn 121 Flexmatch Switches 100 Exact Match/As Drawn plus Tautomers 122 How to Specify Switches 100 Tetrahedral Stereochemistry 122 Description of Switches 101 Cis/Trans Geometric Stereochemistry 122 Isotopes (MAS) 101 Exact Match/As Drawn plus Stereoisomers 123 Bonds (BON, STE, TAU) 101 Tetrahedral Stereochemistry 123 Bond (BON) 101 Cis and Trans Stereochemistry 124 Stereochemistry (STE) 101 Original Tautomer Search 124 Tautomer Bonds (TAU) 102 Exact Match/As Drawn plus Salts 124 Salts and Parent Compounds (CHA, ION, The Least Restrictive Flexmatch Switches 126 FRA, SAL) 102 Exact Match/As Drawn Polymer Search 126 Charge (CHA and ION) 102 Sourced-based and Structure-based Fragments (FRA) 102 Polymer Search 126 Hydrogen Count (HYD) 103 Copolymer Search 127 Salts (SAL) 103 Substructure Search 128 Alternative Structure Representations (MET, Definition of Substructure Search 128 RAD, VAL, IgnoreChargesInPiSystems) 103 Query Features on Atoms and Bonds 129 Metal Bonds (MET) 103 Allowing or Excluding Specific Atoms 130 Radicals (RAD) 103 Atom Query Feature: Any Atom (A) 130 Valence (VAL) 104 Atom Query Feature: Heteroatoms (Q) 130 Substance Groups (Sgroups) 104 Atom Query Feature: List 130 Attached Data (DAT) 104 Atom Query Feature: Not List 130 Polymer End Groups (end) 104 Atom Query Feature: H0 131 Mixtures (MIX) 104 Atom Query Feature: Unsaturated Atom Monomer/SRU Uniqueness (MSU) 104 (u) 131 Polymers (POL) 105 Allowing a Specific Number of Attachments 132 Polymer Type (TYP) 106 Atom Query Feature: Substitution Count 132 of Zero (s0) Reserved Names for Sgroup Fields 146 Atom Query Feature: Substitution Count Guidelines for Defining Your Own Attached of One (s1) 132 Data 147 Atom Query Feature: Substitution Count Examples of Attached Data 147 of Two (s2) 133 Distinguishing Relative Stereoisomers 147 Atom Query Feature: Substitution Count Isotopic Purity 149 of Three (s3) 133 Creating Sgroup Fields in Your Database 149 Atom Query Feature:
Recommended publications
  • The Alexandria Library, a Quantum-Chemical Database of Molecular Properties for Force field Development 9 2017 Received: October 1 1 1 Mohammad M
    www.nature.com/scientificdata OPEN Data Descriptor: The Alexandria library, a quantum-chemical database of molecular properties for force field development 9 2017 Received: October 1 1 1 Mohammad M. Ghahremanpour , Paul J. van Maaren & David van der Spoel Accepted: 19 February 2018 Published: 10 April 2018 Data quality as well as library size are crucial issues for force field development. In order to predict molecular properties in a large chemical space, the foundation to build force fields on needs to encompass a large variety of chemical compounds. The tabulated molecular physicochemical properties also need to be accurate. Due to the limited transparency in data used for development of existing force fields it is hard to establish data quality and reusability is low. This paper presents the Alexandria library as an open and freely accessible database of optimized molecular geometries, frequencies, electrostatic moments up to the hexadecupole, electrostatic potential, polarizabilities, and thermochemistry, obtained from quantum chemistry calculations for 2704 compounds. Values are tabulated and where available compared to experimental data. This library can assist systematic development and training of empirical force fields for a broad range of molecules. Design Type(s) data integration objective • molecular physical property analysis objective Measurement Type(s) physicochemical characterization Technology Type(s) Computational Chemistry Factor Type(s) Sample Characteristic(s) 1 Uppsala Centre for Computational Chemistry, Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Husargatan 3, Box 596, SE-75124 Uppsala, Sweden. Correspondence and requests for materials should be addressed to D.v.d.S. (email: [email protected]).
    [Show full text]
  • Radiotélescopes Seek Cosmic Rays
    INTERNATIONAL JOURNAL OF HIGH-ENERGY PHYSICS CERN COURIER Radiotélescopes seek cosmic rays COMPUTING MEDICAL IMAGING NUCLEAR MASSES Information technology and Spin-off from particle physics Precision measurements from physics advance together pl6 wins awards p23 accelerator experiments p26 Multichannel GS/s data acquisition systems used For more information, to be expensive. They also would fill up entire visit our Web site at www.acqiris.com instrument racks with power-hungry electronics. But no more. We have shrunk the size, lowered 1)Rackmount kit available the cost, reduced the power consumption and incorporated exceptional features such as clock synchronization and complete trigger distribution.1) A single crate (no bigger than a desktop PC) can house up to 24 channels at 500MS/S or 1 GS/s when deploying an embedded processor, or up to 28 channels (14 at 2GS/s) using a PCI interface. CONTENTS Covering current developments in high- energy physics and related fields worldwide CERN Courier is distributed to Member State governments, institutes and laboratories affiliated with CERN, and to their personnel. It is published monthly except January and August, in English and French editions. The views expressed are not CERN necessarily those of the CERN management. Editor: Gordon Fraser CERN, 1211 Geneva 23, Switzerland E-mail [email protected] Fax +41 (22) 782 1906 Web http://www.cerncourier.com News editor: James Gillies COURIER VOLUME 41 NUMBER 3 APRIL 2001 Advisory Board: R Landua (Chairman), F Close, E Lillest0l, H Hoffmann, C Johnson,
    [Show full text]
  • Flexible Heuristic Algorithm for Automatic Molecule Fragmentation: Application to the UNIFAC Group Contribution Model Simon Müller*
    Müller J Cheminform (2019) 11:57 https://doi.org/10.1186/s13321-019-0382-3 Journal of Cheminformatics RESEARCH ARTICLE Open Access Flexible heuristic algorithm for automatic molecule fragmentation: application to the UNIFAC group contribution model Simon Müller* Abstract A priori calculation of thermophysical properties and predictive thermodynamic models can be very helpful for developing new industrial processes. Group contribution methods link the target property to contributions based on chemical groups or other molecular subunits of a given molecule. However, the fragmentation of the molecule into its subunits is usually done manually impeding the fast testing and development of new group contribution methods based on large databases of molecules. The aim of this work is to develop strategies to overcome the challenges that arise when attempting to fragment molecules automatically while keeping the defnition of the groups as simple as possible. Furthermore, these strategies are implemented in two fragmentation algorithms. The frst algorithm fnds only one solution while the second algorithm fnds all possible fragmentations. Both algorithms are tested to frag- ment a database of 20,000 molecules for use with the group contribution model Universal Quasichemical Func- tional Group Activity Coefcients+ (UNIFAC). Comparison of the results with a reference database shows that both algorithms are capable of successfully fragmenting all the molecules automatically. Furthermore, when applying them on a larger database it is shown, that the newly developed algorithms are capable of fragmenting structures previously thought not possible to fragment. Keywords: Molecule fragmentation, Cheminformatics, RDKit, Property prediction, Group contribution method, UNIFAC, Incrementation Introduction named QSPR methods (Quantitative Structure Property Cheminformatics is a growing feld due to the increas- Relationship).
    [Show full text]
  • Chemical Database Projects Delivered by RSC Escience
    Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances” Antony Williams RSC eScience . What was once just ChemSpider is much more… . ChemSpider Reactions . Chemicals Validation and Standardization Platform . Learn Chemistry Wiki . National Chemical Database Service . Open PHACTS . PharmaSea . Global Chemistry Hub We are known for ChemSpider… . The Free Chemical Database . A central hub for chemists to source information . >28 million unique chemical records . Aggregated from >400 data sources . Chemicals, spectra, CIF files, movies, images, podcasts, links to patents, publications, predictions . A central hub for chemists to deposit & curate data We Want to Answer Questions . Questions a chemist might ask… . What is the melting point of n-heptanol? . What is the chemical structure of Xanax? . Chemically, what is phenolphthalein? . What are the stereocenters of cholesterol? . Where can I find publications about xylene? . What are the different trade names for Ketoconazole? . What is the NMR spectrum of Aspirin? . What are the safety handling issues for Thymol Blue? I want to know about “Vincristine” Vincristine: Identifiers and Properties Vincristine: Vendors and Sources Vincristine: Articles How did we build it? . We deal in Molfiles or SDF files – with coordinates . Deposit anything that has an InChI – we support what InChI can handle, good and bad . Standardization based on “InChI standardization” . InChIs aggregate (certain) tautomers The InChI Identifier Downsides of InChI . Good for small molecules – but no polymers, issues with inorganics, organometallics, imperfect stereochemistry. ChemSpider is “small molecules” . InChI used as the “deduplicator” – FIRST version of a compound into the database becomes THE structure to deduplicate against… Side Effects of InChI Usage SMILES by comparison… Side Effects of InChI Usage Searches: The INTERNET Search by InChI ChemSpider Google Search http://www.chemspider.com/google/ How did we build it? .
    [Show full text]
  • Interaction of Selected Actinides (U, Cm) with Bacteria Relevant to Nuclear
    Interaction of Selected Actinides (U, Cm) with Bacteria Relevant to Nuclear Waste Disposal DISSERTATION zur Erlangung des akademischen Grades Doctor rerum naturalium (Dr. rer. nat.) vorgelegt der Fakultät Mathematik und Naturwissenschaften der Technischen Universität Dresden von Diplom-Chemikerin Laura Lütke geboren am 27.04.1984 in Dresden Eingereicht am 21.02.2013............. Die Dissertation wurde in der Zeit von August 2009 bis Januar 2013 im Institut für Ressourcenökologie des Helmholtz-Zentrums Dresden-Rossendorf angefertigt. Gutachter: Prof. Dr. rer. nat. habil. Gert Bernhard Prof. Dr. rer. nat. habil. Jörg Steinbach Prof. Dr. rer. nat. habil. Petra Panak Datum der Disputation: 23.04.2013 Table of Contents I Table of Contents List of Abbreviations and Symbols Abstract 1 MOTIVATION & AIMS.................................................................................................. 1 2 INTRODUCTION ............................................................................................................ 5 2.1 Aqueous Chemistry of Actinides ............................................................................ 5 2.2 Bacteria an Introduction and Their Diversity at Äspö and Mont Terri ................. 11 2.3 Bacterial Isolates of Interest .................................................................................. 15 2.3.1 The Äspö Strain Pseudomonas fluorescens ............................................ 15 2.3.2 The Mont Terri Opalinus Clay Isolate Paenibacillus sp. MT-2.2 .......... 16 2.4 Impact of Bacteria on
    [Show full text]
  • Predicting Outcomes of Catalytic Reactions Using Machine Learning
    Predicting outcomes of catalytic reactions using ma- chine learning† Trevor David Rhone,∗a Robert Hoyt,a Christopher R. O’Connor,b Matthew M. Montemore,b,c Challa S.S.R. Kumar,b,c Cynthia M. Friend,b,c and Efthimios Kaxiras a,c Predicting the outcome of a chemical reaction using efficient computational models can be used to develop high-throughput screening techniques. This can significantly reduce the number of experiments needed to be performed in a huge search space, which saves time, effort and ex- pense. Recently, machine learning methods have been bolstering conventional structure-activity relationships used to advance understanding of chemical reactions. We have developed a model to predict the products of catalytic reactions on the surface of oxygen-covered and bare gold using machine learning. Using experimental data, we developed a machine learning model that maps reactants to products, using a chemical space representation. This involves predicting a chemical space value for the products, and then matching this value to a molecular structure chosen from a database. The database was developed by applying a set of possible reaction outcomes using known reaction mechanisms. Our machine learning approach complements chemical intuition in predicting the outcome of several types of chemical reactions. In some cases, machine learn- ing makes correct predictions where chemical intuition fails. We achieve up to 93% prediction accuracy for a small data set of less than two hundred reactions. 1 Introduction oped to aid in reaction prediction, they have generally involved Efficient prediction of reaction products has long been a major encoding this intuition into a set of rules, and have not been goal of the organic chemistry community 1–3 and the drug discov- widely adopted 2,6–9.
    [Show full text]
  • Umansysprop V1.0: an Online and Open-Source Facility for Molecular Property Prediction and Atmospheric Aerosol Calculations
    UManSysProp V1.0: An Online and Open-Source Facility for Molecular Property Prediction and Atmospheric Aerosol Calculations Topping, David and Barley, Mark and Bane, Michael and Higham, Nicholas J. and Aumont, Berbard and Dingle, Nicholas and McFiggans, Gordon 2016 MIMS EPrint: 2016.13 Manchester Institute for Mathematical Sciences School of Mathematics The University of Manchester Reports available from: http://eprints.maths.manchester.ac.uk/ And by contacting: The MIMS Secretary School of Mathematics The University of Manchester Manchester, M13 9PL, UK ISSN 1749-9097 Geosci. Model Dev., 9, 899–914, 2016 www.geosci-model-dev.net/9/899/2016/ doi:10.5194/gmd-9-899-2016 © Author(s) 2016. CC Attribution 3.0 License. UManSysProp v1.0: an online and open-source facility for molecular property prediction and atmospheric aerosol calculations David Topping1,2, Mark Barley2, Michael K. Bane3, Nicholas Higham4, Bernard Aumont5, Nicholas Dingle6, and Gordon McFiggans2 1National Centre for Atmospheric Science, Manchester, M13 9PL, UK 2Centre for Atmospheric Science, University of Manchester, Manchester, M13 9PL, UK 3High End Compute, Manchester, M13 9PL, UK 4School of Mathematics, University of Manchester, Manchester, M13 9PL, UK 5LISA, UMR CNRS 7583, Universite Paris Est Creteil et Universite Paris Diderot, Creteil, France 6Numerical Algorithms Group (NAG), Ltd Peter House Oxford Street, Manchester, M1 5AN, UK Correspondence to: David Topping ([email protected]) Received: 1 September 2015 – Published in Geosci. Model Dev. Discuss.: 3 November 2015 Revised: 3 February 2016 – Accepted: 16 February 2016 – Published: 1 March 2016 Abstract. In this paper we describe the development and opment of a user community.
    [Show full text]
  • Bringing Open Source to Drug Discovery
    Bringing Open Source to Drug Discovery Chris Swain Cambridge MedChem Consulting Standing on the shoulders of giants • There are a huge number of people involved in writing open source software • It is impossible to acknowledge them all individually • The slide deck will be available for download and includes 25 slides of details and download links – Copy on my website www.cambridgemedchemconsulting.com Why us Open Source software? • Allows access to source code – You can customise the code to suit your needs – If developer ceases trading the code can continue to be developed – Outside scrutiny improves stability and security What Resources are available • Toolkits • Databases • Web Services • Workflows • Applications • Scripts Toolkits • OpenBabel (htttp://openbabel.org) is a chemical toolbox – Ready-to-use programs, and complete programmer's toolkit – Read, write and convert over 110 chemical file formats – Filter and search molecular files using SMARTS and other methods, KNIME add-on – Supports molecular modeling, cheminformatics, bioinformatics – Organic chemistry, inorganic chemistry, solid-state materials, nuclear chemistry – Written in C++ but accessible from Python, Ruby, Perl, Shell scripts… Toolkits • OpenBabel • R • CDK • OpenCL • RDkit • SciPy • Indigo • NumPy • ChemmineR • Pandas • Helium • Flot • FROWNS • GNU Octave • Perlmol • OpenMPI Toolkits • RDKit (http://www.rdkit.org) – A collection of cheminformatics and machine-learning software written in C++ and Python. – Knime nodes – The core algorithms and data structures are written in C ++. Wrappers are provided to use the toolkit from either Python or Java. – Additionally, the RDKit distribution includes a PostgreSQL-based cartridge that allows molecules to be stored in relational database and retrieved via substructure and similarity searches.
    [Show full text]
  • PSC-Db: a Structured and Searchable 3D-Database for Plant Secondary Compounds
    molecules Article PSC-db: A Structured and Searchable 3D-Database for Plant Secondary Compounds Alejandro Valdés-Jiménez 1,† , Carlos Peña-Varas 2,†, Paola Borrego-Muñoz 3 , Lily Arrue 2 , Melissa Alegría-Arcos 2, Hussam Nour-Eldin 4, Ingo Dreyer 1 , Gabriel Nuñez-Vivanco 1 and David Ramírez 2,* 1 Center for Bioinformatics, Simulations, and Modeling (CBSM), Faculty of Engineering, University of Talca, Talca 3460000, Chile; [email protected] (A.V.-J.); [email protected] (I.D.); [email protected] (G.N.-V.) 2 Instituto de Ciencias Biomédicas, Universidad Autónoma de Chile, Santiago 8900000, Chile; [email protected] (C.P.-V.); [email protected] (L.A.); [email protected] (M.A.-A.) 3 Bioorganic Chemistry Laboratory, Facultad de Ciencias Básicas y Aplicadas, Campus Nueva Granada, Universidad Militar Nueva Granada, Cajicá 250247, Colombia; [email protected] 4 DynaMo Center, Department of Plant and Environmental Sciences, University of Copenhagen, 1017 Copenhagen, Denmark; [email protected] * Correspondence: [email protected]; Tel.: +56-2-230-36667 † These authors equally contributed to this work. Abstract: Plants synthesize a large number of natural products, many of which are bioactive and have practical values as well as commercial potential. To explore this vast structural diversity, we present PSC-db, a unique plant metabolite database aimed to categorize the diverse phytochemical Citation: Valdés-Jiménez, A.; space by providing 3D-structural information along with physicochemical and pharmaceutical Peña-Varas, C.; Borrego-Muñoz, P.; Arrue, L.; Alegría-Arcos, M.; properties of the most relevant natural products. PSC-db may be utilized, for example, in qualitative Nour-Eldin, H.; Dreyer, I.; estimation of biological activities (Quantitative Structure-Activity Relationship, QSAR) or massive Nuñez-Vivanco, G.; Ramírez, D.
    [Show full text]
  • Chemical Space, Diversity, and Complexity[Version 1; Peer Review: 2
    F1000Research 2018, 7(Chem Inf Sci):993 Last updated: 21 SEP 2021 RESEARCH ARTICLE Analysis of a large food chemical database: chemical space, diversity, and complexity [version 1; peer review: 2 approved, 1 approved with reservations] J. Jesús Naveja 1,2, Mariel P. Rico-Hidalgo 2, José L. Medina-Franco 2 1PECEM, Faculty of Medicine, Universidad Nacional Autónoma de México, Mexico City, 04510, Mexico 2Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City, 04510, Mexico v1 First published: 03 Jul 2018, 7(Chem Inf Sci):993 Open Peer Review https://doi.org/10.12688/f1000research.15440.1 Latest published: 10 Aug 2018, 7(Chem Inf Sci):993 https://doi.org/10.12688/f1000research.15440.2 Reviewer Status Invited Reviewers Abstract Background: Food chemicals are a cornerstone in the food industry. 1 2 3 However, its chemical diversity has been explored on a limited basis, for instance, previous analysis of food-related databases were done version 2 up to 2,200 molecules. The goal of this work was to quantify the (revision) report chemical diversity of chemical compounds stored in FooDB, a 10 Aug 2018 database with nearly 24,000 food chemicals. Methods: The visual representation of the chemical space of FooDB version 1 was done with ChemMaps, a novel approach based on the concept of 03 Jul 2018 report report report chemical satellites. The large food chemical database was profiled based on physicochemical properties, molecular complexity and scaffold content. The global diversity of FoodDB was characterized 1. Piotr Minkiewicz , University of Warmia using Consensus Diversity Plots.
    [Show full text]
  • Daylight Theory Manual Daylight Theory Manual Table of Contents Daylight Theory Manual
    Daylight Theory Manual Daylight Theory Manual Table of Contents Daylight Theory Manual....................................................................................................................................1 1. Introduction..........................................................................................................................................1 2. Molecules and Reactions in A Computer............................................................................................1 2.1 Representing Molecules..............................................................................................................1 2.2 Analyzing Molecules...................................................................................................................2 2.2.1 Cycles.................................................................................................................................2 2.2.2 Bond Type, Bond Order, and Aromaticity........................................................................2 2.2.3 Symmetry...........................................................................................................................3 2.2.4 Canonical Labeling............................................................................................................3 2.2.5 Chirality.............................................................................................................................3 2.3 Representing Reactions...............................................................................................................3
    [Show full text]
  • A Database of Medicinal Materials and Chemical Compounds in Northeast
    Kim et al. BMC Complementary and Alternative Medicine (2015) 15:218 DOI 10.1186/s12906-015-0758-5 DATABASE Open Access TM-MC: a database of medicinal materials and chemical compounds in Northeast Asian traditional medicine Sang-Kyun Kim1†, SeJin Nam2†, Hyunchul Jang1, Anna Kim1 and Jeong-Ju Lee3* Abstract Background: In traditional medicine, there has been a great deal of research on the effects exhibited by medicinal materials. To study the effects, resources that can systematically describe the chemical compounds in medicinal materials are necessary. In recent years, numerous databases on medicinal materials and constituent compounds have been constructed. However, because these databases provide differing information and the sources of such information are unclear or difficult to verify, it is difficult to decide which database to use. Moreover, there is much overlapping information. The aim of this study was to construct a database of medicinal materials and chemical compounds in Northeast Asian traditional medicine (TM-MC), for which medicinal materials are listed in the Korean, Chinese, and Japanese pharmacopoeias and information on the compound names of medicinal materials can easily be confirmed online. Description: To provide information on the chemical compounds of medicinal materials, chromatography articles from MEDLINE and PubMed Central were searched. After chemical compounds of medicinal materials were extracted by manually investigating the full-text of articles, a database of information on about 14,000 compounds from 536 medicinal materials was built. The database also provides links to the articles from which each medicinal material and chemical compound were extracted. Conclusion: TM-MC database provides information on medicinal materials and their chemical compounds from chromatography articles in MEDLINE and PubMed Central.
    [Show full text]