Modelling of Stereoisomerism for Generative Chemistries

Modelling of Stereoisomerism for Generative Chemistries

Modelling of Stereoisomerism for Generative Chemistries Jakob Lykke Andersen, Christoph Flamm, Daniel Merkle, Peter F. Stadler Department of Mathematics and Computer Science University of Southern Denmark Bled, February 2015 1/25 A Note on Terminology External Representation The external storage format used for data exchange. (E.g., a molecule is stored as a SMILES string, or InChI string) Internal Representation (Implementation) The data structures used to represent the the model, and the algorithms to manipulate the data. (E.g., a molecule is an adjacency list with . ) Model The abstract mathematical description of objects and their semantics. (E.g., a molecule is an connected, undirected, simple, labelled graph.) Reality ??? 2/25 Isomers Isomers H F H H-C-C-C-H Stereoisomers H H H Constitutional (structural) isomers (spatial isomers) H H H H-C-C-C-F H H H H H Diastereomers Enantiomers C C F F Br Br Cl Cl cis/trans isomers 1 4 4 2 3 2 3 Conformers 1 cis-2-butene trans-2-butene Rotamers “Isomers are molecules with the same chemical formula but different chemical structures.” [Wikipedia, Isomer] | {z } (not to be confused with the “structure” in “structural isomers”) 3/25 Our Current Molecule Model A molecule is a connected, undirected, simple, labelled graph. HO O H H H - - OH H H H - C - O H - H - - - - - HO P O OH H O C C C C - - - - - - - - - = O O P - O O O O = - H - H OH O (a) A molecule (b) A model of a molecule We can distinguish between constitutional (structural) isomers. 4/25 Our Current Molecule Model . but not stereoisomers. HO O HO O OH OH HO P O OH HO P O OH H H H - - - O HO H H C - O H - H - - - - OH OH- H O C C C C - - - - - (a) Ribulose 5-phosphate - (b) Xylulose- - 5-phosphate- = O P - O O O O = - H - H O (c) The model for both molecules. 5/25 Our Current Molecule Model Data Structures I Graphs with labels (adjacency lists with strings). Only local information about atoms and bonds. Algorithms I Graph isomorphism Are two data structures representations of the same graph? I Subgraph monomorphism Pattern matching for graphs. Substructure search. I Composition of transformation rules Generalised graph transformation. Computing reactions. I Graph canonicalisation Faster isomorphism check. Making “comfortable” storage formats. We would like to model stereochemistry as well. 6/25 Data Structures I Each edge (bond): A behaviour (usually the bond type) I Each vertex (atom): I Number of incident lone pairs. I A geometry tag. I An ordered list of incident edges and lone pairs. Based on “the ordered list method”. [Petrarca et al., J. Chem. Doc., 1967] [Wipke and Dyott, J. Am. Chem. Soc., 1974] Extension for Modelling (Some) Stereochemistry Goals I Molecules are still graphs, but now with more information. I Information is still localised on atoms and bonds. I It should really be an extension: the current algorithms are simplifications of the new algorithms. Limitation I Only local geometry (or derived thereof) can be modelled. 7/25 Extension for Modelling (Some) Stereochemistry Goals I Molecules are still graphs, but now with more information. I Information is still localised on atoms and bonds. I It should really be an extension: the current algorithms are simplifications of the new algorithms. Limitation I Only local geometry (or derived thereof) can be modelled. Data Structures I Each edge (bond): A behaviour (usually the bond type) I Each vertex (atom): I Number of incident lone pairs. I A geometry tag. I An ordered list of incident edges and lone pairs. Based on “the ordered list method”. [Petrarca et al., J. Chem. Doc., 1967] [Wipke and Dyott, J. Am. Chem. Soc., 1974] 7/25 Lone Pair-Augmentation Lone pairs contribute to the geometry. [Wikipedia, Lone Pair] Add a virtual edge and vertex for each lone pair: e H H H O e H N e O N H H H H H H (a) Before (b) After (a) Before (b) After Oxygen in water. Nitrogen in ammonia. (A virtual edge has single bond behaviour as default.) 8/25 “up” is where hte first neighbour points in positive order from above Example: TetrahedralFree Tetrahedral ≡ 4 neighbours, tetrahedron shape Free ≡ all have single bond behaviour (E.g., a carbon with 4 bonds) Ordering Semantics z}|{ [ , , , ] | {z } 9/25 in positive order from above Example: TetrahedralFree Tetrahedral ≡ 4 neighbours, tetrahedron shape Free ≡ all have single bond behaviour (E.g., a carbon with 4 bonds) Ordering Semantics “up” is where hte first neighbour points z}|{ [ , , , ] | {z } 9/25 Example: TetrahedralFree Tetrahedral ≡ 4 neighbours, tetrahedron shape Free ≡ all have single bond behaviour (E.g., a carbon with 4 bonds) Ordering Semantics “up” is where hte first neighbour points z}|{ [ , , , ] | {z } in positive order from above 9/25 Example: TetrahedralFree [ , , , ][ , , , ][ , , , ][ , , , ] [ , , , ][ , , , ][ , , , ][ , , , ] [ , , , ][ , , , ][ , , , ][ , , , ] [ , , , ][ , , , ][ , , , ][ , , , ] [ , , , ][ , , , ][ , , , ][ , , , ] [ , , , ][ , , , ][ , , , ][ , , , ] Equivalence permutation group: G≡ = h(1)(2 3 4), (1 2)(3 4)i Non-equivalence permutations: G6≡ = G≡ ◦ (1)(2)(3 4) 10/25 Induced permutation: (1 3)(2)(4) ∈ G6≡ Example: TetrahedralFree, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d [a, b, c, d] b c s [s, t, u, v] v t u 11/25 ∈ G6≡ Example: TetrahedralFree, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d [a, b, c, d] b c Induced permutation: (1 3)(2)(4) s [s, t, u, v] v t u 11/25 Example: TetrahedralFree, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d [a, b, c, d] b c Induced permutation: (1 3)(2)(4) ∈ G6≡ s [s, t, u, v] v t u 11/25 Edge Behaviours (Bond Types) I Single: no rotational constraints. I Double: inhibits rotation, 1 reference half-plane. Formation of π-bond: [Wikipedia, Pi bond] I Triple: inhibits rotation, 2 reference half-planes. I Conjugated: inhibits rotation, 1 reference half-plane. Formation of conjugated bonds: [Wikipedia, Conjugated system] 12/25 Example: TrigonalD Trigonal ≡ 3 neighbours, planar shape D ≡ 1 double bond, 2 single bonds Ordering Semantics The ordering defines a reference half-plane. always the double bond z}|{ [ , , ] | {z } in positive order from above I Incident reference half-planes are equal. I [ , , ] and [ , , ] have opposite half-planes. Half-plane-swapping permutation(s): G = {(1)(2 3)}. I G≡ = {(1)(2)(3)(4)}, G6≡ = ∅ 13/25 Example: TrigonalD Either : [ , , ] : [ , , ] or : [ , , ] : [ , , ] Either : [ , , ] : [ , , ] or : [ , , ] : [ , , ] 14/25 Induced permutations: : (1)(2 3) : (1)(2)(3) swaps half-plane, does not ⇒ invalid isomorphism. Example: TrigonalD, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d : [ , a, b] : [ , c, d] b c s v : [ , s, t] : [ , u, v] t u 15/25 swaps half-plane, does not ⇒ invalid isomorphism. Example: TrigonalD, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d : [ , a, b] : [ , c, d] b c Induced permutations: : (1)(2 3) : (1)(2)(3) s v : [ , s, t] : [ , u, v] t u 15/25 Example: TrigonalD, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d : [ , a, b] : [ , c, d] b c Induced permutations: : (1)(2 3) : (1)(2)(3) s v : [ , s, t] : [ , u, v] t u swaps half-plane, does not ⇒ invalid isomorphism. 15/25 Example : [ , , ] : [ , , ] Example: LinearDD Linear ≡ 2 neighbours, linear shape DD ≡ 2 double bonds Ordering Semantics The ordering does not matter: [ , ] and [ , ] mean the same. I.e., G≡ = h(1 2)i Half-plane Propagation The other half-plane is at 90◦, seen from either end. 16/25 Example: LinearDD Linear ≡ 2 neighbours, linear shape DD ≡ 2 double bonds Ordering Semantics The ordering does not matter: [ , ] and [ , ] mean the same. I.e., G≡ = h(1 2)i Half-plane Propagation The other half-plane is at 90◦, seen from either end. Example : [ , , ] : [ , , ] 16/25 Data Structure I Attach a “fixed”-flag to each ordering element. (they may not be mutually independent) Semantics I Fixed elements may still be moved by G≡. I Moves permutations with the non-stabilised elements all being non-fixed into G≡. I Variations of the same molecule are now partially ordered by generality/specificity. (Partially) Unspecified Stereo Information I Our current molecules have no information. (absence of information ≡ completely unspecified information) I Parts of a molecule may have unspecified information. I Part of a local configuration may be unspecified (e.g., in trigonal bipyramidal geometry). 17/25 (Partially) Unspecified Stereo Information I Our current molecules have no information. (absence of information ≡ completely unspecified information) I Parts of a molecule may have unspecified information. I Part of a local configuration may be unspecified (e.g., in trigonal bipyramidal geometry). Data Structure I Attach a “fixed”-flag to each ordering element. (they may not be mutually independent) Semantics I Fixed elements may still be moved by G≡. I Moves permutations with the non-stabilised elements all being non-fixed into G≡. I Variations of the same molecule are now partially ordered by generality/specificity. 17/25 ? is incomparable (by specificity) to ? is not isomorphic to ? can unify with ? (Partially) Unspecified Stereo Information ? ? is less (specific) than ? is not isomorphic to ? ? can unify with ? 18/25 (Partially) Unspecified Stereo Information ? ? is less (specific) than ? is not isomorphic to ? ? can unify with ? ? is incomparable (by specificity) to ? is not isomorphic to ? can unify with ? 18/25 ? ? ? ? chemically equivalent Partial Order of Graphs (Specificity) ? ? ? ? 19/25 ? ? chemically equivalent Partial Order of Graphs (Specificity) ? ? ? ? ? ? 19/25 chemically equivalent Partial Order of Graphs (Specificity) ? ? ? ? ? ? ? ? 19/25 chemically equivalent Partial Order of Graphs (Specificity) ? ? ? ? ? ? ? ? 19/25 chemically equivalent Partial Order of Graphs (Specificity) ? ? ? ? ? ? ? ? 19/25 Partial Order of Graphs (Specificity) ? ? ? ? ? ? ? ? chemically equivalent 19/25 Substructures with Stereo Information Modelling status I In progress.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    44 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us