
Modelling of Stereoisomerism for Generative Chemistries Jakob Lykke Andersen, Christoph Flamm, Daniel Merkle, Peter F. Stadler Department of Mathematics and Computer Science University of Southern Denmark Bled, February 2015 1/25 A Note on Terminology External Representation The external storage format used for data exchange. (E.g., a molecule is stored as a SMILES string, or InChI string) Internal Representation (Implementation) The data structures used to represent the the model, and the algorithms to manipulate the data. (E.g., a molecule is an adjacency list with . ) Model The abstract mathematical description of objects and their semantics. (E.g., a molecule is an connected, undirected, simple, labelled graph.) Reality ??? 2/25 Isomers Isomers H F H H-C-C-C-H Stereoisomers H H H Constitutional (structural) isomers (spatial isomers) H H H H-C-C-C-F H H H H H Diastereomers Enantiomers C C F F Br Br Cl Cl cis/trans isomers 1 4 4 2 3 2 3 Conformers 1 cis-2-butene trans-2-butene Rotamers “Isomers are molecules with the same chemical formula but different chemical structures.” [Wikipedia, Isomer] | {z } (not to be confused with the “structure” in “structural isomers”) 3/25 Our Current Molecule Model A molecule is a connected, undirected, simple, labelled graph. HO O H H H - - OH H H H - C - O H - H - - - - - HO P O OH H O C C C C - - - - - - - - - = O O P - O O O O = - H - H OH O (a) A molecule (b) A model of a molecule We can distinguish between constitutional (structural) isomers. 4/25 Our Current Molecule Model . but not stereoisomers. HO O HO O OH OH HO P O OH HO P O OH H H H - - - O HO H H C - O H - H - - - - OH OH- H O C C C C - - - - - (a) Ribulose 5-phosphate - (b) Xylulose- - 5-phosphate- = O P - O O O O = - H - H O (c) The model for both molecules. 5/25 Our Current Molecule Model Data Structures I Graphs with labels (adjacency lists with strings). Only local information about atoms and bonds. Algorithms I Graph isomorphism Are two data structures representations of the same graph? I Subgraph monomorphism Pattern matching for graphs. Substructure search. I Composition of transformation rules Generalised graph transformation. Computing reactions. I Graph canonicalisation Faster isomorphism check. Making “comfortable” storage formats. We would like to model stereochemistry as well. 6/25 Data Structures I Each edge (bond): A behaviour (usually the bond type) I Each vertex (atom): I Number of incident lone pairs. I A geometry tag. I An ordered list of incident edges and lone pairs. Based on “the ordered list method”. [Petrarca et al., J. Chem. Doc., 1967] [Wipke and Dyott, J. Am. Chem. Soc., 1974] Extension for Modelling (Some) Stereochemistry Goals I Molecules are still graphs, but now with more information. I Information is still localised on atoms and bonds. I It should really be an extension: the current algorithms are simplifications of the new algorithms. Limitation I Only local geometry (or derived thereof) can be modelled. 7/25 Extension for Modelling (Some) Stereochemistry Goals I Molecules are still graphs, but now with more information. I Information is still localised on atoms and bonds. I It should really be an extension: the current algorithms are simplifications of the new algorithms. Limitation I Only local geometry (or derived thereof) can be modelled. Data Structures I Each edge (bond): A behaviour (usually the bond type) I Each vertex (atom): I Number of incident lone pairs. I A geometry tag. I An ordered list of incident edges and lone pairs. Based on “the ordered list method”. [Petrarca et al., J. Chem. Doc., 1967] [Wipke and Dyott, J. Am. Chem. Soc., 1974] 7/25 Lone Pair-Augmentation Lone pairs contribute to the geometry. [Wikipedia, Lone Pair] Add a virtual edge and vertex for each lone pair: e H H H O e H N e O N H H H H H H (a) Before (b) After (a) Before (b) After Oxygen in water. Nitrogen in ammonia. (A virtual edge has single bond behaviour as default.) 8/25 “up” is where hte first neighbour points in positive order from above Example: TetrahedralFree Tetrahedral ≡ 4 neighbours, tetrahedron shape Free ≡ all have single bond behaviour (E.g., a carbon with 4 bonds) Ordering Semantics z}|{ [ , , , ] | {z } 9/25 in positive order from above Example: TetrahedralFree Tetrahedral ≡ 4 neighbours, tetrahedron shape Free ≡ all have single bond behaviour (E.g., a carbon with 4 bonds) Ordering Semantics “up” is where hte first neighbour points z}|{ [ , , , ] | {z } 9/25 Example: TetrahedralFree Tetrahedral ≡ 4 neighbours, tetrahedron shape Free ≡ all have single bond behaviour (E.g., a carbon with 4 bonds) Ordering Semantics “up” is where hte first neighbour points z}|{ [ , , , ] | {z } in positive order from above 9/25 Example: TetrahedralFree [ , , , ][ , , , ][ , , , ][ , , , ] [ , , , ][ , , , ][ , , , ][ , , , ] [ , , , ][ , , , ][ , , , ][ , , , ] [ , , , ][ , , , ][ , , , ][ , , , ] [ , , , ][ , , , ][ , , , ][ , , , ] [ , , , ][ , , , ][ , , , ][ , , , ] Equivalence permutation group: G≡ = h(1)(2 3 4), (1 2)(3 4)i Non-equivalence permutations: G6≡ = G≡ ◦ (1)(2)(3 4) 10/25 Induced permutation: (1 3)(2)(4) ∈ G6≡ Example: TetrahedralFree, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d [a, b, c, d] b c s [s, t, u, v] v t u 11/25 ∈ G6≡ Example: TetrahedralFree, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d [a, b, c, d] b c Induced permutation: (1 3)(2)(4) s [s, t, u, v] v t u 11/25 Example: TetrahedralFree, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d [a, b, c, d] b c Induced permutation: (1 3)(2)(4) ∈ G6≡ s [s, t, u, v] v t u 11/25 Edge Behaviours (Bond Types) I Single: no rotational constraints. I Double: inhibits rotation, 1 reference half-plane. Formation of π-bond: [Wikipedia, Pi bond] I Triple: inhibits rotation, 2 reference half-planes. I Conjugated: inhibits rotation, 1 reference half-plane. Formation of conjugated bonds: [Wikipedia, Conjugated system] 12/25 Example: TrigonalD Trigonal ≡ 3 neighbours, planar shape D ≡ 1 double bond, 2 single bonds Ordering Semantics The ordering defines a reference half-plane. always the double bond z}|{ [ , , ] | {z } in positive order from above I Incident reference half-planes are equal. I [ , , ] and [ , , ] have opposite half-planes. Half-plane-swapping permutation(s): G = {(1)(2 3)}. I G≡ = {(1)(2)(3)(4)}, G6≡ = ∅ 13/25 Example: TrigonalD Either : [ , , ] : [ , , ] or : [ , , ] : [ , , ] Either : [ , , ] : [ , , ] or : [ , , ] : [ , , ] 14/25 Induced permutations: : (1)(2 3) : (1)(2)(3) swaps half-plane, does not ⇒ invalid isomorphism. Example: TrigonalD, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d : [ , a, b] : [ , c, d] b c s v : [ , s, t] : [ , u, v] t u 15/25 swaps half-plane, does not ⇒ invalid isomorphism. Example: TrigonalD, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d : [ , a, b] : [ , c, d] b c Induced permutations: : (1)(2 3) : (1)(2)(3) s v : [ , s, t] : [ , u, v] t u 15/25 Example: TrigonalD, Isomorphism Given a graph isomorphism. Decide if it is still valid with stereo information. a d : [ , a, b] : [ , c, d] b c Induced permutations: : (1)(2 3) : (1)(2)(3) s v : [ , s, t] : [ , u, v] t u swaps half-plane, does not ⇒ invalid isomorphism. 15/25 Example : [ , , ] : [ , , ] Example: LinearDD Linear ≡ 2 neighbours, linear shape DD ≡ 2 double bonds Ordering Semantics The ordering does not matter: [ , ] and [ , ] mean the same. I.e., G≡ = h(1 2)i Half-plane Propagation The other half-plane is at 90◦, seen from either end. 16/25 Example: LinearDD Linear ≡ 2 neighbours, linear shape DD ≡ 2 double bonds Ordering Semantics The ordering does not matter: [ , ] and [ , ] mean the same. I.e., G≡ = h(1 2)i Half-plane Propagation The other half-plane is at 90◦, seen from either end. Example : [ , , ] : [ , , ] 16/25 Data Structure I Attach a “fixed”-flag to each ordering element. (they may not be mutually independent) Semantics I Fixed elements may still be moved by G≡. I Moves permutations with the non-stabilised elements all being non-fixed into G≡. I Variations of the same molecule are now partially ordered by generality/specificity. (Partially) Unspecified Stereo Information I Our current molecules have no information. (absence of information ≡ completely unspecified information) I Parts of a molecule may have unspecified information. I Part of a local configuration may be unspecified (e.g., in trigonal bipyramidal geometry). 17/25 (Partially) Unspecified Stereo Information I Our current molecules have no information. (absence of information ≡ completely unspecified information) I Parts of a molecule may have unspecified information. I Part of a local configuration may be unspecified (e.g., in trigonal bipyramidal geometry). Data Structure I Attach a “fixed”-flag to each ordering element. (they may not be mutually independent) Semantics I Fixed elements may still be moved by G≡. I Moves permutations with the non-stabilised elements all being non-fixed into G≡. I Variations of the same molecule are now partially ordered by generality/specificity. 17/25 ? is incomparable (by specificity) to ? is not isomorphic to ? can unify with ? (Partially) Unspecified Stereo Information ? ? is less (specific) than ? is not isomorphic to ? ? can unify with ? 18/25 (Partially) Unspecified Stereo Information ? ? is less (specific) than ? is not isomorphic to ? ? can unify with ? ? is incomparable (by specificity) to ? is not isomorphic to ? can unify with ? 18/25 ? ? ? ? chemically equivalent Partial Order of Graphs (Specificity) ? ? ? ? 19/25 ? ? chemically equivalent Partial Order of Graphs (Specificity) ? ? ? ? ? ? 19/25 chemically equivalent Partial Order of Graphs (Specificity) ? ? ? ? ? ? ? ? 19/25 chemically equivalent Partial Order of Graphs (Specificity) ? ? ? ? ? ? ? ? 19/25 chemically equivalent Partial Order of Graphs (Specificity) ? ? ? ? ? ? ? ? 19/25 Partial Order of Graphs (Specificity) ? ? ? ? ? ? ? ? chemically equivalent 19/25 Substructures with Stereo Information Modelling status I In progress.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages44 Page
-
File Size-