How IJC Is Adding Value to a Molecular Design Business

How IJC is Adding Value to a Molecular Design Business James Mills Sandexis LLP ChemAxon TechTalk Stevenage, Nov 2012 [email protected] Overview ● Introduction to Sandexis ● Sandexis and IJC use cases – Data visualisation – Post-processing of virtual screening – Monomer selection for library design – PDB ligand database ● Future directions Sandexis LLP Partnership of experienced PhD medicinal and computational chemists Provide variety of flexible and bespoke medicinal and computational chemistry services to biotech, pharma, CRO and not-for-profit drug discovery sectors: ● Medicinal and computational chemistry design support ● from hit generation through to candidate identification ● Management of integrated drug discovery projects ● including outsourced synthetic chemistry ● Generation and optimisation of intellectual property ● Consultancy for due diligence, project reviews, proposal advice, problem solving, expert witness, grant writing and literature searches ● Bespoke computational chemistry solutions ● Design and delivery of medicinal and computational chemistry training Sandexis and IJC ● Novice users (6 months) – Only explored and made use of basic functionality – But have still garnered added value ● Translate Comp Chem geekery – into means for Med Chem decision-making ● Set up IJC databases for each piece of work – Access via password protected files on cloud ● Variety of use cases – SAR analysis – Library Design – Virtual Screening – PDB ligand database SAR analysis Analysis of Lipophilic ligand efficiency. Scatter plot sized by MW, coloured by TPSA Simple use case – calculated properties of data to afford pEC50 values and LipE (pEC50 – ClogP). Histogram analysis by LipE allows simple selection and analysis of efficient ligands all within IJC. SAR analysis Analysis of Lipophilic ligand efficiency. Scatter plot sized by MW, coloured by TPSA Simple use case – calculated properties of data to afford pEC50 values and LipE (pEC50 – ClogP). Histogram analysis by LipE allows simple selection and analysis of efficient ligands all within IJC. SAR analysis Analysis of Lipophilic ligand efficiency. Scatter plot sized by MW, coloured by TPSA Simple use case – calculated properties of data to afford pEC50 values and LipE (pEC50 – ClogP). Histogram analysis by LipE allows simple selection and analysis of efficient ligands all within IJC. Diverse acid selection ● Require 150 acids to cover chemical space (from 4500 available) – SAR exploration from singleton initial hit ● Want to bias clustering to consider environment of acid – e.g. benzoic and phenoxyacetic acids in different clusters ● Use atom pairs, sequences and circular fingerprint bits – weighted by proximity to acid functional group ● Single-linkage clustering with Multiple Tanimoto cutoffs – Cluster representative = cpd with most neighbours ● Start with clustering at Tanimoto > 0.7 – If cluster too big, iteratively move up a Tanimoto level – If cluster too small, iteratively move down a Tanimoto level Acid selection workflow ● Calculate properties of acids (Mwt, clogP) ● Representatives ranked by size of cluster they represent – Select clusters by quality of rep – Pick alternative rep as lowest Mwt cpd ● Boring bit: wade through singletons Acid selection: learnings ● Novel clustering method, more intuitive for a chemist! – Would have been more intuitive if defined cluster rep as lowest Mwt ● Simplest, unadorned structure ● More efficient to identify clusters of interest – Then select compounds ● Not easy to avoid singletons – About 1 in 8 (600 / 4800) for this dataset ● Will get more efficient as IJC skills improve – More use of working lists and structure matrices ● Happy with output – Human input obviates post processing filtering to remove uglies Post-processing virtual screening ● Docked ligand database with multiple scoring functions – Obtained many thousands of putative virtual hits – Reduce to 1000 for screening ● Cluster all hits ● Calculate all interactions between ligand and protein – Remove examples with anti-H-bonds ● For each scoring function, pick best cpd from each cluster – Subsequent results will define best scoring function Virtual Screening Analysis Form views give a simple and intuitive way of visualising each compound/cluster vs physicochemical properties Easy to customise form views tailored to each individual dataset Virtual Screening Analysis Scoring methodology Rank based on cluster number Rank based on Prioritisation based docking Yes/No on structure PDB ligand database Extract ligands Cav vol Ensure ligand/protein criteria met # ligand heavy atoms Generate cavity HYDROGEN BOND: parameters 2.09 167.95 160.57: GLY 48 H and 2 HYDROGEN BOND: parameters 1.95 170.31 121.79: ASP 29 OD1 and 14 HYDROGEN BOND: parameters 1.81 175.82 173.50: ASP 29 H and 10 HYDROGEN BOND: parameters 1.94 154.96 157.98: GLY 48 O and 30 HYDROGEN BOND: parameters 1.99 161.84 105.66: ASP 25 OD1 and 46 HYDROGEN BOND: parameters 2.07 162.05 167.57: ASP 229 H and 68 HYDROGEN BOND: parameters 2.20 148.53 130.84: GLY 248 O and 87 HYDROGEN BOND: parameters 1.72 142.50 148.80: GLY 248 H and 82 Generate and validate SMILES HYDROGEN BOND: parameters 3.30 119.64 96.43: GLY 248 O and 82 HYDROGEN BOND: parameters 2.28 126.46 140.54: ASP 230 OD1 and 86 HYDROGEN BOND: parameters 2.76 90.63 158.59: ILE 247 HG2 and 86 HYDROGEN BOND: parameters 2.52 137.05 128.08: GLY 48 O and 15 HYDROGEN BOND: parameters 2.98 132.03 113.12: GLY 27 O and 31 HYDROGEN BOND: parameters 2.79 136.12 156.38: GLY 27 O and 54 HYDROGEN BOND: parameters 2.37 150.52 111.75: ASP 225 OD1 and 61 HYDROGEN BOND: parameters 2.21 145.15 159.18: GLY 248 O and 71 Canonicalise protein name HYDROGEN BOND: parameters 2.59 156.02 104.23: ASP 229 OD1 and 88 EDGE-TO-FACE: parameters 4.74 91.84 164.81: ASP 229 and 67 EDGE-TO-FACE: parameters 4.77 153.48 93.59: ASP 229 and 84 DONOR-PI: parameters 2.80 166.77 104.79: ASP 29 CG and 14 DONOR-PI: parameters 2.73 154.21 122.39: GLY 27 C and 50 DONOR-PI: parameters 4.05 113.68 168.21: GLY 27 C and 46 DONOR-PI: parameters 3.65 138.55 97.06: GLY 248 C and 82 DONOR-PI: parameters 3.27 92.12 159.30: ASP 230 OD1 and 84 Uses of PDB ligand database ● Identification of all ligands hitting target or family ● Substructure search to identify – conformational preferences in binding sites – interaction preferences – starting points for bioisostere generation Isosteres from PDB mining OH IJC substructure search Overlay all CDK2 and similar sites N N N H H O N O H O O H O O N O O O N O O N ON O Identify isosteres N as groups occupying N N S O same space as PhOH N N N Getting more from IJC ● Make more use of functionality – Groovy script for jumping to next cluster in forms ● Equivalent to trellis from Spotfire ● Or request increased functionality – Visualisations with pie charts – Radio slider to simplify “queries” The Sandexis Team Karl Gibson PhD FRSC, Medicinal Chemist [email protected] Experienced leader of medicinal chemistry and multi-disciplinary project teams delivering candidates to the clinic. Scientific expertise in target validation approaches, HTS triage, hit finding and delivering optimal programs to find clinical candidates rapidly. He has experience of ion channels, GPCRs, kinases, enzymes, PPIs and nuclear receptors across Pain, Anti-infectives, Genitourinary and CNS disease areas. Author of over 10 papers and book chapters; named inventor on 20 patent applications. Gavin Whitlock PhD FRSC, Medicinal Chemist [email protected] Areas of expertise include designing molecules for oral, CNS, topical or inhaled routes of administration across multiple gene families (enzymes, GPCRs, transporters, kinases) and disease areas, including Anti- infectives, Anti-parasitics, Genitourinary, Respiratory and Tissue Repair. Author of over 25 papers and book chapters; named inventor on 14 patent applications. James Mills PhD FRSC, Computational Chemist [email protected] Areas of expertise include writing and applying novel algorithms to support all stages of the drug discovery process; for example target analysis, HTS triage, lead and candidate molecule design, SAR analysis and visualisation, virtual screening, analysis of molecular interactions, ligand superposition and structure-based drug design. Author of over 15 papers; named inventor on 3 patent applications. Medicinal Chemistry ● We can work with our clients to: – Assess and prioritise screening hits to generate quality hit-to-lead programs ● Efficient decision making for series progression or termination – Optimise lead series using an holistic approach to deliver preclinical candidates that meet required biological, ADME, Pharm Sci and drug-safety criteria ● Experience across multiple gene families and disease areas: – Enzymes, GPCRs, nuclear receptors, ion channels, transporters and protein-protein interactions – Allergy & Respiratory, Anti-infectives, CNS, Gastrointestinal, Genitourinary, Obesity, Oncology, Pain, Regenerative Medicine ● Experience of projects with a wide variety of requirements e.g.: – Designing molecules to cross the blood-brain barrier or to be peripherally restricted whilst retaining drug-like properties – Identifying and designing kinase inhibitors with slow binding kinetics – Designing molecules that can be delivered by inhaled or topical routes Computational Chemistry ● In-house proprietary algorithms to carry out: – Molecular superposition, Structure-Based Drug Design – SAR analysis and visualisation, cutting edge HTS triage – Pharmacophore generation and searching – Bioisosteres

How IJC Is Adding Value to a Molecular Design Business

Report on an NIH Workshop on Ultralarge Chemistry Databases Wendy A

Qsar Methods Development, Virtual and Experimental Screening for Cannabinoid Ligand Discovery

Open Chemoinformatic Resources to Explore the Structure, Properties and Chemical Space of Cite This: RSC Adv.,2017,7,54153 Molecules

Retro Drug Design: from Target Properties to Molecular Structures

A Chemaxon/KNIME Based Tool for Designing Chemical Libraries

Optimizing the Use of Open-Source Software Applications in Drug

Bringing Open Source to Drug Discovery

Deltasoft's Chemcart

Useful Molecular Modelling and Drug Design Softwares and Databases

LNCS 5102, Pp

Mining Collections of Compounds with Screening Assistant 2

Press Release. Enamine Collaborates with Chemaxon to Provide