Linking Chemistry to Bioactivity

Linking Chemistry to Bioactivity

Linking Chemistry to Bioactivity George Papadatos, PhD Senior Technical Officer ChEMBL group [email protected] Outline • Chemistry and bioactivity resources at the EBI • Small molecule databases • ChEBI database • ChEMBL database • Integrated enzyme/protein search • Enzyme Portal 2 24/07/2012 Information Sources in Biotechnology ChEBI Database 3 24/07/2012 Information Sources in Biotechnology What is ChEBI? • Chemical Entities of Biological Interest • Freely available • Focused on small chemical entities • Illustrated dictionary of chemical nomenclature • High quality, manually annotated • Provides ontologies • Structural and functional • http://www.ebi.ac.uk/chebi/ 4 24/07/2012 Information Sources in Biotechnology ChEBI data overview Nomenclature Ontology caffeine metabolite 1,3,7-trimethylxanthine CNS stimulant methyltheobromine trimethylxanthines Chemical data Database Xrefs Formula: C8H10N4O2 MSDchem: CFF Charge: 0 KEGG DRUG: D00528 Mass: 194.19 Chemical Informatics Visualisation InChI=1/C8H10N4O2/c1-10-4-9-6- 5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3 SMILES: CN1C(=O)N(C)c2ncn(C)c2C1=O 5 24/07/2012 Information Sources in Biotechnology ChEBI home page 6 24/07/2012 Information Sources in Biotechnology ChEBI entry view 7 24/07/2012 Information Sources in Biotechnology Automatic cross-references 8 24/07/2012 Information Sources in Biotechnology The ChEBI ontology Organised into two main sub-ontologies • Molecular structure ontology • Role ontology (R)-adrenaline 9 24/07/2012 Information Sources in Biotechnology Molecular structure ontology 10 24/07/2012 Information Sources in Biotechnology Role ontology 11 24/07/2012 Information Sources in Biotechnology Browsing the ChEBI ontologies 12 24/07/2012 Information Sources in Biotechnology Further ChEBI resources • Help & contact • [email protected] • SourceForge • https://sourceforge.net/projects/chebi/ • User Manual • http://www.ebi.ac.uk/chebi/userManualForward.do • Tutorial • https://www.ebi.ac.uk/chebi/tutorialForward.do • RSS Feed 13 24/07/2012 Information Sources in Biotechnology The ChEMBL Database 14 24/07/2012 Information Sources in Biotechnology What is ChEMBL? • Open access database for drug discovery • Freely available – searchable and downloadable • Contents: • Bioactivity data manually extracted from the primary medicinal chemistry literature • Deposited data from neglected disease screening (e.g. Malaria) • Subset of data from PubChem • Bioactivity data is associated with a biological target and a chemical structure • Compounds are stored in a structure searchable format • Updated regularly with new data 15 24/07/2012 Information Sources in Biotechnology What is in ChEMBL? Compounds N H N H O N N N N H H N O >Thrombin H MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLERECVEETCSY EEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGTNYRGHVNITRSGIECQLWRS O RYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRRQECSIPVCGQDQVTVAMTPRSEG SSVNLSPPLEQCVPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRNPDGD EEGVWCYVAGKPGDFGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEAD Compound Bioactivities CGLRPLFEKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVL TAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLK KPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNLPIVERPVC KDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFY THVFRLKKWIQKVIDQFGE Ki = 4.5 nM SAR Data Targets APTT = 11 min Assay 16 24/07/2012 Information Sources in Biotechnology Drug discovery process Clinical trials Target Lead Lead Preclinical Phase I Phase II Phase III Launch Discovery Discovery Optimisation Development •Target •Medicinal identification •High-throughput Chemistry •Microarray Screening (HTS) •Structure-based profiling •Fragment-based drug design •Toxicology Safety Indication •Target validation screening •Selectivity •In vivo safety PK •Assay screens pharmacology Efficacy & Discovery & •Focused libraries Tolerability development •Screening •ADMET screens •Formulation Efficacy expansion •Biochemistry collection •Cellular/Animal •Dose prediction •Clinical/Animal disease models disease models •Pharmacokinetics Discovery Development Use Medicinal chemistry SAR Clinical candidates Drugs >1,000,000 distinct compounds ~12,000 candidates ~1,400 ~25,000 distinct lead series drugs ChEMBL database 17 24/07/2012 Information Sources in Biotechnology What is in ChEMBL? ChEMBL13 Compounds: 1,143,682 Activities: 6,933,068 Publications: 44,682 Targets: 8,845 Targets 5674 proteins organisms 1475 1431 cell lines 18 24/07/2012 Information Sources in Biotechnology ChEMBL targets Protein Protein complex Protein family Nucleic Acid PDE5 Nicotinic acetylcholine receptor Muscarinic receptors DNA Cell Line Tissue Subcellular fraction Organism HEK293 cells Nerve Mitochondria Drosophila 19 24/07/2012 Information Sources in Biotechnology ChEMBL assays • ChEMBL contains >6.9 million data points relating compounds to targets or effects • Assays can be classified as: • binding measurements ADMET 9% • e.g. IC50, Ki • functional assay endpoints Binding 40% • e.g. vasodilation, growth inhibition • ADMET data Functional • e.g. LD50, half-life 51% 20 24/07/2012 Information Sources in Biotechnology ChEMBL compounds • Chemical structures extracted as .mol files • Including stereochemistry, if known • No separation of tautomers • Other rules for standardising NO2 groups, HCl salts • Based on FDA Substance Registration System User's Guide • Uniqueness is ensured by standard InChI identifiers • Both salts and parent molecules are kept • Bioactivity is linked to salt form 21 24/07/2012 Information Sources in Biotechnology Marketed drugs Select set of interest Export to Excel or Export SDF 22 24/07/2012 Information Sources in Biotechnology How to access ChEMBL? 1. Web interface • Intuitive and secure • Compound, assay, target search 2. SQL dumps and flat files • Oracle, MySQL, Postgresql* dumps and .sd file 3. RESTful web services • Exact, substructure & similarity search • https://www.ebi.ac.uk/chemblws/compounds/substructure/CC(=O)Oc1c cccc1C(O)=O • Bioactivities for compound, assay and target id • https://www.ebi.ac.uk/chemblws/compounds/CHEMBL25/bioactivities • https://www.ebi.ac.uk/chembldb/index.php/ws 23 24/07/2012 Information Sources in Biotechnology Other ChEMBL tools 24 24/07/2012 Information Sources in Biotechnology What do people do with ChEMBL? • Chemical space visualisation • (Q)SAR analysis • Data modeling, activity cliffs, FW, MMP analysis • Bioisosteric replacement mining • Target selection • (off-)target prediction and ADR analysis • Polypharmacology networks • Neglected tropical disease research 25 24/07/2012 Information Sources in Biotechnology ChEMBL resources ChEMBL blog: http://chembl.blogspot.com If you would like help: [email protected] For ChEMBL news and data releases: http://listserver.ebi.ac.uk/mailman/listinfo/chembl-announce 26 24/07/2012 Information Sources in Biotechnology The ChEBI and ChEMBL databases • Dictionary of molecular entities, focused on small • Database of bioactive, drug-like small molecules molecules • Incorporates an ontological classification • Contains 2D structures, calculated • Uses nomenclature, symbolism and terminology properties and abstracted bioactivities endorsed by international scientific bodies, such as • Curates structures from published primary IUPAC literature 27 24/07/2012 Information Sources in Biotechnology The ChEBI and ChEMBL databases • All ChEMBL compounds have been submitted into ChEBI • Searching on either database will give you the same results • ChEBI and ChEMBL compounds link out to each other via hyperlinks • Both databases encourage user suggestions and comments on quality, errors and new features • ChEBI is focused on the ontology of the compounds • ChEMBL is focused on the associated bioactivity data 28 24/07/2012 Information Sources in Biotechnology The Enzyme Portal 29 24/07/2012 Information Sources in Biotechnology The Enzyme portal • A one-stop shop which integrates • Protein information (UniProt) • 3D structures (PDBe) • Protein-catalyzed reactions (Rhea) • Biochemical pathways (Reactome) • Enzyme nomenclature (IntEnz) • Small molecule chemistry (ChEBI and ChEMBL) • Cofactors and reaction mechanisms (CoFactor and MACiE) • https://www.ebi.ac.uk/enzymeportal/ 30 24/07/2012 Information Sources in Biotechnology Searching the portal for enzymes 31 24/07/2012 Information Sources in Biotechnology Result tabs 32 24/07/2012 Information Sources in Biotechnology Acknowledgements • ChEMBL group • ChEBI group & Enzyme Portal • John Overington • Christoph Steinbeck • Anne Hersey • Paula de Matos • Kazuyoshi Ikeda • EBI • Dominic Clark • Jennifer McDowall • All of you for listening! 33 24/07/2012 Information Sources in Biotechnology Linking Chemistry to Bioactivity George Papadatos, PhD Senior Technical Officer ChEMBL group [email protected] Back-up slides 35 24/07/2012 Information Sources in Biotechnology Data statistics • Focused towards compounds with drug-like properties by extraction from medicinal chemistry journals • Includes small molecules (~92%) and peptides (~7%) • Abstracted from 43,418 papers across 34 journals • 1,222,969 compound records • 1,077,189 distinct compound structures • 5,654,847 activities • Binding, functional and ADMET • 8,703 targets, incl. 5,420 protein targets and 2,442 human targets • Deposition of PubChem Substances and Bioassay assays 36 24/07/2012 Information Sources in Biotechnology .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    36 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us