
ChEMBL resources and KNIME George Papadatos [email protected] Outline • ChEMBL data • ChEMBL nodes • Web services v2.0 • UniChem • Cheminformatics utilities • myChEMBL • SureChEMBL and Open PHACTS ChEMBL: Data for drug discovery 1. Scientific facts 3. Insight, tools and resources for translational drug discovery >Thrombin MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLE Compound RECVEETCSYEEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGT NYRGHVNITRSGIECQLWRSRYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYT TDPTVRRQECSIPVCGQDQVTVAMTPRSEGSSVNLSPPLEQCVPDRGQQYQGRLAVT THGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRNPDGDEEGVWCYVAGKPGDFGY CDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEADCGLRPLF EKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDR WVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWR ENLDRDIALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTA NVGKGQPSVLQVVNLPIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGG Ki = 4.5nM PFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFYTHVFRLKKWIQKVIDQFGE Bioactivity data Assay/Target Assay/Target APTT = 11 min. 2. Organization, integration, curation and standardization of pharmacology data ChEMBL: Data for drug discovery 1. Scientific facts 3. Insight, tools and resources for translational drug discovery >Thrombin MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARSLLQRVRRANTFLEEVRKGNLE Compound RECVEETCSYEEAFEALESSTATDVFWAKYTACETARTPRDKLAACLEGNCAEGLGT NYRGHVNITRSGIECQLWRSRYPHKPEINSTTHPGADLQENFCRNPDSSTTGPWCYT TDPTVRRQECSIPVCGQDQVTVAMTPRSEGSSVNLSPPLEQCVPDRGQQYQGRLAVT THGLPCLAWASAQAKALSKHQDFNSAVQLVENFCRNPDGDEEGVWCYVAGKPGDFGY CDLNYCEEAVEEETGDGLDEDSDRAIEGRTATSEYQTFFNPRTFGSGEADCGLRPLF EKKSLEDKTERELLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDR WVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWR ENLDRDIALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTA NVGKGQPSVLQVVNLPIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGG Ki = 4.5nM PFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFYTHVFRLKKWIQKVIDQFGE Bioactivity data Assay/Target Assay/Target APTT = 11 min. 2. Organization, integration, curation and standardization of pharmacology data KNIME at the EBI • Access ChEBI and ChEMBL databases via KNIME nodes • Trusted community nodes • Algorithms development • Document classification • Share example workflows and use cases • Provide KNIME training to scientists and researchers • Wellcome Trust drug discovery courses, EMBL courses • CDK community nodes development hp://tech.knime.org/book/embl-ebi-nodes ChEMBL nodes ChEMBL KNIME nodes Example: All bioactivities for hERG All bioac9vi9es for hERG Ac9vity value, assay descripon, compound, reference Example: Compound searching in ChEMBL List of NNs Query Example: Polypharmacology profile Find NNs Filter, summarise & pivot Retrieve bioac9vi9es Query Compounds Web services v2.0 • Many more entities à granularity • Pagination, filtering, ordering UniChem integration EMBL-EBI chemistry resources RDF and REST API interfaces Atlas PDBe ChEBI ChEMBL SureChEMBL 3rd Party Data ZINC, PubChem, ThomsonPharma Ligand Ligand Nomenclature Bioac9vity DOTF, IUPHAR, Chemical DrugBank, KEGG, induced structures of primary and data from structures NIH NCC, transcript from secondary literature from patent eMolecules, FDA response structurally metabolites. and literature SRS, PharmGKB, defined Chemical deposions Selleck, …. protein Ontology complexes 750 15K 24K 1.5M ~17M ~70M UniChem – InChI-based chemical resolver (full + relaxed ‘lenses’) >90M REST API Interface - h"ps://www.ebi.ac.uk/unichem/ hps://www.ebi.ac.uk/unichem/ Novelty checking with UniChem Cheminformatics utilities Cheminformatics utilities (aka ‘Beaker’) • Chemical format conversions • Dynamic image generation • Image processing (via OSRA) • Descriptors and property calculations • Chemical modifications and standardization https://www.ebi.ac.uk/chembl/api/utils/docs Example: Image to Structure image URL myChEMBL integration Accessing local data with myChEMBL Using KNIME to connect to myChEMBL SELECT mr.*, md.chembl_id, cp.full_mwt, cp.alogp from mols_rdkit mr, molecule_dictionary md, compound_properties cp where mr.m @> '$${SMolecule}$$'::qmol and mr.molregno = md.molregno and md.molregno = cp.molregno; SureChEMBL and Open PHACTS SureChEMBL and Open PHACTS SureChEMBL SciBite Termite Open PHACTS API https://dev.openphacts.org/docs/develop https://github.com/openphacts/OPS-Knime/ http://rdf.ebi.ac.uk/resource/surechembl/patent/US-8877786-B2 MCS scaffold US-8877786-B2 Substituted carbamoylmethylamino acetic acid derivatives as novel NEP inhibitors Most relevant targets and diseases http://rdf.ebi.ac.uk/resource/surechembl/molecule/SCHEMBL371804 Foretinib, a kinase inhibitor in clinical phase II Patent publication date histogram Found in 89 EP, WO and US patents Most relevant diseases Most relevant targets Summary • KNIME: democratizes access to data and tools • Access public domain structure and bioactivity data and services with KNIME • ChEMBL KNIME Nodes • UniChem • Cheminformatics services • myChEMBL • SureChEMBL Publications Acknowledgements • Francis Atkinson • Thorsten Meinl • Louisa Bellis • KNIME • Jon Chambers • KNIME community • Michał Nowotka • Anne Hersey • Stefan Beisken • Edmund Duesbury • Daniela Digles All workflow examples are available on request. ChEMBL resources and KNIME George Papadatos [email protected] .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages30 Page
-
File Size-