Criblages et Méthodologies
In silico
Dominique Douguet
Ecole Thématique de Criblage 2-4 Octobre 2018 Carry-Le-Rouet
Pharmacology & Neurosciences
Genomic platform, ion channels, small G proteins, vesicular transport, immunology
Institut de Pharmacologie Moléculaire et Cellulaire, UMR 7275 CNRS - Université Nice Sophia Antipolis, France ChemInfoScreen
Chimiothèque Nationale ChemBioScreen
• HTS assay optimization • Hit Identification
ChemInfoScreen ADME-Tox Evaluation
• Structure-Activity Relationships (SAR) • Pharmacokinetics
Hit-to-Lead optimization programs to Explore and Cure living systems ChemInfoScreen: Cheminformat ics Plat form
To date: 8 sites + Coordinating site at UGCN
LIT (Didier Rognan) CMC (Alexandre Varnek) Paris
Orléans Strasbourg BFA (Pierre Tufféry) Institut Pasteur (Olivier Sperandio)
ICOA (Pascal Bonnet)
Montpellier Marseille Nice
IPMC (Dominique Douguet)
UGCN (Philippe Jauffret) CRCM (Xavier Morelli) CBS (Gilles Labesse) Target Drug Discovery (TDD)
Pathology
Identified Target
Protein Sequence
3D Structure
Known No Yes Ligands
Identified Hits
Lead Optimisation
Clinical trials
Approved
drug ~ 10-14 years/~1 Billion $ Target Drug Discovery (TDD)
Pathology
ChemInfoScreen Identified Target ChemBioScreen
Protein Sequence IR FRISBI 3D Structure Chimiothèque Nationale Known No Yes Ligands
Identified Hits Medicinal Chemistry
ADME-Tox Lead Optimisation
Clinical trials
Approved
drug .~ 10-14 years/~1 Billion $ Target Drug Discovery (TDD)
Pathology
ChemInfoScreen Identified Target ChemBioScreen Virtual ligand-based Protein Sequence Screenings structure-based IR FRISBI CN subset Libraries 3D Structure approved drugs Design Chimiothèque Nationale commercial cpds Known No Yes Raw data PAINS/reactivity alert Ligands analysis analogs in catalogs and properties prediction Identified Hits SAR building LogP, Sol.,Kd, Kon/Koff Medicinal Chemistry prioritizing cpds ADME-Tox Lead Optimisation Synthesis prioritizing reactants scaffold hopping
Clinical trials Bio-Profiling Metabolism (site, & ADME-Tox CYP450s…) Prediction Off-targets Approved
drug .~ 10-14 years/~1 Billion $ Target Drug Discovery (TDD)
Pathology
ChemInfoScreen Identified Target ChemBioScreen Virtual ligand-based Protein Sequence Screenings structure-based IR FRISBI 3D Structure Chimiothèque Nationale Known No Yes Ligands
Identified Hits Medicinal Chemistry
ADME-Tox Lead Optimisation
Clinical trials
Approved
drug .~ 10-14 years/~1 Billion $ Ligand- and Structure-based Screenings
Ligand-based Structure-based Known ligands
2D 3D experimental structure/model • Graph/substructure of the target
• Fingerprint (eg: ECFP4)
bit set if the feature is present
3D O d1 = 9-10 Å • Pharmacophore
d3 = 6-7 Å N Ar
d2 = 3-4 Å
• Shape Docking
2D/3D QSAR model (requires a large dataset) Target Drug Discovery (TDD)
Pathology
ChemInfoScreen Identified Target ChemBioScreen Virtual ligand-based Protein Sequence Screenings structure-based IR FRISBI CN subset Libraries 3D Structure approved drugs Design Chimiothèque Nationale commercial cpds Known No Yes Ligands
Identified Hits Medicinal Chemistry
ADME-Tox Lead Optimisation
Clinical trials
Approved
drug .~ 10-14 years/~1 Billion $ Chemical space
Chemical universe 1020-1060 ‘druglike’ molecules Barnaby Roper
« The chemist as astronaut: Searching for biologically useful space in the chemical universe » D. Triggle, Biochem.Pharmacol., 2009.
Weininger D., Encyclopedia of Computational Chemistry, Vol 8,p1056; Bohacek RS. et al., Med. Res. Rev., 1996; Ertl P., J.Chem.Inf.Comput. Sci., 2003. ‘Druglike’: C, N, O, S, P, H, Cl, Br, F, I and MW ≤ 500 (Dobson C.M., Nature, 2004); Walters W.P., J. Med. Chem., 2018. Chemical Space & Screening
1080 1060 1020 1017 108 atoms seconds isolated molecules ‘druglike’ molecules2 sand grains in the Universe1 age of the Earth
● CAS: ~100.106 (organics/inorganics)
● Dedicated to Pharmacology:
Commercial: 106 (screening libraries)
Naturals: 106 (theoretically) < 0.1.106 (isolated (10%)3)
Toxins: 20.106 (theoretically) ~0.2.106 (UniProtKB (1%)4) Dune of Pilat Drugs: < 2000 FDA approved small-molecule drug structures (MW ≤ 2000)
2 ‘Druglike’: C, N, O, S, P, H, Cl, Br, F, I and MW ≤ 500 (Dobson C.M., Nature, 2004) 3 Harvey A., Drug Discovery Today, 2000 1 Source: C. Magnan, Collège de France, http://www.lacosmo.com/dixpuissance80.html 4 Zhang Y, Dongwuxue Yanjiu, 2015 ChemicalChemical Space space
- What is the usable size of a chemical library?
- Experimental High Throughput Screening (HTS)
* A screening campaign may assay up to 500 000 compounds / week
a low cost estimate ~ 0.40 $ / compound 1 (1 million compounds = 400 000 $) (includes cost of the chemical synthesis, high-throughput-screening disposables, capital costs and human resources)
several side issues: molecule re-supply, solubility, chemical stability, presence of PAINS (false positives)… as well as the management of waste products !
* It is commonly accepted that the suitable size of a library is ~250 000 to optimize the likelihood of finding a hit 2,3
1 Lipinski C. and Hopkins A., Nature, 2004, 432, 855-860. 3 Baell J, ACS Med Chem Lett, 2018. 2 Hibert M. and Haiech J., médecine/sciences, 2000, 16, 1332-9. ChemicalChemical Space space
- What is the usable size of a chemical library?
- Virtual Screening
* Building chemical structures
Example of the GDB-131 database: (13 atoms [C, N, O, S, Cl]) (<< mean drug size) - Combinatorial enumeration of structures - 3D Building, minimizing and validating structures
Results: Pyridoxine 910 111 673 structures (13 atoms) 39 882 (h) CPU time (= 1661 days of computation on 1 processor) ~0.16s /molecule (540 000 molecules / day / processor)
* Evaluating properties and/or interactions (e.g.: calculating the binding free energy ∆G of a ligand-protein complex)
- Using empiric method (docking method): ~20s to 3 min /molecule followed by visual inspection
- Using Molecular Dynamic (MD): hours to few days of calculation /molecule
1 Blum LC, Reymond JL.., J Am Chem Soc. 2009, 31(25), 8732-3. Target Drug Discovery (TDD)
Pathology
ChemInfoScreen Identified Target ChemBioScreen Virtual ligand-based Protein Sequence Screenings structure-based IR FRISBI CN subset Libraries 3D Structure approved drugs Design Chimiothèque Nationale commercial cpds Known No Yes Raw data PAINS/reactivity alert Ligands analysis analogs in catalogs and properties prediction Identified Hits SAR building LogP, Sol.,Kd, Kon/Koff Medicinal Chemistry
ADME-Tox Lead Optimisation
Clinical trials
Approved
drug .~ 10-14 years/~1 Billion $ Target Drug Discovery (TDD)
Pathology
Identified Target
Protein Sequence
3D Structure
Known No Yes Ligands
A hit ~ a molecule with Identified Hits µM range of activity
MW [1-200] Hit-to-Lead Lead Optimisation LogP [0.5-4]
Lead Clinical trials Optimization MW < 500 LogP < 5 Approved nbHA<5, Drug drug nbHD<10 .~ 10-14 years/~1 Billion $ Target Drug Discovery (TDD)
Pathology
Identified Target
Protein Sequence Teague et al., Angew. Chem. Int. Ed., 1999
3D Structure Drug-like hits > 0.1 µM MW > 350 unfavored Known LogP > 3 No Yes Ligands Lead-like hits A hit ~ a molecule with Identified Hits > 0.1 µM µM range of activity MW < 350 MW LogP < 3 (polar) LogP
MW [1-200] Hit-to-Lead Lead Optimisation LogP [0.5-4] High affinity hits << 0.1 µM Lead MW >> 350 MW Clinical trials Optimization LogP < 3 LogP MW < 500 LogP < 5 Approved nbHA<5, Drug drug nbHD<10 .~ 10-14 years/~1 Billion $ Target Drug Discovery (TDD)
Pathology
Identified Target LE > 0.35 ; LLE > 5 ; PFI < 7
LE = pX50*1.37 /#heavy atoms (kcal/mol/atom) LipE = LLE = pX50 - cLogP PFI = Chrom LogDpH7.4 + #Ar rings iPFI = Chrom LogP + #Ar rings Protein Sequence Teague et al., Angew. Chem. Int. Ed., 1999
Leeson and Springthorpe, Nat Rev Drug Discov, 2007. 3D Structure Drug-like hits Leeson and Young, ACS Med. Chem. Lett., 2015. > 0.1 µM Young and Leeson, J. med. Chem., 2018. MW > 350 unfavored Known LogP > 3 No Yes Ligands Lead-like hits A hit ~ a molecule with Identified Hits > 0.1 µM µM range of activity MW < 350 MW LogP < 3 (polar) LogP
MW [1-200] Hit-to-Lead Lead Optimisation LogP [0.5-4] High affinity hits << 0.1 µM Lead MW >> 350 MW Clinical trials Optimization LogP < 3 LogP MW < 500 LogP < 5 Approved nbHA<5, Drug drug nbHD<10 .~ 10-14 years/~1 Billion $ Target Drug Discovery (TDD)
Pathology
Identified Target LE > 0.35 ; LLE > 5 ; PFI < 7
LE = pX50*1.37 /#heavy atoms (kcal/mol/atom) LipE = LLE = pX50 - cLogP PFI = Chrom LogDpH7.4 + #Ar rings iPFI = Chrom LogP + #Ar rings Protein Sequence Teague et al., Angew. Chem. Int. Ed., 1999
Leeson and Springthorpe, Nat Rev Drug Discov, 2007. 3D Structure Drug-like hits Leeson and Young, ACS Med. Chem. Lett., 2015. > 0.1 µM Young and Leeson, J. med. Chem., 2018. MW > 350 unfavored Known LogP > 3 Identifying goodNo –Yes progressable - Hits Ligands Lead-like hits A hit ~ a molecule with Identified Hits > 0.1 µM µM range of activity MW < 350 MW LogP < 3 (polar) LogP
MW [1-200] Hit-to-Lead Lead Optimisation LogP [0.5-4] High affinity hits << 0.1 µM Lead MW >> 350 MW Clinical trials Optimization LogP < 3 LogP MW < 500 LogP < 5 Approved nbHA<5, Drug drug nbHD<10 .~ 10-14 years/~1 Billion $ e Drug 3D
Pharmacokinetic data set
Searches by: Names Substructures keywords Target name…
http://chemoinfo.ipmc.cnrs.fr
Drug-like Fragments and Frameworks
rings , fused rings and acyclics ( linkers and substituants)
X : anchoring point for substituents
(Bemis & Murcko definition) Pihan Pihan
et al Total number of drugs containing the framework 10 ., Bioinformatics 1939
47% by by 1939 structures are represented represented are structures , only 24 ; Douguet D.,2012;
1946
1942 Most Most “frameworks” 1940
1949
1939 1946 populated
ACS Med Chem Lett Chem Med ACS
1939 … Drug Drug
1951 …
1953
…
frameworks 1945 frameworks
, 24 2018. http://chemoinfo.ipmc.cnrs.fr 1960 Framework type Framework
in in approved
(represented by1 drug only structure)
drugs Most frameworks areunique
54 … 496
Pihan Pihan Total number et al ., Bioinformatics , ; Douguet D.,2012;
a After Most Decade larger
1980s number populated ACS Med Chem Lett Chem Med ACS
Drug Drug
of of new frameworks …
frameworks frameworks , 2018. http://chemoinfo.ipmc.cnrs.fr
Total number but
in in most populatedmost frameworks approved
drugs Decade
are oldest ones
Drug Frameworks
Butabarbital (1939) Meperidine (1942) Mephenytoin (1946) (GABA receptor; (Mu type opioid receptor; (Nav ion channel) 20S proteasome (Ixazomib)) Neprylisin(Sacubitril)) HIV reverse transcriptase (Zidovudine)) 227 drugs 49 drugs 41 drugs
Theophylline (1940) Histamine (1939) Desoxycorticosterone (1939) (Phosphodiesterase; (H1 receptor) (Aldosterone synthase; Calcium-activated K+ (SK) channel (Riluzole)) Angiotensin-converting enzyme (Captopril)) Glucocorticoid receptor (Halobetasol)) 48 drugs 38 drugs 91 drugs
Chloroquine (1949) Diphenydramine (1946) Sulfapyridine (1939) (Lactate dehydrogenase; (H1 receptor; (Unknown target; Beta adrenergic receptor (Propranolol)) Prostaglandin H synthase (Amfenac/Nepafenac)) Estrogen receptor (Diethylstilbestrol)) 69 drugs 47 drugs 37 drugs
e-Drug3D: release of July 2016 (1557 princeps / 1822 different structures) - 1189 different scaffolds (out of 1697) - 512 different frameworks Source: http://chemoinfo.ipmc.cnrs.fr ; Pihan et al., Bioinformatics, 2012. Drug Frameworks
Promethazine (1951) Thiamine (1953) (H1 receptor) (Vitamine B1; Imipramine (1959) Alpha adrenergic receptor (Clonidine) (Noradrelanine Transporter) 24 drugs 14 drugs 12 drugs
Diazepam (1963) Sulfathiazole (1945) (GABA receptor) Cephalothin (1974) (Unknown target; (Penicillin-Binding Protein) Dihydropteroate synthase (Sulfamethizole); 18 drugs 13 drugs NKCC1, CFTR (Furosemide)) 11 drugs
Vidarabine (1976) Folic acid (1946) (Adenosine deaminase) Prochlorperazine (1956) (vitamine B9; (D2 Dopamine receptor) DHFR (Methotrexate)) 15 drugs 12 drugs 11 drugs
e-Drug3D: release of July 2016 (1557 princeps / 1822 different structures) - 1189 different scaffolds (out of 1697) - 512 different frameworks Source: http://chemoinfo.ipmc.cnrs.fr ; Pihan et al., Bioinformatics, 2012. Drug Frameworks
10 drugs Clomiphene (1967) Tropicamide (1960) (Estrogen receptor) (Muscarinic acetylcholine receptor; 11 drugs Noradrenaline transporter (Benzphetamine))
Benzquinamide (1974) Phenoxybenzamine (1953) (P-glycoprotein receptor; (Alpha adrenergic receptor; Cannabinoid receptor (Dronabinol)) Platelet glycoprotein (Tirofiban)) 10 drugs 10 drugs
∑ (represented drugs) = 828/1822 = 45.4% Triamcinolone acetonide (1960) (Glucocorticoid receptor) 10 drugs of drug structures are represented by 23 frameworks
The simplest frameworks appeared first and are the most populated
e-Drug3D: release of July 2016 (1557 princeps / 1822 different structures) - 1189 different scaffolds (out of 1697) - 512 different frameworks Source: http://chemoinfo.ipmc.cnrs.fr ; Pihan et al., Bioinformatics, 2012. Drug Frameworks
Discontinued Candicidin () large polyene structure (membrane)
Cycrimine () 2 drugs Ceruletide () large structure (Muscarinic acetylcholine receptor M1) Protokylol () 2 drugs (Cholecystokinin type A) (Beta 1/Beta 2 adrenergic receptor) Prazepam () (GABA receptor)
2 drugs Hexafluorenium () (Cholinesterase) Deslanoside () 1 drug Methixene () (Sodium/potassium ATPase) (Muscarinic acetylcholine receptor) Viomycin () large ring Trilostane () (70S ribosome) (Estrogen receptor)
Clidinium () 2 drugs Pentolinium () 1 drug (Muscarinic acetylcholine receptor) (antihypertensive) Dezocine () Beta carotene () (Kappa/Mu opioid receptor) (Beta carotene monooxygenase)
2 drugs Pyrvinium () Clidinium () Hetacillin () (anthelmintic) Saralasin () (Muscarinic acetylcholine receptor) (Penicillin-Binding Proetins 1A/1B) (Angiotensin II receptor)
2 drugs Amdinocillin () Gentian violet () Quinestrol () Meclocycline () & Methacycline () (Penicillin-Binding Proetins 2B) (NADPH oxidase) (Ribosome) (Estrogen receptor) e-Drug3D: release of July 2016 (1557 princeps / 1822 different structures) - 1189 different scaffolds (out of 1697) - 512 different frameworks Source: http://chemoinfo.ipmc.cnrs.fr ; Pihan et al., Bioinformatics, 2012. Drug Frameworks
Discontinued Frame 13 Frame 56 Frame 173 Dicumarol (1944) Diphenidol (1967)/Oxyphencyclimine (Xanthine oxidase) Frame 98 Trovafloxacin/Alatrofloxacin (1997) (Muscarinic acetylcholine receptor) Pentagastrin (1974) (DNA gyrase) (Cholecystokinin type B receptor)
Frame 64 Biperiden (1959) Frame 174 Frame 100 Frame 14 (Muscarinic acetylcholine receptor) Cisapride (1993) Mazindol (1973) Metocurine/Tubocarine (1945) (5-HT4 receptor) (Noradrenaline & Dopamine transporter) (5-HT3 receptor) Frame 66 Benzthiazide (1960) Frame 119 (Antihypertensive) Guanadrel (1982) Frame 175 (Antihypertensive) Levocabastine (1993) Frame 41 (Histamine H1 receptor) Très proche de frame 1 (sulfapyridine) Hydroxystilbamidine (1953) antiparasitic Frame74 Frame 131 (Unknown target) Chlorprothixene (1967) Antazoline (1990) (D2 Dopamine receptor) (Cav channel) Frame 213 Troglitazone (1997) Frame 47 (PPAR gamma) Frame 79 Ambenonium (1956) Frame 136 Cyclothiazide (1982) Frame 239 (Cholinesterase) Bepridil (1990) (Glutamate receptor 2) Pemirolast (1999) (Ca channel) (antiinflammatory)
Frame 53 Frame 90 Frame 161 Frame 245 Rescinnamine (1956) Testolactone (1969) Doxacurium (1991) Telithromycin (2004) large ring (Angiotensin-converting enzyme) (CYP450 19-aromatase) (Muscarinic acetylcholine receptor M1) (50S ribosome) e-Drug3D: release of July 2016 (1557 princeps / 1822 different structures) - 1189 different scaffolds (out of 1697) - 512 different frameworks Source: http://chemoinfo.ipmc.cnrs.fr ; Pihan et al., Bioinformatics, 2012. Drug Frameworks
Discontinued
Frame 297 Frame 392 Cyclacillin / Methicillin (1979) Nalmefene (1995) (Penicillin-Binding Proetins 1A/1B) (Opioid receptor) Frame 315 Frame 377 Frame 303 Azlocillin/Mezlocillin (1981) Hydrocortisone cypionate (1955) Troleandomycin (1969) large ring (Penicillin-Binding Proetins 1A/1B) (Glucocorticoid receptor) (ribosome)
Frame 304 Frame 325 Novobiocin (1964) Cefmetazole (1989) Frame 396 (DNA gyrase) (Penicillin-Binding Proetins 1A/1B) Frame 380 Plicamycin (1970) Nandrolone phenpropionate (1959) (Unknown target) (Androgen receptor) Frame 306 Frame 327 Cephapirin (1974) Dirithromycin (1995) large ring (Penicillin-Binding Proetins 1A/1B) (ribosome) Frame 397 Frame 381 Carbenicillin indanyl (1972) Frame 343 Frame 309 Sulfaphenazole (1974) (Penicillin-Binding Proetins) Ticarcillin (1976) Guanethidine (1960) (CYP450) (Penicillin-Binding Proetins 1A/1B) (Nitric oxide synthase)
Frame 386 Frame 346 Protirelin (1976) Frame 314 Bitolterol (1984) Frame 417 (Hormone analog) Cefoperazone/Cefpiramide (1982) (Beta 2 adrenergic receptor) Telaprevir (2011) (Penicillin-Binding Proetins 1A/1B) (NS3/4A protease)
e-Drug3D: release of July 2016 (1557 princeps / 1822 different structures) - 1189 different scaffolds (out of 1697) - 512 different frameworks Source: http://chemoinfo.ipmc.cnrs.fr ; Pihan et al., Bioinformatics, 2012. Drug-like Fragments and Frameworks
rings , fused rings and acyclics ( linkers and substituants)
X : anchoring point for substituents
(Bemis & Murcko definition) Privileged Structures/Fragments
Mean number of : - legos in drug structures - legos in frameworks (rings + linkers) - substituants in drug structures
Pihan et al., Bioinformatics, 2012; Douguet D., ACS Med Chem Lett, 2018. http://chemoinfo.ipmc.cnrs.fr Physico-Chemical Properties
Statistics on approved drugs*: Evolution of drug properties
Weight
Molecular Polar Surface Area Surface Polar
Year Year
*e-Drug3D: release of March 2015 (1746 different structures) Physico-Chemical Properties
Statistics on approved drugs*: Evolution of drug properties
LogP Fsp3*
Year Year
*e-Drug3D: release of March 2015 (1746 different structures) ** Fsp3 = Number of C(sp3) / Number of C Privileged Structures/Fragments
Drug structures have gained weight over the years and are more complex (more legos)
An increase in the complexity of new frameworks (highly branched structures)
Example: Venetoclax (2016) (protein-protein inhibitor)
On average, the number of legos in a drug = 5.3
- 3.6 legos in the framework (rings + linkers)
- 1.7 legos in ’decoration’ (substituants)
Pihan et al., Bioinformatics, 2012; Douguet D., ACS Med Chem Lett, 2018. http://chemoinfo.ipmc.cnrs.fr Target Drug Discovery (TDD)
Pathology
Identified Target LE > 0.35 ; LLE > 5 ; PFI < 7
LE = pX50*1.37 /#heavy atoms (kcal/mol/atom) LipE = LLE = pX50 - cLogP PFI = Chrom LogDpH7.4 + #Ar rings iPFI = Chrom LogP + #Ar rings Protein Sequence Teague et al., Angew. Chem. Int. Ed., 1999
Leeson and Springthorpe, Nat Rev Drug Discov, 2007. 3D Structure Drug-like hits Leeson and Young, ACS Med. Chem. Lett., 2015. > 0.1 µM Young and Leeson, J. med. Chem., 2018. MW > 350 unfavored Known LogP > 3 No Yes Identifying good – progressableLigands - Hits Lead-like hits A hit ~ a molecule with Identified Hits > 0.1 µM µM range of activity MW < 350 MW LogP < 3 (polar) LogP
MW [1-200] Hit-to-Lead Lead Optimisation LogP [0.5-4] High affinity hits << 0.1 µM Lead MW >> 350 MW Clinical trials Optimization LogP < 3 LogP MW < 500 LogP < 5 Approved nbHA<5 Drug drug nbHD<10 .~ 10-14 years/~1 Billion $ Good practices for HIT Ident ificat ion
Lead-like compounds:
MW 150-350 LogP 3 Rings 1-4 Hbond donor (nbHD) <5 Hbond acceptor (nbHA) <8
Exclude PAINS (Pan-Assay Interference compounds) -> false positive hits = apparent biological activity molecules interfere with the assays (aggregation, micelle, autofluorescence… )
interfere with the function of the protein (chemical reactivity (aldehydes, epoxides, acid halide…), metal chelation, redox activity…)
e.g.: phenotypic assay & amphiphilic molecules (!! non specific activity through membrane binding)
PAINS classes: rhodanines, quinones, cathechols… are well known frequent hitters
O
N O O O S S O O O O curcumin O
Reproducible activity with Re-synthesized or Repurified molecule
Additional biological assays (SPR, structural biology…)
Demonstration of Structure-Activity Relationships (SAR) and Hit-to-Lead optimization