DrugDrug TargetTarget InformaticsInformatics … and answers to the questions: What is a target? How many such targets are there?*)

Tudor I. Oprea Division of Biocomputing University of New Mexico School of Medicine [email protected]

*) P. Imming, C. Sinning, A. Meyer, Nature Rev. Drug Discov 2006, 5: 821-834 *) J. Overington, B. Al-Lazikani, A.L. Hopkins, Nature Rev. Drug Discov 2006, 5: 993-996

Drug Discovery Informatics 2 The University of New Mexico Copyright © Tudor I. Oprea, 2008. All rights reserved Division of BIOCOMPUTING OutlineOutline

• Phases of Drug Discovery - recap • Target Informatics: How many Drug Targets? – Examples by Drug – Examples by Target Class

• Oral Drug Targets: Can Literature be corrected?

• Errors in Drug Target Informatics – What X-ray crystallographers won’t tell you • The Physical Basis of the Rule of Five TargetTarget IdentificationIdentification inin PreclinicalPreclinical DiscoveryDiscovery

Target Hit Lead Lead Clinical Identification Identif. Identif. optim. Candidate

Human genetics

Mouse genetics

Identification Validation Production The key in target identification is mass production of pure protein for structural studies ModernModern TechnologiesTechnologies inin PreclinicalPreclinical DiscoveryDiscovery

Target Hit Lead Lead Clinical Identification Identif. Identif. optim. Candidate

Synthetic Compounds Natural Products

High Throughput Synthesis and Screening ModernModern TechnologiesTechnologies inin PreclinicalPreclinical DiscoveryDiscovery

Target Hit Lead Lead Clinical Identif Identif. Identif. optim. Candidate

R2 R1

N

R3 " " R4

Combinatorial & Medicinal Toxicogenomics Chemistry ModernModern TechnologiesTechnologies inin PreclinicalPreclinical DiscoveryDiscovery

Target Hit Lead Lead Clinical Identif Identif. Identif. optim. Candidate

NH2 O N O H O O

Structure-based Drug Design Transgenic animals Zanamivir (inhaled; Relenza®) Clinically relevant disease models first anti-influenza drug (1999) Study metabolism & toxicology in Oseltamivir (orally available prodrug, “human”-ized conditions Tamiflu ®) launched in 2001 TheThe AttritionAttrition RateRate inin DrugDrug DiscoveryDiscovery

e u l HTS 1,000,000 a v

e d t n a r a HTS Hits 2,000 r e e

r o

g r

u

l r d i

e e a

l l f

a w HTS Actives f 1,200 t

o o n n e

k k s

i m i

r r d e

e Lead Series d 50-200 s p

a x se e e r a

c e d n

I Drug Candidates cr 1 e n s

I a e r c Drug 0.1 n I TheThe MolecularMolecular PharmacopoeiaPharmacopoeia DrugDrug TargetsTargets && DisDis--easeease • Literature estimates the number of drug targets between 5,000 (high estimate) to 500 (targets hit by current ) – Definition: A target is a macro-molecular structure (defined by at least a molecular mass) that undergoes a specific interaction with therapeutics (chemicals administered to treat or diagnose a disease). The target-drug interaction results in clinical effect(s). – Imming, Sinning & Meyer considered the ’intended’ (not side-effect) targets for drugs; validation in knock-out models - a plus; receptor (ant)agonism, enzyme inhibition were also considered proof; 1-3 targets/drug were considered [was this OK?!]. – Overington, Al-Lazikani & Hopkins considered protein targets for FDA- approved drugs only (~1200 drugs from the Orange Book). They did make allowances for ”non-intended” drug targets for, e.g., ritonavir – an HIV- protease inhibitor given in combination with other such inhibitors because it slows down their metabolism via CYP3A4 inhibition (thus CYP3A4 was considered a drug target for ritonavir). [this was better]. • Part of the problem: there is no “right” definition for health (e.g, free from dis-ease). In the case of sickness, do we “cure”, do we “treat” patients, or do we heal them? AspirinAspirin –– thethe ““firstfirst drugdrug”” O O • COX-1; Prostaglandin G/H synthase 1

O • COX-2; Prostaglandin G/H synthase 2 Acts as suicide inhibitor O • Platelet glycoprotein IIb of IIb/IIIa complex, or antigen CD41 Acts as competitive antagonist (μM inhibitor) (used as Baby Aspirin as antiaggregant) • Phospholipase A2 (PDB code 1OXR) Acts as competitive antagonist (μM inhibitor) History: Felix Hoffmann was believed to have developed aspirin for F. Bayer & Co., to help his rheumatic father. Arthur Eichengrün claimed in 1949 that the work had been done under his direction. Walter Sneader analyzed archival data from Bayer, as well as published material and concluded that Eichengrün's claim is valid. Acetylsalicylic acid was synthesised under Eichengrün's direction, and it would not have been introduced in 1899 without his intervention W. Sneader, British Medical Journal 2000, 321:1591–1594 IndomethacinIndomethacin –– anan antianti--inflammatoryinflammatory

Anti-inflammatory; antipyretic; analgesic O • COX-1; Prostaglandin G/H synthase 1 (6.9) • COX-2; Prostaglandin G/H synthase 2 (6.05) • PLA2; Phospholipase A2 (8.0) acts as reversible, competitive inhibitor, with affinity in N O O O the sub-micromolar to nanomolar range • IL-1; interleukin 1 (6.5) acts as antagonist of PGE2 production (sub-μM)

Cl (the above targets clearly related to inflammation) • Prostanoid DP2 receptor; GPR44 (7.5) Indomethacin is clinically used as tocolytic agent effective in preventing pre-term labour because it acts as full agonist on Prostanoid DP2 receptors DexamethasoneDexamethasone–– anotheranother antianti--inflammatoryinflammatory O Chiral O History: G.E. Arth discovered (at Merck) that 16α- O O substituted steroids are metabolically more stable. He H soon discovered that this also reduced the mineralo- cortoicoid, and enhanced the anti-inflammatory F H activity. Dexamethasone (1958) is more potent than O hydrocortisone, with a longer duration of action.

Glucocorticoid; anti-inflammatory; diagnostic aid (Cushing Syndrome, depression) • Glucocorticoid receptor (8.28) – as competitive antagonist • IL-4; interleukin 4 (8.33) & IL-5; interleukin 5 (8.46) •TNF-α; tumor necrosis factor alpha (7.7); IFN-γ; interferon-gamma (10.0) (the above targets clearly related to inflammation) • Anti-Target (?): Mineralocorticoid receptor (7.48) • Anti-Target: PXR; Pregnane X receptor, as agonist Dexamethasone is used to treat inflammatory and autoimmune conditions, e.g., rheumatoid arthritis; and to counteract the development of brain edema; to prevent virilisation of a female fetus in congenital adrenal hyperplasia. Also used to counteract side-effects of antitumor treatment in cancer patients undergoing chemotherapy. AripiprazoleAripiprazole –– aa ““dirtydirty drugdrug”” exampleexample O • Target Meas Value Activity

N •D2 Ki 0.34 nM partial agonist

•D3 Ki 0.8 nM antagonist

•D4 Ki 44 nM antagonist •5HT Ki 1.7 nM partial agonist O 1A •5HT2A Ki 3.4 nM antagonist

•5HT2C Ki 15 nM antagonist

•5HT7 Ki 39 nM antagonist •alphaAR Ki 57 nM antagonist N 1 •H1 Ki 61 nM antagonist N • 5HT reuptake Ki 98 nM antagonist • Aripiprazole is an and neuroleptic with Cl efficacy in schizophrenia and bipolar disorder. Its mechanism of action is unknown (as per FDA label), Cl although the above activities were observed. TamoxifenTamoxifen –– aa ““cleanclean drugdrug”” exampleexample OH • Estrogen receptor – intended CYP2D6, 2B6 2C9, 2C19, 3A drug target. TAM & metabolites antagonize dimer formation; ERα monomer + TAM can act 3CH CH3 3CH CH3 N N as agonist (NFkB, AP-1) O CH3 O CH3 • GPR30 – 4-OH TAM agonist TAM 4OHTAM •ERRγ (estrogen-related CYP3A4/5 CYP3A4/5 response receptors, also class OH 3 NHRs) – 4OHTAM, CYP2D6 antagonist • Emopamil binding protein; 3β- CH CH CH CH 3 3 3 3 hydroxysteroid-Δ7-8 isomerase; NH NH O O cholestenol delta-isomerase (TAM, inhibitor) N-desmethylTAM Endoxifen • Type I sigma receptor (TAM & Desta, Z et al JPET 2004, 310:1062-1075 metab., antagonists) • PXR; Pregnane X receptor

Tamoxifen is the gold standard “antiestrogen” therapy, used as the first line therapy in Estrogen positive breast cancers. Although its mechanism of action is “known” (as per FDA label), TAM has nanomolar affinity to all the above targets. AmantadineAmantadine –– aa ““simplesimple drugdrug”” exampleexample • D1 dopamine receptor agonist • D2 dopamine receptor agonist • N-methyl D-aspartate receptor subtype 2D (Glutamate [NMDA] receptor subunit epsilon 4) - antagonist at the NH2 Phencyclidine binding site Used in Parkinson’s disease • Antiviral against Influenza A virus by interfering with the viral M2 membrane ion channel; appears effective on all Influenza A viral strains • Antiviral against feline immunodeficiency virus Used as antiviral • Side effect 1: hERG (probably). Demonstrated to produce QT-prolongation (with risk for congenital long QT patients) • Side effect 2: -like effects (dry mouth, urinary retention, and constipation) – do not appear to be mediated by direct binding to cholinergic receptors AcyclovirAcyclovir –– UsingUsing ViralViral MachineryMachinery

NH • DNA polymerase from Herpes Simplex Virus 2 H N • DNA polymerase from Herpes Zoster Virus N O In vitro and in vivo inhibitor against herpes simplex virus types 1 (HSV-1), 2 (HSV-2), and varicella-zoster virus N N (VZV). O However, Acyclovir is a prodrug that requires conversion by viral thymidine kinases (TK), as encoded by HSV and VZV. These convert acyclovir into acyclovir OH monophosphate; this is further converted into diphosphate by cellular guanylate kinase ,and into triphosphate by cellular enzymes. • KITH_HHV1 (Q9QNF7) • KITH_HHV23 (P04407) • KITH_VZV7 (P14342) The above are SwissProt identifier for the 3 TK enzymes that are targeted by the prodrug ClorpromazineClorpromazine –– AnotherAnother antianti--Viral?Viral? Antiemetic; antipsychotic; neuroleptic Cl Clorpromazine blocks postsynaptic mesolimbic dopaminergic receptors in the brain; it was the first "neuroleptic“ (introduced in 1953). At least 15 possible drug targets with sub- micromolar affinity: S N It acts as antagonist on α-2A (6.2), α-2B (7.6), α-2C (7.2) adrenoceptors, on 5-HT1A (6.2), 5-HT2C (7.9) serotonin receptors, on D1 (7.64), D2 (7.55), D3 (8.22), D4 (8), D5 N (7.34) dopaminic receptors, on H1 (8.2) and H4 (8) histaminic receptors, Also acts as inverse agonist on 5-HT2A (8.1), 5-HT6 (7.9), and 5-HT7 (7.6) serotoninic receptors. • Anti-Target: hERG; potassium voltage-gated channel subfamily H member 2 It causes QT prolongation (risk for congenital long QT patients) • Possible anti-target: CAR; constitutive androstane receptor (data on mouse only) Recently, peer-reviewed literature suggests that clorpromazine is an effective viral entry inhibitor. IsIs hERGhERG BindingBinding Important?Important? Cl • Clemastine (1967), an antihistaminic, competes with

N O histamine for H1-receptor sites on effector cells in the GI-tract, blood vessels and respiratory tract • Clemastine is a potent hERG inhibitor (12 nM), but does not cause QT-prolongation

NH NH N N N N 3CH OH OH F F

O O CH3 O O Ciprofloxacin (1986) Grepafloxacin (1998) • Grepafloxacin, launched as Vaxar in Germany and Denmark (1998) by Otsuka (Japan) was withdrawn in 1999, following reports of severe cardiovascular events (binds to hERG at the micromolar level, but causes QT prolongation which may lead to fatal ventricular arrhythmias) • Ciprofloxacin has not been associated with QT prolongation! HydralazineHydralazine –– newnew UsesUses forfor anan oldold DrugDrug Used as antihypertensive since 1953 • Prolyl 4-hydroxylase subunit α-1 (aka Procollagen- proline,2-oxoglutarate-4-dioxygenase α-1 subunit) • Voltage-dependent L-type calcium channel subunit α-1C NH2 N Clinical effect: direct vasodilatation of arterioles with decreased systemic resistance. NH • Hydralazine causes lupus in genetically predisposed individuals (J. Immunology 2005, 174: 6212-6219) N • It also induces DNA demethylation and re- expression of the ER, RAR-beta, and p16 genes and other tumor suppressor genes in cultured cells (Clinical Cancer Research 2003, 9: 1596-1603); • Hence it is currently evaluated as adjuvant therapy for current chemotherapeutic agents. Potentially novel targets: • DNA (cytosine-5)-methyltransferase 1; Dnmt1 • DNA (cytosine-5)-methyltransferase 3A; Dnmt3a • DNA (cytosine-5)-methyltransferase 3B; Dnmt3b CaffeineCaffeine –– aa ““stimulantstimulant drugdrug”” exampleexample • Cyclic AMP-inhibited phosphodiesterase 4A Used as CNS stimulant, bronchial smooth muscle relaxant • adenosine A1, A2A, A2B and A3 receptors O CH3 Used as cardiac muscle stimulant and (?) diuretic 3CH N Note: Caffeine is a weak binder (μM range) of adenosine N receptors; its activity is due to high dosage as well as active metabolites (e.g., theobromine, theophylline) O N N Also used to combat apnea of premature newborns CH 3 • PDE4A; Cyclic AMP-inhibited phosphodiesterase 4A • intermediate conductance calcium-activated potassium channel protein 4; KCNN4; KCa2-3.4 Disclaimer: The practical advice given • CYP 1A2 (liver) metabolizes caffeine: ~80% is metabolized below does not constitute to paraxanthine (1,7-dimethylxanthine), ~10% to endorsement of substance abuse of any kind. theobromine (3,7-dimethylxanthine), and ~4% to theophylline (1,3-dimethylxanthine). Some practical advice for Caffeine and users: Ethanol is volatile, and 60% is eliminated through exhalation. DO NOT drink caffeine when heavily intoxicated (it slows Ethanol metabolism). Exhale, slooowly. Exhale. Let it go ☺ If you want to prolong the effects of caffeine, combine it with tea (slows Caffeine metabolism) TheThe FitnessFitness LandscapeLandscape

Similar molecules act in a similar manner …or do they?! We’re beginning to realize that similar molecules may have very different activities, leading to what Gerry Maggiora calls activity cliffs. MultiMulti--TargetTarget BindingBinding AffinityAffinity CliffsCliffs

Chemical Structure MolName α1 AR α2 AR β1 AR β2 AR MW AlogS AlogP ClogP

OH NH2

OH Dopamine 4.8539 7.2366 5 4.301 153.18 -1.27 -0.4 0.169 OH F OHOH NH H 2 OH OHOH NNHNH22 CH3 OH F - OH OH OH Epinephrine 5.7959 7.8861 5.9586 6.1805 183.21 -0.97 -0.6 0.685 F OH

OH NH2

OH (R)-1b 4.6021 6.2596 6.8861 7.1739 187.17 -1.23 -0.97 -0.66 F OH

OH NH2 - OH F (S)-1d 4.9208 6.4437 3.9208 3.5686 201.2 -1.36 -0.43 0.156

Source: J. Med. Chem. 43(8)-2000 1611-1619 MultiMulti--TargetTarget DrugDrug ActivityActivity CliffsCliffs

Chemical Structure MolName Target 1 Target 2 Target 3 Target 4 MW AlogS AlogP ClogP

O

H H Progeste H H rone Estrogen O Norgestrel receptor receptor 312.46 -4.74 3.25 3.5 O

H Membrane Progeste progestin Mineralo- H H rone Estrogen receptor corticoid O Progesterone receptor receptor alpha receptor 314.47 -4.77 3.58 3.96 O

O Chloride H channel protein, GABA-A GABA-A GABA-A H H skeletal receptor receptor receptor muscle, alpha-1 alpha-2 alpha-5 OH H Alphaxalone ClC-1 subunit subunit subunit 332.49 -4.15 3.28 3.73

OH

H Progeste N H H rone Estrogen Androgen O Danazol receptor sulfatase receptor 337.47 -4.27 3.63 3.93 SoSo…… • Are there any “magic bullets” that hit a single target, or is every drug acting on multiple targets? MonoMono--TargetTarget DrugsDrugs

• Histamine H2 receptor antagonists (H2-blockers), e.g. cimetidine, ranitidine, famotidine, nizatidine, roxatidine. • Gastric proton-pump (H+/K+ ATPase) inhibitors, e.g., omeprazole, lansoprazole, pantoprazole • Serotonin 5-HT3 receptor antagonists (antiemetics), e.g., granisetron, ondansetron, tropisetron, dolasetron • HMG-CoA reductase inhibitors (to lower blood cholesterol) “statins”, e.g., atorvastatin, rosuvastatin CurrentCurrent DrugsDrugs ClassificationClassification

• Classification by therapeutic action, e.g. Cardiovascular Drugs: Debrisoquine, Quinidine, Flecainide, Mexiletine, Captopril, Lidocaine, Indoramin. • Classification by intended drug target, e.g. Beta-adrenergic Blockers: Propranolol, Timolol, Atenolol, Metoprolol. • Classification by chemical structure and mode of action, e.g., Tricyclic Antidepressants: Amitriptyline, Nortriptyline, Imipramine. • Classification by “natural source”, e.g., Ergot Alkaloids: Bromocriptine, Ergotamine, DihydroErgocristine or : Morphine, Codeine, , Naloxone. • Imming, Sinning & Meyer state that it is necessary to move away from single-target classification and consider the entire biochemical pathway as the drug target, due to the dynamic aspect of drug- organism interactions. ImmingImming,, SinningSinning && Meyer:Meyer: DrugDrug TargetTarget ClassificationClassification • Enzymes • Substrates, metabolites and proteins •Receptors • Ion channels • Transport proteins • DNA/RNA and the ribosome • Targets of monoclonal antibodies • Various physicochemical mechanisms • Unknown mechanism of action Target:Target: EnzymesEnzymes

Types Drug examples O Oxidoreductases • Cyclooxygenases (COXs) O HO OH O

O N H ACETYLSALICYLIC ACID ACETAMINOPHEN COX1 inhibitor COX2 inhibitor O

• Aromatase H

H H

O

EXEMESTANE

HO

HO

• Lipoxygenases O

NH2 MESALAZINE Target:Target: EnzymesEnzymes (continued)(continued)

Types Drug examples Oxidoreductases • HMG-CoA reductase

O

2 O Ca

OH F OH

3H2O N

NH

O

2 LIPITOR (ATORVASTATIN) Target:Target: EnzymesEnzymes (continued)(continued)

Types Drug examples Transferases O NH • Tyrosine kinases

NH

N

N N N

N

IMATINIB PDGFR/ABL/KIT inhibitor

O

N N F H

NH

O N H SUNITINIB VEGFR2/PDGFRβ/KIT/FLT3

O F F H H N N O O N

N F

O NH N Cl O HN

O O

ERLOTINIB EGFR inhibitor VEGFR2/PDGFRβ/RAF Target:Target: EnzymesEnzymes (continued)(continued)

Types Drug examples Hydrolases (proteases) H2N O • Aspartyl proteases (viral) O

HN O NH

NH N N H O OH

H

SAQUINAVIR HIV protease inhibitor

OH HN O O

N N H

OH N N

INDINAVIR HIV protease inhibitor Target:Target: EnzymesEnzymes (continued)(continued)

Types Drug examples OH Hydrolases (metalloproteases) O

• Human ACE O

N

SH CAPTOPRIL Hydrolases (other) HO • 26S proteasome B O O N

HO HN

NH N

N BORTEZOMIB

N O SO N

O H N • Esterases O N O N O N O O PAPAVERINE SILDENAFIL PDE4 inhibitor PDE5 inhibitor Target:Target: ReceptorsReceptors

Types Drug examples Direct ligand-gated ion channel O HN

•GABAA receptors O

HN

O

BARBITURIC ACID binding site agonists

N • Acetylcholine receptors O

HO

O GALANTAMINE nicotinic receptor allosteric modulators

• Glutamate receptors (ionotropic) Cl H N

O

KETAMINE NMDA subtype phencyclidine binding site antagonists Target:Target: ReceptorsReceptors (continued)(continued)

Types Drug examples G-protein-coupled receptors O

H N • Adrenoceptors HO 2

O ONH

N H OH PROPRANOLOL ATENOLOL β1-receptor antagonists β1-receptor antagonists

• Cysteinyl-leukotriene receptors HO

O Cl OH

S N

MONTELUKAST antagonists

HN N HN N N • Histamine receptors HN S

CIMETIDINE H2 - antagonists Target:Target: ReceptorsReceptors (continued)(continued)

Types Drug examples G-protein-coupled receptors • receptors N N H H H HO

H

O O HO OH O H HO H

MORPHINE BUPRENORPHINE μ - opioid agonists μ - opioid agonists

O O • Purinergic receptors Cl

N

S

CLOPIDOGREL P2Y12 antagonists Target:Target: ReceptorsReceptors (continued)(continued)

Types Drug examples O Nuclear receptors (steroid hormone receptors) O • Mineralocorticoid receptor H

H H

O S

O

SPIRONOLACTONE antagonists

OH • Estrogen receptor H O F F H H S F HO F F FULVESTRANT antagonists Target:Target: IonIon ChannelsChannels

Types Drug examples K+ channels I • Voltage-gated K+ channels O O

N

O I

AMIODARONE

O Na+ channels NH2 • Voltage-gated Na+ channels N

CARBAMAZEPINE

O

N+ -O 2+ O RIR Ca channel family O • Ryanodine receptors NH N N

O DANTROLENE UniqueUnique DrugDrug TargetsTargets byby ClassClass Nucleic Acid, 1% Transporters, 5% N = 6 N = 24 Bone, 0% Receptors, 1% Enzymes, 45% Target Name (Examples) Nr. Example N = 7 N = 221 Drugs

Proteins, 12% COX-1 (human) 36 Piroxicam N = 57 ACE (human) 10 Trandolapril α2/δ1 Ca channel (human) 9 Amlodipine HIV-1 protease (viral) 8 Tipranavir NHRs, 4% HIV-1 RT (viral) 11 Nevirapine N = 21 14-α demethylase (fungal) 7 Voriconazole

α1A adrenoceptor (human) 42 Dapiprazole GABAB subunit 2 (human) 11 Zaleplon GABA-A receptor (worm) 2 Ivermectin

Ion Channels, 16% Na-dependent serotonin 29 Escitalopram N = 77 re-uptake pump (human) K+ transporter (bacterial) 1 Clofazimine penicillin binding protein 39 Amoxicillin (bacterial) Annexin A1 (human) 14 Hydrocortisone

N = 492 GPCRs, 16% N = 78 UniqueUnique HumanHuman DrugDrug TargetsTargets byby ClassClass Nucleic Acid, 0% Transporters, 6% N = 1 N = 23 Target Name (Examples) Nr. Example Bone, 0% N =1 Drugs Receptors, 2% Enzymes, 37% COX-1 36 Piroxicam N = 7 N = 142 COX-2 35 Celecoxib Proteins, 8% N = 31 Carbonic anhydrase 1 13 Acetazolamide

α1A adrenoceptor 42 Dapiprazole D dopaminic receptor 41 Cabergoline NHRs, 6% 2 N = 21 H1 histaminic receptor 39 Fexofenadine

GABAA receptor (α1)42Zaleplon Nav1.5 sodium channel 34 Lidocaine glutamate [NMDA] 13 Ketobemidone receptor subunit 3A Glucocorticoid receptor 19 Mometasone Estrogen receptor 15 Tamoxifen Ion Channels, 20% N = 75 Progesterone receptor 13 Mifepristone Annexin A1 14 Hydrocortisone Calmodulin, CaM 6 Trifluoperazine Hemoglobin 5 Quinine Benzodiazepine 17 Diazepam N = 379 GPCRs, 21% (peripheral) N = 78 σ1 type (opioid) receptor 3 Dextromethorphan Serotonine reuptake pump 29 Sertraline Norepinephrine reuptake 28 Atomoxetine ABCC8 transporter 11 Nateglinide Human DNA 17 Cisplatin UniqueUnique OralOral DrugDrug TargetsTargets byby ClassClass Nucleic Acid, 0% Transporters, 6% N = 6 N = 24 Target Name (Top 3 by Nr. Example Class) Drugs

Receptors, 1% Enzymes, 44% COX-1 (h) 35 Piroxicam N =4 N = 184 COX-2 (h) 34 Celecoxib Proteins, 10% Carbonic anhydrase 1 (h) 13 Acetazolamide N =41

D2 dopaminic receptor (h) 41 Cabergoline

H1 histaminic receptor (h) 39 Fexofenadine NHRs, 5% α1A adrenoceptor (h) 36 Dapiprazole N = 19 GABAA receptor (α1) (h) 36 Thiopental Nav1.5 sodium channel (h) 34 Procainamide K+ channel Kir6.2 (h) 14 Chlorpropamide Glucocorticoid receptor (h) 14 Prednisone Estrogen receptor (h) 14 Tamoxifen Ion Channels, 16% Progesterone receptor (h) 11 Mifepristone N =68 penicillin binding protein 22 Amoxicillin (bacterial) 50S ribosomal protein L10 8 Clarithromycin Benzodiazepine 17 Diazepam (peripheral) GPCRs, 17% N = 418 N = 72 σ1 type (opioid) receptor 3 Dextromethorphan Serotonine reuptake pump 29 Sertraline Norepinephrine reuptake 28 Atomoxetine Na+/K+/Cl- cotransporter (h) 11 Torsemide 16S rRNA 12 Isepamicin Viral DNA 3 Stavudine HumanHuman OralOral DrugDrug TargetsTargets byby ClassClass Nucleic Acid, 0% N = 1 Target Name (Top 3 by Nr. Example Transporters, 7% Class) Drugs N = 23 Receptors, 1% Enzymes, 37% COX-1 34 Piroxicam N =4 N = 123 COX-2 34 Celecoxib Proteins, 8% N = 25 Carbonic anhydrase 1 11 Acetazolamide

D2 dopaminic receptor 38 Cabergoline

NHRs, 6% α1A adrenoceptor 36 Dapiprazole N =19 H1 histaminic receptor 35 Fexofenadine

GABAA receptor (α1)36Alprazolam Nav1.5 sodium channel 28 Disopyramide Potassium channel, Kir6.2 14 Glyburide Glucocorticoid receptor 14 Budesonide Estrogen receptor 14 Estradiol

Ion Channels, 20% Progesterone receptor 11 Progesterone N =66 Annexin A1 11 Betamethasone Calmodulin, CaM 6 Trifluoperazine Hemoglobin 5 Mefloquine Benzodiazepine 16 Clorazepate (peripheral) GPCRs, 22% N = 333 N = 72 σ1 type (opioid) receptor 3 Dextromethorphan Serotonine reuptake pump 29 Paroxetine Norepinephrine reuptake 28 Venlafaxine ABCC8 transporter 11 Torsemide HumanHuman OralOral DrugDrug TargetsTargets byby ClassClass (2)(2) 1 Drug/Target (N =171) 2 - 4 Drugs/Target (N =80) ≥ 20 Drugs/Target (N =14)

Transporters, 9% Receptors, 1% Transporters, 3% Transporters Enzymes 14% Proteins, 4% 14%

Receptors, 1% Enzymes, 40% NHRs, 9% Ion Channels Proteins, 11% 14% Ion Channels, 14% NHRs, 5%

Ion Channels, 22% GPCRs GPCRs, 12% GPCRs, 24% Enzymes, 46% 57%

Target Name Nr. Example 5 - 9 Drugs/Target (N =33) 10 - 19 Drugs/Target (N =34) (All Targets) Drugs Transporters, 3% Transporters, 9% 5-HT receptor 33 Aripiprazole Enzymes, 6% 2A Proteins, 6% Receptors, 3% M1 muscarinic receptor 31 Atropine NHRs, 3% Proteins, 3% α1A adrenoceptor 36 Carvedilol NHRs, 9% β adrenoceptor 28 Metoprolol Ion Channels 1

24% β2 adrenoceptor 22 Salbutamol

Ion Channels GPCRs D2 dopaminic receptor 38 Cabergoline 50% Enzymes, 39% 21% GPCRs, 24% GABAA receptor (α1)36Zaleplon

H1 histaminic receptor 35 Fexofenadine Target Name Nr. Example Target Name Nr. Example μ - type opioid receptor 23 Hydrocodone (Examples) Drugs (Examples) Drugs COX-1 34 Acetylsalicilate 5-HT3 receptor 9 Dolasetron D3 dopaminic receptor 17 Risperidone COX-2 34 Celecoxib Thiazide-sensitive Na- 8 Hydrochloro Benzodiazepine 17 Diazepam Cl cotransporter -thiazide (peripheral) Norepinephrine reuptake 28 Duloxetine Aromatase 5 Exemestane Cav1.2 Ca2+ channel 12 Nifedipine Serotonine reuptake 29 Sertraline Acetylcholinesterase 5 Donepezil Progesterone receptor 11 Desogestrel Nav1.5 sodium channel 28 Riluzole UrbanUrban LegendLegend…… • “40-50% of marketed drugs target G- protein coupled receptors”

• This trend holds true when examining “popular” drug targets (more than 5 drugs per target)

• In terms of unique targets, enzymes as a class outnumber GPCRs, while ion channels are a close second DrugDrug TargetsTargets RevisitedRevisited • Imming, Sinning & Meyer counted 218 drug targets; Overington, Al- Lazikani, & Hopkins suggest 186 small-molecule targets – Discrepancy: Drug targets, as counted by these authors, do not consider unique protein classes, and do not capture each high-affinity target. • An analysis of 1030 drugs (WOMBAT-PK database) shows 492 unique drug targets, of which 379 are human: – 142 enzymes; 78 GPCRs; 75 ion channels; 31 proteins; 23 transporters; 21 NHRs; 7 ’other’ receptors; 1 nucleic acid, and 1 ”bone” (hydroxyapatite) • The 333 Oral Drug Targets by class: – 123 enzymes; 72 GPCRs; 66 ion channels; 25 proteins; 23 transporters; 19 NHRs; 4 ’other’ receptors; and 1 nucleic acid • From WOMBAT (near 200,000 medicinal chemistry substances): at least 68 additional targets, of which 43 are human, are reported in the medicinal chemistry literature, with affinity > 10 nM for 171 launched drugs (revisit!) • In total, 492 targets, of which 379 are human, were found • So: How many Drug Targets? And how many small molecules can we develop to therapeutically manipulate them? • Part of the difficulty: there is no unique, standardized source to capture information related to small molecules (including drugs) and the macromolecules (proteins, nucleic acids) that interact with them. ErrorsErrors inin DrugDrug TargetTarget InformaticsInformatics

Errors of Target Structure Errors of Chemical Structure Errors of Biological Activity

…watch out for these UsingUsing XX--rayray CrystalCrystal StructuresStructures

“An X-ray crystal structure is one crystallographer’s subjective interpretation of an observed electron-density map expressed in terms of an atomic model” • Selection of Common assumptions – The structure is correct – It is at perfect resolution – No errors – conformation of ligand and D-R interactions correct – Water structure known – Crystallisation conditions relevant to biology – Protein conformation fixed – movement understood – We can use the structure to design potent ligands – A protein structure really aids drug design

The University of New Mexico A Davis, S Teague G Kleywegt Angew. Chem. Int. Ed. 2003 SCHOOL OF MEDICINE IsIs thethe XX--RayRay StructureStructure Correct?Correct?

1PHY 1989 2.4Å 2PHY 1995 1.4Å

• Photoactive yellow protein from E. Halophila – First structure 1PHY.. Wrong – Subsequently corrected at higher resolution

The University of New Mexico A Davis, S Teague G Kleywegt Angew. Chem. Int. Ed. 2003 SCHOOL OF MEDICINE DoDo wewe stillstill havehave Errors?Errors?

The University of New Mexico A Davis, S Teague G Kleywegt Angew. Chem. Int. Ed. 2003 SCHOOL OF MEDICINE ErratumErratum……..

Structural basis for BABIM Inhibition in Botulinium Neurotoxin Type B Protease J.Am.Chem. Soc.,2000, 122, 11268

“After a detailed structural analysis of the electron density maps…………………. we have concluded that the maps do not support the placement of the inhibitor as stated in the paper.

The University of New Mexico A Davis, S Teague G Kleywegt Angew. Chem. Int. Ed. 2003 SCHOOL OF MEDICINE DetailedDetailed StructuralStructural AnalysisAnalysis “Where there is no chicken wire, there’s no electrons..atoms”

• 1FQH now withdrawn from PDB

The University of New Mexico SCHOOL OF MEDICINE MakeMake suresure thethe proteinprotein isis wherewhere theythey saysay itit isis

• D98N nitrite reductase – Nitrite disordered – Asn density shows 2 confs – Only 1 reported in PDB file

The University of New Mexico SCHOOL OF MEDICINE ConformationConformation CorrectlyCorrectly AssignedAssigned

NH2 O

O NH2 H e.g. glutamine, asparagine N NNH N e.g. histidine

•NH2 & O can’t be distinguished from density as isoelectronic • PDBREPORT suggest 15% in Protein databank likely incorrect

O H OH N N H O

N

• N/C cannot normally be distinguished from density The University of New Mexico SCHOOL OF MEDICINE TheThe 2222nd AminoAmino AcidAcid

• gene product of in-frame amber(UAG) codon in methanosarcina barkeri monomethylamine methyltransferase

• X-ray structure at 1.55 Å – X is either Me, NH2 or OH

. B Hao, W Gong, T. Ferguson, C. James, J. Krzycki, M. Chan Science, 2002, 296, 1462.

The University of New Mexico SCHOOL OF MEDICINE

WhereWhere’’ss thethe Chlorine?Chlorine?

ATOM AtomNr Type Res.Type Res.Nr X Y Z Occupa B Factor LineNr ATOM 2474 CB GLU 310 52.404 81.843 26.409 1 88.06 2DHC2614 ATOM 2475 CG GLU 310 51.346 82.83 26.95 1 88.24 2DHC2615 ATOM 2476 CD GLU 310 50.029 82.289 27.457 1 88.23 2DHC2616 ATOM 2477 OE1 GLU 310 49.688081 81. 27.504 1 87.72 2DHC2617 ATOM 2478 OE2 GLU 310 49.297219 83. 27.855 1 88.37 2DHC2618 ATOM 2479 OXT GLU 310 55.988334 81. 25.998 1 90.28 2DHC2619 TER 2480 GLU 310 2DHC2620 HETATM 2481 CL1 DCE 600 26.746 104.755 31.15 1 17.69 2DHC2621 HETATM 2482 C1 DCE 600 28.195 105.019 30.194 1 21.52 2DHC2622 HETATM 2483 C2 DCE 600 29.452 105.069 31.044 1 20.78 2DHC2623 HETATM 2484 CL2 DCE 600 30.694 105.43 29.883 0.3 19.6 2DHC2624 HETATM 2485 O HOH 401 20.972 82.672 23.517 1 1 2DHC2625 HETATM 2486 O HOH 402 50.86 105.546 30.27 1 1 2DHC2626 HETATM 2487 O HOH 403 22.358 99.271 35.482 1 1 2DHC2627 LiteratureLiterature QCQC

• Chirality: What chemists can interpret, computers are not always able (the “above/below the plane” must be strictly enforced) Not machine-readable Machine-readable

NH2 N NH2 N N N R O N N R N N N N O OH OH

OH OH • Missing/altered atoms/substituents – overall error rate above 9% – Incorrectly drawn or written structures (3.4%); incorrect molecular formula or molecular weight (3.4%); – Unspecified binding position for substituents or ambiguous numbering scheme for the heterocyclic backbone (0.91%); – Structures with the incorrect backbone (0.71%); – Incorrect generic names or chemical names (0.24%); – Incorrect biological activity (0.34%); – Incorrect references (0.2%). JMCJMC ErrorsErrors……

Reference Published Structure Corrected Structure Comment JMC 37-476 rolipram: O chart 1 O incorrect N atom position O O O N N O JMC 43-2217 N N A-85380: chart 1 O incorrect ring size O N N O -||- O tropisetron: methyl & JMC 36- O group in plus 2645 O N N N N -||- O O DAU-6285: missing O NN N O N methoxy; N instead O

NO N N O

JMC 37-758 N O N O Ro-15-4513: chart 1 N methyl group missing N O O

N3 N N3 N H O O

JMC 37-787 O S O epalrestat: figure 1 S O E/Z config: E instead O N O N Z S S O

OtherOther ErrorsErrors…… MerckMerck IndexIndex

O H N OH H N O O NH2 NH N NH 2 O O O "Carisoprodol" Carisoprodol Merck Index 13th ed #1854 correct structure

Disclaimer: The above error have been corrected in Merck Index 14th edition. In general, the Merck Index is a reliable source of information. TheThe PhysicalPhysical BasisBasis forfor thethe RuleRule ofof FiveFive (Safe(Safe PlayPlay byby thethe FDA)FDA)

Tudor Oprea, Scott Boyer

Additional support: Igor Tetko (GSF, Munchen) CrashCrash CourseCourse inin PharmacokineticsPharmacokinetics • Oral bioavailability describes the rate and extent to which the active drug ingredient is absorbed from a drug product and becomes available at the site of drug action. It takes into account digestive- tract absorption and first-pass metabolism (reaching systemic circulation). %Oral ranges from zero (none) to 100 (complete). • Clearance: the hypothetical volume of a fluid from which a substance is totally and irreversibly removed per unit time t. Systemic (total) clearance = removal via excretion AND metabolism (includes hepatic, renal, other). Renal clearnace = elimination (via excretion). CrashCrash CourseCourse inin PharmacokineticsPharmacokinetics (2)(2) Half-life T1/2 estimates the amount of time it takes for the active drug ingredient to reach half- concentration in (usually) plasma. Half-life influences the dosing regimen: e.g., drugs with short half-life require more frequent administration. Volume of Distribution at steady-state, VDss, measures the relative partitioning between plasma and the tissues (it does not distinguish between various compartments). 693.0 ∗VDss Cltotal = T 2/1 Understanding Cl, VDss and T1/2 tells us how much and how often we should admininister the drugs CrashCrash CourseCourse inin PharmacokineticsPharmacokinetics (3)(3) Plasma Protein Binding describes the fraction (usually %) of the active drug that is bound to plasma proteins. The fraction unbound (fu) relates to pharmacological activity; the fraction bound (fb) acts as a (slow) release mechanism. Maximum recommended therapeutic dose, MRTD: Usually, the maximum daily dose in mg/kg-body weight (kg- bw) per day for which the desired pharmacological effect is achieved. Sometimes MRTD is the lowest dose for which the desired pharmacological effect is achived… And sometimes MRTD is the highest dose to achieve optimal pharmacological effect that does not lead to a maximal therapeutic effect since it is very close to the toxic dose MRTD_U: The MRTD corrected for fraction unbound = MRTD * (1-%PPB); a crude way to relate more to therapeutic and toxic effects HalfHalf LifeLife (222(222 compounds)compounds) G & G vs. Avery’s 75.23 80 70 60 43.69 50 % of compared compounds 40

30 15.77 20 2.70 10 0 >10 % >30 % >100 % >1000 % difference between sources

• There is bad agreement in terms of matching half life - over 43% of the compounds differ more than 30% Kim Fejgin (G.U.) and Péter Várkonyi (AZ) contributed to this analysis ClearanceClearance (202(202 compounds)compounds) G & G vs.

Avery’s 71.29 80 70 60 42.57 50 % of compared 40 compounds 30 13.86 20 2.97 10 0 >10 % >30 % >100 % >1000 % difference between sources

• There is bad agreement in terms of matching clearance - over 42% of the compounds differ more than 30%

Kim Fejgin (G.U.) and Péter Várkonyi (AZ) contributed to this analysis TheThe MisMis--UseUse ofof RO5RO5 ScoresScores

• Pharmaceutical lead 80% discovery world-wide apply 70% ACD Lipinski’s Rule-of-Five: MDDR MW ≤ 500, cLogP ≤ 5, PDR HDO ≤ 5, HAC ≤ 10. Any two violations = poor %Oral • Ro5 does not discriminate “druglikeness”. Its use is inteded as filter in early 20% HTS hit analysis/discovery. 10% • Do any RO5 criteria have 0% implications downstream? PASS FAIL SKIPPED

T.I. Oprea, J Comput-Aided Mol Des 2000, 14: 251-264 FDAFDA MaximumMaximum RecommendedRecommended TherapeuticTherapeutic DoseDose DatasetDataset MRTDs extracted from Martindale and PDR MRTD taken to represent MTD although many compounds limited by lack of additional efficacy MRTD ranges from 0.00001-1000 mg/kg/d Total of 1235 compounds FDA divided dataset into Active (”unsafe”), MarginalMarginal, Inactive (”safe”) 576:120120:613 MRTD data in the WOMBAT-PK database: 620 drugs 283:5151:286 WOMBAT-PK allows overlap of MRTD with VDss, %Oral, %PPB

Matthews et al., Current Drug Discovery Technologies, 2004, 1, 61-76 MRTDMRTD ““ActiveActive”” vsvs ““InactiveInactive””””:: NotNot aa functionfunction ofof %Oral%Oral

100

80

60

40

20

Inactive

100

80

60

40

20

Active 0.1 -20 20 -80 80 -100 Bin %Oral Bioavailability (380 Drugs) MRTDMRTD ““ActiveActive”” vsvs ““InactiveInactive””””:: NotNot aa functionfunction ofof %PPB%PPB

100 90 80 70 60 50 40 30 20 10 Inactive

100 90 80 70 60 50 40 30 20 10 Active 0 - 20 20 - 80 80 - 95 95 - 99 99 - 99.98 Bin %Plasma Protein Binding (377 Drugs) MRTDMRTD ActivityActivity ClassClass vs.vs. clogPclogP (n=443)(n=443)

2.63 1.08 1.841.84

10 ClogP

0

-10 Active Inactive Marginal

Dunnett’s p<0.001 MRTDMRTD ClassClass vs.vs. VDssVDss (n=451)(n=451)

1.00 Marginal

0.75

Inactive

0.50

0.25 Active

0.00 0.0167 - 0.5 0.5 - 1 1 - 10 10 - 250

VDss Bin (L/kg) TheThe PhysicalPhysical BasisBasis forfor Ro5Ro5 • The single most important criterion out of Lipinski’s Ro5 that relates to MRTD_U is related to CLogP:

• High cLogP (cLogP > 4.5), high LogD74 (LogD7.4 > 3) and low TLogSw

(TLogSw < -5) are associated with the low MRTD_U region. This also relates to low %Oral and high %PPB, to P450 metabolism (A. Davis et

al.), and to VDss (F. Lombardo et al). • This simple cut-off relates to how effective/toxic drugs can be… Kudos to Al Leo (Biobyte Corporation) • MW, HDO & HAC do not influence trends in the current MRTD_U dataset. These Ro5 criteria were introduced to • trim down combinatorial libraires (MW) • to improve %Oral (HDO, HAC) • The CDER Division @ FDA uses mg/kg/day, rooted in clinical practice. This is arbitrary when trying to link molecular properties to it. More realistic is the use of μM/kg/day. UrbanUrban Legend:Legend: BioavailabilityBioavailability andand FlexibilityFlexibility

%oral Paper by D. Veber et al., J. Med. Chem., 45, 2615 -2623, 2002

Flexibility is not Flexibility directly related to passive diffusion. Highly flexible lipids are major components of the membrane.

The lack of flexible molecules in our therapeutic arsenal reflects the bias towards inhibitors (~80%), where we aim at rigid molecules.

The University of New Mexico SCHOOL OF MEDICINE AdditionalAdditional $0.02$0.02 CommentsComments onon PharmaPharma • The drug discovery machine via HTS/combichem is over-rated and failed to produce tangible results in ~2 decades. There is no significant increase in nr. of NCEs in the past 5 years. • Faster drug discovery is stupid. Better design of clinical trials is what’s missing; better competence in the personnel running the clinical trials is what’s missing. • A return to “traditional values” (in vivo experiments) is very likely to help. Viagra, Aripiprazole – even systems chemical biology • Part of what’s wrong is the fact that everyone is hiding results from failed clinical trials. FDA has the info, but big pharma is not mature enough to consider it as pre-competitive knowledge (cell phone companies swap patents by the dozen, to improve individual products) • Big pharma houses should just dispense drugs & outsource research → there’s a genuine need for the NIH-funded Molecular Libraries Initiative and its followers

The University of New Mexico SCHOOL OF MEDICINE

The Tale of the Unknown Unknowns

• “There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know.” • Known Knowns – Sets & models that we have, which were validated externally. • Known Unknowns – Sets & models that we think we can prove, but lack external validation. Ridiculed by the media, Sec. Donald • Unknown Unknowns Rumsfeld (once CEO of a pharma – Sets & models that we do not have, company, Searle) was, in fact, which lack any validation. These include that part of model space that making a very serious point about we don’t even know exists the impossible-to-predict situations. Black Swans and White Tablets

Tudor I. Oprea UNM School of Medicine Andrew L. Hopkins University of Dundee College of Life Sciences Manuscript in preparation for Nature Rev. Drug Discov.

“We don’t learn that we don’t learn. We don’t learn rules, just facts.” NNT The Impact of the Unpredictable

• On November 30 2006, Jeff Kindler, Pfizer’s CEO, was quoted as saying about Torcetrapib that O O “…this will be one of the most important compounds of our generation." N • On December 2, 2006, Pfizer cut off Torcetrapib's F ILLUMINATE trial because of "an imbalance of F mortality and cardiovascular events" associated with N O F its use (82 vs 51 deaths in a 15,000 patients clinical O trial, comparing Lipitor/Torcetrapib vs Lipitor alone) F F • This was the most late-stage important compound in Pfizer’s portfolio. The event not only threw the F F F F financial projections of the world’s largest pharmaceutical plans into flux (e.g., closing Ann Cholesterylester Arbor), but it even raised questions about utility of Transfer Protein raising HDL, the main hypothesis governing anti- inhibitor... Is this arthrosclerosis therapy for the past decade. still a valid drug • This event was a Black Swan: an unpredictable target?... event, which had a massive impact. Source of Inspiration • NNT is a financial trader from Amioun, Lebanon, who learned that life, war and science are unpredictable through years of practice • The Black Swan is a metaphor for the first sighting of the black swan, which (a) invalidated the assumption that all swans are white – based on millions of previous observations; (b) changed our perception of those birds and (c) was retrospectively “assimilated” as a highly predictable event One Thousand and One Days Of History • The Turkey Surprise (NNT), also referred to as [David] “Hume’s Problem” was contemplated by Sextus Empiricus, Al- Ghazali, P.D. Huet, Francis Bacon, Bertrand Russell... • Sir Francis extended Past data as a disincentive for believing it to silent evidence (we that we can truly predict the unknown can’t learn from the hidden; we don’t learn that we never learn) Quality of Life (seen by a Turkey) Quality of Life Fig. 1: A turkey before and after Thanksgiving. The history of a process over a thousand days tells you nothing about what is to happen next. This naïve projection of the future from the past can be applied to anything

Adapted from “The Black Swan: the impact of the highly improbable”, by Nassim Nicholas Taleb Random House, Inc., New York: 2007, pg. 41 Predictions are always difficult, especially about the future. Niels Bohr The future ain’t what it used to be. Yogi Berra • But in all my experience, I have never been in any accident… of any sort worth speaking about. I have seen but one vessel in distress in all my years at sea. I never saw a wreck and never have been wrecked nor was I ever in any predicament that threatened to end in disaster of any sort. E.J. Smith, Captain, RMS Titanic, 1907 FIGURE 3: Data & Model, part 1

A series of a seemingly growing bacterial population (or of sales records, or of any variable observed through time – such as the total feeding of the turkey in Chapter 4

Taken from “The Black Swan: the impact of the highly improbable”, by Nassim Nicholas Taleb Random House, Inc., New York: 2007, pg. 186 FIGURE 4: Data & Model, part 2

Easy to fit the trend – there is one and only one linear model that fits the data. You can project a continuation into the future

Taken from “The Black Swan: the impact of the highly improbable”, by Nassim Nicholas Taleb Random House, Inc., New York: 2007, pg. 186 FIGURE 5: Data & Model(s), part 3

We look at a broader scale. Hey, other models also fit it rather well

Taken from “The Black Swan: the impact of the highly improbable”, by Nassim Nicholas Taleb Random House, Inc., New York: 2007, pg. 187 FIGURE 6: Data & Models, part 4

And the real “generating process” is extremely simple but it had nothing to do with a linear model! Some parts of it appear to be linear and we are fooled by extrapolating in a direct line.*

Taken from “The Black Swan: the impact of the highly improbable”, by Nassim Nicholas Taleb Random House, Inc., New York: 2007, pg. 187 A Past-time Prediction (is it True?)

Projected Sales for LosecTM (Orange) based on annual sales 1988-1998 18000

16000 In 2001, Losec

14000 went off-patent.

12000 AstraZeneca managed to maintain patent 10000 portfolio until mid-2002 8000 (pediatric use; nice trick!) 6000 Generic omeprazole 4000 started selling by Q2/02. 2000 AstraZeneca replaced 0 Losec with NexiumTM

1988 1990 1992 1994 1996 1998 2000Half 2002the drug,2004 2006 2x the2008 $ Never Tell Me the Odds! Han Solo, aboard the Millenium Falcon • We focus on preselected segments of the seen and generalize from it to the unseen: Error of confirmation (absence of evidence is not evidence of absence) • We fool ourselves with stories that cater to our “idealized” thirst for patterns: Narrative fallacy (people have selective memory & tend to generalize from single observations)

• We behave as if the Black Swan does not exist (people go on living as if death is the unlikeliest of events. Mors certa, ora incerta) • What we see is not all there is… The odds are not always evaluated properly: Silent evidence distortion (e.g., people play the Lottery focusing on winners, & forget the odds) • We “tunnel” on specific Black Swans (e.g., floods, fires) and rarely prepare for the unknown unknowns

Adapted from “The Black Swan: the impact of the highly improbable”, by Nassim Nicholas Taleb Random House, Inc., New York: 2007, pg. 50 We think we live in Mediocristan… but live in Extremistan

• Non-Scalable • Scalable • One datapoint does not • One datapoint can have high impact influence the average and great influence • Tyrrany of the collective • Tyrrany of the accidental • Winner-get-a-slice • Winner-takes-the-pizza • Ancestral environment • Modern environment • Applies to humans in physical • Applies to humans in social context context (weight) (money) • Easy to predict from what you • Past information rarely assists in see to what you don’t making predictions • 80/20 principle not obvious • 80/20 principle rules • “Bell curve” (Gaussian) • Power law / Mandelbrotian • Impervious to Black Swans • Vulnerable to Black Swans

Adapted from “The Black Swan: the impact of the highly improbable”, by Nassim Nicholas Taleb Random House, Inc., New York: 2007, pg. 36 We think we live in Mediocristan… but live in Extremistan • Solar system during our lifetime • Solar system in the long run (except for comets) (comets, supernovas, aliens, etc.) • Peoples’ weight & height • Peoples’ money & social networks • Peoples’ 1:1 conversations • TV & radio show-hosts • The world before Guttenberg • The world of printed books • Your untold stories • Tom Clancy, JK Rowling • Your hamburger & coffee • McDonald’s & Starbucks • The world before Bell • The world after the Internet •Your website • Google • Your “usual” scientific journals • Science, Nature – high-impact • Your emails • The blogosphere • Your local music band •Pink Floyd • Your hotdog • The one with everything

Adapted from “Black Swans & White Tablets”, by T.I.O. & A.L.H. (the point is: once you get it, you got it)

Failed Predictions, Rationalized Post-Hoc • Post hoc ergo propter hoc is a logical fallacy that permeates in science: the appearance of correlation is often thought to relate to causality (just because the R- square is high does not mean that variables X & Y share a causality relationship) • We blame outliers (chemical; biological; statistical), we blame the descriptor system (or the length of the simulation time), sometimes we blame the experiments, we forgot to take water or hydrophobics into account, even worse we delete the data when it does not fit the model! • We rarely query the outliers. That wealth of data, that unexpected, which we’re unfortunately trained to ignore has given the world many “serendipitously” discovered medicines such as Penicillin and Viagra. The Impact of the Highly Improbable • Recall the financial outlook that Pfizer’s CEO Jeffrey Kindler presented at an Analyst’s Meet 2 days before clinical data forced them to stop the Torcetrapib ILLUMINATE Trial (2006). • “Pfizer, Pfizer, Pfizer. Depending on your point of view, it's ironic, inspiring, or merely interesting that the company that staggered out of 2006 with its every vulnerability and vanity exposed in the media glare nonetheless finishes in Pharm Exec's winner's circle for the eighth year running. “ Pfizer ranked #1, with $45.08 Billion USD sales in 2006 • Imagine you worked at Merck on September 26 2004, benefiting from the $2.5 billion/year sales of Vioxx. By September 30, Merck announced a voluntary worldwide withdrawal of Rofecoxib (Vioxx) following results from the 3-year APPROVe trial, which showed an increased risk of cardiovascular events such as heart attack and stroke beginning after 18 months of treatment . Can you predict chronic (ab)use in clinical trials?! Merck ranked #7, with $22.64 Billion USD sales in 2006 • Bayer voluntarily withdrew Baycol (cerivastatin) on August 7 2001, following reports of side-effects of potentially fatal myopathy and rhabdomyolysis, particularly in patients co-treated with Gemfibrozil. Bayer ranked #15, with $9.87 Billion USD sales in 2006 What do these Drugs have in Common?

O O O O O N O N N N O S N OO Ximelagatran (Exanta) - AstraZeneca Rofecoxib (Vioxx) – Merck

F

N O O O F O O F F N

Cerivastatin (Baycol) - Bayer Dexfenfluramine (Redux) - Servier

O N O O O O S N O O Br O Bromfenac (Duract) - Wyeth Troglitazone (Rezulin) - Sankyo

All of them were withdrawn globally in the past decade Market Activity Cliffs

DRUG Name Total (Country first Com- Chemical Structure Period Reason for Therapetic Sales launched) pany (active compound) Launched withdrawal category 1996-2006 Generic Name O NH2 O following several DURACT 1997- cases of severe liver anti- 169.76 (USA) Wyeth 1998 failure in patients inflammatory; million Bromfenac OH taking bromfenac for analgesic USD Br >10 days F F anorexic, ADIFAX (France); 1970- following a US FDA 346.27 F recommendation due for the REDUX (USA) Servier million 1997 to abnormalities in treatment Dexfenfluramine NH USD heart valves of obesity

NH VAXAR N N 47.236 (Germany and 1998 - following reports of antibacterial, Otsuka severe cardiovascular million Denmark) OH 1999 quinolone events USD Grepafloxacin F O O O O due to increased risk anti- VIOXX of cardiovascular inflammatory; 13,319.98 (several Merck & 1999- events such as heart analgesic; million countries) Co 2004 attack and stroke S COX-2 USD Rofecoxib beginning after 18 OO months of treatment inhibitor What do these Drugs have in Common?

O N O N O N N NN N N S N NO

Omeprazole (Losec) - AstraZeneca Imatinib (Gleevec) – Novartis

O O O O O

N N O N OO N N S N N N O F Atorvastatin (Lipitor) - Pfizer Sildenafil (Viagra) – Pfizer

H N S N N N S N O N N O N O O Cimetidine (Tagamet) – GSK Penicillin G (generic)

All of them were positive Black Swans Predictions And Drug Discovery • One cannot travel forward in time with speeds > 1 sec/sec (did I miss anything?) • While informatics (machine learning) specialists assume that the chemical space related to drug discovery belongs to Mediocristan, in fact it belongs to Extremistan because of the highly hierarchical structure of that space (winner-takes-it-all) • Hint: The definition of a drug (an unknowable quantity) is done by a regulatory body based on available evidence submitted by a company Drug discovery as a business is highly impacted by the social aspect: The industry (people) decide to petition the regulatory agencies for an NDA, and the agency (people) VOTES on that drug's safety based on filed data • “Drug” is not a natural property of chemicals • Do not attach too much significance to “drug-likeness” Take Home Message 3 • Predictions Are Based on the Past • What follows is a highly personal opinion: – The known chemical and biological space can be mapped. – This implies that, within limits and given appropriate descriptors, we can generate (wrong) models that are useful in small increments – These models do not allow us to understand what Gerry Maggiora calls “affinity cliffs” – the “turkey surprise” in medicinal chemistry space – There needs to be a balanced choice between good coverage of chemical space (“diversity”) and biological space (“targets”) in order to build reliable models – As reliable as these models may be, within the constraints of the already- mapped (i.e., the past) chemical space, – … no machine learning method can predict the previously-unmapped chemical and biological space [the unknown unknown] • We keep pretending that Machine Learning / QSAR works • We are biased because most scientific papers we read are success stories! We are left to our own devices when it comes to failures – we have to learn the hard way (experience!)