Chemogenomics:Chemogenomics 19/4/07 16:30 Page 57

Genomics

CHEMOGENOMICS a gene family approach to parallel discovery

Currently available only target around 500 different proteins4. Recent reports from efforts to sequence the human suggest there are tens of thousands of genes1,2 and many more different proteins. Popular estimates of the number of ‘new’ drug targets that will emerge from genomic research range from 2,000 to 5,0003. A critical question as we enter the post-genomic world is: how can the pharmaceutical industry rapidly discover and develop medicines for these new targets to improve the human condition?

n the pharmaceutical industry to date, research QSAR, structure-based and informat- By Dr Paul R. Caron, and early development activities have typically ics, have accelerated the process4. Dr Michael D. Ibeen organised according to therapeutic area. Dramatically new and different drug discovery Mullican, Dr Robert In organising their drug discovery efforts in this approaches, however, are needed to take full D. Mashal, Dr Keith P. way, companies have sought to create efficiency by advantage of the massive influx of targets being Wilson, Dr Michael S. building a critical mass of expertise and experience elucidated through genomic research. Simply stat- Su and Dr Mark A. in the biology of related diseases. Over the past 20- ed, a therapeutic area focus and a single target Murcko 30 years this organisational approach has proved drug discovery approach do not create enough effi- successful for many companies. While there is no ciency to allow companies to keep pace with the doubt that this strategy produces some synergies in massive inflows of new target information. An early stage research, greater efficiency in late-stage ideal solution would be to accelerate drug discov- clinical development and marketing is the main ery by processing multiple related targets in paral- driver for the organisation of research and devel- lel, reusing information and know-how across tar- opment resources along therapeutic area lines. gets in a way that allows chemistry to be broadly Pharmaceutical companies have also traditional- leveraged. Drug discovery approaches that focus ly tackled one protein target at a time in drug dis- on structurally similar protein families, and thus covery. Over the years, increasingly sophisticated leverage the way which particular classes of chem- technologies and approaches have increased the ical compounds will interact with targets within efficiency of drug discovery based on single targets the same family, may be just such an ideal solution. at a pace sufficient to keep the pipelines of many Just as the fields of and are major companies well-stocked with promising broadly characterised as the identification and development candidates. The development and classification of all the genes and proteins in a application of high-throughput chemical synthesis genome, the field of chemogenomics may be char- and in vitro biological screening, for example, as acterised as the discovery and description of all well as new computational methods applied to possible drugs to all possible drug targets5.

Drug Discovery World Fall 2001 57 Chemogenomics:Chemogenomics 19/4/07 16:30 Page 58

Genomics

Figure 1 Scaffold morphing and target hopping are two key concepts in chemogenomics. Scaffold morphing is the generation of multiple, chemically distinct lead classes (‘scaffolds’) against any particular target. Target hopping is the ability of compounds from the same lead classes to interact with multiple targets – in effect, to be ‘reused’. Importantly, while the scaffold class is reusable, different specific compounds from the same scaffold class will be optimal for different targets in the family

Analogous to genomics and proteomics, success in tion and robotics-has helped to increase the effi- chemogenomics will require not only highly inte- ciency of drug discovery in important ways. The grated technology and computational advances, requirement of practical expertise in many parts of but will necessitate fundamental changes in the of the drug discovery process, however, suggest pharmaceutical drug discovery process. Any signif- that there is a limit to the efficiency that will be cre- icant progress towards this goal could generate a ated by automation. formidable package of patentable drug molecules. A major potential source of efficiency in any process lies in the reuse of information and know- Organising research by gene family how. A gene family approach to drug discovery Industrialising parts of the drug discovery process- seeks to exploit this efficiency to its maximum. by incorporating parallel processing, miniaturisa- Targets within a gene family – defined by homol- ogy at the protein sequence level – will often have very similar in vitro assays and properties, provid- ing some leverage of biology resources. In addi- Chemogenomics is distinct from chemical . Chemical genetics tion, a significant percentage of compounds (sometimes called ‘reverse chemical genetics’) entails the use of defined designed and synthesised against one family mem- chemical probes to help understand biological targets and pathways. The ber will be active against other family members, which can allow on multiple fundamental premise is that chemical probes, if of sufficient potency and targets to have a common starting point. In addi- selectivity in cellular or animal models, can be used to help understand and tion to creating efficiency, reuse of chemical and to prioritise those targets of the greatest therapeutic relevance. Thus chemical biological information may produce intellectual genetics as currently described in the literature is essentially a ‘target property that is transferable among related tar- gets. Some of these concepts have been touched validation’ technology. Chemogenomics, on the other hand, is principally a upon by other groups7-11. ‘chemical’ technology which aims to produce new chemical entities (NCEs) – Based on our experience with employing struc- clinical development candidates – as efficiently as possible. The molecules tural biology and modelling approaches together which derive from the chemogenomics approach can of course be used as with combinatorial and medicinal chemistry, we have found that it is possible to design multiple chemical probes in a ‘target validation’ sense as well. classes of compounds to inhibit each target within a gene family. We refer to this as scaffold morphing.

58 Drug Discovery World Fall 2001 Chemogenomics:Chemogenomics 19/4/07 16:30 Page 60

Genomics

Figure 2 Sequence homology is often a good predictor of three- dimensional structural homology. On the left panel is the crystal structure of caspase-1 (ICE) colour coded by the sequence homology of a set of 10 different caspases. Blue = highly conserved sequences, white = intermediate homology, and red = low homology. On the left panel is the crystal structure of caspase-1 colour coded by the three- dimensional structural variation in the C-α, positions taken from a superposition of five different caspases. Blue = highly conserved C-α positions, white = intermediate and red = low structural conservation

Once created, the molecular libraries may be The completed genome sequence enables the screened against multiple targets within the family, identification and classification of all members of a and the breadth of activity of each active chemical gene family into subfamilies based on a number of scaffold may be rapidly explored. This ‘compound criteria: overall sequence homology, domain struc- reuse’ strategy is sometimes called target hopping. ture, and/or transcriptional regulation. This The combination of ‘morphing and hopping’ are genome-wide perspective distinguishes chemoge- essential for the rapid generation of multiple devel- nomics from traditional gene family research. opment candidates against multiple targets within a family (Figure 1). The need for therapeutic area The three-dimensional structures of approxi- knowledge mately 15,000 proteins are now publicly available A central tenet of the chemogenomic approach is and there is a recent surge of interest in the public that efficiencies will result from reuse of informa- and private sector to actively obtain representative tion and compounds, driven by the overlap structures for novel proteins. The combination of between chemical space and the active sites of the this raw data and refined homology modelling protein family members. Maximal increases in effi- tools is now enabling the structures of a large num- ciency and productivity can only occur when this ber of pharmaceutically relevant protein targets to knowledge is concentrated in a single discovery be predicted as well as the shapes and physical unit which is organised along areas of ‘chemoge- properties of potential binding sites. The nomic space’. We believe it is possible to have this ability to map genomic data on to protein struc- organisational structure from project initiation tures provides the framework linking three billion through first in man studies. As, Cs, Gs and Ts to drug design chemistry. A distinct advantage of this discovery structure

60 Drug Discovery World Fall 2001 Chemogenomics:Chemogenomics 19/4/07 16:30 Page 61

Genomics

entire target family, together with a representative References subset of protein structures, allows one to build 1 Lander ES, Linton LM, Birren three-dimensional models for the entire protein B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, family and to map the interactions of a given sub- Doyle M, FitzHugh W, et al. strate or inhibitor and specific residues in the tar- Initial sequencing and analysis of get even when detailed structural data is unavail- the human genome. Nature able. This provides a firm understanding of the 2001, 409:860-921. subset of residues which provide key inhibitor 2Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, interactions, and enables the prediction of Smith HO, Yandell M, Evans CA, inhibitor specificity. For example, inhibitors Holt RA, et al. The sequence of which make strong interactions with unique or the human genome. Science 2001, ‘rare’ residues are likely to demonstrate more tar- 291:1304-1351. get specificity. We and others have demonstrated 3 Drews J. In Human Disease – from Genetic Causes to Bio- that single amino acid changes are sufficient to chemical Effects. Edited by Drews 12-15 generate specificity in protein kinases . J, Ryser S. Blackwell; 1997:5-9. It must be appreciated that high-resolution struc- 4 Drews J, Ryser S. The role of tural data is not required in the chemogenomics innovation in drug development. strategy. Naturally, the ability to map protein Nat Biotechnol 1997, 15:1318- 1319. sequence on to inhibitor-target co-complex struc- 5 Caron PR, Mullican MD, Mashal tures provides a fundamental link between the RD, Wilson KP, Su MS, Murcko genomic sequence information and the medicinal MA. Chemogenomic approaches chemistry required for drug design. However, at the to drug discovery. Curr Opin amino acid level, it is possible to utilise various pro- Chem Biol 2001, 5:464-470. 6 Schreiber SL. Chemical tein folding prediction methods and mutagenesis genetics resulting from a data to build respectable models of most proteins. passion for synthetic organic These models would be sufficient to provide guid- chemistry. Bioorg Med Chem ance for chemistry both with respect to potency and 1998, 6:1127-1152. specificity. The recent publication of the structure 7 Ohlstein EH, Ruffolo RR, Elliott JD. Drug discovery in the next of a mammalian GPCR and the increasing number millennium. Annu Rev Pharmacol 18-20 of publications on membrane-bound proteins Toxicol 2000, 40:177-191. suggest that sufficient structural information to 8 Lehman J, Baxter A, Brown D, construct such models will be available. Connolly P, Geysin M, Hayes M, Howard R, Knowles J, Lee M, Lyall A, et al. Systematization of Research. occurs for targets with multiple potential therapeu- Chemogenomics: the caspase example Nature 1996, 384 Supp 7:5. tic indications. Often, the optimal compound char- An example of an interesting gene family is the cas- 9 Frye SV. Structure-activity acteristics (intravenous versus oral dosing; formu- pases, cysteine proteases with specificity for cleav- relationship homology (SARAH): lation; specificity) will differ across potential ther- age after aspartyl residues. Interleukin-1ß convert- a conceptual framework for drug apeutic indications. During lead optimisation, dif- ing (ICE) has been shown to be essential discovery in the genomic era. Chem Biol 1999, 6:R3-7. fering compound characteristics across multiple for cytokine processing and is currently being pur- 10 Thorpe DS. Forecasting roles 25 therapeutic can be explored fully, whereas in a tra- sued as a drug target . There are also roughly a of combinatorial chemistry in the ditional pharmaceutical organisation discovery dozen caspases with sequence homology to ICE, age of genomically derived drug efforts are usually confined towards only a single and while the exact function of all of these is not discovery targets. Comb Chem therapeutic indication. While the therapeutic area known, it is clear that some of these have an High Throughput Screen 2000, 3:421-436. organisational structure is sub-optimal for a important role in regulation of apoptosis26. 11 Debouck C, Metcalf B. The chemogenomic approach to discovery, there Structural insights through x-ray crystallography impact of genomics on drug remains a clear rationale for organising late stage facilitated the rapid identification of selective discovery. Annu Rev Pharmacol development (phases II-IV) and commercial opera- inhibitors of these other potential drug targets. Toxicol 2000, 40:193-207. tions along therapeutic lines. Experience with ICE has also been applied to the 12 Wilson KP, McCaffrey PG, Hsiao K, Pazhanisamy S, Galullo other caspase family members: expression, purifi- V, Bemis GW, Fitzgibbon MJ, How genomic data can drive drug cation, assay development, crystallisation, and Caron PR, Murcko MA, Su MS. discovery structure determination of these homologs27. The structural basis for the Having complete knowledge of all the members The x-ray crystal structures of the caspases also specificity of pyridinylimidazole of a given gene family provides a new perspective nicely highlight the way in which sequence conser- inhibitors of p38 MAP kinase. Chem Biol 1997, 4:423-431. on the drug discovery process. The availability of vation can be a good predictor of three-dimensional complete gene sequences information for an structural conservation (Figure 2). This is important Continued on page 62

Drug Discovery World Fall 2001 61 Chemogenomics:Chemogenomics 19/4/07 16:30 Page 62

Genomics

Continued from page 61 for the chemogenomics approach because the com- family initiatives fully utilise such specialised bination of structural and sequence information knowledge in particular therapeutic areas while at 13 Lisnock J, Tebben A, Frantz B, O’Neill EA, Croft G, O’Keefe enables rapid drug design progress within families the same time not limiting the pursuit of targets in SJ, Li B, Hacker C, de Laszlo S, even without having the high-resolution structures other therapeutic areas. These particular chal- Smith A, et al. Molecular basis of every target of interest within those families. lenges, while complex, are not insurmountable for p38 protein kinase inhibitor with foresight and skillful planning. specificity. Biochemistry 1998, Measures of success 37:16573-16581. 14 Fox T, Coll JT, Xie X, Ford The success of a chemogenomics approach should Acknowledgement PJ, Germann UA, Porter MD, be quantifiable using a number of parameters – such We would like to thank Michael Partridge for his Pazhanisamy S, Fleming MA, as the number of patents, the number of compounds help with revising this manuscript. DDW Galullo V, Su MS, et al. A single synthesised to get to the clinic, and ultimately amino acid substitution makes increased numbers of approved drugs and sales. It is Paul Caron is the Director of Informatics and ERK2 susceptible to pyridinyl imidazole inhibitors of p38 expected that patent applications generated with a leads the kinase target selection group at Vertex. MAP kinase. Protein Sci 1998, chemogenomic background would be able to sup- Paul holds a PhD from Johns Hopkins University 7:2249-2255. port claims which cover a broader range of chemi- in biochemistry and was a post-doctoral fellow at 15 Gum RJ, McLaughlin MM, cal space than average and would be able to include Harvard. He joined Vertex in 1994. Kumar S, Wang Z, Bower MJ, a comprehensive list of defined molecular targets Lee JC, Adams JL, Livi GP, Goldsmith EJ, Young PR. and indications. Increased efficiency in identifying Michael Su is the Senior Research Fellow and Acquisition of sensitivity of potent chemical leads with ‘drug-like’ properties Head of Biology, and co-project head of the Vertex stress-activated protein should accelerate the process of driving these leads kinase programme. Michael holds a PhD from kinases to the p38 inhibitor, SB into clinical development candidates with the Duke University in molecular biology and was a 203580, by alteration of one desired pharmacologic and pharmacokinetic param- post-doctoral fellow at Harvard. He joined Vertex or more amino acids within the ATP binding pocket. J Biol eters without compromising quality. As the first of in 1990. Chem 1998, 273:15605-15610. the candidate molecules generated using this inte- 16 Doyle DA, Morais Cabral J, grated approach enter the clinic in the near future, Keith Wilson holds the title of Senior Research Pfuetzner RA, Kuo A, Gulbis the ability of chemogenomics to increase overall Fellow and Head of . Along with JM, Cohen SL, Chait BT, productivity in the pharmaceutical industry will Dr Su, Keith is co-project head of the Vertex kinase MacKinnon R. The structure of the potassium channel: start to be apparent in the number of new molecu- programme. Dr Wilson received his PhD in struc- molecular basis of K+ lar entities approved by the FDA within five years. tural biology from the University of Oregon and conduction and selectivity. joined Vertex in 1992. Science 1998, 280:69-77. Summary 17 Palczewski K, Kumasaka T, The chemogenomics approach, where gene sequence Robert Mashal holds the title of Program Hori T, Behnke CA, Motoshima H, Fox BA, Le Trong I, Teller DC, information is combined with protein structure Executive for Multi-Drug Resistance and is Okada T, Stenkamp RE, et al. and/or models to link to chemical inhibitors, is responsible for clinical oncology for kinase Crystal structure of rhodopsin: designed to fully utilise the sequence information by inhibitors. Robert received his MD degree from A G protein-coupled . considering large families of gene targets at once. Johns Hopkins University and was an instructor Science 2000, 289:739-745. This highly parallel approach depends on well- at the Harvard Medical School. Prior to joining 18 Sukharev S, Betanzos M, Chiang CS, Guy HR. The gating established methods such as combinatorial chem- Vertex in 1998, Dr Mashal worked at McKinsey mechanism of the large istry, high-throughput screening, computational & Company. mechanosensitive channel MscL. chemistry, structural biology and , all Nature 2001, 409:720-724. of which drive the efficient re-use of information, Michael Mullican is a Principal Investigator in 19 Dinarello CA. Interleukin-1 reagents, methods and know-how as research teams the Department of Medicinal Chemistry and co- beta, interleukin-18, and the interleukin-1 beta converting move from one group of targets to the next. ordinates all chemistry activities on the Vertex enzyme. Ann N Y Acad Sci It is still an open question whether a gene fami- kinase programme. Mike received his PhD in 1998, 856:1-11. ly focus is more efficient than a ‘traditional’ organic chemistry from the University of Kansas. 20 Marks N, Berg MJ. Recent approach. However, early results from our Prior to joining Vertex in 1992, Dr Mullican advances on neuronal caspases research into the protein kinases and the caspases worked at Parke-Davis Pharmaceuticals in Ann in development and neurodegeneration. Neurochem support our opinion that a gene family approach Arbor, Michigan. Int 1999, 35:195-220. can provide a more efficient process for generating 21 Wei Y, Fox T, Chambers SP, late-stage development candidates. Clinical and Mark Murcko holds the position of Vice-President Sintchak J, Coll JT, Golec JM, marketing expertise in specific therapeutic areas is and Chief Technology Officer, and also chairs the Swenson L, Wilson KP, Charifson also of great importance to the success of any new Vertex Scientific Advisory Board. Mark received PS. The structures of caspases-1, - 3, -7 and -8 reveal the basis for drug, and traditionally pharmaceutical companies his PhD in organic chemistry from Yale. Prior to substrate and inhibitor selectivity. have organised their R&D efforts to capture the joining Vertex in 1990, Mark worked at Merck Chem Biol 2000, 7:423-432. advantages of this expertise. It is essential that gene Sharpe and Dohme in West Point, Pennsylvania.

62 Drug Discovery World Fall 2001