<<

Informatics

predictive chemoinformatics applications to the

While significant advances in chemoinformatics present tremendous opportunities to improve human health, the future of chemoinformatics in the pharmaceutical industry is not without significant challenges.

hemoinformatics is the result of collective development, chemoinformatics provides the tools By Dr Leslie J. advances in , biology, computer to compare expression of genes and proteins as well Browne and C sciences and statistics and refers to the as complex signalling processes in disease and nor- Laurie L.Taylor electronic tools, methods and data used for analy- mal tissues and impacts concretely on selection of sis and predictive computation of drug effects on therapeutic targets (Figure 2). Differential gene and complex biological processes (Figure 1). Scientific protein expression profiles related to a disease state milestones over the last 50 years which have con- (eg cancer) promise to help fine-tune diagnoses and tributed to the evolution of predictive chemoinfor- improve the accuracy of prognostic indicators to matics include the development of the DNA double best serve individual patient needs. can helix model by Watson and Crick (1953); sequenc- now design compounds with improved drug-like ing of the first protein-bovine insulin by Sanger qualities through computerised structure/activity (1955); protein crystallography by Perutz (1954); the first integrated circuit by Kilby at Texas Instruments (1958); recombinant DNA technology Figure 1 by Berg et al (1972); conception of the Internet by Human Cerf and Kahn (1974); development of 2-D gel sequenced 3000Mbp D. melangaster genome electrophoresis (1975); identification of protein sequenced 2001 structure NMR by Wuthrich (1980); creation of S. cerevisiae E.Coli 2000 sequenced the first personal computers by IBM (1981); poly- H. influenzae genome 1996 merase chain reaction technology by Mullis et al sequenced 1995 Expressed Sequence (1985); creation of the SWISS-PROT database Tags (Ventner) 1994 Milestones in predictive (1986); the founding of the NCBI (1988); creation WWW protocols 1991 chemoinformatics of BLAST by Altschul (1990); development of developed (CERN) BLAST search programme WWW protocols by the CERN (1991); identifica- 1990 NCBI created (Altschul et al) tion and significance of ESTs by Ventner (1991); founded 1988 SWISS-PROT database sequencing of the entire genomes of H. influenzae, 1986 established Polymerase S.cerevisiae (12Mb), E. Coli (1995-1996); D Personal Computers Chain Reaction 1985 developed (IBM) Recombinant First protein melangaster (180Mb) in 2000; and the human (Mullis et al) 1981 2D gel DNA sequence DNA Double Helix 1980 genome 3000Mbp in 2001. electrophoresis (Berg et al) (Sanger) (Watson Crick) Protein Structure 1975 1974 by NMR (Wuthrich) 1972 Advances in chemoinformatics related to Internet 1958 1955 1954 1953 , and computer-assisted chem- (Cerf and Kahn) First integrated Protein circuit (Kilby at TI) Crystallography ical modelling hold tremendous promise to improve (Perutz) human health. For pharmaceutical research and

Drug Discovery World Fall 2002 71 Informatics

Genes and drug response

DNA Nucleus Cell membrane

DRUG TARGETS

DNA bases

Chain of amino acids Gene

mRNA

Ribosome Altered protein

Efficacy DNA Protein Variable vs variants variants responses toxicity

Figure 2 modelling – often reducing the number of com- so high – about 1 in 10 drug candidates survives Cellular checkpoints for pounds tested, compared with conventional trial- from initiation of clinical evaluation to market therapeutic intervention and-error methods. Drugs themselves affect expres- launch (Figure 4) – even a modest improvement to sion of a wide variety of genes and proteins, and 1 in 5 halves the development cost. Many drug fail- individual patient responses to drugs differ in ures are the result of ‘off target’ activity, ie poor metabolism and toxicity. side effect profiles that offset the potential thera- Pharmaceutical companies are highly motivated peutic effect. Structure-based design to reduce the discovery-to-market time and cost. and structure-activity data of existing bioactive Increased R&D dollars dedicated to the business of compounds facilitate the design of new compounds discovering new therapeutics have not resulted in a with the critical ‘drug-like’ qualities, in addition to correspondingly increased number of successful potency and efficacy at the therapeutic target: a drugs on the market. The pre-market failure rate of necessity for successful pre-clinical and clinical drug candidates has been measured and remea- development. The ability to project the in vitro sured from varying perspectives but always leads effects of a candidate drug into predictive models to the unavoidable conclusion that the process is of broader in vivo systemic effects earlier in the inefficient. More than 50% of failures are due to discovery process, will benefit the industry by lack of efficacy or unexpected animal toxicity. reducing failure rates, the developer by reducing (Figure 3). It now costs an average of $800 million costs and the consumer by helping get better drugs to bring a new product to market1. This includes, to the market. of course, the cost of the numerous failures and Despite the expense and time committed to their consumption of R&D dollars – a cost that is drug development, approved drugs have fre- passed on to the consumer. Since the failure rate is quently been withdrawn from the market due to

72 World Fall 2002 Informatics

severe adverse drug reactions (ADR). Between October 1997 and September 1998, a number of Causes of drug failure FDA-approved drugs were withdrawn, but not before being prescribed to 20 million patients in the US alone2. Animal toxicity Pharmacokinetics Importantly, the side effects that resulted in the 17% 7% ADR might have been measured and potentially designed out of the drug candidates had there been a means of identifying in advance the full spectrum Miscellaneous 7% of its potential side effects. While additional pre- market animal and human evaluation might decrease the number of drugs withdrawn from the market, the additional cost would be significant. In contrast, new chemoinformatics tools can be used Adverse effects to identify potential liabilities and benefits much 16% earlier in the discovery process. Identifying and Efficacy eliminating likely failures earlier permits efforts to Commercial 46% be focused on higher quality compounds, resulting 7% in more efficacious drugs produced at lower over- Figure 3 all cost.

Chemogenomics applied to the discovery of new therapeutic agents Overview: While the physiological response of animals to drug treatment is the mainstay of efficacy and safety evaluation for drug develop- ment, the nature of conventional pre-clinical evaluation methods means that only a few important physiological parameters can be Better efficacy and toxicology assessed at a time. The new options provided by predictions will reduce attrition genomics and proteomics is to assess broadly the effect of a compound on the system as a whole 400 1 by looking at the and the pro- Validated Lead Candidate IND/Phase 1 Phase II Phase III NDA 3rd year teome. As the tools are developed, it will be pos- idea compound on market sible to look not only at mRNA in high through- 200 put but also the resultant individual protein, its 50 conformation and its phosphorylation state, etc 0.1 to get the fullest possible picture of what is hap- pening at the molecular level in response to com- 12 pound treatment. – or ‘ with 0.01 genomics tools’ – combines the strengths of tradi- tional pharmacology and the mechanistic 3 approach to drug discovery. Since an intact bio- survivingFraction of programmes 1 logical system is the focus of the evaluation, it is 0.001 contextually information-rich. The effects of a Basic Discovery Pre-clinical Clinical compound are examined in the context of other biological processes it affects in addition to the Discovery Pre-clinical Clinical ADME ADME To xicity target for which it was designed. For example, To xicity 12% ADME 20% 33% 34% To xicity this approach allows for compensatory and regu- 40% 38% latory mechanisms to influence the phenotypic outcome, as measured by the genomic response of Efficacy Efficacy Efficacy the system. Furthermore, since the analysis views 40% 33% 50% all, or at least a large proportion of, induced Figure 4 genomic changes within an organism, an

Drug Discovery World Fall 2002 73 Informatics

Figure 5 COMPOUNDS Gene expression class signatures DNA crosslinking signatures

Statin signatures

PPARa signatures SIGNATURES

NSAID signatures

Hepatotox signatures Sulindac Busulfan Cisplatin Ibuprofen Naproxen Clofibrate Lovastatin enofibrate Dicumarol Fluvastatin Diolofenac Bezafibrate Simvastatin F Gemfibrozil Carboplatin Atorvastatin Progesterone Indomethacin Clofibric Acid Beta Estradiol Norethindrone Ethinylestradiol Ethylene Glycol Ethylene Diethylstilbestrol Carbon Tetrachloride 1 Naphthyl Isothiocya 1 Naphthyl Bis 2 Ethylhexyl Phtha Bis 2 Ethylhexyl

improved understanding of the breadth of com- information effectively to make key drug discovery pound action on target-related genes, as well as decisions. One approach is to characterise the unrelated genes, is possible. effects of existing, well-understood drugs in While the immediate promise of chemoge- chemogenomic terms and translate the knowledge nomics is to increase the efficiency of drug discov- to prediction and interpretation of the effects of ery and development by eliminating failures early, new drug candidates. it offers the potential to improve drug quality by treating disease pathophysiology rather than Applications: The key to the successful application symptoms. The use of gene expression profiles of chemogenomics is interpreting the enormous involving multiple genes, whose misregulation amount of information obtained from each exper- have been implicated in a disease state, represents iment. Although it is seductive to try to analyse the a novel, although unproven, approach to drug dis- genomic profile of individual compounds, under- covery. Since many of the most important unmet standing the biology of classes of well-known com- medical needs are polygenic diseases, in which sev- pounds in genomic terms offers an improved plat- eral genes contribute to the disease in a complex form on which to base understanding the profiles way, a drug discovery approach that identifies of new drug candidates. modulators of multiple genes in concert has the potential to uncover treatments for the underlying Chemogenomic profiles: Diverse ways of extract- cause of the disease. ing information from chemogenomic data are Chemogenomics, then, is the interaction of being developed using a variety of statistical and chemical compounds and living systems in terms of computational approaches. Certainly one the induced genomic response. For example, approach to achieving this objective is to collect instead of examining only a few changes in mRNA the genomic and pharmacological response of the expression in a single experiment, an entire tran- target tissue or cell type to treatment with a chem- scriptional state (10,000 or more changes in ical compound. Each compound profile is the mRNA levels) of an organism may be analysed compound’s own signature of transcriptional and using microarray chips. The challenge is to inter- molecular pharmacological effects (Figure 5). pret what these changes mean and how to use the While this has utility, extending the analysis to

74 Drug Discovery World Fall 2002 Informatics

look at compound families, eg related by a com- ● Elucidate the mechanism of toxicity in target mon therapeutic use, mechanism or by structural organs. similarity, has even more value since it provides a means of extracting the biomarkers associated In vivo and in vitro gene expression: Gene expres- with a class effect, eg the therapeutic signature, as sion profiles can be measured in both in vivo and well as compound specific effects, eg side-effect in vitro experiments. The in vivo approach has the signatures. The total activity profile of a com- advantage for that the pathologi- pound comprises multiple signatures representing cal outcome can be measured in the intact animal its structure, on- and off-target mechanistic and correlated to the genomic response. The prin- effects, side effects and therapeutic effects. While ciple disadvantages are the cost of whole animal such profiles clearly exist, the challenge is how to experiments and the requirement for relatively identify them and make use of them in making large quantities of compound. decisions that improve the quality of drug discov- In vitro systems include the use of whole organs, ery and development. The principle challenge for tissue slices, primary cells, conditionally immor- chemo- and is to develop computa- talised and immortalised cell lines and generally tional methods capable of deciphering informa- requires less test compound than in vivo test sys- tion contained in chemogenomic profiles and tems. The divergence of an in vitro surrogate sys- effectively displaying the results for more effective tem from an in vivo system has to be carefully con- ‘next-step’ decisions in drug candidate selection sidered depending on the application. While whole and development. organ preparations and tissue slices may correlate better with in vivo models physiologically and Toxicogenomics: A series of compounds known metabolically, reproducibility and availability rep- to cause a particular toxicity are employed in an resent significant hurdles especially for high in vivo or in vitro experiment to induce genomic throughput gene expression analysis. changes, eg transcriptional, to derive the genom- The principle disadvantages of in vitro cell sys- ic profile. The resulting set of biomarkers, or sig- tems are the lack of cellular heterogeneity and nature, reflects genomic changes that represent integrity of the whole organs from which they were the compound-induced phenotype. The signature derived and the fact they suffer almost universally contained in the chemogenomic profile of a drug from the lack of full metabolic capability. We have candidate may indicate that the candidate pos- found in vitro experiments to be particularly useful sesses similar properties to compound classes for mechanistic studies when a phenotypic end- from whose profiles a specific class signature was point can often be measured (eg cell death as the derived. When the signature is derived from functional endpoint of apoptotic gene expression). known toxicants, it may be useful in predictive For mechanism of action studies where metabol- toxicology. This type of application, known as ic competency is less important, in vitro cell lines toxicogenomics – toxicology with genomics tools offer considerable advantages: – is becoming a generally accepted approach in the pharmaceutical industry to identifying com- ● Reduced compound need – less than 100mg may pounds with potential safety problems before be adequate for in vitro work. they are evaluated in costly regulatory toxicolo- ● Ready access to human cellular systems – may gy studies3-5. be preferable to non-human mammals. To date, investigations in toxicogenomics fre- ● Faster turnaround – 24 hours treatments may be quently involve in vivo studies in rats since this sufficient. species is commonly employed as the primary ● Higher throughput – cell culture amenable to model by the pharmaceutical industry. The pres- miniaturisation. ence of biomarkers of toxicity can alert investiga- tors to potential overt toxicity, such as necrosis or Drug discovery: In the same way toxicity signa- organ pathology. By comparing an investigational tures are derived, biomarkers for other compound compound’s chemogenomic profile with known effects, eg mechanism of action, can be deduced. signatures of toxicity, it is possible to: By assessing the effects of a broad range of chemi- cal compounds, chemogenomics has yielded signa- ● Match drug candidate profiles against known tures for mechanistic, structural and therapeutic toxicity profiles. classes and even subtle off-target effects. Recent ● Compare compounds by degree of toxicity. studies performed at Iconix and MDS Pharma ● Anticipate compound-induced pathology. Services demonstrate the predictive power of this

Drug Discovery World Fall 2002 75 Informatics

Figure 6 Genes differentially induced by Gemfibrozil COMPOUND TREATMENTS

Genes expressed only with Gemfibrozil

Genes common to all fibrates REGULATED GENES REGULATED

approach. By comparing the genomic profile for Future considerations Gemfibrozil to those of the class of fibrates to The future of chemoinformatics in the pharmaceu- which it belongs, it has been possible to identify tical industry is not without significant challenges. the genomic responses uncommon to the class, but Despite multiple public-access genomics, pro- specific to Gemfibrozil. This approach identified a teomics, sequence and functional pathway search pathway regulated by Gemfibrozil but not by other engines and databases, large amounts of data reside members of the fibrate class, and may account for in proprietary databases with restricted access and the effects of Gemfibrozil that differentiate it from proprietary search algorithms. Collation, analysis other fibrates6 (Figure 6). and meaningful interpretation of disparate biologi- In summary, the chemogenomics advantage cal data present an arduous challenge to computer applied to understanding the broad spectrum of scientists and biologists alike. This fact is evidenced effects of chemical compounds on a living system is by the growing number of bioinformatics services to characterise, evaluate and prioritise compounds and products offering database search, analysis and for further optimisation or, if necessary, elimina- reporting tools as well as proprietary databases tion from further consideration. Future directions populated with gene, protein, pharmacological and include extending our reach to develop surrogate molecular descriptors. In addition, many databases genomic and proteomic markers for drug optimi- are populated with data generated under a variety sation and design. Chemogenomic studies of drug of experimental conditions with varying degrees of effects on differential gene expression as well as accuracy and/or relevance to a given biological post-translational protein expression will help model or target. Analysis of complex biological sys- address current challenges in the integration of our tems from gene to the organism level will necessar- understanding of genomics and proteomics, eg cor- ily require development of mathematical and statis- relation of gene to protein and the complexity of tical algorithms to analyse very large data sets intracellular signalling, processing and regulatory through close collaboration between biologist pathways in health and disease. and mathematician.

76 Drug Discovery World Fall 2002 Informatics

Clearly, predictive modelling, based on group’s achievements was the discovery of References compound structure and target molecule motifs Diovan, the second angiotensin ll antagonist ever 1 Outlook 2002.Tufts Center using chemoinformatics alone, will not obviate the to be marketed. Dr Browne received his PhD for the Study of Drug Development. need for validation through in vitro and in vivo from the University of Michigan, with a postdoc- 2 Lasser, KE et al. JAMA 287, experimentation. Similarly, in vitro approaches to toral fellowship at Harvard University with the 2215-2220 (2002). studying key cellular disease pathways and drug Nobel laureate Professor R.B. Woodward. 3 Waring, JF et al. Clustering of effects on molecular targets afford critical but lim- hepatotoxins based on ited ‘views’ of complex interactions at any given Laurie Taylor has more than 20 years of experi- mechanism of toxicity using gene expression profiles. point in time. Through in silico mathematical and ence in pharmaceutical research, with special To xicol.Appl. Pharmacol. 175, computer analysis of thousands of these in vitro emphasis in lead discovery and pharmacology 28-42 (2001). ‘views’, complex molecular interactions may be services. Ms Taylor joined MDS Panlabs (now 4 Waring, JF et al. Microarray displayed simultaneously providing the ability to MDS Pharma Services) in 1990 as an associate sci- analysis of hepatotoxins in vitro understand the effect of a single entity on a com- entist in assay development. Today, she serves as reveals a correlation between gene expression profiles and plex system. Director, Lead Discovery, and is responsible for mechanisms of toxicity.Toxicol. In conclusion, key future applications of predic- negotiating and implementing client contracts, Lett. 120, 359-368 (2001). tive chemoinformatics for the pharmaceutical identifying and developing new product areas and 5 Furness, LM et al. industry are: guiding the business unit’s sales efforts. Ms Chemogenomics for Predictive 1 Identification, validation, testing and functional Taylor’s background includes management and Drug Assessment. In press. To xicogenomics, Springer- annotation of new protein and gene drug targets in scientific positions with Signal Pharmaceuticals Verlag (2002). disease and health. and the departments of pharmacology and pathol- 6 Browne, LJ, Furness, LM, 2 Ensuring a healthy pipeline of new, improved ogy at the University of Washington, Seattle. She Natsoulis, G, Pearson, C and drug leads for development. has authored or co-authored numerous scientific Jarnagin, K. Chemogenomics: 3 Expanded understanding of complex biological articles and abstracts for peer-reviewed publica- pharmacology with genomics tools,Targets, 1, 59-65 (2002). systems. tions, including Tetrahedron, the Journal of 4 Optimisation of structure-related activities for Medicinal Chemistry and the Journal of the new as well as known compounds to reduce the American Chemical Society. She has also present- Other references number of compounds tested and therefore the ed papers and posters before Annual Conference Hansch, C, Hoekman, D, Leo, cost and time to market. on the Biotechnology of Microbial Products, the A,Weininger, D, Selassie, C. 5 Empowering clinicians and patients with infor- Annual Meeting of the American Society for Cell Chemo-Bioinformatics: Comparative QSAR at the matics tools, such as differential gene and protein Biology, the Organic Chemistry Symposium and Interface between Chemistry expression profiles as a function of patient age, other associations. Ms Taylor earned her BA in and Biology. Chem. Rev. 2002, health, family history and environment, to effec- zoology from the University of Washington. She is vol 102, pp783-812,American tively tailor individual drug treatment regimes for also a certified electron microscopist, San Joaquin Chemical Society. improved patient care. DDW Delta College. Kitano, H. :A brief Overview. Science, March 1, 2002,Vol 295, pp1662-1664. Dr Leslie Browne is currently Chief Operating Noble, Denis.The rise of Officer for Iconix, a company he joined in . Nature, October 2001 from Gene Trace Systems where Volume 3, June 2002, pp 460- he held the same position. Before that Dr 463. Richon,Allen B.A Short Browne spent more than a decade at History of Bioinformatics. Berlex/Schering AG, most recently as Corporate Network Science, Vice-President, Berlex Laboratories, Inc and www.netsciorg/Science/bioinfo President of Schering Berlin Venture rm/feature06.html August Corporation. Prior to this he was Vice-President, 2002. Head of Discovery Research, at Berlex Biosciences, having responsibility for drug dis- covery, including cell and , pro- tein chemistry, screening, medicinal chemistry and molecular and animal pharmacology. Before Berlex, Dr Browne was with Ciba-Geigy, where he invented Fadrazole, the first marketed non- steroidal aromatase inhibitor for the treatment of estrogen-dependent breast cancer. He also man- aged the cardiovascular research programme at Ciba-Geigy Ltd in Basle, where one of the

Drug Discovery World Fall 2002 77