CORRESPONDENCE

PharmGKB: a logical home for knowledge relating genotype to drug response

To the Editor: databases (MODs) provide extensive cura- , and alternative splices The and Pharmacogenetics tion of the genomic sequence and functional (if relevant). Sixteen such summaries are avail- Knowledge Base (PharmGKB, found at http:// information associated with those organisms. able today. Usage statistics (as summarized on www.pharmgkb.org/) is devoted to cataloging Curators comb the literature and create sum- my ‘PharmGKBlog’) show more than 2,000 information about pharmacogenes—those maries of function in both textual and registered users (who can gain access to indi- involved in modulating the response to controlled terminologies, and they integrate vidual-level data) with more than 50,000 unique drugs. Genes may be pharmacogenes because high-throughput data sets relevant to their internet address visitors per month. All data and they are involved in the pharmacokinetics of mission. knowledge contents are available for download. a drug (how the drug is absorbed, distributed, Although not an LSDB or a MOD, PharmGKB PharmGKB provides a cross-reference file that metabolized and eliminated) or the pharmaco- has elements of both. Like the LSDB curators, associates all SNPs in PharmGKB with identi- dynamics of a drug (how the drug acts on its tar- the PharmGKB curators constantly survey the fiers from the Golden Path human genome get and its mechanisms of action). PharmGKB’s literature and other databases for reports of browser, dbSNP, HapMap, jSNP, Illumina, goal is to be a comprehensive resource on phar- important genetic variations in known pharma- Affymetrix, SeattleSNP and ALFRED. macogenes, their variations, their pharma- cogenes. PharmGKB accepts and/or integrates Thus, PharmGKB is a resource that cokinetics and pharmacodynamic pathways information from the central data warehouses provides both primary data as well as curated

http://www.nature.com/naturegenetics and their effects on drug-related phenotypes. about sequence polymorphisms in order to knowledge about pharmacogenomics. It benefits Whereas the older field of pharmacogenetics provide a single location for high-quality infor- from the existence of data warehouses, because often focused on the effect of single dominant mation about the location and population fre- although it is capable of accepting and present- genes on drug response, pharmacogenomics quencies of variations in pharmacogenes. In ing primary data, it will increasingly depend connotes the study of the multigenic influences particular, curators look for variations that have on reliable archival resources for basic primary on drug response, often using modern high- functional phenotypic consequences related to data storage. More importantly, PharmGKB throughput experimental techniques. drug response. They create summaries of the will provide the aggregation, integration of lit- The sequencing of the human genome and the pharmacogenomics literature to create a defini- erature and summaries of knowledge (through study of have opened tive list of gene-drug interactions and charac- pathways and VIP genes) that still require PhD- up great opportunities for understanding the terize those interactions. Like MOD curators, level human curation. Our software developers association between genotypes and phenotypes. the PharmGKB curators attempt to provide are charged with building tools to help curators Nature Publishing Group Group Nature Publishing

7 There is an active debate about the merits of annotations of the functions and phenotypes of work more effectively in managing the increas- single, centralized databases to hold all genetic pharmacogenes. Thus, PharmGKB also accepts ing volume of pharmacogenomics science and 200

© variation information and associated pheno- and/or integrates information from the central to help users in searching, visualizing and ana- types. Recently, there has been a movement warehouses about drug-related phenotypes. lyzing the data and knowledge contained in the toward this model with the introduction of the Curators integrate high-throughput data sets knowledge base. National Center for Information relevant to drug response. They work to define The five-year goal of PharmGKB is to be a (NCBI) dbGAP databases to hold genotype and these phenotypes with controlled terminologies comprehensive store of information about phenotype data for large genome association tri- to facilitate indexing, searching and aggregation pharmacogenes and their associated phenotypes als (http://www.ncbi.nlm.nih.gov/entrez/query. of these data. They also work with members of and to catalyze research in pharmacogenomics. fcgi?db=gap).The success of GenBank and the the US National Institutes of Health (NIH) In the longer term, the knowledge contained suite of NCBI data resources has demonstrated Pharmacogenetics Research Network (PGRN) within PharmGKB will be used as a starting clearly that the NCBI is capable of meeting the to generate summaries of important genes and point for implementing genome-informed drug demand for high volumes of raw data and serv- their phenotypes and create pathway diagrams prescribing decisions. With sufficient informa- ing as a reliable repository. relevant to drug response. tion, we may be able to move toward predictive In addition to the need for providing raw Today, the PharmGKB has curated evidence pharmacogenomics, where patterns of variation data in standard formats, there is a major, con- for 1,994 genes involved in drug response. in pharmacokinetics and pharmacodynamics tinuing need for integration, aggregation and PharmGKB has high-quality genotype variation for existing drugs are used to predict the varia- curation of the information contained within data (in many cases with population frequen- tion in response to new drugs. these data stores for the purpose of supporting cies) for 240 genes, and 1,671 literature entries specific areas of scientific enquiry. For example, have been curated to create gene-drug associa- Russ B Altman locus-specific databases (LSDBs) discussed in tions that are labeled with respect to the of Department of Bioengineering and Department the accompanying correspondence1 are thriv- information contained in the papers. There are of , Stanford University, Stanford, ing, because their curators carefully scrutinize 38 manually created drug-related pathways cre- California 94305-5120, USA. the raw genotype and phenotype measurements ated in collaboration with PGRN investigators e-mail: [email protected] and present the data along with summaries of and others. Finally, we have introduced a new COMPETING INTERESTS STATEMENT the literature and additional information about Very Important Pharmacogenome (VIP) initia- The author declares no competing financial interests. rare mutants—particularly those with impor- tive to create structured summaries of key phar-

tant phenotypes. In addition, model organism macogenes, their important polymorphisms, 1. Horaitis, O. et al. Nat. Genet. 39, 425 (2007).

426 VOLUME 39 | NUMBER 4 | APRIL 2007 | NATURE GENETICS