Perspectives
Total Page:16
File Type:pdf, Size:1020Kb
PERSPECTIVES increased statistical efficiency of association OPINION analysis of complex diseases and the biologi- cal understanding of the phenotype, tissues, genes and proteins that are likely to be Candidate-gene approaches for involved in the disease. In spite of their promise, candidate-gene studying complex genetic traits: studies have been subject to two important criticisms. First, the significant findings of practical considerations association in many candidate-gene studies have not been replicated when followed up in subsequent association studies. Second, Holly K. Tabor, Neil J. Risch and Richard M. Myers because candidate-gene studies are based on the ability to predict functional candidate Association studies with candidate genes Because of these features, researchers genes and variants, some critics argue that have been widely used for the study of have begun to apply other approaches to current knowledge is insufficient to make complex diseases. However, this approach identify genes that are involved in complex these predictions. These critics believe that has been criticized because of non- diseases. For example, an association study ‘hypothesis-driven’ genetic approaches are replication of results and limits on its ability using a candidate-gene approach looks for a therefore less likely to yield results than to include all possible causative genes and statistical correlation between specific ‘anonymous’ approaches, which use markers polymorphisms. These challenges have led genetic variants and a disease. Association throughout the genome and do not depend to pessimism about the candidate-gene studies are likely to be more effective tools on the ability to select biologically plausible approach and about the genetic analysis of than linkage studies for studying complex candidates. complex diseases in general. We believe traits because they can have greater statisti- In this article, we argue that the pessimism that these criticisms can be usefully cal power to detect several genes of small that is sometimes aimed at the candidate- countered with an appeal to the principles of effect1. The candidate-gene approach can be gene approach is perhaps too extreme. We epidemiological investigation. defined as the study of the genetic influences suggest that application of the rigorous prin- on a complex trait by: generating hypotheses ciples that are used in epidemiology might In the past two decades, many genes that were about, and identifying candidate genes that help respond to some of the criticisms and implicated in simple (Mendelian) diseases might have a role in, the aetiology of the dis- improve the chances of successfully elucidat- have been identified by using genetic linkage ease; identifying variants in or near those ing genetic components of complex diseases. and positional cloning methods. Although genes that might either cause a change in the Although the intersection between candi- these methods have been remarkably success- protein or its expression, or be in LINKAGE date-gene and epidemiological approaches ful in identifying high RELATIVE RISK genes, they DISEQUILIBRIUM with functional changes; geno- has been addressed by other authors, particu- have not been successful in identifying genes typing the variants in a population; and by larly in relation to cancer epidemiology2,the that are involved in the complex forms of dis- using statistical methods to determine use of epidemiological principles in the selec- ease. This failure is the result of three main whether there is a correlation between those tion, analysis and interpretation of candidate features of complex diseases. First, such dis- variants and the phenotype. genes and DNA sequence variants has not eases typically vary in severity of symptoms Rather than rely on markers that are been described. and age of onset, which results in difficulty in evenly spaced throughout the genome with- defining an appropriate phenotype and out regard to their function or context in a Genetic studies and epidemiology selecting the best population to study. Second, specific gene, candidate-gene studies focus on The field of epidemiology is based on they can vary in their aetiological mecha- genes that are selected because of a priori observing and measuring disease patterns nisms, which might involve various biological hypotheses about their aetiological role in in populations, and using association and pathways. Third, and perhaps most impor- disease. Furthermore, a candidate-gene study statistical correlation to identify factors that tantly, complex diseases are more likely to be is usually conducted in a population-based affect those patterns. Epidemiology allows caused by several, and even numerous, genes, sample of affected and unaffected individuals the detection of small-to-moderate, but sig- each with a small overall contribution and (a case–control study). A candidate-gene nificant, relative risks of disease that are relative risk. study therefore takes advantage of both the contributed by several heterogeneous risk NATURE REVIEWS | GENETICS VOLUME 3 | MAY 2002 | 1 PERSPECTIVES There are many reasons for the lack of Box 1 | Epidemiological criteria for gene-association studies reproducibility seen with some candidate-gene • Biological plausibility of association and its consistency with existing knowledge about biology studies. These reasons indicate caution in both and disease aetiology are evaluated. Is the candidate gene likely to be involved in the the design and interpretation of such studies, phenotype? Are the single-nucleotide polymorphisms (SNPs) likely to have functional effects but do not condemn the approach as unreli- on the protein? able. Discrepant findings are often due to vari- • The strength of the association between the risk factor and the disease is examined. When ations in study design. For example, candidate- considering multiple SNPs in a candidate gene, the ones with strongest association are most gene studies might differ in the study likely to be causally related. population and in the definition of the pheno- • The dose–response relationship of the association is considered. For example, individuals with type (examples can be found in REFS 9,10). The two copies of a variant might be at greater risk of disease than individuals with one copy of the same candidate gene or DNA variants might variant. be associated with different relative risks in dif- • The consistency of the association across past and future studies, and across different ferent populations, and the non-replication populations, is an important consideration. Consistent replication in different populations is might result from real biological differences. strong evidence of causality. Lack of replication does not necessarily imply lack of causality, but Non-replication might also be due to the small might point to the need for more studies in certain populations or more detailed study of the magnitude of relative risks that are likely to be function of a particular gene. detected in candidate-gene studies of complex These issues are explored further in REFS 3,8,53. diseases. The aetiological heterogeneity that is inherent in the label ‘complex disease’ chal- lenges the measurement and analysis of multi- factors. Similarly, candidate-gene studies associations can also be used to indicate ple genetic and environmental components, aim to detect small-to-moderate relative molecular or biochemical mechanisms for and the interpretation of multiple and conflict- risks in the context of aetiological and the sequence variants, which allows experi- ing findings of association11. Confounding, genetic heterogeneity. ments to be designed to test their functional bias and misclassification are more likely to Association in epidemiological or candi- roles in biological processes and disease obscure small-to-moderate relative risks than date-gene studies can be defined as statisti- pathology. Such studies can provide strong larger relative risks. cal dependence or correlation between two support for causality; for example, in the Another possible explanation for non- or more events, characteristics or other recent case of identifying a gene that is replication across candidate-gene studies variables. This dependence is greatly influ- involved in Crohn disease6,7. relates to the selection of polymorphisms that enced by the characteristics of the study, are not likely to be causal. If a polymorphism including the size of the population studied Criticisms is selected because of the ease of genotyping and the number of variables analysed. The criticisms of candidate-gene and is not likely to affect the function of the Findings of association can be influenced approaches are rooted in a fundamental protein — such as a restriction fragment by problems such as SELECTION BIAS, RECALL challenge to the study of the genetics of length polymorphism in a deep intronic BIAS, MISCLASSIFICATION and CONFOUNDING (for complex diseases: in the absence of other region — then it is assumed, or hoped, that more detailed discussion, see REFS 3–5). effective methods and techniques, how can the polymorphism will be in linkage disequi- Significant associations might be causal, or investigators balance the use of available librium with a ‘disease-causing’ variant in the might simply be the result of coincidence or data about candidate genes and polymor- gene, and that this will be reflected in a find- one of the biases listed above. phisms with the desire to minimize the ing of association. Some