Positive Natural Selection in the Human Lineage REVIEW
Total Page:16
File Type:pdf, Size:1020Kb
REVIEW limited power to detect individual incidents of selection, whereas studies based on human genetic variation have suffered from difficulties Positive Natural Selection in the with assessing statistical significance. The evi- dence for positive selection has traditionally been evaluated by comparison with expecta- Human Lineage tions under standard population genetic models, but the model parameters (especially those re- P. C. Sabeti,1,2* S. F. Schaffner,1*† B. Fry,1 J. Lohmueller,1,3 P. Varilly,1 O. Shamovsky,1 lating to population history) have been poorly A. Palma,1 T. S. Mikkelsen,1 D. Altshuler,1,4,5 E. S. Lander1,6,7,8 constrained by available data, leading to large uncertainties in model predictions. One solution Positive natural selection is the force that drives the increase in prevalence of advantageous traits, would be to assess significance by comparing and it has played a central role in our development as a species. Until recently, the study of natural empirical results from different studies, but selection in humans has largely been restricted to comparing individual candidate genes to this has been challenging because of the varied theoretical expectations. The advent of genome-wide sequence and polymorphism data brings statistical tests, sizes of genomic region, and fundamental new tools to the study of natural selection. It is now possible to identify new population samples used (see table S2 for candidates for selection and to reevaluate previous claims by comparison with empirical examples). distributions of DNA sequence variation across the human genome and among populations. The The advent of whole-genome sequencing flood of data and analytical methods, however, raises many new challenges. Here, we review and increasingly complete surveys of genetic approaches to detect positive natural selection, describe results from recent analyses of genome- variation represent a turning point in the study wide data, and discuss the prospects and challenges ahead as we expand our understanding of the of positive selection in humans. With these ad- role of natural selection in shaping the human genome. vances, humans can now join model organisms such as Drosophila (9) at the forefront of evo- omo sapiens, like all species, has posed as possible targets for selection (table S1 lutionary studies. Newly available tools allow been shaped by positive natural selec- provides a review of this literature). systematic survey of the genome to find the H tion. As first articulated by Darwin Some of the proposed candidates for se- strongest candidate loci for natural selection, as and Wallace in 1858, positive selection is the lection, like HBB, have strong support in the well as to reevaluate previously proposed can- principle that beneficial traits—those that make form of a functional mutation with an identified didate genes, in comparison with genetic varia- it more likely that their carriers will survive and phenotypic effect that is a likely target of se- tion in the genome as a whole (the genome-wide reproduce—tend to become more frequent in lection. In the case of HBB, the selected muta- empirical distribution). Although they permit us on March 28, 2016 populations over time (1). In the case of hu- tion creates a glutamatetovalineaminoacid to make progress even while working out re- mans, these beneficial traits likely included change, but the target of selection need not be maining theoretical issues, they also bring ana- bipedalism, speech, resistance to infectious dis- in the protein-coding region of a gene. For lytical challenges of their own, because they eases, and other adaptations to new and diverse example, the Duffy antigen (FY) gene encodes represent imperfect samples of genetic variation. environments. Understanding the traits (and a membrane protein used by the Plasmodium Here, we review genetic methods for de- genes underlying them) that have undergone vivax malaria parasite to enter red blood cells. tecting natural selection, discuss initial results positive selection during human evolution can A mutation in the promoter of FY that disrupts about positive selection based on recent whole- provide insight into the events that have shaped protein expression confers protection against genome analyses, and outline the potential and Downloaded from our species, as well as into the diseases that P. vivax malaria and was proposed to be se- the challenges ahead in going from candi- continue to plague us today. lected for in regions of Africa where P. vivax dates of selection to proven examples of adapt- Until very recently, the only practical way malaria has been endemic (5). Another example ive evolution. to identify cases of positive selection in hu- is a mutation in a regulatory region near the mans was to examine individual candidate genes. gene for lactase (LCT ) that allows lactose tol- Methods for Detecting Selection Allison noted in 1954 that the geographical erance to persist into adulthood. This particular When alleles (genetic variations) under positive distribution of sickle cell disease was limited variant was apparently selected in parts of selection increase in prevalence in a population, to Africa and correlated with malaria endemic- Europe after the domestication of cattle (6). they leave distinctive ‘‘signatures,’’ or patterns ity (2); this observation led to the identification Often, however, the functional target of se- of genetic variation, in DNA sequence. These of the sickle cell mutation in the Hemoglobin-B lection is not known. In some cases, candidate signatures can be identified by comparison with gene (HBB) as having been the target of selec- genes gain support because they lie in func- the background distribution of genetic variation tion for malaria resistance (3, 4). Since then, tional pathways, such as spermatogenesis and in humans, which is generally argued to evolve approximately 90 different loci have been pro- the immune response, that are known to be fre- largely under neutrality (10). This is in accord quent targets for selection in other species. One with the neutral theory, which proposes that 1Broad Institute of MIT and Harvard, Cambridge, MA, USA. example is protamine 1 (PRM1), a sperm-specific most observed genetic variation, both within 2Harvard Medical School, Boston, MA, USA. 3Brown protein that compacts sperm DNA (7, 8). Such and between species, is neutral (i.e., has no ef- University, Providence, RI, USA. 4Departments of Genetics cases, however, are the exception. Most pro- fect on an individual’s fitness), so that its pop- and Medicine, Harvard Medical School, Boston, MA, USA. posed candidates lack compelling biological ulation prevalence changes over time by chance 5Department of Molecular Biology, Center for Human Genetic Research, and Diabetes Unit, Massachusetts support. Rather, the argument for selection has alone (so-called ‘‘genetic drift’’) (11). A great General Hospital, Boston, MA, USA. 6Department of relied solely on comparative and population challenge for population genetics–based sig- Biology, MIT, Cambridge, MA, USA. 7Whitehead Institute genetic evidence. natures (sections ii to v below) is determining 8 for Biomedical Research, Cambridge, MA, USA. Depart- Despite its great potential to illuminate new whether a signature is due to selection or to the ment of Systems Biology, Harvard Medical School, Boston, MA, USA. biological mechanisms, identification of se- confounding effects of population demographic *These authors contributed equally to this work. lected loci by genetic evidence alone is fraught history, such as bottlenecks (periods of reduced †To whom correspondence should be addressed. E-mail: with methodological challenges. Studies based population size), expansions, and subdivided [email protected] on comparisons between species suffer from populations. 1614 16 JUNE 2006 VOL 312 SCIENCE www.sciencemag.org REVIEW Many specific statistical tests have been pro- detect only ongoing or recurrent selection. In of low overall diversity, with an excess of posed to detect positive selection (table S3 practice, when the human genome is surveyed rare alleles. provides a review), but they are all based broad- in this manner, few individual genes will give Unlike excess functional changes, which ly on five signatures. Below, we describe the statistically significant signals, after correction involve differences between species, selective nature of each signature, an estimate of the win- for the large number of genes tested. However, sweeps are detected in genetic variation with- dow of evolutionary time in which it can be used the signature can readily be used to detect in a species. The most common type of variant to detect moderately strong selection in humans positive selection across sets of multiple genes used is the single-nucleotide polymorphism (Fig. 1), and its strengths and weaknesses in (25). For example, genes involved in gameto- (SNP). As an example, Akey et al. identified a human studies. Several excellent reviews (12–18) genesis clearly stand out as a class having a 115-kb region containing four genes including provide more information, as well as background high proportion of nonsynonymous substitu- the Kell blood antigen, which showed an on coalescent modeling and on other types of tions (25–27). overall reduction in diversity and more rare selection (e.g., purifying selection and balanc- (ii) Reduction in genetic diversity (age alleles in Europeans than expected under ing selection). It should be noted that many G250,000 years). As an allele increases in pop- neutrality (Fig. 3) (28). Statistical tests com- instances of selection are likely monly used to detect this signal not detectable by any currently include Tajima’s D, the Hudson- Proportion of functional changes proposed method—for example, if Kreitman-Aguade´ (HKA) test, and the selective advantage is too small Heterozygosity/rare alleles Fu and Li’s D* (29–32). or selection acts on an allele that is Reduction in genetic diversity already at an appreciable frequency High frequency derived alleles can be particularly useful because in the population (19). it persists longer than other popu- (i) High proportion of function- Population differences lation genetic signatures.