Population models of common Anna Di Rienzo

The number and frequency of susceptibility for common This is because the relevant fitness effects are those that diseases are important factors to consider in the efficient acted in ancestral populations, whereas clinical design of association studies. These quantities are the are assessed in contemporary populations. results of the joint effects of , genetic drift and In addition, clinical severity does not necessarily imply selection. Hence, models, informed by effects on reproduction, viability and survival, especially empirical knowledge about patterns of disease variation, can for diseases with late age of onset. be used to make predictions about the allelic architecture of common disease susceptibility and to gain an overall Disease mapping strategies rely — implicitly or explicitly understanding about the evolutionary origins of such diseases. — on population genetics quantities, such as allelic Equilibrium models and empirical studies suggest a role for heterogeneity, frequency spectrum and inter-eth- both rare and common variants. In addition, increasing nic differentiation; for example, standard association evidence points to changes in selective pressures on methods are most powerful for detecting common sus- susceptibility genes for common diseases; these findings are ceptibility variants with limited heterogeneity. Likewise, likely to form the basis for further modeling studies. the evidence for association is most likely to be replicated in additional populations if the susceptibility variants are Addresses Department of Human Genetics, University of Chicago, Chicago, the same and similarly common across populations. IL 60637, USA Because these quantities are functions of the underlying population genetics model, modeling studies might help Corresponding author: Di Rienzo, Anna to optimize disease-mapping strategies to the outcomes of ([email protected]) specific evolutionary scenarios.

Current Opinion in Genetics & Development 2006, 16:630–636 In general, modeling studies aim at analyzing a complex biological process (e.g. the of disease suscep- This review comes from a themed issue on tibility) by reducing it to more basic components (e.g. Genomes and evolution Edited by Chris Tyler-Smith and Molly Przeworski mutation rate or strength of selection etc). The ultimate goal is to understand the properties of the process and to Available online 19th October 2006 make quantitative predictions about its outcomes. Even 0959-437X/$ – see front matter though most models are too simple to be realistic, they # 2006 Elsevier Ltd. All rights reserved. can still have practical utility. For example, they can generate hypotheses that can be tested by empirical DOI 10.1016/j.gde.2006.10.002 studies of genetic variation or they can point to new approaches to disease mapping. Conversely, as we learn more about genetic susceptibility variants, modeling stud- Introduction ies might use the available empirical information to fill in Alleles that contribute to the genetic susceptibility to the gaps of knowledge regarding the genetic architecture human diseases are the focus of intense attention in human of common diseases and how it was shaped by natural genetics. They can be studied by a variety of approaches selection. Hence, population genetics modeling informed (e.g. linkage analysis and case-control studies) that aim to by empirical studies of risk variation — and the reverse — identify their role in the etiology of clinically defined will be necessary if a comprehensive understanding of the phenotypes. Disease alleles are also a specific subset of genetic bases of common diseases is to be obtained. all genetic variation present in human populations. As such, their fate is influenced jointly by the history and Here, I review the results of modeling studies performed strength of the selective pressures acting in human popu- to date, in light of the available data on susceptibility lations as well as by demographic history, including popu- variants for common diseases. Several hypotheses have lation size changes and geographic structure. Therefore, also been formulated about how changes in selective the properties of disease alleles can also be investigated by pressures acted on susceptibility genes for common dis- population genetics modeling based on evolutionary prin- eases. These hypotheses are providing a framework for ciples. An important difference between disease mapping further empirical and modeling studies. and population genetics studies is that the latter focus on the effects of susceptibility alleles on evolutionary fitness Mutation–selection balance: models and data rather than focusing on parameters of clinical severity. Different types of population genetics models are Fitness is only loosely correlated with clinical severity. likely to apply to Mendelian versus common diseases.

Current Opinion in Genetics & Development 2006, 16:630–636 www.sciencedirect.com Population genetics models of common diseases Di Rienzo 631

Mendelian diseases, which are usually due to highly In a related study, Reich and Lander [2] also modeled penetrant and deleterious alleles that segregate in susceptibility alleles as being (slightly) deleterious and , probably fit relatively simple equilibrium models showed that, conditional on a specified total frequency of of mutation–selection balance in which disease alleles are the susceptibility allele class (20%) at a single locus and removed by purifying selection and continually replen- for a ‘typical’ disease mutation rate, there is a single ished through the mutation process. Mutation–selection predominant allele within the susceptibility class [6]. balance predicts high allelic heterogeneity and low dis- Importantly, this study focused on the role of population ease prevalence, which are both common features of growth on the dynamics of disease alleles. More specifi- Mendelian diseases. Models of positive selection also cally, the authors showed that, although the number of apply to a minority of Mendelian disease alleles, most disease alleles increases very quickly if the frequency of notably sickle cell anemia, G6PD deficiency and the the susceptibility class before the onset of growth is low, thalassemias, which are caused by the pleiotropic effects the allelic heterogeneity for loci with intermediate fre- of disease alleles on resistance to malaria. For other quencies also increases, but much more slowly. For diseases (e.g. cystic fibrosis), a selective advantage has plausible times for the onset of population growth, the been invoked, but generating definitive evidence has met increase in allelic heterogeneity for loci with intermediate with considerable difficulties [1–4]. equilibrium frequency is expected to be negligible. Hence, if indeed common diseases are due to suscep- When it comes to common diseases with complex pat- tibility loci with intermediate equilibrium frequencies, terns of inheritance, however, the formulation of evolu- the allelic heterogeneity is predicted to be relatively low. tionary models is more complicated. Owing to incomplete penetrance, late age of onset, gene-by-environment inter- Owing to the different parameterization, the conclusions actions and polygenic inheritance, the connection of these two studies cannot be directly compared. Never- between common disease susceptibility and fitness is theless, they highlight important issues in modeling the not clear. Moreover, although most recessive Mendelian evolution of common disease susceptibility, and they diseases are due to loss-of-function alleles, this expecta- indicate new directions for further investigation. One tion might not apply to common diseases, thus adding to important issue is the choice of parameter values. For the uncertainty regarding the rate at which susceptibility example, an obvious difference between the two studies variants are generated by the mutation process. Perhaps concerns the assumptions about the rate at which new more importantly, part of the selective pressures acting on susceptibility alleles are created by the mutation process. susceptibility alleles for common diseases might have Pritchard [1] considered a broad range of mutation rates changed dramatically during human history, as a result (i.e. 2.5 Â 10–6 to 1.25 Â 10–4 per gene per generation), of major cultural and/or environmental transitions, imply- whereas Reich and Lander [2] considered a single and low ing that non-equilibrium models might be needed. mutation rate (i.e. 3.2 Â 10–6). The rate at which suscept- ibility alleles are generated and the rate at which they are The first models of common disease susceptibility based repaired or compensated for depend on the functional on population genetics principles were formulated within effects of causing common diseases, which in the mutation–selection balance framework. Pritchard [1] turn depend on a variety of factors, including the disease modeled the evolution of a disease susceptibility locus in pathophysiology and the biological function of the gene. a randomly mating population of constant size, incorpo- For example, the mutation rate for variants leading to loss rating the effects of mutation, genetic drift and selection of function is likely to be higher than the rate for variants [5]. An important aspect of this analysis is that the total resulting in gain of function. Moreover, partial or com- frequency of the susceptibility allele class was treated as plete loss of gene function can have similar phenotypic an outcome of the evolutionary process and was not fixed effects and selection coefficients depending on the to a specified value. Under neutrality and for a broad redundancy of the gene. More importantly, because range of mutation rates, he found that the susceptibility the current disease allele frequency and heterogeneity class is unlikely to be polymorphic (i.e. have a frequency are the result of a long-term history, the phenotypic and between 1% and 99%), but rather it tends to be either rare fitness effects relevant to population genetics modeling or near fixation. Accordingly, neutral susceptibility alleles are those existing in ancestral human populations and are expected to make only a minor contribution to the might not be easily extrapolated from the effects of the genetic variance for the disease. However, if susceptibil- same mutations in contemporary populations. ity alleles are deleterious (and if the mutation rate is relatively high), the probability that the susceptibility The comparison of model predictions under constant and class is polymorphic is much higher than under neutrality; growing population size, as discussed by Reich and hence, deleterious alleles are likely to explain a larger Lander [2], is an important reminder that susceptibility fraction of the genetic variance. These findings suggest loci evolve in populations with complex histories and that common disease susceptibility is unlikely to be due that demographic histories may differ across major to selectively neutral variation. ethnic groups. To date, population genetics models of www.sciencedirect.com Current Opinion in Genetics & Development 2006, 16:630–636 632 Genomes and evolution

disease susceptibility have relied on a small set of highly regulatory sequence elements is far less reliable than simplified demographic models. Thus, though much is for nonsynonymous variants. still to be learnt about plausible models for human history, future modeling should include some of the complexities Because standard approaches based on typing common emerging from recent inferences about human demogra- variants are thought not to be powerful for the identifica- phy. Several lines of evidence point to a history of rapid tion of rare risk-alleles, the development of new statistical recent growth from an equilibrium population for sub- methods is an area of active investigation. A promising Saharan Africans [7–9], whereas genetic variation data new method exploits the property that rare variants are on from non-African populations are compatible with bottle- average younger than common variants and, hence, are neck models [8,10] and, possibly, serial founder-events associated with longer homogeneous haplotypes [19]. [11,12]. In addition to population expansion, human Because the method considers all haplotypes of all variation data are also consistent with significant geo- lengths, it uses the information on the long-range haplo- graphic structure on a continental scale as well as on a type structure typical of rare variants. Computer simula- finer scale [13]. Future modeling studies incorporating tions showed that this method has reasonable power for some of these demographic complexities might show variants with frequency as low as 1%, as long as the size of that the predicted allelic heterogeneity and frequency the phenotypic effect is at least moderate. Additional spectrum vary to an appreciable extent across human work is necessary to investigate the power of this method populations — even if selective pressures were the same. for less common variants and as a function of the relevant evolutionary parameter values, such as selection coeffi- Despite the uncertainties regarding plausible parameters cient, mutation rate and demographic parameters. values, the results of the above modeling studies have stimulated much empirical and analytical work. System- Advantageous disease variants and atic re-examination and meta-analysis of 25 reported changes in selective pressures disease associations with common variants revealed a Several additional non-neutral frameworks have been significant excess of replications of the original reports, proposed to explain the evolution of common disease suggesting that some of these associations are real and susceptibility. Although these were not developed into that common variants with modest effects — odds ratios formal population genetics models of disease susceptibil- between 1.07 and 2.28 — do exist [14]. Evidence support- ity, they provide a useful set of hypotheses for further ing a role for heterogeneous and rare susceptibility var- investigation. In one scenario, which can be thought of as a iants is also rapidly growing as strategies for the detection case of antagonistic pleiotropy, alleles that increase risk to of these alleles are being developed. Testing for hetero- common late-onset diseases are advantageous in early life geneous and rare susceptibility variants presents consid- [20]. Because the strength of selection decreases with the erable analytical challenges, because established disease age of onset of the , the allele will be favored association methods are tailored to common susceptibility and selection will maintain the optimum for younger ages. variants and are unlikely to be powerful for the case of This scenario, originally proposed to explain the evolution multiple rare variants. A strategy specifically developed to of senescence, might also apply to some of the more meet this challenge relies on performing variation dis- common late-onset diseases (e.g. type 2 diabetes and heart covery either in affected and unaffected subjects or in the disease). The finding that mutations affecting the insulin– extremes of a distribution (e.g. the top and bottom 5%) for IGF (insulin-like growth factor) signaling pathway lead to a disease-related phenotype, followed by testing for an an increased lifespan in worms, flies and mice supports a excess of a class of variants in one group compared with link between type 2 diabetes and ageing. the other. The class of variants to be tested is defined a priori on the basis of criteria that are likely to identify The evolutionary scenarios discussed so far envision the subset of all variation with effects on gene function selection regimes in which selective pressures do not (e.g. nonsynonymous variants). When applied to candi- change substantially over time; hence, they are essentially date genes for variation in plasma lipid levels [15] equilibrium models. However, decades of anthropologi- and risk to colorectal cancer [16], this approach detected cal research have established that the environment in an excess of nonsynonymous variants with frequency which we now live is radically different from the one that lower than 0.5%. Reliable identification of the class ancestral human populations adapted to. A number of of variants with functional effects is a critical aspect transitions have resulted in shifts in the selective pres- of this approach. For this reason, to date it has been sures acting on the biological processes that, in contem- applied to coding regions and nonsynonymous variants, porary populations, underlie major classes of common for which information on structural and sequence diseases. Crucial junctures include the dispersal of mod- conservation can be used to calculate the probability of ern out of Africa, and the shift away from a an effect on function [17,18]. In principle, this approach hunting and gathering life-style to a subsistence based could be extended to non-coding sequences, but the on agriculture and animal husbandry, in addition to the prediction of functional effects for variants affecting more recent transitions associated with industrialization

Current Opinion in Genetics & Development 2006, 16:630–636 www.sciencedirect.com Population genetics models of common diseases Di Rienzo 633

and globalization. These transitions were associated with with the adoption of a Western life-style and diet is too major changes in environmental factors — including diet, rapid to be due to a change in the gene pool. Rather, it is pathogen exposure, life style and social organization etc likely that the environmental change simply uncovers the — that are likely to have shaped the evolution of suscept- latent genetic susceptibility of human populations to ibility genes for common diseases. diseases that today account for a disproportionate fraction of the public health burden. Consistent with the idea that The impact of these selective shifts on susceptibility common disease susceptibility might be the result of genes for many common diseases has been formalized ancient human adaptations to a long-term steady envi- in a series of hypotheses, which are based on epidemio- ronment, a number of alleles that increase risk to common logical trends and the disease pathophysiology. Perhaps diseases are ancestral (i.e. they are identical to the allele the most popular is the ‘thrifty genotype’ hypothesis first found at the orthologous position in chimpanzee [28,29]). proposed in 1962 to explain the high prevalence of type 2 The new (derived) allele at these variants confers protec- diabetes and [21]. This hypothesis posits that tion against the disease. A possible population genetics since ancestral hunter-gatherer populations underwent model for this class of variants is that the ancestral alleles seasonal cycles of feast and famine, they would have were maintained by purifying selection in the ancestral benefited from being ‘thrifty’ (i.e. having extremely effi- environment, whereas the derived alleles were slightly cient fat and carbohydrate storage). With changes in deleterious (see Figure 1). Upon the change in selective subsistence strategies, this ‘thriftiness’ became detrimen- pressures, the ancestral alleles increase risk to disease tal as food production, processing and storage resulted in whereas the derived alleles become either neutral or more reliable availability across seasons and changes in advantageous. dietary patterns. More recently, it was proposed that inter-ethnic differences in the prevalence of type 2 dia- A signature of recent positive selection has been reported betes and obesity, in particular the low prevalence in for several derived alleles that have protective effects Europeans, might be due to adaptations to cold climates, against heart disease and diabetes [30,31,32,33]; this whereby adaptive variation in a subset of susceptibility observation is somewhat surprising because, given the genes results in increased thermogenesis and lower risk to late age of onset of these diseases, protective alleles are type 2 diabetes and obesity [22]. Consistent with the idea expected to have little or no fitness consequences. One that cold stress was an important selective force during possible explanation is that alleles that protect against the human dispersal, there is substantial evidence that body primary disease phenotypes also protect against pheno- mass index (BMI) and basal metabolic rate (BMR) vary types with stronger fitness effects, such as those that clinically around the globe [23,24]; high BMI and low affect reproductive function more directly. For example, BMR are correlated with increased risk of type 2 diabetes the ancestral Thr235 allele in the Angiotensinogen (AGT) and obesity. A related scenario, referred to as the ‘sodium gene was reported to increase risk to both hypertension retention’ hypothesis, was proposed to explain the inter- [34,35] and pre-eclampsia [36]; accordingly, the haplo- ethnic differences in the prevalence of hypertension, type carrying the derived (protective) allele exhibits a particularly the salt-sensitive type [25]. Under this sce- haplotype structure consistent with the action of recent nario, ancestral populations living in equatorial Africa positive selection [31]. Moreover, as predicted by the adapted to hot and humid climates by increasing the rate sodium retention hypothesis, the worldwide distribution of sodium and water retention; as populations moved out of allele frequency at the AGT Thr235Met variant, as well of Africa into cooler, drier environments, this high level of as other variants influencing risk to hypertension, is sodium retention probably became deleterious. Energy consistent with the action of spatially varying selection: metabolism and sodium homeostasis are not the only the protective alleles decrease gradually in frequency biological processes associated with currently common with increasing distance from the equator [32,37]. A diseases that have undergone major selective shifts. Para- selective advantage for protective alleles through direct site loads and pathogen exposure are likely to have effects on reproduction might also be at work in type 2 increased dramatically with the switch to the sedentary diabetes. Over the past 25 years, a decline in diabetes life-style, higher population densities and greater proxim- prevalence has been reported in the Pacific island of ity with domesticated animals resulting from the spread Nauru, which experienced a dramatic epidemic starting of agriculture and animal husbandry [26]. The reduced in the 1950s. The lower number of live births observed in exposure to childhood infections, and the overall higher Nauruan women with diabetes or impaired glucose tol- sanitary conditions typical of industrial societies are erance suggests a link between diabetes risk and reduced thought to be responsible for the rapid increase in inflam- fertility that is consistent with the observed rapid decline matory and allergic diseases, such as asthma and inflam- in diabetes prevalence [38]. matory bowel disease [27]. The above frameworks open the appealing prospect that What is common to the above scenarios is the notion that approaches based on detecting the signature of selection the observed increase in disease prevalence associated might be added to the arsenal of disease mapping www.sciencedirect.com Current Opinion in Genetics & Development 2006, 16:630–636 634 Genomes and evolution

Figure 1

The ancestral susceptibility model as it might apply to the ‘thrifty’ genotype and the sodium-retention hypotheses. In ancestral human populations, those alleles that today increase risk to type 2 diabetes or hypertension were maintained by purifying selection. As human populations transitioned to a Western life-style and diet or as they moved out of Africa to colder climates, the ancestral alleles were no longer advantageous and increased disease susceptibility. The derived alleles at those sites were slightly deleterious in ancestral populations. Since the environmental change, they have conferred protection against the disease and have been either neutral or advantageous. Neutral variation also segregates throughout the history of human populations. strategies. Because population genetics studies and the assessment of statistical significance for signals of disease association studies are based on independent selection [41]. theoretical frameworks, evidence converging on an over- lapping set of candidate susceptibility variants will Conclusions increase the confidence that the true causative variation Although the development of formal population genetics has been identified. Consistent with this proposal, a scan models for common diseases is still in its infancy, it is for selection signals in 132 genes involved in inflamma- already clear that no single model can explain the evolu- tion, blood clotting, and blood pressure regulation found tion of susceptibility alleles even for the same disease or signatures of selection in several genes previously impli- the same susceptibility gene. For example, analyses cated in common disease risk [39]. A practical advantage aimed at detecting rare susceptibility variants for a given of population genetics studies over disease mapping phenotype showed that there is strong evidence for such studies is that they can be performed on random samples variants in some candidate genes, but not in others of unrelated individuals and do not require the costly and [15,16]. Moreover, both common and rare variants time-consuming collection of extensive phenotypic data. have been shown to contribute to the susceptibility to For example, to the extent that susceptibility to common the same common disease. This might even apply to diseases might be related to phenotypes for which selec- variants within the same gene; for example, the ABCA1 tion varies according to measurable aspects of geography (ATP-binding cassette A1) gene, which codes for a choles- (e.g. phenotypes related to sodium homeostasis or energy terol efflux pump, was shown to harbor multiple rare metabolism), mapping susceptibility alleles could be variants as well as a common variant (Ile883Met) that conducted by genotyping individuals from geographically influence plasma HDL cholesterol levels [15]. Given diverse populations and testing for correlations between the complex history of selective pressures acting on population allele frequencies and environmental vari- humans, and the composite pathways underlying the ables. Along the same lines, a recent genome-wide scan pathophysiology of common diseases, it is not surprising for signals of recent positive selection developed a set of that a diverse spectrum of plausible evolutionary models SNPs that tag the haplotypes carrying the strongest is emerging. Further modeling studies rooted in popula- signatures of selection; it was proposed that this set of tion genetics theory and data are necessary to achieve a tag SNPs be included in whole-genome scans for complex more complete understanding of the evolutionary causes traits [40]. There is, however, uncertainty regarding of common diseases. Moreover, by broadening the range

Current Opinion in Genetics & Development 2006, 16:630–636 www.sciencedirect.com Population genetics models of common diseases Di Rienzo 635

of plausible models, they might lead to mapping One of two studies that abandoned the standard linkage disequilibrium- based approach to disease mapping and used instead a study design approaches that are powerful for the identification of that enables the identification of heterogeneous and rare susceptibility the full gamut of susceptibility variants. variants. The first of several studies showing a role for rare variants in the susceptibility to common diseases. Acknowledgements 16. Fearnhead NS, Wilding JL, Winney B, Tonks S, Bartlett S,  Bicknell DC, Tomlinson IP, Mortensen NJ, Bodmer WF: Multiple I would like to thank M Przeworski, J Cohen, N Freimer, H Hobbs, rare variants in different genes account for multifactorial D Nicolae and J Pritchard for comments on the manuscript. I was inherited susceptibility to colorectal adenomas. Proc Natl Acad supported in part by National Institutes of Health grant DK56670. Sci USA 2004, 101:15992-15997. Together with the study by Cohen et al.[15], this is an important study based on resequencing of cases and controls to identify an excess of rare References and recommended reading variants with functional effects in affected individuals. Papers of particular interest, published within the annual period of review, have been highlighted as: 17. Ng PC, Henikoff S: SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 2003, 31:3812-3814.  of special interest  of outstanding interest 18. Ramensky V, Bork P, Sunyaev S: Human non-synonymous SNPs: server and survey. Nucleic Acids Res 2002, 30:3894-3900. 1. Cuthbert AW, Halstead J, Ratcliff R, Colledge WH, Evans MJ: 19. Lin S, Chakravarti A, Cutler DJ: Exhaustive allelic transmission The genetic advantage hypothesis in cystic fibrosis disequilibrium tests as a new approach to genome-wide heterozygotes: a murine study. J Physiol 1995, 482:449-454. association studies. Nat Genet 2004, 36:1181-1188. 2. Gabriel SE, Clarke LL, Boucher RC, Stutts MJ: CFTR and outward 20. Wright A, Charlesworth B, Rudan I, Carothers A, Campbell H: rectifying chloride channels are distinct proteins with a A polygenic basis for late-onset disease. Trends Genet 2003, regulatory relationship. Nature 1993, 363:263-268. 19:97-106. 3. Hogenauer C, Santa Ana CA, Porter JL, Millard M, Gelfand A, 21. Neel JV: Diabetes Mellitus: a ‘thrifty’ genotype rendered Rosenblatt RL, Prestidge CB, Fordtran JS: Active intestinal detrimental by ‘progress’? Am J Hum Genet 1962, 14:353-362. chloride secretion in human carriers of cystic fibrosis mutations: an evaluation of the hypothesis that heterozygotes 22. Fridlyand LE, Philipson LH: Cold climate genes and the have subnormal active intestinal chloride secretion. Am J Hum prevalence of type 2 diabetes mellitus. Med Hypotheses 2006, Genet 2000, 67:1422-1427. 67:1034-1041. 4. Pier GB, Grout M, Zaidi T, Meluleni G, Mueschenborn SS, 23. Katzmarzyk PT, Leonard WR: Climatic influences on human Banting G, Ratcliff R, Evans MJ, Colledge WH: Salmonella typhi  body size and proportions: ecological adaptations and secular uses CFTR to enter intestinal epithelial cells. Nature 1998, trends. Am J Phys Anthropol 1998, 106:483-503. 393:79-82. A rigorous re-analysis of data on the worldwide distribution of human body size phenotypes, supporting a role for spatially varying selection on 5. Pritchard JK: Are rare variants responsible for susceptibility to a phenotype related to obesity and type 2 diabetes. complex diseases? Am J Hum Genet 2001, 69:124-137. 24. Roberts DF: Climate and Human Variability. Menlo Park: 6. Reich DE, Lander ES: On the allelic spectrum of human disease. Cummings; 1978. Trends Genet 2001, 17:502-510. 25. Gleibermann L: Blood pressure and dietary salt in human 7. Adams AM, Hudson RR: Maximum-likelihood estimation of populations. Ecol Food Nutr 1973, 2:143-156. demographic parameters using the frequency spectrum of unlinked single- polymorphisms. Genetics 2004, 26. Barrett R, Kuzawa CW, McDade T, Armelagos GJ: Emerging and 168:1699-1712. re-emerging infectious diseases: the third epidemiologic transition. Annu Rev Anthropol 1998, 27:247-271. 8. Voight BF, Adams AM, Frisse LA, Qian Y, Hudson RR, Di Rienzo A: Interrogating multiple aspects of variation in a full re- 27. Strachan DP: Hay fever, hygiene, and household size. sequencing data set to infer human population size changes. BMJ 1989, 299:1259-1260. Proc Natl Acad Sci USA 2006, 102:18508-18513. 28. Chimpanzee Sequencing and Analysis Consortium: Initial 9. Wall JD, Przeworski M: When did the human population size sequence of the chimpanzee genome and comparison with start increasing? Genetics 2000, 155:1865-1874. the human genome. Nature 2005, 437:69-87. 10. Reich DE, Cargill M, Bolk S, Ireland J, Sabeti PC, Richter DJ, 29. Di Rienzo A, Hudson RR: An evolutionary framework for Lavery T, Kouyoumjian R, Farhadian SF, Ward R et al.: common diseases: the ancestral-susceptibility model. Linkage disequilibrium in the human genome. Nature 2001, Trends Genet 2005, 21:596-601. 411:199-204. 30. Fullerton SM, Clark AG, Weiss KM, Nickerson DA, Taylor SL, 11. Prugnolle F, Manica A, Balloux F: Geography predicts neutral Stengard JH, Salomaa V, Vartiainen E, Perola M, Boerwinkle E genetic diversity of human populations. Curr Biol 2005, et al.: Apolipoprotein E variation at the sequence haplotype 15:R159-R160. level: implications for the origin and maintenance of a major human . Am J Hum Genet 2000, 67:881-900. 12. Ramachandran S, Deshpande O, Roseman CC, Rosenberg NA, Feldman MW, Cavalli-Sforza LL: Support from the relationship 31. Nakajima T, Wooding S, Sakagami T, Emi M, Tokunaga K, of genetic and geographic distance in human populations for a Tamiya G, Ishigami T, Umemura S, Munkhbat B, Jin F et al.: serial founder effect originating in Africa. Proc Natl Acad Sci and population history in the human USA 2005, 102:15942-15947. Angiotensinogen gene (AGT): 736 complete AGT sequences in chromosomes from around the world. Am J Hum Genet 2004, 13. Rosenberg NA, Pritchard JK, Weber JL, Cann HM, Kidd KK, 74:898-916. Zhivotovsky LA, Feldman MW: Genetic structure of human populations. Science 2002, 298:2381-2385. 32. Thompson EE, Kuttab-Boulos H, Witonsky D, Yang L,  Roe BA, Di Rienzo A: CYP3A variation and the evolution 14. Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN: of salt-sensitivity variants. Am J Hum Genet 2004, Meta-analysis of genetic association studies supports a 75:1059-1069. contribution of common variants to susceptibility to common A study showing a signature of spatially varying selection on risk variants disease. Nat Genet 2003, 33:177-182. for hypertension and variable salt homeostasis, on the basis of allele frequency distributions and patterns of linked neutral variation. 15. Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R,  Hobbs HH: Multiple rare alleles contribute to low plasma levels 33. Vander Molen J, Frisse LM, Fullerton SM, Qian Y, of HDL cholesterol. Science 2004, 305:869-872. del Bosque-Plata L, Hudson RR, Di Rienzo A: www.sciencedirect.com Current Opinion in Genetics & Development 2006, 16:630–636 636 Genomes and evolution

Population genetics of CAPN10 and GPR35: implications for hypertension is due to selection during the out-of-Africa the evolution of type 2 diabetes variants. Am J Hum Genet 2005, expansion. PLoS Genet 2005, 1:e82. 76:548-560. A survey of allele frequencies for variants involved in risk to hypertension, showing a latitudinal cline. The results of this study challenge the notion 34. Wu SJ, Chiang FT, Chen WJ, Liu PH, Hsu KL, Hwang JJ, Lai LP, that increased risk to hypertension is ancestral for all risk variants. Lin JL, Tseng CD, Tseng YZ: Three single-nucleotide polymorphisms of the Angiotensinogen gene and 38. Dowse GK, Zimmet PZ, Finch CF, Collins VR: Decline in susceptibility to hypertension: single locus genotype vs. incidence of epidemic glucose intolerance in Nauruans: haplotype analysis. Physiol Genomics 2004, 17:79-86. implications for the ‘thrifty genotype’. Am J Epidemiol 1991, 133:1093-1104. 35. Lifton RP, Warnock D, Acton RT, Harman L, Lalouel JM: High prevalence of hypertension-associated Angiotensinogen 39. Akey JM, Eberle MA, Rieder MJ, Carlson CS, Shriver MD, variant T235 in African Americans. Clin Res 1993, 41:A260. Nickerson DA, Kruglyak L: Population history and natural selection shape patterns of genetic variation in 132 genes. 36. Ward K, Hata A, Jeunemaitre X, Helin C, Nelson L, Namikawa C, PLoS Biol 2004, 2:E286. Farrington PF, Ogasawara M, Suzumori K, Tomoda S et al.: A molecular variant of angiotensinogen associated with 40. Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent preeclampsia. Nat Genet 1993, 4:59-61. positive selection in the human genome. PLoS Biol 2006, 4:e72. 37. Young JH, Chang YP, Kim JD, Chretien JP, Klag MJ, Levine MA, 41. McVean GA, Spencer CCA: Scanning the human genome for  Ruff CB, Wang NY, Chakravarti A: Differential susceptibility to signals of selection. Curr Opin Genet Dev 2007, in press.

Current Opinion in Genetics & Development 2006, 16:630–636 www.sciencedirect.com