Copyright Ó 2008 by the Genetics Society of America DOI: 10.1534/genetics.107.083691

Population Genetic Analysis of the N-Acylsphingosine Amidohydrolase Associated With Mental Activity in Humans

Hie Lim Kim and Yoko Satta1 Department of Biosystems Science, The Graduate University for Advanced Studies (Sokendai), Hayama, Kanagawa 240-0193, Japan Manuscript received October 25, 2007 Accepted for publication December 21, 2007

ABSTRACT To understand the evolution of human mental activity, we performed population genetic analyses of nucleotide sequences (11 kb) from a worldwide sample of 60 of the N-acylsphingosine amidohydrolase (ASAH1) gene. ASAH1 hydrolyzes and regulates neuronal development, and its deficiency often results in mental retardation. In the region (4.4 kb) encompassing exons 3 and 4 of this gene, two distinct lineages (V and M) have been segregating in the human population for 2.4 6 0.4 million years (MY). The persistence of these two lineages is attributed to ancient population structure of humans in Africa. However, all haplotypes belonging to the V lineage exhibit strong linkage disequilibrium, a high frequency (62%), and small nucleotide diversity (p ¼ 0.05%). These features indicate a signature of positive Darwinian selection for the V lineage. Compared with the orthologs in mammals and birds, it is only Val at amino acid site 72 that is found exclusively in the V lineage in humans, suggesting that this Val is a likely target of positive selection. Computer simulation confirms that demographic models of modern humans except for the ancient population structure cannot explain the presence of two distinct lineages, and neutrality is incompatible with the observed small genetic variation of the V lineage at ASAH1. On the basis of the above observations, it is argued that positive selection is possibly operating on ASAH1 in the modern human population.

OMO sapiens has evolved to adapt to new and relation to language (FOXP2,Enard et al. 2002; Zhang H diverse environments, showing rapid population et al. 2002); and Spinocerebellar ataxia type 2 and Pituitary expansion since the exodus from Africa, 60–80 thousand adenylate cyclase-activating polypeptide in relation to neu- years (KY) ago (Watsonet al. 1997; Macaulay et al. 2005; rodegenerative disorders (SCA2,Yu et al. 2005; PACAP, Kivisild et al. 2006; Mellars 2006). At the same time, Wang et al. 2005). In addition, many causal for modern humans have acquired specific mental activity several types of mental retardation, possibly related to (e.g., language, symbols, culture, arts, etc.) (Henshilwood brain and cognitive development, have recently been re- et al. 2002; Mellars 2006; Bouzouggar et al. 2007), a ported (Inlow and Restifo 2004; Mervis and Becerra possible driving force for subsequent dispersal around 2007; Schumacher et al. 2007). These genes are further the world (Klein 1999; Mellars 2006). In this process of thought to influence mental activity. modern human evolution, it is likely that some genes, Our interest lies in lipid storage diseases (LSDs) such especially those related to mental activity, have evolved as Gaucher, Tay-Sachs, Farber, and Niemann-Pick dis- under natural selection (Nei 1983). Recent studies have eases, all of which are the result of inherited deficiency of reportedpositively selected genes for mental activity and/ genes whose products are related to metab- or brain development in the human lineage: Abnormal olism. Deficiency causes intralysosomal accumulation of spindle-like microcephaly associated and Microcephalin in rela- unmetabolized , ubiquitous components of tion to brain size (ASPM,Zhang 2003; MCPH1,Evans eukaryotic cell membranes that play important roles in in- et al. 2004; Wangand Su 2004, but see Currat et al. 2006; tracellular signaling and membrane structure (Futerman Yuet al. 2007); Dopamine receptor D4 and Monoamine oxidase and Hannun 2004; Futerman and Riezman 2005). Sphin- A in relation to emotional activity (DRD4,Ding et al. golipids regulate neuronal growth rates, differentiation, 2002; MAOA,Gilad et al. 2002); Forkhead box P2 in and death of neurons. This regulation depends on the concentration of sphingolipids in their path- way (Buccoliero et al. 2002; Buccoliero and Futerman The nucleotide sequence data reported in this article have been 2003). deposited in the DDBJ/EMBL/GenBank nucleotide sequence databases Ceramides are at the hub of sphingolipid metabolism under accession nos. AB371370–AB371406. and serve as the first point of significant sphingolipid 1Corresponding author: Department of Biostems Science, The Graduate annun University for Advanced Studies (Sokendai), Hayama, Kanagawa 240- accumulation in the de novo pathway (H and 0193, Japan. E-mail: [email protected] Obeid 2002; Merrill 2002). This pathway involves acid

Genetics 178: 1505–1515 (March 2008) 1506 H. L. Kim and Y. Satta (ASAH1) (also known as N-acylsphingosine mation, haplotypes with a low probability were further ex- amidohydrolase, ASAH) (AC, MIM 228000, EC 3.5.1.23) cluded. Thus the total number of SNPs and chromosomes used (Rother et al. 1992), which hydrolyzes ceramides into in the LRH test ranges from 83 to 100 and 82 to 120, att respectively, as given in supplemental Table 1 at http://www. and free fatty acids (G 1963). Catalysis genetics.org/supplemental/. The extended haplotype homo- of ASAH1 is highly related to neuronal development zygosity (EHH) and relative EHH (REHH) in 200 kb (Schwarz and Futerman 1997; Ruvolo 2003), and surrounding specified genomic regions of interest were mea- inherited deficiency leads to accumulation of ceramides sured using the software Sweep 1.0 (Sabeti et al. 2002). The significance of the LRH test results was examined with the in various tissues, resulting in also known udson ugita simulation program ms (H 2002). In the simulation, as Farber lipogranulomatosis (S et al. 1972). Farber neutral polymorphism data in a sample of 120 DNA sequences, disease is a rare disorder with an autosomal recessive each of which 200 kb in length, were generated without mode of inheritance. Typical symptoms include pain- recombination to make the test conservative. One hundred ful swelling of joints, hoarseness, and premature segregating sites were randomly sampled so as to imitate the death, and depending on the tissues affected by the actual data for the eight LSD-associated genes (supplemental Table 1). One thousand replications were carried out for each storage of ceramides, severe nervous system dysfunction of the eight genes. For each gene in each population, the (Moser 1995). Several mutations for Farber disease have observed and simulated EHH and REHH were compared been reported; a single nucleotide deletion (V96del, within the bin that contained haplotypes of the same fre- Muramatsu et al. 2002) and nine single nonsynon- quency. The standard deviations of the observed values from ymous mutations (Y36C, V97E, E138V, L182V, T222K, the mean in their bin were calculated using the EHH och i significance calculator option of Sweep 1.0. G235R, R254G, N320D, and P362R, K et al. 1996; L DNA samples, PCR, and sequencing: The 30 human et al. 1999; Ba¨r et al. 2001; Muramatsu et al. 2002; Devi genomic DNA samples used in this study come from 15 et al. 2006). The gene is located on the short arm of chro- Africans (10 Pygmies, 2 African Americans, and 3 Yoruba) mosome 8 (8p22–p21.3), is 28.5 kb long, appears to be a and 15 non-Africans (4 Amerinds, 5 Europeans, and 6 Asians). single-copy gene, and encodes 14 exons. Depending on The repository numbers of these samples in the Coriell Cell Repositories are NA10470–10473, 10492–10496, 10469, 10965, the splicing pattern, ASAH1 is translated into either 395 10970, 10975, 11197, 11322, 11324, 11373, 11521, 11587, or 411 amino acids (NP_808592 and NP_004306). 13597, 13607, 13617–13618, 13820, 13838, 14537, 14661, In this article, we applied the long-range haplotype 18523, 18853, and 19208. (LRH) test to eight genes associated with the sphingo- PCR was used to amplify the part of the ASAH1 gene (12.5 and found a probable signature of kb), ranging from position 17969623 to 17982155 on (NCBI build 36.2). The primers selective sweep in ASAH1. We therefore examined the were designed using the program Primer3 (Rozen and tempo and mode of ASAH1 evolution in human pop- Skaletsky 2000) and are given in supplemental Table 2 at ulations in more detail. Fifteen African and 15 non- http://www.genetics.org/supplemental/. PCR was performed African samples were used to analyze genetic variation. with 4 pmol of each primer, 150 ng of human genomic DNA, m The region sequenced is 11 kb long and encompasses 0.2 m dNTPs, 0.7 ml of Elongase mix (Invitrogen, m m exons 3–10, where strong linkage disequilibrium (LD) is Carlsbad, CA), and 4 l of PCR buffer containing 1.9 m MgCl2 in a total volume of 20 ml. A RoboCycler Gradient 96 manifested. The results of LD and polymorphism an- (Stratagene, La Jolla, CA) and TGradient (Whatman Biome- alyses suggest that a particular group of haplotypes in tra, Goettingen, Germany) were used under the following ASAH1 represents a signature of recent positive Darwin- conditions depending on primer pairs: denaturation at 94° for ian selection. 2 min followed by 40 amplification cycles of 94° for 30 sec, 55°– 59° for 30 sec, and 68° for 10 min, and ending with an extension at 68° for 20 min. The amplified products were purified using ExoSAP-IT (United States Biochemical, Cleve- MATERIALS AND METHODS land) and sequenced directly. Except for repeated sequences and nucleotides with low quality peaks, the 11-kb region was The LRH test: The LRH test (Sabeti et al. 2002) for eight used for subsequent analyses. Sequencing reactions were per- LSD-associated genes was conducted using the HapMap Project formed using BigDye Terminator v1.1 and v3.1 cycle sequenc- data, which was released in June 2006 (http://www.hapmap. ing kits (Applied Biosystems, Foster City, CA) and analyzed on org) (International HapMap Consortium 2005). The eight an ABI PRISM 377, 3100, and 3730 DNA sequencer (Applied genes are ASAH1, Glucosidase acid beta (GBA), b-Galactosidase 1 Biosystems). To avoid sequencing errors, the PCR products (GLB1), GM2-activator (GM2A), A were read at least twice in both directions. The accession (HEXA), Hexosaminidase B (HEXB), Niemann-Pick disease type C1 numbers of sequences determined in this study are as follows: (NPC1), and Niemann-Pick disease type C2 (NPC2). For each of AB371370–AB371406. Yoruba in Ibadan from Nigeria (YRI) and Northern and Population analyses: From the DNA sequence data of the Western Europeans in Utah (CEU), the HapMap data of 60 11-kb region of ASAH1, haplotype phases were inferred using unrelated individuals were analyzed. For Han Chinese from the program PHASE v.2.1 (Stephens et al. 2001; Stephens and Beijing (CHB) and Japanese from Tokyo ( JPT), similar data of Donnelly 2003) and fastPHASE 1.0.1 (Scheet and Stephens 45 and 44 unrelated individuals were used, respectively. To 2006). All estimated haplotypes were used for further analyses. estimate haplotype phases accurately, chromosomes with The nucleotide diversity (p)(Nei and Li 1979) and Tajima’s D $10% missing genotypes (undetermined genotypes) were ex- (Tajima 1989) were computed, and the HKA test (Hudson cluded. The haplotype phase of these trimmed HapMap data et al. 1987) was applied using the program DnaSP 4.10 (Rozas was estimated using the program PHASE v2.1 (Stephens et al. et al. 2003). A gene tree for the strong LD region (after ex- 2001; Stephens and Donnelly 2003). After the phase esti- cluding two possible recombinants) was constructed and the ASAH1 Polymorphism and Human Evolution 1507 time to the most recent common ancestor (TMRCA) was es- ceased at 0.1N0 generations after the unification of two sub- timated using the software Genetree (Griffiths and Tavare´ populations. At any segregating site, the lineage with a nu- 4 1995), assuming the effective population size (Ne)of10 and cleotide of its frequency 0.7 is defined as the S lineage, while the generation time (g) of 20 years (Takahata 1993; Klein if the frequency is 0.3, the lineage with the nucleotide is and Takahata 2002). To estimate the TMRCA in an alternative defined as the N lineage. The p-values within the S or N way, the average p distance between the V and M lineages (pB) lineage and in the entire population were calculated in the was used. The chimpanzee sequence was downloaded from the same way for the case of selection, and 1000 p-values within Ensemble database (ENSPTRT00000037110). The average per- each lineage were collected. site pairwise distance (d ¼ 0.0174 6 0.0003) between 60 human Phylogenetic analysis: ASAH1 orthologous DNA sequences and one chimpanzee chromosomes was calculated in the strong in mammals and birds were retrieved from the NCBI and LD region of 4.4 kb using the program MEGA 3.1 (Kumar et al. Ensembl databases. The accession numbers of the sequences 2004). Under the assumption that humans and chimpanzees used are as follows: humans (NM_177924), chimpanzees diverged 5–7 MYA, the between-lineage TMRCA was estimated (ENSPTRT00000037110), orangutans (CR859721), rhesus as (pB/d) 3 5–7 MY. The average nucleotide difference between monkeys (ENSMMUT00000021738), mice (NM_019734), rats the root haplotype and every individual in the sample was (NM_053407), cows (ENSBTAT00000014960), dogs (EN- calculated in each of the V and M lineages. The ratio of these SCAFT00000039154), hedgehogs (ENSETET00000012263), differences to d was then used to estimate the within-lineage opossums (ENSMODT00000024178), and chickens (NM_ TMRCA. 001006453). These nucleotide sequences were aligned by Simulations under various demographic models: We as- Clustal X (Thompson et al. 1997) and the alignment was fur- sumed a model of a panmictic population with bottleneck or ther checked manually. A neighbor-joining (NJ) tree (Saitou expansion as well as that of both recent and ancient structured and Nei 1987) was constructed by MEGA 3.1 (Kumar et al. populations. Under the assumption of selective neutrality, the 2004). The number of nonsynonymous sites was counted by a software ms (Hudson 2002) efficiently evaluated the extent of modified Nei–Gojobori method (Nei and Gojobori 1986; Ina neutral variation such as the p-value. Twenty-two different sets 1995; Zhang et al. 1998) with the complete-deletion option. of demographic parameters were examined with 50,000 rep- lications for each (supplemental Figure 1 and supplemental Table 3 at http://www.genetics.org/supplemental/) by com- monly specifying the following parameters: the number of RESULTS AND DISCUSSION chromosomes, 60; the number of segregating sites, 49 (the observed value in the region of strong LD); and the generation Linkage disequilibrium of LSD-associated genes: To time (g), 20 years. The other parameters for models followed identify target genes that showed a plausible signature of the previous studies (Takahata 1995; Marth et al. 2004; positive selection, we focused on eight LSD-associated oight illiamson V et al. 2005; W et al. 2005). genes whose deficiency clearly results in symptoms of To evaluate the effect of an advantageous mutation on the abeti pattern of polymorphism in comparison with neutral cases, a mental retardation. The LRH test (S et al. 2002) was forward simulation was also carried out (supplemental Figure applied to these genes using the HapMap data of four 2 at http://www.genetics.org/supplemental/). An ancestral populations, YRI, CEU, CHB, and JPT (International sequence of L ¼ 1000 bp length was generated at random. Two HapMap Consortium 2005). The test is intended to hundred copies (2N0 ¼ 200) of the sequences evolved ac- detect the signature of positive selection, i.e., selective cording to a finite-site model of neutral mutations and random sweep. Positive selection results in such rapid spread of a sampling, and this process was repeated till equilibrium. Each of the two subpopulations (popS and popN) was composed of selected site within a population that recombination these 100 sequences (N0 ¼ 2N1 ¼ 100, where N1 is the size of does not take place frequently to break down LD at subpopulation). To trace a particular lineage (N lineage) in a nearby sites. A haplotype containing the selected site subpopulation, a single mutation for labeling is introduced becomes dominant while maintaining long-range LD into a single gene in the popN. It was assumed that neutral (Sabeti et al. 2006) and resulting in significantly higher mutations scaled by N0 occur at a rate of N0Lm ¼ 0.4 per quadro generation, where m is the neutral mutation rate per site per EHH and REHH than under neutrality (A et al. abeti generation, followed by migration (at a rate of N0m ¼ 0.1, 0.01, 1994; S et al. 2002). and 0.001 where m is the migration rate per gene) and random The core region of each of the eight LSD-associated sampling of N0 sequences for the next generation. No re- genes is determined separately in YRI, CEU, CHB, and combination was assumed. Repeating this process for an JPT. For each gene, the core region turns out to be additional 10N0 generations, two subpopulations admix with each other and the two become a single panmictic one. At almost the same in the four populations, or the pattern 0.25N0 generations before this unification, a single advanta- of LD does not differ among populations. Both the EHH geous mutation of N0s ¼ 50 was introduced into a non-N and the REHH of each haplotype in the core region lineage in the popS and a new lineage (S lineage) is generated. (core haplotype) were measured in either the 200-kb The time of 0.25N0 generations is long enough for the region upstream or downstream from the core region mutation to fix in the popS. The fixation time of a single and compared with simulation data under a neutral advantageous mutation with N0s ¼ 50 within a subpopulation is akahata materials and meth- 0.2N0 generations (T 1991). Simulation was termi- model without recombination ( nated when the frequency of the S lineage in the entire pop- ods). Among these 16 surrounding regions of the eight ulation reached 0.7. Then the p-values within the S and N genes, a particular core haplotype including ASAH1 lineages and in the entire population were measured in each shows a significantly high EHH and REHH in the 200- replication and 100 such p-values were collected. kb downstream region for all four populations (P , 0.05; For a neutral case, the population size of N0 was set to be 20, because the simulation of the same scheme takes an enor- Figures 1 and 2, and supplemental Table 5 at http:// mously long time to collect sufficient data. The simulation was www.genetics.org/supplemental/). The EHH and REHH 1508 H. L. Kim and Y. Satta

with TMRCA .2MY(Harris and Hey 1999; Barreiro et al. 2005; Garrigan et al. 2005a; Stefansson et al. 2005; Garrigan and Hammer 2006; Hayakawa et al. 2006). Except for one gene (Garrigan et al. 2005b), the an- cient TMRCA of these genes is attributed to the presence of two distinct lineages in Africa (Satta and Takahata 2004). In particular, Hayakawa et al. (2006) show that the TMRCA at the CMP-N-acetylneuraminic acid hy- droxylase (CMAH) is 2.9 MY and suggest that this rather ancient TMRCA may result from partially isolated populations in the Pleistocene period in Africa. To compare the TMRCA at ASAH1 with that at other loci, Figure 1.—The relative extended haplotype homozygosity we applied the HKA test to the nucleotide diversity and (REHH) of ASAH1 genes. The REHH values (y-axis) are plot- divergence at 10 loci including CMAH (Hayakawa et al. ted against the core haplotype frequency (x-axis). Observed 2006) and those at ASAH1, but no significant differences REHHs were compared with simulated data with 1000 replica- are detected in the test. This suggests that the TMRCA at tions. Brown dots represent simulation results, and blue dots ASAH1 is not exceptional (supplemental Table 7 at represent the observations of each of the four populations. http://www.genetics.org/supplemental/). The p-value (Nei and Li 1979) in the SL subregion of associated with the ASAH1 core haplotype are kept high 60 chromosomes is high (0.37 6 0.02%; Table 1), more throughout (Figure 2A), whereas no other core regions than four times higher than the average value in the chosen in the 200-kb region exhibit such a tendency. (0.08%; Sachidanandam et al. 2001). It The most parsimonious explanation for the sharing of was examined whether this large p is compatible with de- this pattern across all four populations is that the sweep mographic models that have been proposed for human occurred prior to the radiation of modern humans out evolution so far by computer simulation (see materials of Africa. and methods, and supplemental Figure 1). The prob- Coalescence analysis of ASAH1 sequences: About an ability of p . 0.37% from simulated polymorphism data 11-kb region of the ASAH1 gene was sequenced for 15 was estimated under neutrality. The results of 50,000 Africans and 15 non-Africans. The region includes replications showed that the probability was ,0.03 except exons 3–10, which are part of the functional domain, for the ancient population-structure model (supplemen- and covers the strongest LD block with significantly high tal Table 4 at http://www.genetics.org/supplemental/). It EHH and REHH (Figure 2B). There are 106 segregating should be noted that this ancient population-structure sites and 37 haplotypes within this region (Figure 3). model (supplemental Figure 1H) is consistent with p- The small number of haplotypes relative to the large values at other loci, i.e., more than half of the 10 loci used number of segregating sites directly indicates fairly for the HKA test (data not shown). strong LD in this region. Nonetheless, we classified the The frequency of V and M in the total sample is 0.62 region into two in terms of LD, a 4.4-kb subregion and 0.38, respectively (supplemental Table 6). Under under strong LD (SL) and the remaining 6.6-kb sub- the assumption of Hardy–Weinberg equilibrium, the region under moderately strong LD (ML). With an expected heterozygosity with V and M is 0.47. However, ancestor sequence being inferred from a chimpanzee the observed value in the sample is 0.23, which is sig- sequence, the gene tree constructed for the SL sub- nificantly lower than the expectation (P , 0.005, chi- region (Figure 4) reveals two distinct lineages. They are square test). This heterozygosity deficiency is also ob- distinguished at two nonsynonymous polymorphic sites, served in both the African and non-African samples. M72Vand I93V. One lineage possesses Val (derived type) Under overdominance selection, we may also expect an at amino acid site 72 and Ile (ancestral type) at amino excess of heterozygotes over Hardy–Weinberg equilib- acid site 93, whereas the other possesses Met (ancestral rium, although the extent is not necessarily large if mat- type) and Val (derived type) at these sites, respectively. ing occurs at random every generation. The deficiency is The two lineages are named M and V with respect to this inconsistent with the possibility of overdominance that M72V dimorphism and are found in both the African might maintain two allelic lineages for a long time and the non-African samples (supplemental Table 6 at (Takahata 1990). Alternatively, this deficiency may http://www.genetics.org/supplemental/). The TMRCA suggest that the V and M lineages have been maintained between the V and M lineages is estimated as 2.4 6 0.4 in a partially isolated subpopulation until the exodus MY from the average pairwise nucleotide differences from Africa. When migration is limited, it is likely that materials and methods (pB)( ). This is relatively old genes within each subpopulation coalesce to a common 4 compared to the average TMRCA of 0.8 MY, if Ne ¼ 10 ancestor and form a single cluster in a tree (Takahata and g ¼ 20 years (Takahata 1993; Klein and Takahata 1991). Subpopulation-specific lineages tend to be re- 2002). However, there are several reports about genes ciprocally monophyletic and independent mutations ASAH1 Polymorphism and Human Evolution 1509

Figure 2.—A map of genes and SNPs, REHH by distance, and a LD plot of ASAH1. (A) Blue rectangles indicate genes in the 200- kb region, with the official symbol of the gene shown on the rectangles. Under the genes, blue bars indicate the SNPs used for the LRH test and dark blue bars rep- resent SNPs in the core re- gion. The x-axis indicates the distance from the core region and the y-axis, the REHH values of each haplo- type in the region. Core hap- lotype sequences are shown in supplemental Figure 3A at http://www.genetics.org/ supplemental/. The data presented is based on the CEU panel from the Hap- Map. (B) A LD plot was drawn using the software Haploview 3.32 (Barrett et al. 2005) and analyzed on the basis of HapMap ge- notype data in which SNPs were shared in all four pop- ulations. The black line ex- presses the gene structure of ASAH1, and on the line, the dark blue rectangle rep- resents the region se- quenced in this study. The black bar in the light blue rectangle shows the location of the SNPs used for the LD analysis, with the reference SNP identification number indicated under each SNP. The triangle represents the LD plot calculated from these SNPs. Numbers in squares represent D9 values (3100, Lewontin 1964) and the color of each square expresses the extent of LD: bright red, LOD (Morton 1955) $ 2 and D9 ¼ 1; red and shades of pink, LOD $ 2 and D9,1; blue, LOD , 2 and D9 ¼ 1; and white, LOD , 2 and D9,1. The triangle surrounded by a thick black line shows the LD block according to the definition of Gabriel et al. (2002). can accumulate in a lineage-specific manner. This pat- that within the M lineage as far as the SL subregion is tern is consistent with the observed tree topology. In this concerned (Figure 4). Indeed, the TMRCA of the V context, it should be noted that the pattern of genetic lineage is estimated as 200 6 50 KY from Genetree anal- diversity of ASAH1 and other loci is compatible with the ysis (Griffiths and Tavare´ 1995) and 340 6 80 KY on proposal that the human population was once geograph- the basis of the average nucleotide diversity (materials ically structured and genetically differentiated in Africa and methods). On the other hand, the TMRCA of the (Takahata 1995; Satta and Takahata 2004). M lineage is 320 6 70 KY from the Genetree analysis and ASAH1 polymorphism: The value of nucleotide di- 680 6 180 KY from the nucleotide diversity. Compared versity (p) within the V lineage (pVSL ¼ 0.05 6 0.01%; with the M lineage, the relatively recent origin of the Table 1) is smaller than the overall p-value on chromo- predominant V lineage implies that it has been rapidly some 8 (0.08%) (Sachidanandam et al. 2001) on which increasing in frequency. In accordance with this, Taji- ASAH1 is located. More importantly, the pVSL value is ma’s D value is negative (0.11) for the V lineage and significantly smaller than the nucleotide diversity within positive (0.27) for the M lineage, although both are not the M lineage (pMSL ¼ 0.13 6 0.02%). Furthermore, the statistically different from zero (P . 0.1) (Tajima 1989). number of haplotypes in the V lineage is only 6, yet it is To see if the reduced level of polymorphism within the 11 in the M lineage. These hold true in both African and V lineage is restricted to the SL subregion, the p-value is non-African samples (Table 1). We can therefore expect compared with that of the ML subregion defined as that the TMRCA within the V lineage is younger than moderately strong LD (Figure 5). Possible recombina- 1510 H. L. Kim and Y. Satta

Figure 3.—Segregating sites in 37 haplotypes and their frequencies in a worldwide sample of 60 chromosomes. Dots indicate nucleotides that are identical to those in the top line sequence. The number above the sequence indicates the position of each site, and sites in the coding region are represented by the larger font size. Nucleotides with an asterisk (*) indicate a non- synonymous polymorphic site; the amino acid substitutions at these sites are shown above the top line sequence. Sequences in the rectangle are in complete linkage disequilibrium except for two haplotypes (M090 and M100). The numbers in the far right-hand column indicate the frequency of each haplotype.

tion in the ML subregion does not allow us to make operating for the V lineage. In addition, it should be phylogenetic analysis and the concept of lineage per se noted that this small genetic diversity was limited to the becomes equivocal. For this reason, we simply defined SL subregion, suggesting that the target of the selection the V and M lineages in the ML subregion as those that is located within the subregion. are linked with V and M in the SL subregion. We then Lineage-specific amino acid changes: We have also calculated the nucleotide diversity as pVML ¼ 0.10 6 attempted to identify a target site of positive selection. In 0.02% and pMML ¼ 0.18 6 0.02%. They are not sig- the SL subregion of ASAH1, there are 46 segregating nificantly different from each other (P . 0.05, Z-test), sites in introns, one synonymous and two nonsynon- but pVML is significantly larger than pVSL (P , 0.01, Z- ymous segregating sites. The synonymous mutation is test; Table 1). observed in only one chromosome (V040), whereas two As mentioned earlier, the frequency of the V lineage nonsynonymous mutations (M72V and I93V) are ‘‘fixed’’ in the total sample is higher (0.62) than that of the M within each lineage. Regarding M72V, the Val is a derived lineage (0.38). The predominance of the V lineage is ob- and human-specific amino acid, because in chimpanzees, served in both Africans and non-Africans (supplemental orangutans, and rhesus monkeys the amino acid at this Table 6). The HapMap data also gives similar results: site is exclusively occupied by Met. Regarding I93V, 0.83 in YRI, 0.56 in CEU, and 0.67 in both CHB and JPT. although the Val is shared by rhesus monkeys, but the Despite the relatively high frequency of the V lineage ir- site is occupied by Ile in chimpanzees and orangutans respective of data and populations, the p-value is smaller (Figure 6). It is likely that site 93 is subject to recurrent and the within-lineage TMRCA is shorter than the cor- substitution to Val. responding value in the M lineage. All these features are The SL subregion contains exon 3 (31 amino acids) consistently explained by positive Darwinian selection and exon 4 (29 amino acids). Although these exons do ASAH1 Polymorphism and Human Evolution 1511

TABLE 1 Genetic diversity at ASAH1 in a worldwide sample of 60 chromosomes

Lineagea Nb Sc Hd pe Worldwide All 60 49 17 0.37 (0.024) V 37 9 6 0.05 (0.006) M 23 19 11 0.13 (0.017) African All 30 44 11 0.36 (0.043) V 20 8 5 0.06 (0.010) M 10 12 6 0.13 (0.016) Non-African All 30 41 9 0.39 (0.026) V 17 4 3 0.04 (0.005) M 13 16 6 0.11 (0.034) a Defined by V and M at site 72. b Number of sequences. c Number of segregating sites. Figure 4.—A gene tree of 13 human ASAH1 haplotypes in d Number of haplotypes. the SL subregion. The chimpanzee sequence was used to root e Percentage of nucleotide diversity per site (standard devi- the tree. Solid circles represent nucleotide substitutions. The ation). haplotype names at the tip of the tree are provided in supple- mental Table 6. The numbers below the haplotype names rep- resent the frequency of each haplotype on the 58 examined for a finite island model in Takahata (1995). The chromosomes, excluding two recombinant sequences. Since model assumes that the ancestral panmictic population the nucleotide at site 32 (Figure 3) in M080 is the same as that in the chimpanzee, parallel substitution was thought to have of effective size M has been subdivided into l island occurred at this site; the site is incompatible with others and populations with effective size N1 for ti generations. Our therefore excluded from the tree. After exclusion of this site, model is different in that the subpopulations have fur- M080 in the SL subregion is identical to M2. For the method ther admixed to form a panmictic population of effec- of TMRCA estimation, see materials and methods. tive size N0 for subsequent tm generations to the present. We sample two genes from this panmictic population and derive the formula of the mean coalescence time. not encode the active center of ASAH1, mutations in Although the formula is complicated for any values of l, these exons are associated with Faber disease. Three M, N1, and N0, it is much simplified for the case of l ¼ 2, such nonsynonymous mutations are Y36C, V96del, and M ¼ N0 ¼ 2N1, and this simplified formula is sufficient V97E (Ba¨r et al. 2001; Muramatsu et al. 2002). It thus for our present purpose. appears that these exons encode important functional The coalescence can occur in the admixed popula- parts of ASAH1. It is interesting to note that the 659-bp tion during the period of tm generation. This condi- region encompassing M72V did not accumulate any tional TMRCA, T0, is readily given by mutations within the V lineage (Figure 5). This is again a ð tm 1 ðtm=2N0Þ tm signature of selective sweep on the region encompassing T0 ¼ t e dt ¼ 2N0 1 1 1 a ; Val at site 72, suggesting the amino acid might be a likely 0 2N0 2N0 target of positive selection for the V lineage. ð1Þ Theoretical considerations of natural selection op- ðtm=2N0Þ erating on ASAH1: The above results show that positive where a ¼ e . On the other hand, if two genes do selection is likely to have favored the V lineage com- not coalesce during tm, the two ancestral gene lineages monly sharing Val at amino acid site 72 and showing reside in the same subpopulations with probability P(0) reduction of genetic diversity in the lineage despite its and in two different populations with probability Q(0) in ð Þ¼ ð Þ¼1 predominance (the large population size). To examine Takahata’s designations. Obviously, P 0 Q 0 2 in akahata the relationship between p within a lineage and pop- the present model with l ¼ 2. Equation 4 in T ulation size with and without selection, we carried out (1995) then reduces to computer simulations under the ancient population- 1 Q ðtiÞ structure model (materials and methods; supplemen- T1 ¼ 4N1 1 ; ð2Þ 4m 2m tal Figure 2). To check the validity of the computer program for the model, we compare simulation results where with the theoretical expectation for neutral cases. 1 R 1 R R R 1 2 ðR2R1Þti 2 1 ðR11R2Þti To obtain the theoretical formulas for the coalescence Q ðtiÞ¼ e 1 e ; time of two neutral genes at a locus, we use Equation 4 2R2 2 2 1512 H. L. Kim and Y. Satta

Figure 5.—Window analysis of p in the se- quenced region of 11 kb in length. The p-values were calculated in a window size of 500 bp with a step size of 250 bp. The x-axis indicates the dis- tance in the sequenced region and the y-axis, the p-value of each window. (A) The p-values ob- tained from a sample of 60 sequences. (B) The p- values within the V lineage. The arrow indicates the location of the M72V polymorphism. (C) The p-values within the M lineage.

rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Under neutrality, simulation assumes N Lm ¼ 0.4, L ¼ 1 2 m 0 R1 ¼ 2m 1 and R2 ¼ R1 : 1000, and N ¼ 20. In a wide range of N m, p in 4N1 N1 0 0 Equations 4a and 4b agrees well with that in simulation Noting that this coalescence is conditional with proba- (Table 2). The same simulation also demonstrates that bility a and from Equations 1 and 2, we obtain the for- the p-value within the predominant lineage (S lineage) is mula of the unconditional coalescence time T ¼ T0 1 significantly larger than that within the other lineage (N aT1: lineage), even though with limited gene flow (4N0m ¼ , T ¼ 2N0 að2N0 1 tm T1Þ: ð3Þ 0.004–0.4) (Table 2; t-test, P 1E 130). Thus neutrality cannot explain the observed reduction of nucleotide For 4N0m . 1, T approaches to 2N0 generations, diversity in the V lineage in the SL subregion. On the whereas for small 4N0m, T becomes large in propor- other hand, in the presence of selection, the mean and tion to the reciprocal of m. variance of nucleotide diversity in the S lineage are Finally, to define the effective population size and the significantly smaller than those in the N lineage with any expected nucleotide diversity p for the present non- tested migration rate (Table 2; F-test, P , 1E 10; t-test, equilibrium demographic model, we set Ne ¼ T/2 as in P , 1E 6). Selective sweep can thus result in reduced Nei and Takahata (1993) and obtain nucleotide diversity within a lineage that carries an a advantageous mutation. This pattern is exactly what we N ¼ N ð2N 1 t T Þð4aÞ e 0 2 0 m 1 observed at ASAH1. Conclusions: The pattern and level of genetic vari- ability at ASAH1 cannot be explained by demographic p ¼ 4Ne m: ð4bÞ causes alone and/or overdominance selection. Rather, ASAH1 Polymorphism and Human Evolution 1513

Figure 6.—Alignment of ASAH1 orthologs from birds and mammals in the SL subregion (A) and the NJ tree obtained on the ba- sis of nonsynonymous sub- stitutions in the entire ASAH1 of these orthologs (B). (A) Dots indicate amino acids that are identi- cal to those in the top line sequence. Dashes represent a deletion. Boldface type in- dicates amino acid substitu- tions in the human lineage compared with chimpan- zees and birds and other mammals. (B) Numbers near the node indicate the bootstrap probability in 500 replications. The scale bar below the tree repre- sents the number of nonsy- nonymous substitutions per site. The nonsynonymous sites of 751 bp were com- pared. The chicken se- quence was used as an outgroup. Uppercase letters next to the species name indicate the amino acid at position 72.

they suggest positive Darwinian selection operating on curred and the V lineage has since spread in the entire the predominant V lineage of a relatively recent origin. population by possible positive selection. Although there It is likely that the human-specific Val at amino acid site is no functional assay of the Val substitution at present, 72 is a possible target for positive Darwinian selection the Val may influence the enzyme activity of ASAH1, and may have occurred before the most recent common thereby affecting catalysis of ceramides and possibly neu- ancestor of all V lineage haplotypes, 200–340 KYA. It is ronal development as well. Further comprehension of speculated that, when modern humans dispersed from ASAH1 and other genes could shed light on the evolution Africa, admixture of the distinct V and M lineages oc- of mental activity in modern human evolution.

TABLE 2 The distribution of nucleotide diversity under an ancient population-structure model with several migration rates

Total S lineage N lineage

a b N0m Mean SD Mean SD Mean SD Neutral 0.1 0.33 (0.29) 0.17 0.18 0.15 0.05 0.08 0.01 0.46 (0.47) 0.17 0.25 0.19 0.05 0.06 0.001 0.51 (0.51) 0.16 0.28 0.21 0.05 0.07 Selection 0.1 0.37 0.18 0.01 0.01 0.12 0.07 0.01 0.38 0.12 0.01 0.02 0.07 0.06 0.001 0.38 0.15 0.01 0.02 0.07 0.06 a Percentage of mean of nucleotide diversity (theoretical expectation). b Percentage of standard deviation of nucleotide diversity. 1514 H. L. Kim and Y. Satta

We thank Naoyuki Takahata, Tatsuya Ota, and Andrew G. Clark for Griffiths, R. C., and S. Tavare´, 1995 Unrooted genealogical tree their critical reading of the manuscript and for their numerous sug- probabilities in the infinitely-many-sites model. Math. Biosci. 127: gestions and three anonymous reviewers for their helpful comments on 77–98. annun beid an early version of the manuscript. This study was supported in part by H , Y. A., and L. M. O , 2002 The -centric uni- an intramural grant from the Hayama Center for Advanced Studies. verse of lipid-mediated cell regulation: stress encounters of the lipid kind. J. Biol. Chem. 277: 25847–25850. Harris, E. E., and J. Hey, 1999 X chromosome evidence for ancient human histories. Proc. Natl. Acad. Sci. USA 96: 3320–3324. ayakawa ki arki atta akahata LITERATURE CITED H , T., I. A ,A.V ,Y.S and N. T , 2006 Fixation of the human-specific CMP-N-acetylneuraminic Aquadro,C.F.,D.J.Begun and E. C. Kindahl, 1994 Non-neutral evo- acid hydroxylase pseudogene and implications of haplotype di- lution, pp. 46–56 in Selection, Recombination and DNA Polymorphism in versity for human evolution. Genetics 172: 1139–1146. Drosophila,editedbyB.Golding. Chapman & Hall, London. Henshilwood, C. S., F. d’Errico,R.Yates,Z.Jacobs,C.Tribolo Ba¨r, J., T. Linke,K.Ferlinz,U.Neumann,E.H.Schuchman et al., et al., 2002 Emergence of modern human behavior: Middle 2001 Molecular analysis of acid ceramidase deficiency in pa- Stone Age engravings from South Africa. Science 295: 1278–1280. tients with Farber disease. Hum. Mutat. 17: 199–209. Hudson, R. R., 2002 Generating samples under a Wright-Fisher Barreiro, L. B., E. Patin,O.Neyrolles,H.M.Cann,B.Gicquel neutral model of genetic variation. Bioinformatics 18: 337–338. et al., 2005 The heritage of pathogen pressures and ancient de- Hudson,R.R.,M.Kreitman and M. Aguade´, 1987 A test of neutral mo- mography in the human innate-immunity CD209/CD209L re- lecular evolution based on nucleotide data. Genetics 116: 153–159. gion. Am. J. Hum. Genet. 77: 869–886. Ina, Y., 1995 New methods for estimating the numbers of synony- Barrett, J. C., B. Fry,J.Maller and M. J. Daly, 2005 Haploview: mous and nonsynonymous substitutions. J. Mol. Evol. 40: 190–226. analysis and visualization of LD and haplotype maps. Bioinfor- Inlow, J. K., and L. L. Restifo, 2004 Molecular and comparative matics 15: 263–265. genetics of mental retardation. Genetics 166: 835–881. Bouzouggar,A.,N.Barton,M.Vanhaeren,F.d’Errico,S.Collcutt International HapMap Consortium, 2005 A haplotype map of et al., 2007 82,000-year-old shell beads from North Africa and im- the human genome. Nature 437: 1299–1320. plications for the origins of modern human behavior. Proc. Natl. Kivisild, T., P. Shen,D.P.Wall,B.Do,R.Sung et al., 2006 The role Acad. Sci. USA 104: 9964–9969. of selection in the evolution of human mitochondrial genomes. Buccoliero, R., and A. H. Futerman, 2003 The roles of ceramide Genetics 172: 373–387. and complex sphingolipids in neuronal cell function. Pharma- Klein, R. G., 1999 Anatomically modern humans, pp. 495–573 in col. Res. 47: 409–419. The Human Career: Human Biological and Cultural Origins, Ed. 2. Buccoliero, R., J. Bodennec and A. H. Futerman, 2002 The role The University of Chicago Press, Chicago. of sphingolipids in neuronal development: lessons from models Klein, J., and N. Takahata, 2002 Where Do We Come From? The Mo- of sphingolipid storage diseases. Neurochem. Res. 27: 565–574. lecular Evidence for Human Descent. Springer-Verlag, Berlin. Currat,M,L.Excoffier,W.Maddison,S.P.Otto,N.Ray et al., Koch, J., S. Ga¨rtner,C.M.Li,L.E.Quintern,L.Bernardo et al., 2006 Comment on ‘‘Ongoing adaptive evolution of ASPM, a 1996 Molecular cloning and characterization of a full-length brain size determinant in Homo sapiens’’ and ‘‘Microcephalin, complementary DNA encoding human acid ceramidase. Identi- a gene regulating brain size, continues to evolve adaptively in hu- fication of the first molecular lesion causing Farber disease. J. mans.’’ Science 313: 172. Biol. Chem. 271: 33110–33115. Devi,A.R.R.,M.Gopikrishna,R.Ratheesh,G.Savithri,G.Swarnalata Kumar, S., K. Tamura and M. Nei, 2004 MEGA3: integrated soft- et al., 2006 Farber lipogranulomatosis: clinical and molecular ge- ware for molecular evolutionary genetics analysis and sequence netic analysis reveals a novel mutation in an Indian family. J. Hum. alignment. Brief. Bioinform. 5: 150–163. Genet. 51: 811–814. Lewontin, R. C., 1964 The interaction of selection and linkage. I. Ding,Y.C.,H.C.Chi,D.L.Grady,A.Morishima,J.R.Kidd et al., General considerations; heterotic models. Genetics 49: 49–67. 2002 Evidence of positive selection acting at the human dopamine Li, C. M., J. H. Park,X.He,B.Levy,F.Chen et al., 1999 The human receptor D4 gene locus. Proc. Natl. Acad. Sci. USA 99: 309–314. acid ceramidase gene (ASAH): structure, chromosomal location, Enard, W., M. Przeworski,S.E.Fisher,C.S.Lai,V.Wiebe et al., mutation analysis, and expression. Genomics 62: 223–231. 2002 Molecular evolution of FOXP2, a gene involved in speech Macaulay, V., C. Hill,A.Achilli,C.Rengo,D.Clarke et al., and language. Nature 418: 869–872. 2005 Single, rapid coastal settlement of Asia revealed by analy- Evans, P. D., J. R. Anderson,E.J.Vallender,S.S.Choi and B. T. sis of complete mitochondrial genomes. Science 308: 1034–1036. Lahn, 2004 Reconstructing the evolutionary history of micro- Marth, G. T., E. Czabarka,J.Murvai and S. T. Sherry, 2004 The cephalin, a gene controlling human brain size. Hum. Mol. allele frequency spectrum in genome-wide human variation data Genet. 13: 1139–1145. reveals signals of differential demographic history in three large Futerman, A. H., and Y. A. Hannun, 2004 The complex life of sim- world populations. Genetics 166: 351–372. ple sphingolipids. EMBO Rep. 5: 777–782. Mellars, P., 2006 Why did modern human populations disperse Futerman, A. H., and H. Riezman, 2005 The ins and outs of sphin- from Africa ca. 60,000 years ago? A new model. Proc. Natl. Acad. golipid synthesis. Trends Cell Biol. 15: 312–318. Sci. USA 103: 9381–9386. Gabriel, S. B., S. F. Schaffner,H.Nguyen,J.M.Moore,J.Roy et al., Merrill,Jr., A. H., 2002 De novo sphingolipid biosynthesis: a nec- 2002 The structure of haplotype blocks in the human genome. essary, but dangerous, pathway. J. Biol. Chem. 277: 25843–25846. Science 296: 2225–2229. Mervis,C.B.,andA.M.Becerra,2007 Languageandcommunica- Garrigan, D., and M. F. Hammer, 2006 Reconstructing human ori- tive development in Williams syndrome. MRDD Res. Rev. 13: 3–15. gins in the genomic era. Nat. Rev. Genet. 7: 669–680. Morton, N. E., 1955 Sequential tests for the detection of linkage. Garrigan, D., Z. Mobasher,S.B.Kingan,J.A.Wilder and M. F. Am. J. Hum. Genet. 7: 277–318. Hammer, 2005a Deep haplotype divergence and long-range Moser, H. W., 1995 Ceramidase deficiency: Farber lipogranuloma- linkage disequilibrium at xp21.1 provide evidence that humans tosis, pp. 2589–2599 in The Metabolic Basis of Inherited Disease, Ed. descend from a structured ancestral population. Genetics 170: 7, edited by C. R. Scriver,A.L.Beaudet,W.S.Sly and D. 1849–1856. Valle. McGraw-Hill, New York. Garrigan, D., Z. Mobasher,T.Severson,J.A.Wilder and M. F. Muramatsu,T.,N.Sakai,I.Yanagihara,M.Yamada,T.Nishigaki et al., Hammer, 2005b Evidence for archaic Asian ancestry on the hu- 2002 Mutation analysis of the acid ceramidase gene in Japanese man X chromosome. Mol. Biol. Evol. 22: 189–192. patients with Farber disease. J. Inherit. Metab. Dis. 25: 585–592. Gatt, S., 1963 Enzymic hydrolysis and synthesis of ceramides. J. Nei, M., 1983 Genetic polymorphism and the role of mutation in Biol. Chem. 238: 3131–3133. evolution, pp. 165–190 in Evolution of Genes and , edited Gilad, Y., S. Rosenberg,M.Przeworski,D.Lancet and K. Skorecki, by M. Nei and R. K. Koehn. Sinauer Associates, Sunderland, MA. 2002 Evidence for positive selection and population struc- Nei, M., and T. Gojobori, 1986 Simple methods for estimating the ture at the human MAO-A gene. Proc. Natl. Acad. Sci. USA 99: numbers of synonymous and nonsynonymous nucleotide substi- 862–867. tutions. Mol. Biol. Evol. 3: 418–426. ASAH1 Polymorphism and Human Evolution 1515

Nei, M., and W. H. Li, 1979 Mathematical model for studying ge- Sugita,M.,J.T.DulaneyandH.W.Moser,1972 Ceramidasedeficiency netic variation in terms of restriction endonucleases. Proc. Natl. in Farber’s disease (lipogranulomatosis). Science 178: 1100–1102. Acad. Sci. USA 76: 5269–5273. Tajima, F., 1989 Statistical method for testing the neutral mutation Nei, M., and N. Takahata, 1993 Effective population size, genetic hypothesis by DNA polymorphism. Genetics 123: 585–595. diversity, and coalescence time in subdivided populations. J. Mol. Takahata, N., 1990 A simple genealogical structure of strongly bal- Evol. 37: 240–244. anced allelic lines and trans-species evolution of polymorphism. Rother, J., G. van Echten,G.Schwarzmann and K. Sandhoff, Proc. Natl. Acad. Sci. USA 87: 2419–2423. 1992 Biosynthesis of sphingolipids: dihydroceramide and not Takahata, N., 1991 Genealogy of neutral genes and spreading of sphinganine is desaturated by cultured cells. Biochem. Biophys. selected mutations in a geographically structured population. Res. Commun. 189: 14–20. Genetics 129: 585–595. Rozas, J., J. C. Sanchez-DelBarrio,X.Messeguer and R. Rozas, Takahata, N., 1993 Allelic genealogy and human evolution. Mol. 2003 DnaSP, DNA polymorphism analyses by the coalescent Biol. Evol. 10: 2–22. and other methods. Bioinformatics 19: 2496–2497. Takahata, N., 1995 A genetic perspective on the origin and history Rozen, S., and H. Skaletsky, 2000 Primer3 on the WWW for gen- of humans. Annu. Rev. Ecol. Syst. 26: 343–372. eral users and for biologist programmers, pp. 365–386 in Bioinfor- Thompson, J. D., T. J. Gibson,F.Plewniak,F.Jeanmougin and D. G. matics Methods and Protocols: Methods in Molecular Biology,editedbyS. Higgins, 1997 The CLUSTAL_X windows interface: flexible Krawetz and S. Misener. Humana Press, Totowa, NY. strategies for multiple sequence alignment aided by quality anal- Ruvolo, P. P., 2003 Intracellular signal transduction pathways acti- ysis tools. Nucleic Acids Res. 25: 4876–4882. vated by ceramide and its metabolites. Pharmacol. Res. 47: 383–392. Voight, B. F., A. M. Adams,L.A.Frisse,Y.Qian,R.R.Hudson et al., Sabeti, P. C., D. E. Reich,J.M.Higgins,H.Z.Levine,D.J.Richter 2005 Interrogation multiple aspects of variation in a full rese- et al., 2002 Detecting recent positive selection in the human ge- quencing data set to infer human population size changes. Proc. nome from haplotype structure. Nature 419: 832–837. Natl. Acad. Sci. USA 102: 18508–18513. Sabeti, P. C., S. F. Schaffner,B.Fry,J.Lohmueller,P.Varilly et al., Wang, Y. Q. and B. Su, 2004 Molecular evolution of microcephalin, 2006 Positive natural selection in the human lineage. Science a gene determining human brain size. Hum. Mol. Genet. 13: 312: 1614–1620. 1131–1137. Sachidanandam, R., D. Weissman,S.C.Schmidt,J.M.Kakol,L.D. Wang, Y. Q., Y. P. Qian,S.Yang,H.Shi,C.H.Liao et al., Stein et al., 2001 A map of human genome sequence variation 2005 Accelerated evolution of the pituitary adenylate cyclase- containing 1.42 million single nucleotide polymorphisms. Na- activation polypeptide precursor gene during human origin. Ge- ture 409: 928–933. netics 170: 801–806. Saitou, N., and M. Nei, 1987 The neighbor-joining method: a new Watson, E., P. Forster,M.Richards and H. J. Bandelt, method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 1997 Mitochondrial footprints of human expansions in Africa. 406–425. Am. J. Hum. Genet. 61: 691–704. Satta, Y., and N. Takahata, 2004 The distribution of the ancestral Williamson,S.H.,R.Hernandez,A.Fledel-Alon,L.Zhu,R.Nielsen haplotype in finite stepping-stone models with population ex- et al., 2005 Simultaneous inference of selection and population pansion. Mol. Ecol. 13: 877–886. growth from patterns of variation in the human genome. Proc. Scheet, P., and M. Stephens, 2006 A fast and flexible statistical Natl. Acad. Sci. USA 102: 7882–7887. model for large-scale population genotype data: applications to Yu, F., P. C. Sabeti,P.Hardenbol,Q.Fu,B.Fry et al., 2005 Positive inferring missing genotypes and haplotypic phase. Am. J. selection of a pre-expansion CAG repeat of the human SCA2 Hum. Genet. 78: 629–644. gene. PLoS Genet. 1: e41. Schumacher, J., P. Hoffmann,C.Schma¨l,G.Schulte-Ko¨rne and Yu, F., R. S. Hill,S.F.Schaffner,P.C.Sabeti,E.T.Wang et al., M. M. No¨then, 2007 Genetics of dyslexia: the evolving land- 2007 Comment on ‘‘Ongoing adaptive evolution of ASPM, a scape. J. Med. Genet. 44: 289–297. brain size determinant in Homo sapiens.’’ Science 316: 370. Schwarz, A., and A. H. Futerman, 1997 Distinct roles for ceramide Zhang, J., 2003 Evolution of the human ASPM gene, a major deter- and glucosylceramide at different stages of neuronal growth. J. minant of brain size. Genetics 165: 2063–2070. Neurosci. 17: 2929–2938. Zhang, J., H. F. Rosenberg and M. Nei, 1998 Positive Darwinian Stefansson, H., A. Helgason,G.Thorleifsson,V.Steinthorsdottir, selection after gene duplication in primate ribonuclease genes. G. Masson et al., 2005 A common inversion under selection in Proc. Natl. Acad. Sci. USA 95: 3708–3713. Europeans. Nat. Genet. 37: 129–137. Zhang, J., D. M. Webb and O. Podlaha, 2002 Accelerated protein Stephens, M., and P. Donnelly, 2003 A comparison of Bayesian evolution and origins of human-specific features: Foxp2 as an ex- methods for haplotype reconstruction from population genotype ample. Genetics 162: 1825–1835. data. Am. J. Hum. Genet. 73: 1162–1169. Stephens, M., N. J. Smith and P. Donnelly, 2001 A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet. 68: 978–989. Communicating editor: A. Di Rienzo