Haplotypes and Linkage Disequilibrium at the Phenylalanine Hydroxylase Locus, PAH, in a Global Representation of Populations Judith R
Total Page:16
File Type:pdf, Size:1020Kb
Am. J. Hum. Genet. 66:1882±1899, 2000 Haplotypes and Linkage Disequilibrium at the Phenylalanine Hydroxylase Locus, PAH, in a Global Representation of Populations Judith R. Kidd,1 Andrew J. Pakstis,1 Hongyu Zhao,2 Ru-Band Lu,4 Friday E. Okonofua,5 Adekunle Odunsi,6 Elena Grigorenko,3, Batsheva Bonne-Tamir,7 Jonathan Friedlaender,8 Leslie O. Schulz,9 Josef Parnas,10 and Kenneth K. Kidd1 Departments of 1Genetics, 2Epidemiology and Public Health, and 3Psychology and Child Study Center, Yale University, New Haven, CT; 4Department of Psychiatry, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan; 5University of Benin, Faculty of Medicine, Benin City, Nigeria; 6Department of Gynecological Oncology, Roswell Park Cancer Institute, Buffalo; 7Department of Genetics, Sackler School of Medicine, Tel Aviv University, Tel Aviv; 8Department of Anthropology, Temple University, Philadelphia; 9Department of Health Sciences, University of Wisconsin, Milwaukee; and 10Institute of Preventative Medicine, Kommune Hospitalet, Copenhagen Because defects in the phenylalanine hydroxylase gene (PAH) cause phenylketonuria (PKU), PAH was studied for normal polymorphisms and linkage disequilibrium soon after the gene was cloned. Studies in the 1980s concentrated on European populations in which PKU was common and showed that haplotype-frequency variation exists between some regions of the world. In European populations, linkage disequilibrium generally was found not to exist between RFLPs at opposite ends of the gene but was found to exist among the RFLPs clustered at each end. We have now undertaken the ®rst global survey of normal variation and disequilibrium across the PAH gene. Four well-mapped single-nucleotide polymorphisms (SNPs) spanning »75 kb, two near each end of the gene, were selected to allow linkage disequilibrium across most of the gene to be examined. These SNPs were studied as PCR-RFLP markers in samples of, on average, 50 individuals for each of 29 populations, including, for the ®rst time, multiple populations from Africa and from the Americas. All four sites are polymorphic in all 29 populations. Although all but 5 of the 16 possible haplotypes reach frequencies 15% somewhere in the world, no haplotype was seen in all populations. Overall linkage disequilibrium is highly signi®cant in all populations, but disequilibrium between the opposite ends is signi®cant only in Native American populations and in one African population. This study demonstrates that the physical extent of linkage disequilibrium can differ substantially among populations from different regions of the world, because of both ancient genetic drift in the ancestor common to a large regional group of modern populations and recent genetic drift affecting individual populations. Introduction (Kazazian et al. 1984), as well as studies at the phenyl- alanine hydroxylase locus (PAH) to identify phenyl- Linkage disequilibrium or, more generally, gametic- ketonuria (PKU [MIM 261600]) mutants (DiLella et al. phase allelic association, is the nonrandom occurrence 1986a, 1987). of alleles on chromosomes (i.e., in gametes) in a popu- PKU is one of the most common genetic diseases in lation. Linkage disequilibrium has become an important people of northern-European descent, occurring in that tool in the end stages of positional cloning because a group at an average rate of »1/10,000 live births (Bickel recently arisen single deleterious allele will usually be et al. 1981). The PAH gene, coding for the enzyme nonrandomly associated with the alleles at nearby poly- phenylalanine hydroxylase (PAH), was implicated as the morphic sites that were on the chromosome on which etiologic gene, by the absence of PAH enzyme activity the mutation originally occurred. Among the earliest ap- in patients with PKU (Friedman et al. 1973). The human plications of this principle for identi®cation of distinct PAH cDNA was cloned (Woo et al. 1983; Kwok et al. mutations associated with a disease were studies at the 1985), RFLPs were identi®ed by use of the cDNA as b-hemoglobin cluster to identify thalassemia mutants the probe (Woo et al. 1983; Lidsky et al. 1985a), the gene was mapped to human chromosome 12q22-24 (Lidsky et al. 1985b), and the molecular structure of Received December 23, 1999; accepted for publication March 14, 2000; electronically published April 27, 2000. the gene was described (DiLella et al. 1986a), relatively Address for correspondence and reprints: Dr. Judith R. Kidd, early in the history of recombinant-DNA studies. When Department of Genetics, SHM I353, Yale University School of Med- the RFLPs encompassing the gene were used, it was icine, 333 Cedar Street, New Haven, CT 06520-8005. E-mail: kidd obvious that PKU mutations occurred on several dif- @biomed.med.yale.edu q 2000 by The American Society of Human Genetics. All rights reserved. ferent haplotypes. By 1986 it was recognized that PKU 0002-9297/2000/6606-0017$02.00 was mutationally heterogeneous; at least some different 1882 Kidd et al.: A Global Survey of Disequilibrium at PAH 1883 PKU haplotypes were found to have different mutations ground haplotypes is maintained at the PAHdb Web site (DiLella et al. 1986b, 1987). Interestingly, in a sample (Nowacki et al. 1997). from Denmark, no signi®cant association could be dem- Studies of normal chromosomes in European popu- onstrated between the disease alleles and any single al- lations have shown that disequilibrium exists among the lele of the normal polymorphisms spanning the gene sites at either end of the gene, at distances of 22 and (Chakraborty et al. 1987). However, when haplotypes 31 kb, but, in general, either does not exist or is much of these polymorphisms were examined, two haplotypes weaker between markers located at opposite ends of the were signi®cantly more common among PKU chro- gene, at distances >43 kb (Chakraborty et al. 1987; mosomes than among normal chromosomes. The rea- Daiger et al. 1989a). A similar pattern has been ob- son for the absence of allelic association with individual served in a sample of 44 chromosomes from China and RFLPs was also obvious: the marker-allele frequencies Japan (Daiger et al. 1989b) and in a sample of 1600 among PKU chromosomes were very similar to the fre- chromosomes from several Polynesian groups (Hertz- quencies in normal chromosomes. In the case of normal berg et al. 1989). Other analyses of various published chromosomes, the two most common haplotypes were PAH data have reached similar conclusions (Feingold approximately equally frequent and had alternative al- et al. 1993; Degioanni and Darlu 1994). This pattern leles at most sites. The PKU chromosome(s) included of the molecular extent of linkage disequilibrium agrees those two haplotypes, but two others were the most with the ®ndings of Jorde et al. (1994)Ðthat linkage common, accounting for 20% and 38% of the PKU disequilibrium in populations of European origin gen- 1 chromosomes in the Danish sample; these two ªPKUº erally does not extend to 50±60 kb. Our recent studies haplotypes also had alternative alleles at all the sites at of linkage disequilibrium in multiple populations have which the two common normal haplotypes differed and found that linkage disequilibrium can differ dramati- had the same allele at all the sites at which the other cally among populations from different regions of the two were the same. This allelic complementarity of the world (Tishkoff et al. 1996a, 1996b, 1998; Kidd et al. most common haplotypes in both normal and PKU 1998). The only substantive reports of PAH haplotype chromosomes greatly reduced the power to detect allelic frequencies in speci®c non-European populationsÐ association of PKU with any single RFLP, although the Polynesians (Hertzberg et al. 1989) and eastern Asians (Daiger et al. 1989b)Ðshow reduced levels of hetero- association was obvious when haplotypes were used. zygosity relative to that in Europeans. The one small The two associated haplotypes were the basis for the study of African Americans (Hofman et al. 1991) has ®rst identi®cations of speci®c PKU mutations (DiLella found that haplotype frequencies for normal and PKU et al. 1986b, 1987). chromosomes differ from each other and from frequen- As the etiologically relevant mutations for PKU and cies in Europeans and Asians. the phenylalaninemia states became known, PAH hap- All of the previous studies of linkage disequilibrium lotypes of patients with PKU often signaled which mu- at the PAH locus have used pairwise coef®cients. The tation(s) a patient carried and/or alerted the researchers resulting matrix of disequilibrium coef®cients can show to the existence of previously unknown mutations a clear pattern but also often contains some pairwise (Daiger et al. 1989a, 1989b: Hertzberg et al. 1989; values that do not ®t into the general pattern of high Stuhrmann et al. 1989; Apold et al. 1990; Dianzani et absolute values for pairs of markers within either cluster al. 1990; Jaruzelska et al. 1991; Konecki and Lichter- and low absolute values for pairs of markers that bridge Konecki 1991; Svensson et al. 1991; Zygulska et al. the two clusters. As we have begun to consider linkage 1991; Baric et al. 1992; Kozak et al. 1995). Thus, by disequilibrium in more-complex genetic systems, we 1989, polymorphisms, mutations, and haplotypes of the have introduced a new coef®cient to measure overall PAH region had ®nally become well characterized in nonrandomness across the entire haplotype (Kidd et al. patients with PKU (Woo 1988; Nowacki et al. 1997). 1998; Zhao et al. 1999). The coef®cient estimate is By 1996, the PKU mutations and their haplotypes were based on the permuted data of the observed samples. being used to infer the natural histories both of the A variant of the permutation test for overall signi®cance mutations themselves and of the populations carrying allows us to test signi®cance of the disequilibrium across those mutations (Scriver et al.