A Century of Hardy–Weinberg Equilibrium

Oliver Mayo CSIRO Livestock Industries, Adelaide, Australia

ardy–Weinberg equilibrium (HWE) is the state of essentially stable in genetical terms: a system, like Hthe genotypic frequency of two alleles of one the genome of a population, is at equilibrium in autosomal gene locus after one discrete generation time when no net change occurs or is expected to of random mating in an indefinitely large population: occur from its state at that time. Furthermore, equi- if the alleles are A and a with frequencies p and libria can be stable, meaning that when a small q (=1-p), then the equilibrium gene frequencies are displacement occurs, the system is expected to simply p and q and the equilibrium genotypic fre- return to the equilibrium. 2 2 quencies for AA, Aa and aa are p , 2pq and q . It was Soon after the rediscovery of Mendel’s remarkable independently identified in 1908 by G. H. Hardy and work in 1900, interest arose in the properties of W. Weinberg after earlier attempts by W. E. Castle Mendelian genes in populations; this was the dawn of and K. Pearson. Weinberg, well known for pioneer- population . Castle (1903) and Pearson ing studies of , made many important (1903a, 1903b) were among the first to investigate contributions to genetics, especially human genet- these. As Edwards (in press) has pointed out, Castle ics. Existence of this equilibrium provides a did not derive a generalization equivalent to reference point against which the effects of selec- Pearson’s, and will be considered no further in this tion, linkage, mutation, inbreeding and chance can paper. be detected and estimated. Its discovery marked Mendel (1865) had hypothesized that inheritance the initiation of population genetics. of a trait was particulate, that the units of inheritance did not change from generation to generation, that they were contributed equally by an organism’s two Hardy–Weinberg equilibrium (HWE) is the state of parents, and that each parent contributed its unit at the genotypic frequency of two alleles of one gene random from the two it contained. He had shown locus after one generation of random mating in an that a cross between two pure-breeding lines, termed indefinitely large population with discrete genera- A and a in regard to some trait, gave a first genera- tions, in the absence of mutation and selection: if the tion resembling one of the two parents (A) identically alleles are A and a with frequencies p and q (= 1-p), and that crosses among members of this first genera- then the equilibrium gene frequencies are just p and q tion gave a ratio of 3:1 of the two parental types, the and the equilibrium genotypic frequencies for AA, Aa more frequent type being the same as this first genera- and aa are p2, 2pq and q2. Thus, there is equilibrium tion (A). He had described the first generation type as at both the allelic and the genotypic level. Not exces- dominant, the other as recessive. sively fancifully, one could compare this Mendel produced a model of the following kind: Hardy–Weinberg rule with Newton’s first law of the genetic make-up of the two parental lines was AA motion: a physical body will remain at rest, or con- and aa respectively, and their offspring were Aa. tinue to move at a constant velocity, unless an 1 Crossing two Aa gave, by the binomial expansion ( ⁄2A external force acts upon it. If such stability is the rule, 1 1 1 1 1 1 1 + ⁄2a)( ⁄2A + ⁄2a), ⁄4AA, ⁄4Aa, ⁄4aA, ⁄4aa. If Aa = aA then it provides the basis for the detection and estima- (since the units of inheritance are unchanged), and if tion of the effects on the population of ‘the thousand Aa resembles AA exactly (the phenomenon of domi- natural shocks the flesh is heir to’, including natural nance, deduced from the disappearance of a in the and artificial selection, mutation, assortative mating, cross of the two lines), then the proportions of the migration, inbreeding and random sampling (through 3 1 phenotypes A:a will be ⁄4: ⁄4. Mendel made further finite population size). confirmatory crosses, for example showing that two Each word in the topic concept deserves explica- thirds of the A types were Aa and one third were AA. tion: Hardy was a notable pure mathematician, Mendel did not consider directly what would Weinberg was a pioneering human geneticist and happen in a population of an organism, but this was doctor to the poor who made a special contribution to studies, and the concept of equilibrium is Received 19 February, 2008; accepted 3 March, 2008. simple and attractive to those in permanent disequi- Address for correspondence: Oliver Mayo, CSIRO Livestock librium, like human beings. The concept made Industries, PO Box 100401 Adelaide BC, SA 5000, Australia. E-mail: immediately clear that human populations were [email protected]

Twin Research and Human Genetics Volume 11 Number 3 pp. 249–256 249 Downloaded from https://www.cambridge.org/core. IP address: 170.106.35.234, on 30 Sep 2021 at 04:27:46, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1375/twin.11.3.249 Oliver Mayo

essential if evolutionary phenomena were to be the frequencies of the three possible genotypes are as explained or even studied at the level of the hypothesized shown:

‘essential character’, as Mendel called his fundamental A1A1 A1A2 A2A2 particles of inheritance. (The name ‘gene’ was introduced PQR 1 in about 1905 by Wilhelm Johannsen, as an abbrevia- Then the frequencies of the alleles A1 and A2 are P+ ⁄2Q 1 tion of Darwin’s and De Vries’s ‘pangen’.) and ⁄2Q + R respectively. Call these p and q respec- Pearson immediately saw that / tively. Call this population the parental generation. recessiveness was not essential to the dynamics of the Some very simple algebra shows that the frequencies model, but was rather an additional assumption of of the three genotypes in the offspring produced by Mendel’s. He generalized the model by removing this random mating among this parental generation will be

assumption, and also began the analysis of multiple A1A1 A1A2 A2A2 1 2 1 1 1 2 independent genes. (P + ⁄2Q) 2(P + ⁄2Q)( ⁄2Q + R)(⁄2Q + R) On his model, the first two generations described = p2 = 2pq = q2 above would be It is also simple to show that these genotypic frequen- (AA´) × (aa´) = (A+A´)(a+a´) = cies, which are also those chosen by pairwise sampling (Aa) + (Aa´) + (A´a) + (A´a´), of gametes at random in the population, will be the same in the next generation. (At this equilibrium, representing the parents, the gametes and the offspring Q2 =4PR.) The process of pairwise sampling is simple in turn. If the gametes identified by the prime ´ are binomial sampling with replacement, justifiable actually identical, that is A = A´ and so on, then this because the populations of gametes can be regarded as second generation is identical, Aa. This can then be indefinitely large. For a careful and complete mathe- extended to multiple independent genes. Pearson matical account of HWE, see Edwards (2000). (1903b) wrote: The English pure mathematician G. H. Hardy If these hybrids now breed at random and are equally (1908), notable for contributions to number theory fertile among themselves, segregation takes place. If and analysis, simply showed that the relationship the process of random mating with equal fertility be given above would hold; he had been asked what continued generation by generation, what further would happen to gene and genotype frequencies in a changes, if any, take place, and what are the laws of population mating at random, and gave the answer. inheritance within such a population? (p. 506) He participated no further in population genetics. His conclusion was ‘that when the members of this seg- Diaconis (2002) has speculated that Hardy had ‘a true regating population cross at random the population antipathy to the subject’ of probability, which could accurately reproduces itself, and supposing no artificial, explain his failure to contribute further, but it could natural or reproductive selection takes place, a stable equally well be explained by his love of pure mathe- population or ‘race’ is created, which is permanent and matics and total lack of interest in applications. shows a permanent proportional frequency for each Hardy’s place in mathematical history is secure; that sub-class of the population’. From this important con- in genetical history minor but significant. For an clusion, Pearson went on to calculate parent-offspring accessible portrait, see Snow (1967) and Hardy’s own

correlations, rOP, and other attributes of the quantita- memoir (1940), and for detailed comments on Hardy’s tive inheritance which he was developing. (1908) paper, see Edwards (in press). Unfortunately, having pointed out that he did not need Weinberg (1908), who was a human geneticist of 1 Mendel’s hypothesis of dominance, he calculated rOP = ⁄3 the first rank, though widely regarded in his own and noted that this was not in agreement with empirical country as an Armenarzt (a doctor employed by a observations which lay round 0.5. For this and other local authority to treat the indigent, an honorable reasons he abandoned particulate inheritance of the calling, perhaps, but hardly a sign of success in his Mendelian kind. Had he assumed that the heterozygote career), did much more work on the topic. 1 was intermediate, he would have obtained rOP = ⁄2. After the publication of Hardy’s note, Pearson (1909a, Weinberg’s Contribution 1909b) obtained correct results, without referring to his Wilhelm Weinberg was born in in 1862, was earlier errors as such. It is perhaps unsurprising but educated in Stuttgart, Tübingen and , worked nevertheless noteworthy that the teutonophile Pearson, as ‘poor doctor’, public health adviser and private aware of Weinberg’s fine work on the familial incidence practitioner in Stuttgart, and died after some years of of cancer (Weinberg & Gaspar, 1904) at least through poor health in Tübingen in 1937, though he remained attendance at and participation in a major meeting on scientifically productive until his death. genetics of human disease in 1908 (see Church, 1908), As well as demonstrating how HWE must arise in did not cite Weinberg (1908). the diallelic case, Weinberg (1908, 1909a, 1909b) also considered multiple alleles and multiple independent The Initial Work of Hardy and Weinberg genes. In this last case, he showed that the approach Consider a population in which a diallelic gene with to multilocus HWE would be asymptotic, not the

two alleles, A1 and A2, are segregating. Suppose that result of one generation of panmixia.

250 Twin Research and Human Genetics June 2008 Downloaded from https://www.cambridge.org/core. IP address: 170.106.35.234, on 30 Sep 2021 at 04:27:46, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1375/twin.11.3.249 A Century of Hardy-Weinberg Equilibrium

Apart from HWE, his major contributions were to Mendel’s normal dominant:affected recessive ratio of quantitative inheritance (correlations between rela- 3:1. However, human families are generally small, and tives), twin studies, segregation analysis and the merits for sibships of size 1, 2, 3… the probability of all 1 1 1 (or otherwise) of eugenics. Crow (1999) gives a dominants is ⁄2, ⁄4, ⁄8 … Thus, if the trait is ascertained thoughtful appraisal of many aspects of his work. through affected children, the observed ratio will be Hill (1984) and Crow (1999) give an account of lower than 3:1. As noted by Bailey (1961), the Weinberg’s (1909a, 1909b, 1910) work on quantita- problem is to fit a binomial distribution with its initial tive genetics, which in some ways anticipated term missing, and Weinberg was the first to analyze developments by Fisher and Wright. See the next this case in human genetics. Weinberg’s solution (the section for further discussion. proband method) is the maximum likelihood solution ‘Weinberg’s studies on the frequency of twins and if ascertainment has been both random and complete higher multiple births are the best studies ever pub- (Fisher, 1934). That this condition is not often met was lished on this subject’ (Bulmer, 2003). Bulmer (1970) recognized by Weinberg (1912a, 1912b, 1927), who had earlier shown how good these studies are. consequently preferred what he called the sib method, Weinberg (1901) systematically developed his dif- whereby the segregation ratio in sibs of propositi is ferential method for determining the frequencies of measured, weighting each sibship by the number of monozygotic and dizygotic twinning. In Bulmer’s affected individuals. He also developed a method for notation, suppose the case where ascertainment was incomplete. In L = number of like-sexed twin maternities in developing these methods, Weinberg was drawing on a total sample of N maternities his experience of working with poor families, both in U = number of unlike-sexed twin maternities public employment and privately, recognizing how in the sample. genetical studies were often not the mainspring for Then the monozygotic twinning rate is given by data collection, and how, consequently, data might be m =(L-U)/N incomplete and biased. However, he also showed and the dizygotic rate by remarkable statistical insight and expertise. d =2U/N. Demmler (2003) has pointed out that Weinberg Bulmer shows that was one of the earliest German medical scientists to Var(m)= (m + d)/N and Var(d) = 2d/N understand and apply Mendel’s laws. Thus, he was approximately. always having to battle against others, such as the psy- chiatric geneticist Wilhelm Strohmeyer (1874-1936) In an example given by Bulmer, in 791,584 maternities who had accepted Darwinism, including Darwin’s fal- in Wales in 1960, L = 5,894 and U = 3,192. Then lacious blending inheritance (Demmler, 2003, p. m = 0.0034 ± 0.0001 and d = 0.0081 ± 0.0001. As 74–76). Strohmeyer, writing to his wife from a confer- presented, a constant sex-ratio of unity is assumed, ence in 1912, reported that he had met his ‘antagonist’ but the bias engendered thereby is very small, as Weinberg, who ‘talked a great deal, but was clever Weinberg understood. Indeed, in 1934, he published a and industrious’ (Demmler, 2003, p. 75). Strohmeyer paper on this topic, including a method for estimating was also rather in favor of eugenics, more so as the the precision of the estimates rather more complex years passed, whereas Weinberg was against it; he than Bulmer’s (see also James 2007). ‘noted that for “race-hygienic” reasons tight bound- Weinberg (1909a, 1909b) investigated the inheri- aries would have to be drawn round any intervention. tance of twinning, showing that a propensity to An improvement in “national efficiency” without produce dizygotic twins is inherited, though he could reduction of population size Weinberg saw as the not investigate this further by statistical methods. He more favorable way; ‘a sober statistical view leading also concluded that there was no inherited propensity thus to the result that eugenics does not require abor- to produce monozygotic twins. These results have tion and sterilization particularly strongly and indeed largely been borne out by subsequent work (Bulmer, prominent personalities would not support such an 1970; Fisher, 1928; Hoekstra et al., 2007). approach’ (Weinberg, 1918, as cited in Demmler, In human genetics, breeding experiments not being 2003, p. 81). Weinberg was right about the science, possible (even ignoring ethical issues), methods have but not about the way that ‘prominent personalities’ had to be developed to detect Mendelian inheritance, determine its form, and to investigate linkage and would fail to support ‘sterilization of the unfit.’ For interaction of Mendelian factors, by analysis of further detail, see Früh (1996, 1999). observed (ascertained) families. In these circumstances, random sampling and inference therefrom are not Subsequent Work on HWE always possible, and bias and other statistical problems The importance of HWE as a basis for investigation of have to be carefully avoided. To take a very simple the effects on gene frequencies of selection, mutation, example, suppose that a deleterious trait is suspected to inbreeding and other factors was immediately recog- be a simple recessive. The segregation ratio in the off- nized, though not all workers in the next decade spring of two normal carriers is then expected to be acknowledged the earlier workers.

Twin Research and Human Genetics June 2008 251 Downloaded from https://www.cambridge.org/core. IP address: 170.106.35.234, on 30 Sep 2021 at 04:27:46, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1375/twin.11.3.249 Oliver Mayo

Suppose that a sample is obtained from a popula- etc.) also showed how selection and mutation could be tion, and the numbers and frequencies observed are: taken into account, introducing the concept of muta-

A1A1 A1A2 A2A2 Total tion-selection balance. To take a very simple example, abcn consider selection against a deleterious recessive: PQR1 AA Aa aa Then an obvious test to determine whether HWE Frequency p2 2pq q2 holds is Pearson’s χ2 whereby Fitness 1 1 1-s 1 2 2 1 2 (a-n(P + ⁄2Q) ) /(n(P + ⁄2Q) ) + Here, the frequency of A in the progeny is 1 1 2 1 1 2 2 2 2 (b-2n(P + ⁄2Q)(R + ⁄2Q)) /n(P + ⁄2Q)(R + ⁄2Q)) + p’ = (p + pq)/(p + 2pq + q (1-s)) = p/(1-q s). 1 2 2 1 2 (c-n(R + ⁄2Q) ) /(n(R + ⁄2Q) ) Then the change in p through selection is q2/(1-q2s). If will be distributed as χ2 with one degree of freedom if gene frequency is not to change over time, this HWE holds. This test was available to the early increase must be balanced by mutation at the rate µ workers, though there was uncertainty about degrees from A to a, that is µp = q2/(1-q2s). If s is not small 1 of freedom. (In this case, there are three classes, yield- relative to µ, then q = (µ/s) ⁄2 approximately. ing two degrees of freedom, but one is associated with On a similar argument, in the X-linked case, estimation of gene frequency in order to estimate q =3µ/s. If the mutation rate is in males and in genotypic expectations). It is immediately clear that, females and these are unequal, q = (µ + 2 ν)/s to the on the assumption of HWE, the observed frequency of same level of approximation. Using this argument, recessives c/n is an estimator for q2, and this relation- Haldane (1935, 1947) was the first to estimate human ship was used early, for example in estimating gene mutation rates (3.2 × 10–5 for haemophilia) with µ > ν, frequencies for the ABO blood group system (see possibly by a factor of 10. Recent analyses show this Kempthorne, 1957). Later developments will be dis- to be of the correct order (e.g., Ellergren, 2002). cussed in the next section. Weinberg (1912b) was one of the first to recognize Norton, in an appendix to Punnett (1915), appears that mutation might maintain deleterious traits in this to have been the first to have considered the effects of way, and also that mutation might be more frequent selection on a gene in a panmictic, indefinitely large in males than in females, at a time when mutation was population. Since Punnett had led Hardy to consider an underdeveloped and misunderstood concept. the problem, and Norton was a student of Hardy’s Chetverikov (1926) referred to HWE as Hardy’s (see Edwards, in press), Norton must have been famil- law, and (possibly through misunderstanding of iar with Hardy (1908), but he did not cite the paper. Pearson’s (1903a, 1903b) work) a slightly different rule Fisher (1918) used HWE without comment; we do as Pearson’s law. Despite his familiarity with the not know whether he knew of Hardy’s or Weinberg’s German literature, Chetverikov did not cite Weinberg. papers, though he was acquainted with Pearson’s Chetverikov was one of the first experimental scientists earlier work, cited above, and knew and valued Hardy to understand the implications of population genetics, highly (Fisher, 1958). HWE was the basis for Fisher’s especially the role of finite population size, and to try to derivation of correlations between related individuals investigate natural populations from this standpoint. under Mendelian inheritance. In this notable paper, Fisher, Haldane, Wahlund (1928), Wright (1931) Fisher also presented one of the first analyses of depar- and others showed that departures from random tures from panmixia. His development of the concept mating such as inbreeding, whether systematic (as in of balanced polymorphism also required HWE (see plants, or in animal and plant breeding) or chance (as Fisher, 1922; Lewontin, 1958; Mayo, 2007). in the effect of finite population size), could induce It is perhaps worth noting that Fisher wrote to departures from expected Hardy–Weinberg propor- Weinberg on August 29, 1930, in the course of a tions. Perhaps it should be mentioned that such cordial correspondence mainly about ascertainment: ‘I departures from random mating must influence geno- am sure you will always be honoured abroad, and I typic frequencies for every gene, which is not the case hope also in your own country for your pioneer work for selection and mutation. upon the Mendelian or other interpretation of human Wahlund showed that HWE proportions would data’ (Fisher, 1930, August 29). Even here, Fisher does not be found in a population composed of isolated not notice the contribution honoured as HWE. subpopulations, even if each subpopulation were itself Haldane (1924a) called HWE the Hardy–Pearson in HWE. The frequencies would be rule, and used it as the basis for all his important early A A A A A A work on selection in populations. He derived the 1 1 1 2 2 2 p2 + V 2(pq – V ) q2 + V recurrence relationship for the approach to equilib- p p p

rium of gene and genotypic frequencies of an X-linked where p is the mean frequency of A1 in the whole pop-

diallelic gene; both the gene frequencies in the two ulation and Vp the variance in gene frequency among sexes and the genotypic frequencies in females the subpopulations. approach equilibrium asymptotically (see Bennett & Inbreeding at the rate F will give rise to a similar Oertel, 1965, for a definitive analysis). In a series of increase in the frequencies of the homozygotes and papers, Haldane (1924a, 1924b, 1926a, 1926b, 1927 decrease in that of heterozygotes. In the case of a finite

252 Twin Research and Human Genetics June 2008 Downloaded from https://www.cambridge.org/core. IP address: 170.106.35.234, on 30 Sep 2021 at 04:27:46, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1375/twin.11.3.249 A Century of Hardy-Weinberg Equilibrium

population of size N, the chance of identity by descent of two randomly chosen alleles is 1/2N, and this is the Table 1 increase in inbreeding each generation. See Wright Distribution of APOE Genotype by Ethnic Classification (adapted from (1922, 1931) for the original work. It should be noted Table 2 of Kimmel et al. 2008) once more that HWE arises from two phenomena: APOE genotype African–American Caucasian Total binomial sampling of gametes and panmixia. Hence, if 2–2303 subpopulations or lines are isolated so that inbreeding 2–3151631 and differentiation arise but both phenomena continue to apply within a line or subpopulation, there can still 2–4 7 3 10 be HWE within it. 3–3 51 73 124 Most methods for the analysis of gene and geno- 3–4302858 type frequencies in finite populations require HWE as 4–4516 a starting point, from which gene trajectories in time Total 111 121 232 etc. can be pursued. Even when departures from HWE are caused by breeding systems, random mating gener- ates, because of the independent binomial sampling of Table 1. By Fisher’s ‘exact’ test, the two ethnic classifi- alleles in the two parents, genotypic regularities in the cations differ significantly in genotype frequencies. progeny that allow assessment of the effects of the Both groups fit HWE by various tests. Given that breeding system on the genome. For example, gameto- APOE is known to be strongly associated with certain phytically determined and other self-incompatibility disease states (see Song et al., 2004; Kimmel et al., systems which prevent selfing have been thoroughly 2008, the source of these data), is this agreement sur- analyzed at both the infinite and finite population prising or not? level (Wright 1939, Leach & Mayo, 2005). Today, following the development of automated methods of DNA sequencing and the consequent HWE Today numerous genomic analyses, we know that single Li (1988), followed and elaborated by Stark (2006a, nucleotide polymorphisms (SNPs) are prodigiously 2006b), showed that panmixia is not the only breed- numerous in the genome. In the human genome, for ing structure that can yield HW proportions, so that example, there may be ten million. Since they arise panmixia is a sufficient but not a necessary condition from DNA misreplication at a rate of, perhaps, 10–8 for HWE. However, no natural population is known per base per generation (Nachman & Crowell, 2000), to manifest the other possible breeding structures so the overwhelming majority of these will be diallelic. that it appears unlikely that they need to be consid- Suppose that there are ten million SNPs in humans. ered in data collection and analysis. HWE continues Then a mutation yielding a third allele is expected to be an important starting point for any population every ten generations. In cattle, more than 98% of analysis. This will indeed be true even when what is SNPs are diallelic (Wade & Adelson, personal commu- being analyzed is something that must initially disrupt nication). Neutral variants are fixed at a rate the regularity of the meiotic processes that provide the approximately given by the mutation rate (Kimura, basis for HWE (e.g. the investigation of the fate of 1968, 1983). newly arising duplications; see Force et al., 1999; Considered individually, each diallelic SNP may be Hittinger & Carroll, 2007). expected to be in HWE. Obviously, inbreeding, selec- Testing for departure from HWE began, as noted tion, mutation etc. are certain to be present in every above, with simple χ2 analysis. Problems inherent in population studied, but one has no expectation a such χ2 analyses, especially the dependence on sample size, and the low power of the χ2 HWE test (investi- priori that they will influence any particular SNP. gated thoroughly by Lewontin & Cockerham, 1959), Since, to the contrary, each pair of linked SNPs is meant that Haldane (1954) and others sought ‘exact’ unlikely to be in linkage disequilibrium, testing indi- tests based on the expectation that, under HWE, vidual SNPs for departure from HWE has been widely Q2 –4PR = 0. These tests have been extensively devel- regarded as a useful first step in genome scans for oped and used; see Mantel & Li (1974) and Rousset regions associated with important traits (multifactorial & Raymond (1995). Rousset & Raymond consider disorders in humans, production traits in plants and the issue of the alternative hypothesis: is one con- animals). Since disturbances to panmixia are expected cerned with selection or a disturbance to panmixia? to affect all sites in the genome, rare departures from Guo & Thompson (1992) and Rousset & Raymond HWE found for a few SNPs in the early stages of a consider ‘exact’ tests for the case of multiple alleles; genome-wide scan are likely to arise by chance or pos- microsatellite markers are widely used in population sibly from mistyping. studies, and multiple alleles are the norm with these However, Zou & Donner (2006) dispute the valid- markers. Many coding genes also exhibit multiple ity and hence utility of such testing in case-control alleles and are significantly associated with disease. studies. (See their paper for many references support- For example, consider the data on human APOE in ing such preliminary screening).

Twin Research and Human Genetics June 2008 253 Downloaded from https://www.cambridge.org/core. IP address: 170.106.35.234, on 30 Sep 2021 at 04:27:46, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1375/twin.11.3.249 Oliver Mayo

In general, if a departure from HWE is shown to Bulmer, M. G. (2003). Francis Galton, pioneer of heredity be of interest, it should indicate, through a deficiency and biometry. Baltimore, MD: Johns Hopkins of either or both homozygotes, or of heterozygotes, University Press. etc., the type of explanation that might be sought. Castle, W. E. (1903). The laws of Galton and Mendel and Since an individual SNP is unlikely to be influential in some laws governing race improvement by selection. selection (recognizing that exceptions, like the Proceedings of the American Academy of Arts and malaria-related polymorphisms, exist), patterns Science, 35, 233–242. involving closely linked SNPs must then be examined. Chetverikov, S. S. (1926). On certain aspects of the evolu- Weir et al. (2004) provide an appropriate statistical tionary process from the standpoint of modern approach to follow in these circumstances. genetics. Zhurnal Eksperimental’noi Biologii, A2, While technical mistakes are more likely a priori 3–54. (Originally published in Russian; this transla- than selection as a cause of disturbed segregation, tion by M. Barker originally published in 1961 in nonconforming loci must be followed up. As Edwards American Philosophical Society Proceedings, 105, (2007) wrote of one major continuing study: 167–195 and reprinted in D. L. Jameson (Ed.) (1977), The HapMap set of data restricts analysis to loci con- Benchmark papers in genetics, vol. 8 evolutionary forming to the simple genetic background imposed by genetics. Stroudsburg, PA: Dowden, Hutchinson & rejecting genotypes inconsistent with Mendel’s first Ross, pp. 234–262.) law, and consistent with what Stern termed the Church, W. B. (1908). The influence of heredity on Hardy–Weinberg law (Hardy, 1908; Stern, 1943; disease, with special reference to tuberculosis, cancer Weinberg, 1908). The rejects — the golden dross for and diseases of the nervous system (with discussion). the recognition of recessive lethals — are not dis- Journal of the Royal Society of , 2, 8–142. cussed in detail although they account for over 10% of loci even though rejection was based on very high Crow, J. F. (1999). Hardy, Weinberg and language impedi- levels of significance (P <.001). (p. 390) ments. Genetics, 152, 821–825. As outlined in the previous section, HWE is the funda- Demmler, A. (2003). Wilhelm Strohmeyer (1874–1936) mental starting point for all population–genetical Ein Wegbereiter der Kinder- und Jugendpsychiatrie. investigation, whether the goal is detection or estima- MD dissertation, Friedrich-Schiller-Universität Jena. tion of the effects of all the forces that disrupt HWE. Diaconis, P. (2002). G. H. Hardy and probability??? While Mendel conceived the independent binomial Bulletin of the London Mathematical Society, 34, sampling of gametes from parents and hence could be 385–402. regarded as the first to have considered a popula- Edwards, A. W. F. (2000). Foundations of Mathematical tion–genetical example (the effects of crossing of pure Genetics (2nd ed.). Cambridge: Cambridge University lines) (Edwards, in press), the generalization to arbi- Press. trary gene frequencies to give HWE was the true Edwards, A. W. F. (in press). G. H. Hardy (1908) and foundation of population genetics. Hardy–Weinberg Equilibrium. Genetics. Acknowledgments Edwards, J. H. (2007). Genome scans and the ‘old genet- ics’. In O. Mayo and C. R. Leach (Eds.), Fifty years of I thank A. W. F. Edwards for an advance copy of his human genetics a Festschrift and liber amicorum to Hardy–Weinberg paper, G. R. Fraser for drawing celebrate the life and work of George Robert Fraser Church (1908) to my attention, D. L. Adelson, N. G. (pp. 385–401). Adelaide, Australia: Wakefield Press. Martin and their colleagues for helpful discussion of SNPs, A. W. F. Edwards, C. R. Leach, N. G. Martin, J. Ellergren, H. (2002). Human mutation — blame (mostly) A. Sved and P. M. Visscher for useful comments on me. Nature Genetics, 31, 9–10. this note, and N. G. Martin for suggesting I write it. I Fisher, R. A. (1918). The correlation between relatives on also thank CSIRO for provision of a research fellow- the supposition of Mendelian inheritance. ship during which the note has been written. Transactions of the Royal Society of Edinburgh, 52, 399–433. References Fisher, R. A. (1922). On the dominance ratio. Proceedings Bailey, N. T. J. (1961). Introduction to the Mathematical of the Royal Society of Edinburgh, 42, 321–341. Theory of Genetic Linkage. New York: Oxford Fisher, R. A. (1928). Triplet children in Great Britain and University Press. Ireland. Proceedings of the Royal Society of London, Bennett, J. H., & Oertel, C. R. (1965). The approach to Series B, 102, 286–311. random association of genotypes with random mating. Fisher, R. A. (1930, August 29). Archived personal com- Journal of Theoretical Biology, 9, 67–76. munication. Available from http://digital.library. Bulmer, M. G. (1970). The biology of twinning in man. adelaide.edu.au/coll/special/fisher/corres/weinberg/ Oxford: Clarendon Press. Weinberg300829a.html

254 Twin Research and Human Genetics June 2008 Downloaded from https://www.cambridge.org/core. IP address: 170.106.35.234, on 30 Sep 2021 at 04:27:46, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1375/twin.11.3.249 A Century of Hardy-Weinberg Equilibrium

Fisher, R. A. (1934). The effect of method of ascertain- Hoekstra, C., Zhao, Z. Z., Lambalk, C. B., Willemsen, ment upon the estimation of frequencies. Annals of G., Martin, N. G., Boomsma, D. I. & Montgomery, Eugenics, 6, 13–25. G. W. (2007). Dizygotic twinning [Electronic Version]. Fisher, R. A. (1958). The nature of probability. Centennial Human Reproduction Update Advance Access, 14, Review, 2, 261–274. 37–47. Force, A., Lynch, M. T., Pickett, F. B., Amores, A., Yan, James, W. H. (2007). The validity of Weinberg’s differen- tial rule. Twin Research and Human Genetics, 10, Y., and Postlethwait, J. (1999). Preservation of dupli- 771–2. cate genes by complementary, degenerative mutations. Genetics, 151, 1531–1545. Kempthorne, O. (1957). An introduction to genetic statis- tics. New York: John Wiley. Früh, D. (1996). Wilhelm Weinberg (1862–1937), Armenarzt und Populationsgenetiker — Anmerkungen Kimmel, S. E., Christie, J., Kealey, C., Chen, Z., Price, M., zu Leben und Werk. [Wilhelm Weinberg (1862–1937), Thorn, C. F., Brensinger, C. M., Newcomb, C. W. doctor to the poor and population geneticist — notes & Whitehead, A. S. (2008). Apolipoprotein E geno- on his life and work]. Biologisches Zentralblatt, 115, type and warfarin dosing among Caucasians and 112–119. African Americans. The Pharmacogenomics Journal, 8, 53–60. Früh, D. (1999). Die Genealogie als Hilfswissenschaft der Humangenetik. [Genealogy as supporting science for Kimura, M. (1968). Evolutionary rate at the molecular human genetics]. Jahrbuch für Geschichte und Theorie level. Nature, 217, 624–626. der Biologie, 6, 141–162. Kimura, M. (1983). The Neutral Theory of Molecular Guo, S. W. & Thompson, E. A. (1992). Performing the Evolution. Cambridge: Cambridge University Press. exact test of Hardy–Weinberg proportions for multi- Leach, C. R. & Mayo, O. (2005). Outbreeding mecha- ple alleles. Biometrics, 48, 361–372. nisms in flowering plants: An evolutionary perspective Haldane, J. B. S. (1924a). Part I. A mathematical theory from Darwin onwards. Stuttgart: Gebrüder of natural and artificial selection. Transactions of the Borntraeger Verlagsbuchhandlung. Cambridge Philosophical Society, 23, 19–41. Lewontin, R. C. (1958). A general method for investigat- Haldane, J. B. S. (1924b). Part II. The influence of partial ing the equilibrium of gene frequency in a population. self-fertilization, inbreeding, assortative mating and Genetics, 43, 419–434. selective fertilization on the composition of Mendelian Lewontin, R. C., & Cockerham, C. C. (1959). The good- populations and on natural selection. Proceedings of ness-of-fit test for detecting natural selection in the Cambridge Philosophical Society, 1, 158–163. random mating populations. Evolution, 13, 561–564. Haldane, J. B. S. (1926a). Part III. Proceedings of the Li, C. C. (1988). Pseudo-random mating populations: In Cambridge Philosophical Society, 23, 363–372. celebration of the 80th anniversary of the Haldane, J. B. S. (1926b). Part IV. Proceedings of the Hardy–Weinberg Law. Genetics, 119, 731–737. Cambridge Philosophical Society, 23, 607–615. Mantel, N. & Li, C. C. (1974). Estimation and testing of Haldane, J. B. S. (1927). Part V. Selection and mutation. a measure of non-random mating. Annals of Human Proceedings of the Cambridge Philosophical Society, Genetics, 37, 445–454. 23, 838–844. Mayo, O. (2007). The rise and fall of the common Haldane, J. B. S. (1935). The rate of spontaneous muta- disease-common variant (CD-CV) hypothesis: How tion of a human gene. Journal of Genetics, 31, the sickle cell disease paradigm led us all astray (or did 317–326. it?). Twin Research and Human Genetics, 10, 793–804. Haldane, J. B. S. (1947). The mutation rate for haemophilia, and its segregation ratios in males and Mendel, J. G. (1865). Versuche über Pflanzenhybriden. females. Annals of Eugenics, 13, 262–277. [Experiments in plant hybridisation]. Verhandlungen des naturforschenden Vereines in Brünn, Bd. IV für Haldane, J. B. S. (1954). An exact test for randomness of das Jahr1865, 3–47. mating. Journal of Genetics, 52, 631–635. Nachman, M. W. & Crowell, S. L. (2000). Estimate of the Hardy, G. H. (1908). Mendelian proportions in a mixed mutation rate per nucleotide in humans. Genetics, population. Science, 28, 49–50. 156, 297–304. Hardy G. H. (1940). A mathematician’s apology (reprint Pearson, K. (1903a). Mathematical contributions to the edition, 1992). Cambridge: Cambridge University theory of evolution. XI. On the influence of natural Press selection on the variability and correlation of organs. Hill, W. G. (Ed.). (1984). Quantitative genetics, part I. Philosophical Transactions of the Royal Society of New York: Van Nostrand Reinhold. London, Series A, 200, 1–66. Hittinger, C. T. & Carroll, S. B. (2007). Gene duplication Pearson, K. (1903b). Mathematical contributions to the and the adaptive evolution of a classic genetic switch. theory of evolution. XII. On a generalised theory of Nature, 449, 677–682. alternative inheritance, with special reference to

Twin Research and Human Genetics June 2008 255 Downloaded from https://www.cambridge.org/core. IP address: 170.106.35.234, on 30 Sep 2021 at 04:27:46, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1375/twin.11.3.249 Oliver Mayo

Mendel’s laws. Proceedings of the Royal Society of Zeitschrift für induktive Abstammungen- und London, 71, 505–509. Vererbungslehre, 2, 276–330. Pearson, K. (1909a). The theory of ancestral contribu- Weinberg, W. (1910). Statistik und Vererbung in der tions in heredity. Proceedings of the Royal Society of Psychiatrie. [Statistics and genetics in psychiatry]. London, 81, 219–224. Klinik für psychische und nervöse Krankheiten, 5, 34–43. Pearson, K. (1909b). On the ancestral gametic correla- tions of a Mendelian population mating at random. Weinberg, W. (1912a). Methode und Fehlerquellen der Proceedings of the Royal Society of London, 81, Untersuchung auf Mendelschen Zahlen beim 225–229. Menschen. [Methods and sources of error in the inves- tigation of Mendelian ratios in humans]. Archiv für Punnett, R. C. (1915). Mimicry in butterflies. Cambridge: Rassen- und Gesundheits Biologie, 9, 165–174. Cambridge University Press. Weinberg, W. (1912b). Zur Vererbung der Anlage der Rousset, F. and Raymond, M. (1995). Testing heterozy- Blutkrankheit mit methodologischen Ergänzungen gote excess and deficiency. Genetics, 140, 1413–1419. meiner Geschwistermethode. [On the inheritance of Snow, C. P. (1967). Variety of men. London: Macmillan. predisposition to blood disease with methodological additions to my sib method]. Archiv für Rassen- und Song, Y., Stampfer, M. J. & Liu, S. (2004). Meta-analysis: Gesundheits Biologie, 9, 694–709. Apolipoprotein E genotypes and risk for coronary heart disease. Annals of Internal Medicine, 141, Weinberg, W. (1918). Künstliche Fehlgeburt und kün- 137–147. stliche Unfruchtbarkeit vom Standpunkt der Statistik. [Artificial abortion and artificial infertility from the Stark, A. E. (2006a). A clarification of the Hardy– point of view of statistics]. In S. Placzek (Ed.), Weinberg law. Genetics, 174, 1695–1697. Künstliche Fehlgeburt und künstliche Unfruchtbarkeit: Stark, A. E. (2006b). Stages in the evolution of the ihre Indikationen, Technik und Rechtslage (pp. Hardy–Weinberg law. Genetics and Molecular 437–456). Leipzig: Thieme Verlag. Biology, 4, 589–594. Weinberg, W. (1927). Grundlagen der Probanden- Stern, C. (1943). The Hardy–Weinberg law. Science, 97, methode. [Foundations of the proband method]. 137–138. Zeitschrift für induktive Abstammungen- und Vererbungslehre, 48, 179–228. Wahlund, S. (1928). Zusammensetzung von population und korrelationserscheinung vom standpunkt der Weinberg, W. (1934). Differenzmethode und Geburtenfolge vererbungslehre aus betrachtet. [Population composi- bei Zwillinge (nebst einem Anhang über dem mittleren tion and correlation structure considered from the Fehler der Geburtenfolgen-nummer). [Difference viewpoint of genetics]. Hereditas, 11, 65–106. method and birth order in twins (together with an appendix on the mean error of the birth rank)]. Weinberg, W. (1901). Beiträge zur physiologie und Genetica, 16, 383–388. pathologie der mehrlingsgeburten beim menschen. [Contributions to the physiology and pathology of Weinberg, W. & Gaspar, K. (1904). Die bösartigen human multiple births]. Pflügers Archiv ges. Neubildungen in Stuttgart von 1878 bis 1902. [Malignant tumours in Stuttgart from 1878 to 1902]. Physiologie, 88, 346–430. Zeitschrift für Krebsforschung, 2, 195–260. Weinberg, W. (1908). Über den Nachweis der Vererbung Weir, B. S., Hill, W. G. & Cardon, L. R. (2004). Allelic beim Menschen. Jahreshefte des Vereins für vaterländis- association patterns for a dense SNP map. Genetic che Naturkunde in Württemberg, Stuttgart 64: Epidemiology, 27, 442–450. 369–382. [On the demonstration of inheritance in humans]. Translation by R. A. Jameson printed in D. L. Wright, S. (1922). Coefficients of inbreeding and relation- Jameson (Ed.), (1977). Benchmark papers in genetics, ship. American Naturalist, 56, 330–338. Volume 8: Evolutionary genetics (pp. 115–125). Wright, S. (1931). Evolution in Mendelian populations. Stroudsburg, PA: Dowden, Hutchinson & Ross. Genetics, 16, 97–159. Weinberg, W. (1909a). Über Vererbungsgesetze beim Wright, S. (1939). The distribution of self-sterility alleles Menschen I. [On laws of heredity in humans]. in populations. Genetics, 24, 538–552. Zeitschrift für induktive Abstammungen- und Zou, G. Y. & Donner, A. (2006). The merits of testing Vererbungslehre, 1, 377–392, 440–460. Hardy–Weinberg equilibrium in the analysis of Weinberg, W. (1909b). Über Vererbungsgesetze beim unmatched case-control data. Annals of Human Menschen II. [On laws of heredity in humans]. Genetics, 70, 923–933.

256 Twin Research and Human Genetics June 2008 Downloaded from https://www.cambridge.org/core. IP address: 170.106.35.234, on 30 Sep 2021 at 04:27:46, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1375/twin.11.3.249