Proc. Natl. Acad. Sci. USA Vol. 88, pp. 6716-6720, August 1991 Genetics Role of diversifying selection and gene conversion in of major histocompatibility complex loci (major histocompatibility complex polymorphisms/multigene family) National Institute of Genetics, Mishima 411, Japan Communicated by Motoo Kimura, April 12, 1991 (received for review February 2, 1990)

ABSTRACT Genes at the major histocompatibility com- 1037-1054). In both class I and class II families, there is plex (MHC) in mammals are known to have exceptionally high variation in the number of genes among different species and and linkage disequilibrium. In addition, these also in the level ofpolymorphism (see ref. 9). Apparently, the genes form highly complicated gene families that have evolved numbers of normally expressed genes are rather small, usu- through gene conversion and unequal crossing-over. It has ally three but occasionally two or four, in each of the two been shown recently that amino acid substitution at the antigen class families of mouse and human. The expressed loci are recognition site (ARS) is more rapid than synonymous substi- called "classical" for the class I family. Exceptionally high tution, suggesting some kind of positive levels of polymorphism exist only at the expressed loci, and working at the ARS. It is highly desirable to know the nonexpressed loci are much less polymorphic. It has been interactive effect of gene conversion and natural selection on speculated that the seemingly "dormant" genes may be the evolution and variation ofMHC gene families. A population useful as a donor repertory for gene conversion and help in genetic model is constructed that incorporates both selection enhancing polymorphisms (10, 11). Based on such unusual and gene conversion. Diversifying selection is assumed in which genetic organization at MHC loci, it has been suggested that sequence diversity is enhanced not only between alleles at the these genes evolve with continuing formation, diversifica- same locus but also between duplicated genes. Expressed and tion, and degeneration of alleles and loci, presumably be- nonexpressed loci are assumed as in the class I gene family of cause of changing demands upon antigen presentation by a MHC, with gene conversion occurring among all loci. Exten- variable antigenic environment (12). This picture may be sive simulation studies reveal that very at viewed as "genetic turnover" involving unequal crossing- individual amino acid sites in combination with gene conversion over, gene conversion, and diversifying selection. It might can explain the unusual pattern of evolution and polymor- also be regarded as a type of frequency-dependent selection phisms. Here both gene conversion and natural selection in the sense of minority advantage within a population of a gene family. contribute to enhancing polymorphism. From sequence comparisons, it is thought that the three presently expressed loci of each gene family were duplicated The exceptionally high levels of polymorphism at the class I after the mouse-human divergence (9). At the time when the and class II loci of the major histocompatibility complex genes duplicated, their divergence would have been low. (MHC) in human and mouse have been of great interest for Thus, in my model, diversity among genes is assumed to be many years (for reviews, see refs. 1 and 2). Based on recent enhanced by selection. Two types of loci are assumed: Is is discoveries of the effects of protein structure on antigen the number of selected loci corresponding to the classical recognition and by using reported DNA sequences at these class I loci, and In is the number of nonselected loci corre- loci, Hughes and Nei (3) have shown that amino acid re- sponding to the nonclassical ones. All loci are assumed to be placement substitutions occur more frequently than synon- identical and free of at the beginning. Each locus ymous substitutions at the antigen recognition site (ARS). consists of 50 sites that correspond to the amino acid sites in From this finding, these authors argue that heterozygote the ARS. Mutation according to the infinite allele model (13) advantage (overdominant selection) is operating at the ARS. is assumed at each site. Fig. 1 shows the model for the case However, these genes are known to be evolving under of Is = 3 and In = 6. various molecular interaction mechanisms such as gene con- A realistic value of the mutation rate per ARS, v, was version and unequal crossing-over (see refs. 2 and 4-7, for chosen with respect to the product, Nv, where N is the reviews), and overdominant selection at fixed loci would effective population size. It is now known that the average seem to be an insufficient mechanism. It is highly desirable heterozygosity per nucleotide site of man is around 0.002- to investigate how natural selection interacts with such 0.004 (ref. 14, see page 267). This value can be set approx- molecular mechanisms. In this report, I show that a model imately equal to 4NvO, where vo is the selectively neutral that incorporates both selection and gene conversion fits mutation rate per site per generation (15). There are 57 amino better to the observed facts than the model of simple over- acids in the ARS (16, 17), and they would correspond to dominant selection. roughly 100 amino acid replacement sites. Therefore, 4Nv of ARS should be -0.2-0.4. I have used Nv - 0.1 (N = 50, and Model and Simulation Procedure v = 0.002). I further assume that one generation roughly equals a year in the ancestral species of man and mouse in the In the genomes of human and mouse, there are usually three subsequent discussion. loci each of the class I and class II gene families (1, 2). All As in an ordinary gene family, gene conversion is assumed genes are expressed as important cell-surface molecules that to occur among the loci, in addition to mutation and random participate in regulating immune reaction (ref. 8, see pages . Interlocus but intrachromosomal conversion is carried out by choosing two loci from (is + In) loci, and one The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" Abbreviations: MHC, major histocompatibility complex; ARS, an- in accordance with 18 U.S.C. §1734 solely to indicate this fact. tigen recognition site.

6716 Downloaded by guest on September 26, 2021 Genetics: Ohta Proc. Natl. Acad. Sci. USA 88 (1991) 6717

of the two, randomly chosen, converts the other. A site i[s A randomly chosen from the 49 sites excluding the rightmosISt ) ) U) | | | | chromosome one, and either the region to its left including itself or thie region to its right excluding itself is converted. The rate aIf occurrence of the above event is A per gene per generationn. 1 Because only half of a gene is converted on the average, thiWe Cl effective conversion rate, Ae = A/2. This procedure produceI's A small value o4, B 50 sites various "recombinant" genes among loci. . /7 .. I I I conversion rate (NAe = 0.0-0.4) was chosen that was thoughit ::I I t I I to be realistic based on gene diversity and gene trees. t~ dFl dF2 dF3 Intralocus but interchromosomal conversion is also incorr- porated. It is performed by choosing a locus from (Is + ln) loc of a diploid individual, and the gene at this locus on on( -I I I I I I chromosome converts that on the homologous chromosome of the individual. This is again done by choosing a site fron I C1 2 49 sites of the gene. The interchromosomal conversion haw'S similar effect as the ordinary crossing-over. Various "recomi FIG. 1. (A) Diagram of the gene family consisting of three binant" genes are again produced by this process, and expressed loci and six nonexpressed loci. Allelic (F) and nonallelic recombination is between genes at the same locus this time (C1) identity coefficients are also shown. (B) Diagram of three The rate of intralocus conversion is ,3, and the effective rate expressed loci. A gene contains 50 sites, and dF, and dclj are the is Pe = 8/2. Again a low rate (NIBe = 0.1) was chosen that was numbers of different sites in comparison. thought to be realistic from the data. For selectively neutral , theoretical predictions RESULTS made on genetic variability in the present model is possible Our main interest is in how gene diversity is attained under (4, 18). When selection is involved, the process is more selection and gene conversion. First, results of sequence complicated, and extensive Monte Carlo simulations are divergence from the ancestral sequence are presented. Two required. cases are studied. In case 1, three expressed loci are present If the system starts from identical genes, natural selection and the copy number remains constant (4s = 3, in = 0). In case should operate to increase diversity not only between alleles 2, three expressed and six nonexpressed loci are present (is at the same locus but also between genes at different loci. = 3, In = 6). In all cases, both interlocus (intrachromosome) Here it is convenient to use the identity coefficients (4, 18) off the multigene family model. Fig. 1A shows the two identity and intralocus (interchromosome) conversions are incorpo- coefficients that represent the probabilities ofgene identity of rated. In each case, several levels of selection intensity (SF the illustrated relationships for the case of Is = 3 and In = 6. and Sc of Eq. 1) were considered, but the value of truncation Selection is assumed to work to lower F and/or C1. Let dF1 point was always assumed to be 10 (dt = 10 in Eq. 1). and dc~i be the numbers of different sites among the se- The sequence divergences from the original measured by quences of the illustrated relationships of a diploid individual the distance-i.e., -lo0ge( - Pd)-where Pd is the fraction of as in Fig. 1B. Then the fitness of a gamete, w, from this different sites, is presented in Fig. 2 as functions oftime. The individual with dcj and dF, is assumed to be given by straight line gives the case ofneutral mutations. When 2Ns = 0.5, gene conversion appears to be more effective in case 2 w = exp(-AF - Ac), [1] than in case 1. However, when 2Ns = 1.0, the pattern of the accelerated divergence is similar in both cases. For 2Ns = where 2.0, the acceleration slows down as time goes on in case 2, where gene conversion from the nonselected loci is incorpo- rated. This is thought to be caused by the decreased selection AF = SF (dt -dFi), when dFi < dt, effect produced by conversion from the nonexpressed loci that are not rapidly diverging. I have repeated simulations, AF = 0, when dFi 2 dt, and very similar figures were obtained. Thus, the result is repeatable and has an important bearing on the observed is-1 decrease of the acceleration of amino acid substitution at the site I AC = Sc E (dt - dc,.), when dc1i < dt, and antigen recognition ofthe class genes (3). This problem i=l will be discussed further in a later section. It can be seen from the figures, that the selection is very AC = °, when dc,i 2 dt. efficient even when it is mild. The selection intensity of 2Ns 1 is sometimes called "near neutrality" (15). The reason In this equation, SF and sc are selection coefficients and d, is for such efficient selection is the linkage disequilibrium the number corresponding to the truncation point. This among segregating sites, a situation similar to the Franklin- fitness function is motivated by the consideration that func- Lewontin effect of multilocus selection (19). Once "coad- tional diversity among class I or class II molecules depends apted" mutant blocks are established, they are maintained in on amino acid differences in the ARS, and the greater the the population by selection (20). The appearance of such differences, the more beneficial a gene family is until each coadapted blocks is thought to be the result of interacting difference reaches its truncation point dt. This model of forces of selection and random drift (19). selection treats only part ofthe turnover process ofthe whole Let us now turn our attention to the more general proper- gene family mentioned before-i.e., the process of differen- ties of the present model. Several interesting quantities such tiation of duplicated genes. The results are useful for under- as allelic diversity or actual number ofalleles were examined standing sequence comparison data. The meaning of "diver- in the period from the (1/v)th generation to the end of each sifying" selection is somewhat different from that of its simulation experiment, and average values are presented in ordinary usage, because diversity usually means phenotypic Table 1. These values do not pertain to the equilibrium diversity. In the present study, environment is an antigenic situation but rather are the average for the transient phase. world. The actual MHC gene families turn over in the evolutionary Downloaded by guest on September 26, 2021 6718 Genetics: Ohta Proc. Natl. Acad. Sci. USA 88 (1991)

V.I

0.1 1'N, * 0.1o l . * 0.1

0 0 0 0 20 40 60 80 0 20 40 60 80 20 40 60 80 Time

FIG. 2. Gene divergence measured by -lo&(l - pd) is given as functions oftime, where Pd is the fraction of sites that differ from the original

sequence. Time is measured by units of N generations. -, case 1; ---, case 2. Parameters are 2Nv = 0.2, 2NA = 0.1, and 2N,8 = 0.4.

time scale as discussed previously, so the transient phase where HA = 1 - FA and HB = 1 - FB. The denominator is should be closer to the real situation. the product of the fraction of nonidentical sites at the first Examined are nonallelic diversity, allelic diversity, age of locus and that at the second locus. Hedrick (22) suggested nonidentity (polymorphism; both allelic and non-allelic), standardization of each observed value, instead of using the actual number of alleles, identity excess, allelic diversity at means ofFA, FB, and so on. When a multisite gene is treated, one of the nonexpressed loci (case 2), and distance from the the present measure is more convenient. Both measures are on one to be original sequence at the end of each simulation. Allelic and dependent the value ofthe denominator, and has careful in evaluating the results (22). nonallelic diversity are measured by the fraction of the Several interesting properties ofthe model can be found in different sites among the 50 sites. Age of nonidentity is the data ofTable 1. (i) Very weak selection at individual sites age younger mutant at two measured by the of the the sites causes a large increase of allelic and nonallelic diversities, as compared whenever the two sites differ, and the value is well as ofthe actual number ofalleles. (ii) Although selection averaged for all different sites. Both the diversity and the age is effective in both cases, allelic and nonallelic diversity and for nonallelic comparisons are made for nonallelic genes on the actual number of alleles are increased by interlocus gene the same chromosome, corresponding to C1 in Fig. 1. The conversion, especially when selection is very weak. (iii) Age actual number of alleles is the one found in the simulated of nonidentity becomes higher by the selection, and interlo- population of 2N = 100. cus conversion again increases the age of nonidentity. (iv) Identity excess is a measure of linkage disequilibrium. Identity excess is large even when selection is very mild- When there are many sites in a gene, this measure is i.e., fairly large linkage disequilibrium is expected under the convenient and hence is used here. Let FA and FB be the present model. (v) Allelic diversity at the nonexpressed loci fractions of identical sites of two randomly chosen chromo- (case 2) is one-third to about one-half ofthat at the expressed somes at the first and the second loci, respectively. Let FAB loci. (vi) As pointed out before, in case 2, genetic divergence be the probability of having identical sites simultaneously at measured by distance from the original sequence is decreased the first and second locus. Then the identity excess is FAB - by gene conversion from nonexpressed loci in the later period FAFB. I have suggested the following standardized measure of the simulations when 2Ns = 1.0 or more. All of these (21): properties of the model have significant implications for understanding the observed pattern ofMHC polymorphisms, FAB- FAFB which will be discussed later. Ft ,A [2] Our next simulation experiments incorporate a different HAHB form of selection, in which diversifying selection is only for

Table 1. Properties of the simulated populations in the period from the (1/v)th to the 80Nth generation Age of Actual Identity Allelic Divergence Nonallelic Age of Allelic polymor- number excess diversity,* at 80Nth Case 2Ns diversity* nonidentityt diversity* phismt of alleles standard nonexpressed generation 1 0.0 0.152 ± 0.074 1037 0.010 ± 0.006 93 3.21 1.24 - 0.151 0.5 0.176 ± 0.074 827 0.038 ± 0.025 443 4.30 1.50 0.177 1.0 0.259 ± 0.052 1253 0.078 ± 0.047 824 4.72 2.26 0.2% 1.5 0.249 ± 0.069 1115 0.086 ± 0.052 698 4.94 3.43 0.269 2.0 0.295 ± 0.060 1277 0.129 ± 0.036 1029 5.22 1.34 0.292 2 0.0 0.145 ± 0.064 1042 0.024 ± 0.019 540 3.92 2.30 0.027 0.179 0.5 0.257 ± 0.081 1142 0.070 ± 0.042 790 4.59 5.26 0.023 0.264 1.0 0.258 + 0.069 1039 0.088 ± 0.053 824 4.57 3.00 0.046 0.226 1.5 0.261 ± 0.041 1193 0.120 ± 0.041 999 5.25 1.20 0.076 0.202 2.0 0.290 ± 0.035 1288 0.135 ± 0.040 1152 5.95 1.02 0.071 0.214 Other parameters: 2Nv = 0.2, 2NA = 0.1 and 2NP = 0.4, with N = 50. *Diversity is measured by the fraction of different sites among the 50 sites with the standard deviation between generations. tAge is the average value in terms of number of generations for all nonidentical sites in diploid individuals. Downloaded by guest on September 26, 2021 Genetics: Ohta Proc. Natl. Acad. Sci. USA 88 (1991) 6719

Table 2. Properties of the simulated populations in the period from the (1/v)th to the 16ONth generation, for case 1 with 2Ns = 1.0 Age of Age of Actual Identity Divergence Nonallelic non- Allelic polymor- number excess at 16ONth 2NA diversity* identityt diversity* phismt of alleles standard generation C1 + F 0.0 0.419 ± 0.101 2259 0.053 ± 0.024 622 4.10 0.77 0.506 0.02 0.340 ± 0.094 2087 0.099 ± 0.046 1272 4.35 2.19 0.475 0.04 0.336 ± 0.069 2096 0.119 ± 0.050 1577 4.79 2.11 0.442 0.1 0.266 ± 0.059 1802 0.118 ± 0.039 1583 4.79 2.50 0.385 0.2 0.249 ± 0.069 1529 0.101 ± 0.053 1276 5.39 2.16 0.404 0.4 0.223 ± 0.049 1895 0.098 ± 0.053 1533 5.68 3.12 0.519 F 0.0 0.328 ± 0.152 1953 0.062 ± 0.031 758 4.14 0.24 0.439 0.02 0.252 ± 0.092 1632 0.099 ± 0.058 1102 4.33 2.27 0.489 0.04 0.198 ± 0.072 1635 0.090 ± 0.038 1332 4.49 1.96 0.318 0.1 0.171 ± 0.058 1793 0.132 ± 0.033 1738 5.17 2.66 0.469 0.2 0.173 ± 0.053 1471 0.131 ± 0.041 1361 5.73 2.78 0.539 0.4 0.067 ± 0.027 992 0.103 ± 0.032 1359 6.03 5.67 0.395 C1 + F means that selection works for diversity of both allelic and nonallelic genes (SF = SC in Eq. 1), and F means that selection works only on allelic genes (SF > 0, SC = 0 in Eq. 1). Other parameters: 2Nv = 0.2 and 2NP = 0.4, with N = 50. See Table 1 for footnotes. allelic genes and does not work on nonallelic genes-i.e., SF quite similar to those of data in Table 2. However, there are > 0 and sc = 0 in Eq. 1. Also, the length of the simulation is significant differences between the two. The differences are extended to 16ONth generations. As before, the average caused by gene conversion involving nonexpressed loci in values of gene diversity, age, and so on, in the period from case 2. First, both diversity and age of nonidentity increase the (1/v)th generation to the 16ONth generation were exam- by conversion from nonexpressed loci. Second, in case 2, ined. Table 2 gives the results for case 1. In the series of unlike the previous case 1, allelic diversity does not exceed experiments, the intensity of selection is fixed (2Ns = 1.0), nonallelic diversity even in the extreme situation of F selec- and the rate of interlocus conversion was varied from 2NA = tion with high conversion rate. Third, the actual number of 0.0 to 2NA = 0.4. In the table, C1 + F means that there is alleles tends to be slightly larger, but the identity excess and selection for diversity of both allelic and nonallelic genes (SF the distance at the end tend to be smaller in case 2 than in case = Sc = 0.01 in Eq. 1), and F means that selection is only on 1. In case 2, the allelic diversity at one of the nonexpressed allelic genes (SF = 0.01, Sc = 0 in Eq. 1). Let us call the former loci was also measured and is given in the table. The diversity (C1 + F) selection and the latter F selection. at the nonexpressed locus is 30-73% of that at the expressed The results of Table 2 show that, as the conversion rate loci. The difference between the expressed and the nonex- increases, the nonallelic diversity decreases, whereas the pressed loci is insufficient compared with real data, and the allelic diversity becomes larger in both selection models. As problem will be discussed later. to the effect of the selection form, the nonallelic diversity is higher and the allelic diversity is lower with (C1 + F) DISCUSSION selection than with F selection. This is just as expected. Thus, in the extreme situation of F selection with high The present simulation studies have clearly shown that the conversion rate, the allelic diversity exceeds the nonallelic interaction among diversifying selection, gene conversion, diversity. Such a relationship is not in accord with the real and random genetic drift is important for acquiring and data of MHC polymorphisms. The age of polymorphism is maintaining MHC polymorphisms: diversifying selection as similarly affected by the type of selection and by the con- well as gene conversion is effective in increasing the allelic version rate as the diversity. The actual number ofalleles also diversity, the actual number of alleles, and the age of poly- increases as the conversion rate increases in both selection morphism. The effect of conversion is particularly pro- models. The result has a significant bearing on understanding nounced in case 2, where gene conversion from the nonex- MHC polymorphisms, which will be discussed later. In both pressed loci is incorporated. Note that random drift is also models, the identity excess is high. important. In their simulation study of multilocus overdom- The results of case 2 for the two models of selection are inance, Franklin and Lewontin (19) concluded that the es- given in Table 3. General properties of data in Table 3 are tablishment of complementary blocks of genes is caused by Table 3. Properties of the simulated populations in the period from the (1/v)th to the 16ONth generation for case 2 with 2Ns = 1.0 Age of Age of Actual Identity Allelic Divergence Nonallelic non- Allelic polymor- number excess, diversity,* at 16ONth 2NA diversity* identityt diversity* phismt of alleles standard nonexpressed generation C1 + F 0.0 0.462 + 0.146 2197 0.074 ± 0.032 804 4.34 1.25 0.022 0.521 0.02 0.339 ± 0.087 2026 0.079 ± 0.041 1366 4.37 0.79 0.033 0.358 0.04 0.332 ± 0.099 2331 0.105 ± 0.038 1541 4.45 1.70 0.055 0.309 0.1 0.284 ± 0.063 1941 0.121 ± 0.055 1640 5.10 1.78 0.091 0.352 0.2 0.274 ± 0.060 1898 0.146 ± 0.044 1684 5.92 1.29 0.095 0.384 0.4 0.273 ± 0.067 1820 0.159 ± 0.059 1703 7.15 1.65 0.096 0.464 F 0.0 0.336 ± 0.141 1738 0.017 ± 0.012 143 3.36 1.95 0.013 0.450 0.02 0.286 ± 0.105 1898 0.093 ± 0.043 1360 4.43 1.19 0.038 0.321 0.04 0.278 ± 0.109 1872 0.117 ± 0.054 1610 4.86 1.87 0.056 0.354 0.1 0.248 ± 0.082 2056 0.136 ± 0.044 1817 5.28 1.92 0.075 0.294 0.2 0.213 ± 0.072 1687 0.133 ± 0.049 1604 6.17 1.90 0.093 0.422 0.4 0.167 ± 0.050 1494 0.133 ± 0.034 1506 7.11 2.36 0.097 0.380 C1 + F, F, and other parameters are as in Table 2. See Table 1 for footnotes. Downloaded by guest on September 26, 2021 6720 Genetics: Ohta Proc. Natl. Acad. Sci. USA 88 (1991)

"finiteness" of population size. In our study, gene conver- Thus, the two loci may not be interchangeable, and the sion makes the system more complex, and all three processes genetic turnover may be prevented in the class II gene family. contribute to the development of polymorphisms. For class I genes, the mean divergence is 40% at most, and The real data of MHC polymorphisms suggest that many one has to assume very limited types of amino acid replace- polymorphisms are trans species-i.e., more ancient than ments at ARS to explain the slowdown by saturation. I have recent speciation (23). The age of polymorphisms are often mentioned that the diversifying selection for allelic and estimated to be (50-100)N generations (24). Our results nonallelic genes results in rapid divergence at the beginning suggest that an age of this order of magnitude may be of the turnover process, followed by the slower divergence explained by assuming weak selection at each site. In addi- (20). The present simulations have shown that conversion tion to the results given in Tables 1-3, I have checked the age from the nonexpressed loci strengthens this tendency. As of polymorphic alleles at the end of each simulation experi- mentioned before, this effect may be caused by the slow ment. These values often exceed 10ON generations at the divergence at the nonexpressed genes from which genetic 160Nth generation of the experiment even under weak se- information transfers to the expressed loci. lection (2Ns = 1.0). Note that the age of polymorphic sites given in Tables 2 and 3 is the average for all segregating sites I thank Professors Motoo Kimura, Takehiko Sasazuki, Philip W. in the period from the (1/v)th to the 160Nth generation, and Hedrick, Bruce S. Weir, Hiroshi Hori, and Kenichi Aoki for their the value is much smaller than the age ofpolymorphic alleles many valuable comments on the manuscript. This work is supported at the end of the experiment. by a Grant-in-Aid from the Ministry of Education, Science and The actual number of alleles in a sample of 10-80 haplo- Culture of Japan. This is contribution no. 1869 from the National types is reported to be 5-20 in local Mus populations (25). Our Institute of Genetics, Mishima, Japan. results suggest that the number in the simulated populations 1. Klein, J. (1986) Natural History of the Major Histocompati- is slightly smaller than such data. This problem may be bility Complex (Wiley, New York). overcome by making ln larger or the region of conversion 2. Bodmer, W. F. & Bodmer, J. G. (1989) in Mathematical Evo- smaller in the simulation to induce more "recombinant" lutionary Theory, ed. Feldman, M. W. (Princeton Univ. Press, genes. The allele number may also be increased by bringing Princeton, NJ), pp. 315-334. the subdivided population structure into the model. These 3. Hughes, A. L. & Nei, M. (1988) Nature (London) 335,167-170. problems are left to a future study. 4. Ohta, T. (1983) Theor. Popul. Biol. 23, 216-240. Allelic diversity at the nonexpressed locus may not be 5. Ohta, T. (1988) in Oxford Surveys in , V, small enough as compared with that at the expressed loci in eds. Harvey, P. H. & Partridge, L. (Oxford Univ. Press, Oxford, U.K.), pp. 41-65. the simulated populations. In other words, the difference 6. Kappes, D. & Strominger, J. L. (1988) Annu. Rev. Biochem. between the expressed and the nonexpressed genes may be 57, 991-1028. insufficient to account for the actually observed difference. 7. Lawlor, D. A., Zemmour, J., Ennis, P. D. & Parham, P. (1990) In future analyses, the preferential conversion from the Annu. Rev. Immunol. 8, 23-63. nonexpressed to the expressed loci should be incorporated. 8. Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K. & Linkage disequilibrium is one ofthe most intensely studied Watson, J. D. (1989) of the Cell (Garland, quantities on MHC polymorphisms (26). Strong association New York), 2nd Ed. among serologically detectable alleles ofdifferent loci is often 9. Klein, J. & Figueroa, F. (1986) CRC Crit. Rev. Immunol. 6, observed, but the combination of strongly associated alleles 295-386. 10. Bregegere, F. (1983) Biochimie 65, 229-237. is usually different between local populations, indicating 11. Ohta, T. (1984) Genetics 106, 517-528. some effects of random drift. In the present simulations, 12. Parham, P. (1989) Nature (London) 324, 617-618. strong linkage disequilibrium occurs as shown by standard- 13. Kimura, M. & Crow, J. F. (1964) Genetics 49, 725-738. ized identity excess in Tables 1-3. Although one needs more 14. Nei, M. (1987) Molecular Evolutionary Genetics (Columbia detailed numerical comparison with the actual data, the Univ. Press, New York). association of alleles between loci appears to be strong 15. Kimura, M. (1983) The Neutral Theory ofMolecular Evolution enough in the simulated populations. I incorporated inter- (Cambridge Univ. Press, London). chromosomal conversion, which has an effect similar to the 16. Bjorkman, P. J., Saper, M. A., Samraoui, B., Bennett, W. S., meiotic but it may not be Strominger, J. L. & Wiley, D. C. (1987) Nature (London) 329, ordinary crossing-over, quite 506-512. sufficient. 17. Bjorkman, P. J., Saper, M. A., Samraoui, B., Bennett, W. S., Finally, I discuss the puzzling observation on the acceler- Strominger, J. L. & Wiley, D. C. (1987) Nature (London) 329, ation of amino acid substitution at ARS. Figure 2 of ref. 20 512-518. indicates that the acceleration of amino acid substitution 18. Nagylaki, T. (1984) Genetics 106, 529-548. disappears as genes become old in the genetic turnover 19. Franklin, I. & Lewontin, R. C. (1970) Genetics 65, 707-734. process mentioned earlier. Hughes and Nei (3) suggest that 20. Ohta, T. (1991) in Evolution ofLife, eds. Osawa, S. & Honjo, the slowdown of the acceleration can be explained if only T. (Springer, Berlin), pp. 145-159. certain types of amino acid replacements are favored even at 21. Ohta, T. (1980) Genet. Res. 36, 181-197. ARS with back and forth substitutions, which they call 22. Hedrick, P. W. (1987) Genetics 117, 331-341. 23. Klein, J. (1987) Hum. Immunol. 19, 155-162. "saturation." For the class II gene family, this hypothesis 24. Takahata, N. & Nei, M. (1990) Genetics 124, 967-978. may be appropriate, since the mean divergence is high. It is 25. Nadeau, J. H., Wakeland, E. K., Gotze, D. & Klein, J. (1981) likely that the evolutionary pattern is considerably different Genet. Res. 37, 17-31. between the class I and class II families. It has been suggested 26. Bodmer, J. G. & Bodmer, W. F. (1970) Am. J. Hum. Genet. 22, that the HLA-DR locus activates the immune reaction, 396-411. whereas the HLA-DQ locus may suppress the reaction (27). 27. Sasazuki, T. (1989) Prog. Immunol. 7, 853-860. Downloaded by guest on September 26, 2021