Copyright Ó 2008 by the Genetics Society of America DOI: 10.1534/genetics.107.082131
An Asymmetric Model of Heterozygote Advantage at Major Histocompatibility Complex Genes: Degenerate Pathogen Recognition and Intersection Advantage
Rick J. Stoffels*,†,1 and Hamish G. Spencer* *Allan Wilson Centre for Molecular Ecology and Evolution, Department of Zoology, University of Otago, Dunedin 9054, New Zealand and †The Murray-Darling Freshwater Research Centre, CSIRO Land and Water, Wodonga, Victoria 3689, Australia Manuscript received September 19, 2007 Accepted for publication November 29, 2007
ABSTRACT We characterize the function of MHC molecules by the sets of pathogens that they recognize, which we call their ‘‘recognition sets.’’ Two features of the MHC–pathogen interaction may be important to the theory of polymorphism construction at MHC loci: First, there may be a large degree of overlap, or degeneracy, among the recognition sets of MHC molecules. Second, when infected with a pathogen, an MHC genotype may have a higher fitness if that pathogen belongs to the overlapping portion, or intersection, of the two recognition sets of the host, when compared with a genotype that contains that pathogen in only one of its recognition sets. We call this benefit ‘‘intersection advantage,’’ g, and incorporate it, as well as the degree of recognition degeneracy, m, into a model of heterozygote advantage that utilizes a set-theoretic definition of fitness. Counterintuitively, we show that levels of polymorphism are positively related to m and that a high level of recognition degeneracy is necessary for polymorphism at MHC loci under heterozygote advantage. Increasing g reduces levels of polymorphism considerably. Hence, if intersection advantage is significant for MHC genotypes, then heterozygote advantage may not explain the very high levels of polymorphism observed at MHC genes.
ETEROZYGOTE advantage has been a particu- validity of highly symmetric selection has been ques- H larly appealing heuristic for major histocom- tioned, because it ignores the contribution made by patibility complex (MHC) polymorphism as it follows individual MHC molecules (De Boer et al. 2004). Given immediately from the biology of the system. That is, if that MHC alleles are codominantly expressed (Abbas we allow that the function of each MHC molecule is to and Lichtman 2006) and that individual alleles affect present a set of pathogen epitopes to T-cells, then, the host’s fitness when exposed to a particular pathogen because heterozygotes present two distinct sets of epi- (Jeffery and Bangham 2000; Nikolich-Zˇugich et al. topes to T-cells, they may be immune to a more diverse 2004), significant variation—and hence asymmetry—in set of pathogens over their lifetime than homozygotes. genotype fitness may exist within host populations. This Evidence of heterozygote advantage at MHC loci is variation is important with respect to the heterozygote accumulating for populations of humans (Thursz et al. advantage hypothesis of MHC polymorphism because 1997; Carrington et al. 1999; Tang et al. 1999; Jeffery models of asymmetric heterozygote advantage do not et al. 2000; Trachtenberg et al. 2003), other primates easily lead to a high level of polymorphism (Lewontin (Sauermann et al. 2001), and various other vertebrates et al. 1978; Spencer and Marks 1988; Marks and (Penn et al. 2002; McClelland et al. 2003; Froeschke Spencer 1991; Hedrick 2002). Consequently, evolu- and Sommer 2005). tionary immunologists have called for fitness functions Some theoretical studies have shown that heterozy- that more accurately capture the biology of the gene gote advantage may lead to the levels of polymorphism products, so that the validity of the hypotheses of MHC that we see in real populations of MHC genes (Mar- polymorphism may be more rigorously assessed. uyama and Nei 1981; Takahata and Nei 1990; Taka- Using allele-based fitness functions De Boer et al. hata et al. 1992), but these models assume very high (2004) and Borghans et al. (2004) concluded that levels of symmetry in selection, which implies minimal heterozygote advantage is not a valid explanation of or even no variance in the fitness of homozygotes and no MHC polymorphism. They showed that a high level of variance in the fitness of heterozygotes. The biological polymorphism is possible only if the fitnesses of all MHC alleles are very similar, which, they claimed, contradicts what we see in reality, and so heterozygote advantage 1Corresponding author: The Murray-Darling Freshwater Research Centre, CSIRO Land and Water, P.O. Box 991, Wodonga, VIC 3689, Australia. fails to explain the high degree of polymorphism of the E-mail: [email protected] MHC. By contrast, we present a large body of evidence
Genetics 178: 1473–1489 (March 2008) 1474 R. J. Stoffels and H. G. Spencer
Figure 1.—Hypothetical interaction network among pathogens, MHC molecules, and T-cells il- lustrating potential for degeneracy in the patho- gen recognition sets of MHC molecules.
implying that the fitness of MHC alleles may actually be that pathogen (Figure 1). Second, any given epitope quite similar. Before we present this evidence, however, may be bound by many different MHC molecules we must introduce some terminology that we use to (Figure 1). Indeed, there is a very large quantity of evi- describe the relationship between a pathogen strain and dence implying a large degree of degeneracy in the an MHC molecule. peptide-binding sets of MHC molecules (e.g., Siniga- Throughout this article we will define an MHC glia et al. 1988; Panina-Bordignon et al. 1989; Barber molecule by its pathogen ‘‘recognition set.’’ By saying et al. 1995; Koziel et al. 1995; Sidney et al. 1995; Bertoni that an MHC molecule ‘‘recognizes’’ a pathogen strain, et al. 1997; Doolan et al. 1997; Khanna et al. 1998; we mean that the pathogen has at least one epitope that Southwood et al. 1998; Crotzer et al. 2000; Doolan binds to the peptide-binding groove of that MHC mole- et al. 2000; Gianfrani et al. 2000; Sidney et al. 2001; Diaz cule with an appropriate affinity and/or conformation et al. 2005; Schulze zur Wiesch et al. 2005). Thus, the to activate clonal expansion of a T-cell lineage. We above two features of the pathogen–MHC interaction assume an MHC molecule has some finite ‘‘recognition combine to imply that there may be a large degree of set,’’ which is the set of pathogen strains recognized by overlap in the pathogen recognition sets of MHC that MHC molecule. Because two MHC alleles can have molecules. Consider pathogens A and B in Figure 1: disjoint peptide-binding sets but both recognize the MHC molecules 1–3 all recognize pathogen A while all same pathogen strain and hence have the same fitness four MHC molecules recognize pathogen B. It follows under single-strain infection (see Figure 1 and below), that there must exist subsets of MHC alleles with very the fitness of an MHC molecule is (partially) defined by similar fitnesses and that, while we do not know how its recognition set and not by the set of peptides that it similar the lifetime average fitnesses of MHC alleles binds. If the recognition sets of MHC molecules are actually are, we certainly do not have much evidence broad, then the specificity of the MHC–pathogen in- that implies that the fitnesses of different MHC alleles teraction is low and variation in MHC allele fitness is low. are very dissimilar. By contrast, if recognition sets are narrow, then the Thus, a paradox emerges. Population geneticists have specificity of the MHC–pathogen interaction is high shown that selection shapes polymorphism in MHC and variation in MHC allele fitness is high. genes, but at the same time immunologists have shown There is little direct empirical evidence to suggest that that they possess a great degree of functional redun- the pathogen recognition sets of MHC molecules are dancy. Here we provide a reappraisal of the heterozy- narrow and disjoint. In our view, recent advances in the gote advantage hypothesis of MHC polymorphism using understanding of MHC–pathogen interactions imply a simple, single-locus model of asymmetric selection. the opposite: First, each pathogen contains many epi- We build on the work of De Boer et al. (2004) and topes, each of which is a viable target for an MHC mol- Borghans et al. (2004) by utilizing a fitness function ecule (see Figure 1, left; Nayersina et al. 1993; Koziel that makes allowances for the dual requirements of et al. 1995; Rehermann et al. 1995; Bertoni et al. 1997; allele-specific fitness and degeneracy in pathogen rec- Doolan et al. 1997; Jameson et al. 1998; Khanna et al. ognition sets. To this end, we employ a set-theoretic 1998; Rowland-Jones et al. 1998; Crotzer et al. 2000; approach to defining the fitness of MHC alleles. This Doolan et al. 2000; Gianfrani et al. 2000; Boon et al. approach allows us to address two particular aspects of 2002; Doolan et al. 2003; Schulze zur Wiesch et al. MHC polymorphism under heterozygote advantage. 2005). It is obvious that the potential number of MHC The first is the effect of the degree of degeneracy in molecules that recognize the same pathogen will in- pathogen recognition sets among MHC alleles. If each crease with the number of epitopes contained within MHC molecule recognizes and presents a large pro- Heterozygote Advantage at MHC Genes 1475
Figure 2.—Hypothetical relative fitness profiles of genotypes to single-strain infection and coinfection generated under two levels of intersection advantage, g. MHC allele R contains X within its recognition set while r contains Y.
portion of the total set of pathogens to T-cells, then the dependent on the shape of the fitness profile under host population may not need a large number of MHC single-strain infection, and hence on the degree of in- molecules to maintain immunity to the pathogen com- tersection advantage, so the inclusion of this parameter munity. Here we test this hypothesis. is pivotal to the rigorous assessment of the heterozygote Second, we parameterize our model to control for the advantage hypothesis of MHC polymorphism mainte- form of the fitness profile of genotypes under single- nance. Finally, under single-strain infection, g ¼ 1 strain infection. Consider the following interaction means that alleles have an additive effect, which cor- 1 between two pathogen strains, X and Y, and two MHC responds to a dominance coefficient of 2. alleles, R and r. Suppose allele R contains only X in its recognition set while allele r contains only Y. Under single-strain infection with pathogen X, what are the THE MODEL relative fitnesses—as measured by, for example, patho- gen density and blood cell counts—of the three host Suppose our population of MHC molecules is ex- genotypes RR, Rr, and rr? Figure 2, A and B, presents two posed to 100 pathogen strains. We assume that not all alternative fitness profiles under single-strain infection. strains have equal virulence; thus, we assume that the For the fitness profile in Figure 2A we assumed that virulence of a strain is not completely determined by its genotype RR obtains an advantage from expressing two interaction with MHC molecules. Therefore, let V ¼ {v1, alleles that both recognize X, while for the profile in v2, ..., v100}, the set of ‘‘weights’’ that defines the com- Figure 2B we assumed that RR does not obtain any munity of pathogen strains. Let vl be some arbitrary benefit from expressing two alleles that contain X in weight in V; then, for 1# l # 100, vl is drawn from a their recognition sets. Empirically derived fitness pro- uniform distribution, U½0,1 . files under single-strain infection often vary between the We denote the set of n MHC alleles in the host two extremes of Figure 2, A and B (Penn et al. 2002; population as A ¼fa1; a2; ... ; ang: Suppose that allele McClelland et al. 2003; Wedekind et al. 2005, 2006) so ai codes for an MHC molecule that recognizes some we introduce a parameter, g, that enables us to control subset of V. Denote this subset as Vi. This subset has size for the relative benefit that a genotype obtains by m for all i; m is a parameter. For ease of explanation, we expressing two alleles that recognize a pathogen strain refer to Vi as an MHC allele’s recognition set. when infected with that strain. We call this benefit We assign fitnesses to individual alleles. Let vi,k be the ‘‘intersection advantage’’ since it is the proportional kth element from the set Vi and vi\j;b be the bth element benefit obtained from pathogen strains in the inter- from Vi \ Vj ; then section—i.e., the overlapping portion—of the alleles’ X X recognition sets in a diploid genotype (see the model wij ¼ ðvi;k 1 vj;kÞ ð1 gÞ vi\j;b: ð1Þ discussed below). Heterozygote advantage emerges un- k b der coinfection (Figure 2, E and F) when the corre- The fitness of the homozygote follows immediately by sponding alleles have opposite fitness profiles under letting j ¼ i: single-strain infection, as has been experimentally X demonstrated (McClelland et al. 2003). Also the wi¼j ¼ð1 1 gÞ vi;k: ð2Þ degree of heterozygote advantage under coinfection is k 1476 R. J. Stoffels and H. G. Spencer
Therefore, when g ¼ 0, the fitness of each homozy- selection than equilibrium-based approaches would gote is equal to the sum of the weights in its allele’s suggest (Spencer and Marks 1993). Therefore, simu- recognition set, and the fitness of each heterozygote is lations were initiated with a single allele and new alleles equal to the sum of the weights in the union of its alleles’ were introduced at one of two per-locus rates (mL ¼ 2m, recognition sets. Here, g is the degree of intersection where m is the per-gene mutation rate): 10 5 and 10 6. advantage and represents the proportional benefit that Here we consider these allele-introduction rates to a genotype obtains by having two alleles that recognize a represent the combined effects of point mutation and pathogen strain when infected with that strain (0 #g# recombination, both of which are important to the 1). If g ¼ 0, the homozygote obtains no benefit from generation of MHC diversity (Martinsohn et al. 1999; having two copies of an allele and a heterozygote obtains Ohta 1999; Richman et al. 2003; Consuegra et al. 2005; no improvements in fitness from the elements in Vi \ Vj . Reusch and Langefors 2005; Schaschl et al. 2006). By contrast, if g ¼ 1, the fitness benefit that a host We ran simulations with three different effective pop- genotype obtains in the presence of a pathogen strain, 3 4 5 ulation sizes (Ne)of10,10, and 10 , so that the rate at vi, is directly proportional to the number of alleles that it which new alleles were added to the population, mP, was carries that recognize that strain. mP ¼ mLNe; new alleles were introduced when, for We assume a monoecious, randomly mating popula- 1 generation t, t mod m ¼ 0 (the combinations of mL tion with discrete, nonoverlapping generations. We also P and Ne used here ensured that mP was an integer). We assume that the pathogen community, V, is constant for also ran simulations in which mutations were intro- each individual simulation. By making this assumption, duced at random time intervals at the same mean rate, we effectively assume that all hosts are infected by all but there was no notable difference in results. New pathogens before finding a mate, that there is no alleles were introduced with a frequency of (2N ) 1, and variance in pathogen abundance, and that pathogens e any p that fell below (2N ) 1 was eliminated from the do not evolve. Of course, this assumption is artificial, i e population. Here, we assume that all alleles in the host albeit necessary, since we wanted to isolate the effects population recognize the same number, m, of patho- of heterozygote advantage on polymorphism construc- tion. That is, if we allowed the pathogen community to gens and simulate allele introduction and selection with ... vary, then we would no longer be studying polymor- nine levels of m: 10, 20, , 90, which correspond to phism maintenance due to heterozygote advantage fractions 0.1, 0.2, ..., 0.9. respectively. The parameter m alone, but instead studying the combined effects of represents the degree of degeneracy in pathogen heterozygote advantage and variation in selective pres- recognition by MHC molecules. Although this model sures, which are separate hypotheses of polymorphism contains a finite number of alleles, there is an extremely maintenance in the MHC (e.g., Hedrick 2002). Fur- large number of distinct combinations of the vi’s for any thermore, if we allowed the pathogen community to given value of m: 100!/½m!(100 m)! . Four values of the evolve, then we would naturally have a coevolutionary parameter g are simulated for each m-value: 0, 0.2, 0.4, model that would necessarily incorporate frequency- and 0.8. All simulations were run with and without drift. dependent fitness. Since frequency-dependent selec- Genetic drift in a population of n alleles was simulated tion may also maintain polymorphism in the MHC (e.g., by taking a sequence of n 1 conditional binomial Borghans et al. 2004), it is a hypothesis that competes samples each generation (see Gentle 2003, p. 198). with the heterozygote advantage hypothesis of MHC Drift took place after selection. Twenty replicate simu- polymorphism and we would then be confounding our lations were run for each m–g–Ne–mL combination, both treatment of heterozygote advantage. with and without drift. A new pathogen community, V, Let pi and pi9 be the frequencies of allele ai at times t was drawn for each replicate simulation. and t 1 1, respectively; the allele dynamics are then After each simulation was run for 105 generations, we described by the usual recursion equations: measured five quantities of particular interest. The first quantity is the number of alleles, n(A). For the second 1 pi9 ¼ piw wi; quantity, we measured the mean pairwise strength of P P selection across all genotypes. We defined the relative where wi ¼ j pj wij and w ¼ i wipi. fitness of a genotype as w˜ ij ¼ wij =maxðwij Þ. Selection We conducted simulations with allele introduction strength, sij, is equal to sij ¼ 1 w˜ ij and has domain ½0,1 . and selection. This nonequilibrium, ‘‘constructionist’’ We then take the average of the n(n 1 1)/2 sij values as approach (following Spencer and Marks 1993) has our measure of the strength of selection. For the third proved very useful in the analysis of polymorphism quantity, as a measure of the average proportionate maintenance in the past (Spencer and Marks 1988, heterozygote advantage (hij ) relative to the fittest 1992, 1993; Marks and Spencer 1991). Researchers homozygote, we defined wij;h ¼ wij =maxðwiiÞ and then utilizing this constructionist approach have shown that hij ¼ w˜ ij;h 1, and then calculated the mean across the polymorphism is far more easily generated and main- n(n 1)/2 heterozygotes. For the fourth quantity,P we 2 tained via a simple process of allele introduction and calculated the expected heterozygosity: H ¼ 1 i pi . Heterozygote Advantage at MHC Genes 1477
We compare levels of heterozygosity and polymor- levels of polymorphism (see ‘‘Neutral’’ expectations in phism with those expected under neutrality. Levels of appendices a–d). heterozygosity under neutrality can be obtained from In the absence of genetic drift, the level of poly- Kimura and Crow (1964). In addition, we constructed a morphism increases nonlinearly to a maximum at m ¼ simple neutral computational model, which was similar 90 (Figure 3, a–c; appendix a). Including genetic drift to the model outlined above, in that alleles were causes the maximum level of polymorphism to occur at introduced at a per-locus rate of mL to an originally lower levels of recognition degeneracy (Figure 3, d–f; monomorphic locus, which was then subject to genetic appendix a). As discussed above, weak selection across drift without selection. Because levels of H and nA are so genotypes is required for the coexistence of large num- variable in small populations under neutrality, we ran bers of alleles. However, weak selection also leaves a more replicates for the smaller population sizes: 104,103, polymorphism more susceptible to erosion by the forces 3 4 5 and 200 replicates for Ne ¼ 10 ,10 , and 10 , respectively. of genetic drift and limits the ability of new alleles to Our computational estimates of heterozygosity agree invade (Crow and Kimura 1970, p. 422). Therefore, very well with analytic estimates from Kimura and Crow levels of MHC polymorphism may be maximized by (1964), so we can have some confidence that our genetic increasing recognition degeneracy, but only to a thresh- drift algorithm is correct (appendix a). old level of m, at which the erosive effect of genetic drift begins to take over (Figure 3, a–f). Levels of polymorphism were severely affected by genetic drift, even for very large population sizes (N ¼ RESULTS e 105; Figures 3 and 4; appendix a). The highest mean Recognition degeneracy: We first consider the effect level of MHC polymorphism recorded was 233 alleles; of recognition degeneracy, m, on the level of polymor- this occurred in the absence of genetic drift with mL ¼ 5 5 phism at g ¼ 0. Recall that the intersection advantage, g, 10 , Ne ¼ 10 , g ¼ 0, and m ¼ 90, while the highest mean determines the proportionate benefit that a host obtains level recorded in the presence of drift was 43 alleles, by having two alleles that recognize a pathogen when which occurred at the same parameter values (compare infected with that pathogen. Thus, at g ¼ 0, a host ob- appendices c and d). As a consequence of the negative tains no further benefit from having a second allele that relationship between selection and recognition degen- also recognizes that pathogen. We therefore draw the eracy, m, the loss of polymorphism due to genetic drift is reader’s attention to points connected by the solid line in greatest at high levels of recognition degeneracy. This Figure 3, which shows how mean levels of polymorphism relationship is clearly demonstrated in Figure 4. In- n(A) and heterozygote advantage (hij ) vary as a function terestingly, the greatest net loss of polymorphism due of both recognition degeneracy (m) and degree of to genetic drift occurred at the largest population size 5 intersection advantage (g). (Ne ¼ 10 ; Figure 4). This result may, at first, seem We expected the level of polymorphism to be nega- counterintuitive. However, at high levels of recognition tively related to recognition degeneracy. By contrast, the degeneracy selection becomes very weak, which means level of polymorphism was generally positively related to that new alleles either do not easily invade (Crow and m (Figure 3, a–f; appendices a–d). This relationship was Kimura 1970, p. 422) or invade but are easily lost from particularly strong in the absence of genetic drift the population. Because large, finite populations are (Figure 3, a–c) and was consistent across rates of allele subject to more frequent introductions of alleles, un- introduction (mL; appendices a–d). The mechanisms der such weak selection the proportion of successful underlying this relationship are as follows: At low levels invasions may be negatively correlated with population of m there is greater variance in the composition of the size in the presence of drift. Alternatively, allele invasion alleles’ recognition sets than that expected at high levels rates may not vary with population size, but the pull of of m. Thus there is the potential for much more vari- the attractor about the polymorphic equilibrium may be ance in allele fitnesses and a more asymmetric form of negatively correlated with the number of alleles in the selection at low levels of m. Selection strength across population and hence negatively correlated with pop- genotypes is then strongest at low levels of m (Figure 4; ulation size also (e.g., Kimura and Crow 1964). Thus, appendix a), and small sets of alleles dominate the the average lifetime of alleles may be negatively corre- population of MHC molecules. As recognition degen- lated with population size, which may result in a rela- eracy increases, the compositions of recognition sets tively greater loss of polymorphism to genetic drift in become increasingly similar, which in turn lowers selec- larger populations. tion strength (Figure 4; appendix a), makes selection Intersection advantage: Intersection advantage, g, more symmetric, and enables the coexistence of more had surprisingly complex effects on both the statistical alleles. However, while weak selection, or near neutral- properties of fitness sets and polymorphism. The most ity, is apparently a requirement for high levels of MHC obvious effects of increasing g are to reduce polymor- polymorphism under heterozygote advantage, com- phism and mean levels of heterozygote advantage plete selective neutrality generally results in very low (Figure 3; appendix a). It is obvious that m and g have 1478 R. J. Stoffels and H. G. Spencer
Figure 3.—Mean levels of polymorphism (6SD) as a function of recognition degeneracy, m, and intersection advantage, g, without genetic drift (a–c) and with genetic drift (d–f). (g–i) Mean levels of average heterozygote advantage (6SE) as a function of recognition degeneracy and intersection advantage (with genetic drift). Solid line: g ¼ 0; dashed line: g ¼ 0.2; dotted line: g ¼ 6 0.4; dash–dot line: g ¼ 0.8. Data presented for all three population sizes and allele-introduction rate mL ¼ 10 .