Quick viewing(Text Mode)

The Evolution of Hybrid Incompatibilities

The Evolution of Hybrid Incompatibilities

Copyright 0 1995 by the Society of America

The Genetics of : The of Incompatibilities

H. Allen Orr

Department of , University of Rochester, Rochester, New York 14627 Manuscript received August 3, 1994 Accepted for publication January 3, 1995

ABSTRACT Speciation often results from the accumulationof “complementary ,”ie., from genesthat, while having no deleterious effect within ,cause inviability or sterility when brought togetherwith genes from another species. HereI model speciation as the accumulation of genic incompatibilities between diverging . Several results are obtained. First, and most important, the number of genic incompatibilities between taxa increases much faster than linearly withIn time. particular, the probability of speciation increasesat least as fast as the square of the time since separation betweentwo taxa. Second, as Muller realized, all hybrid incompatibilities must initially be asymmetric. Third, at loci that have diverged between taxa, evolutionarily derived allelescause hybrid problems far more often than ancestral . Last, it is “easier” to evolve complex hybrid incompatibilities requiring the simultaneous action of three or more loci thanto evolve simple incompatibilities between pairs of genes. These results have several important implications for genetic analyses of speciation.

T is well known that little of The Origzn of Species con- As MULLER (1942) pointed out, it makes no differ- I cerns the splitting of species. One of the reasons for ence whether the substitutions occur in both popula- this neglect is not generally appreciated, however. It tions, as above, or in one only. If one population retains was not simply that DARWINwas more interested in the the ancestral aabb andthe other becomes forces shaping change within lineages. Instead, the ori- AAbb and thenAABB, the B may wellbe incompat- gin of posed a serious problem ible with the a allele among the AaBb hybrids. In either to DARWIN:as he admitted(1859, p. 264), it was unclear case, reproductive isolation results from “complemen- how something as patently maladaptive as the sterility tary” or “reinforcing” between loci A and B or inviability of hybrids could evolve by natural selec- (CROWand KIMURA 1970, p. 81): the lethal or sterile tion. In modern parlance,it was unclear how two geno- effect of an allele at one depends on the back- types descended from a common ancestor could be- ground at otherloci. Of course, complemen- come separated by an adaptive valley unless one of the tary genes need not have a complete effect-a pair of lineages passed through the valley. This would not, of complementary genes might cause only partial hybrid course, be allowed by . sterility or inviability. This fundamental problemwas finally solvedby DOE DOBZHANSKYand MULLER’Smodel of speciation is ZHANSKY (1936) and MULLER(1939, 1940) early in the important fortwo reasons. First, it shows that the evolu- modern synthesis. Each produced geneticmodels show- tion of hybrid sterility or inviability need not involve ing that two populations could come to produce com- any intermediate, maladaptive step. Perhaps more im- pletely sterile or inviable hybrids even when no substitu- portant, it shows that, while the problem of the origin tion caused any sterility or inviability within either of species can be reduced to the origin of reproductive population. Their models were verysimple: two allopat- isolation, this in turn-at least for postzygoticisola- ric populations begin with identical genotypes at two tion-can be reduced to the building up of comple- loci (aa,66). In onepopulation, an A allele appears and mentary genes. is fixed; the Aabb and AAbb genotypes are perfectly via- It is nowclear that postzygotic isolation usually results ble and fertile. In the other population, a B from complementary genes (ignoring sterility resulting from appears andis fixed; aaBb and aaBB are also viable and ). Among ,complementary genes causing fertile. The critical point is that, although the B allele have been found in is compatible with a, it has not been “tested” on an A Mimulus (CHRISTIE genetic background. It is thus possible that B has a and MACNAIR 1984, 1987), Crepls (HOLLINGSHEAD1930), deleterious effect that appears only when A is present. and cotton (GERSTEL1954). Similarly, complementaryfac- If the two populations meetand hybridize, the resulting tors causing hybrid inviabilityor sterility havebeen found AaBb hybrid may be inviable or sterile. in many hybridizations(MULLER and PONTEG ORVO 1940; MULLER1942; PANTMIDISand ZOUROS 1988; Author e-mail: [email protected] H. A. Om, unpublished data).

Genetics 139 180.5-1813 (April, 1995) 1806 H. A. Orr

Given the success of this simple model in explaining the genetical facts of hybrid sterility and inviability, it is remarkablethat almost no one has exploredthe mathematical consequences of viewing speciation as an accumulation of complementary genes [see NEI (1976) for a notable exception]. Although population geneti- cists have constructed more or less formal models of speciation (TEMPLETON1981; SLATKIN1982; WALSH 1982; NEIet al. 1983; BARTONand CHARLESWORTH1984; CHARLESWORTHet al. 1987), these theories are largely concerned with the relative roles of and natural selection in speciation and almost entirely ig- nore the simple mechanism by which postzygotic isola- tion evolves: the accumulation of complementary genes. FIGURE1.-History of substitutions fixed between two pop- The reasons for this neglect are not clear, but, as will ulations. Time runs upward. Both populations were initially fixed for lowercase alleles at all loci. The first substitution become clear, they do not include the mathematical occurred atlocus a, the second at b, and the third atc. Arrows difficulty of the problem. show possible incompatibilities between loci from the two In this and in a later paper (M. TURELLIand H. A. populations. ORR,unpublished data), we analyze models in which speciation is treated as the accumulation of complemen- Because I consider the cumulative effects of interac- tary genes. In a sense, we try to see how far one can go tions between many loci-which quickly gets compli- in explaining the facts of postzygotic isolationby studying cated-it is useful to picture this process diagramatically. consequences of the accumulation of complementary Figure 1 offers a simple way to picture the accumulation genes. Several different problems are considered: the of complementary genes between two haploid popula- rate at which reproductive isolation arises, the roles of tions. Each of the two heavy lines represents a lineage ancestral us. derived alleles in reproductive isolation, the descended from a common ancestor.The two allopatric expected complexityof hybrid incompatibilities (will populations begin with identical “ancestral” lowercase most incompatibilities involve pairs or triplets, etc. of genotypes at all loci (ab c . . .). Time runs upward, with genes?), thegenetic basis ofHaldane’s rule and thelarge the first substitution occurring at thea locus, the second role of the Xchromosome in reproductive isolation. at the b locus and so on. The first several issues can be addressed with simple The first substitution involves the replacement of the models that ignore (this paper), while Hal- a allele by the A allele in population 1 (uppercase letters dane’s rule and the large X-effect require more com- indicate only that an allele is “derived”; nodominance plex models that take dominance into account(M. TUR- is implied). TheA allele cannot cause any hybrid sterility ELLI and H. A. ORR,unpublished data). Both classes of or inviability: because A is obviously compatible with model, however, are built around the same theme: the the genetic background of population 1, it must be evolution of postzygotic isolation can be reduced to the compatible with the identical background of popula- mechanics of genic incompatibilities. tion 2. The second substitution, at the Blocus (in popu- lation 2), could be incompatible with only one locus: THE BASIC MODEL A, as the Ballele has not been “tested”for compatibility The central assumption of the DOBZHANSKY-MULLERwith A. The third substitution, at C, could be incompati- model of speciation is that alleles cause no sterility or ble with the B or a alleles. As we continue this process, inviability on their normal “pure species” genetic back- it is clear that we can identify all possible (i.e., evolution- ground. Instead, an allele can lower only when arily allowed) incompatibilities by drawing an arrow brought together with genes fromanother species. Any from each derived allele to each “earlier” (lower) allele particular might cause partial or carried by the other species. Thus D can be incompati- complete hybrid sterility or inviability. For most of this ble with c, B, and a. This arrow-drawing device will re- paper, I assume that hybrid incompatibilties involve in- peatedly prove useful. teractions between pairs of genes, as in DOBZHANSKYand Several facts immediately emerge fromFigure 1. First, MULLER’Sverbal models. Later, I consider three-locus hybrid incompatibilities only occur between two loci andhigher interactions. I also assume that multiple that have both experienced a substitution: in Figure 1, substitutions do notoccur at thesame locus, an assump- arrows never run up toward loci that have not diverged tion that is reasonable during the early divergence of (e.g., locus e). This follows from the fact that the two taxa. I assume nothing about the evolutionary causes of populations carry identical alleles at all undiverged loci, substitutions. The DOBZHANSKY-MULLERmodel of spe- so that any substitution must be compatible with these ciation requires only that substitutions occur and as- loci in both species. sumes nothing about whether they are brought about Several other less trivial facts also emerge from Fig- by natural selection or genetic drift. ure 1: P opulation Genetics Population of Speciation 1807

All incompatibilities are asymmetric. For example, that does not arise is ancestral-ancestral (AA). The rea- although B might be incompatible with A, b cannot son is obvious: all ancestral alleles must be compatible be incompatible with a. as they represent the initial genotype. Evolutionarily derived (uppercase) alleles are in- Thus, restricting our attentionto those loci that have volved in more potentialincompatibilities than ances- experienced a substitution, the alleles causing postzy- tral (lowercase) alleles. gotic isolation will be derived more often than ances- Later substitutions cause more possible incompatibil- tral. We would like to know how much more often. ities than earlier ones (e.g., although the substitution Imagine that a fraction f of all substitutions occur in of B produces one possible incompatibility, the later population 1 and 1 - fin population 2. We consider substitution of D produces three). This suggests that only those alleles that areinvolved in a hybrid incompat- the strength of reproductive isolation might increase ibility. A new allele that arises in, say, population 1 can faster than linearly with time. only be incompatible with those loci that have already experienced substitutions. By assumption, a fraction f I consider each of these observations below. Because of these loci in population 2 carry ancestral alleles and the last point has the most important evolutionary con- a fraction 1 - fcarry derived alleles. Thus the probabil- sequences, I consider it in the most detail. ity that anew allele in population 1will be incompatible with an ancestral allele in population 2 is proportional THE ASYMMETRY OF INCOMPATIBILITIES to J Similarly, the probability that it will be incompati- As first noted by MULLER(1942), all hybrid incompat- ble with a derived allele in population 2 is proportional ibilities must be asymmetric. Figure 1 shows that, al- to 1 - $ By making all such comparisons it is easy to though B might be incompatible with A, a cannot be see that the probabilities of the various possible incom- incompatible with b. The reason is simple: aabb repre- patibilities are sents an ancestral step in the divergence of these taxa P(D,A2 incompatible) = f ', (i.e.,aabb is either the genotype of the common ances- tor or an intermediate step in the evolution of these P(A,D2 incompatible) = (1 - f)2, taxa). Thus the requiredfertility/viability of all interme- P(D& incompatible) = 2f(l - f), (1) diate steps in the divergence of two taxa places con- straints on which incompatibilities are possible and where D represents a derived allele, A an ancestral al- which are not, a point that will recur. lele, and the subscripts identify the two populations. Notice, however, that the required asymmetry does From this, we can tabulate the expected frequency not mean that both derived and ancestral alleles at a with which derived us. ancestral alleles from each popu- locus cannot cause hybrid problems. Instead, it means lation will be involved in hybrid incompatibilities: only that the ancestral and derived alleles at one locus (e.g., A and a) cannot be incompatible with alleles at the same other locus (e.g.,the A-B and a4 incompatibili- ties are not bothpossible). Thus, even if the A allele of Figure 1 were incompatible with B, a could still be in- volved in another incompatibility, for example, with C. Indeed, this latter incompatibility is just as likely as any other. The finding that both homologous alleles at a locus do notusually cause postzygotic isolation (e.g., Wu and P(A,) = f'-. BECKENBACH1983) merely reflects the fact that p2 is 2 much smaller than p, where p is the probability that an Obviously P(D,) + P(A,) = P(D2) + P(A,), i.e., the allele causes detectable problems in hybrids. total frequency with which alleles from population 1 are involved equals the frequency with which alleles THE ROLE OF DERIVED VS. ANCESTRALALLELES IN from population 2 are involved, as everyincompatibility HYBRID INCOMPATIBILITIES must involve an allele from each species. Figure 1 also shows that derived (uppercase) alleles Finally, the ratio P(D):P(A) of derived-to-ancestral al- tend to be involved in hybrid incompatibilities more of- leles causing hybrid incompatibilities is ten than ancestral (lowercase) alleles. Once again, this 1 + 2f(l - f):1 - 2f(l - f). is a trivial consequence of the types of incompatibilities (3) that are possible: when a new derived allele (say C) is In words, the number of derived us. ancestral alleles substituted, it might be incompatible with either another causing reproductive isolation depends on thepropor- derived allele (B) or with an ancestral allele (a).Thus, tion of substitutions thatoccur in each population. both derived-derived (DD) and derived-ancestral (DA) When there are equal rates of evolution in the two incompatibilities occur. The only type of incompatibility lineages ( f = '/?), derived alleles are three times more likely 1808 H. A. Orr to be involved in hybrid incompatibilities than ancestral alleles, depend on the proportionof substitutions that occur in as noted by ORR (1993). Indeed, derived alleles are each population noron the order in which substitutions always more likely to cause hybrid problems unless all arise in population 1 us. 2. evolution occurs in one lineage (f= 1). In that case, Thus the probability of speciation increases much P(D):P(A) = 1:l since the only possible type of incom- faster than linearly. Indeed, when S 4 1, the cumulative patibility is derived-ancestral. The ratio P(D):P(A) probability of speciation increases as the square of the number plays an important role in one possible explanation of of substitutions. How S varieswith time depends, of Haldane’s rule (see ORR1993). course, on how the number of substitutions increases with time. If K increases approximately linearly with THE RATE OF SPECIATION time, i.e., if there is a rough , the cumu- lative probability of speciation rises with the square of Later substitutions cause more potential incompati- time since divergence. In any case, S increases faster bilities than earlier ones (Figure 1). As already noted, with time than does K the first substitution at the A locus cannot cause any Although we have assumed that asingle incompatibil- hybrid incompatibility, while the second substitution ity can cause hybrid lethality or sterility, additional in- could be incompatible with only one locus: the B allele compatibilities will obviously continue to accumulate has not been tested with the A allele. In general, the after this initial speciation event. It is easy to show that Kth substitution can be incompatible with K - 1 loci because of the everincreasing probability of obtaining from the other population. hybrid incompatibilities, the expected number of in- It is obvious, then, that thetotal number of incompat- compatibilities separating two taxa also increases very ibilities separating two taxa increases faster than linearly rapidly. BecauseEquation 5 describes a statedependent with the number of substitutions that have occurred Poisson process, the mean number of incompatibilities between them. This, in turn, implies that the strength that have occurred up through substitution K is of reproductive isolation-or the probability of specia- tion-between two taxa increases faster than linearly K(K - l)p K2P I= M- (6) with time. This important effect is easily quantified. I 2 2’ consider two cases. First, I assume that complete repro- i.e., the expected number of complementary lethals/ ductive isolation results from a single incompatibility steriles increases as K2.This rapid increase in the num- between two complementary genes. Second, I assume ber of “speciation genes” has important implications that reproductive isolation results from the cumulative for genetic analyses of speciation (see DISCUSSION). effects of many small incompatibilities. As we will see, Interestingly, we can also find the expected “time”to both cases yield similar results. speciation. If K, is the numberof substitutions required Single-incompatibilityspeciation: Reproductive iso- until the appearance of a hybrid incompatibility, then lation here results from a single incompatibility be- P(K, > K) = (1 - p)K(K-1)/2. Thus the expectation of tween two complementary genes and the level of isola- K, is it, = X& (1 - p)K(K-1)/2,which is approximately tion suddenly leaps from zero to unity. Although it is unclear just how common this situation is, there is con- & e-K2f’/2 dK siderable evidence that reproductive isolation some- 1 times has a simple genetic basis (HOLLINGSHEAD1930; GERSTEL1954; WITTBRODTet al. 1989). I calculate the probability that speciation hasoc- =6. curred as a function of the number of substitutions Thus if the probability that any two genes are incom- separating two diverging populations. For simplicity,as- patible in hybrids is p = an average of 400 substi- sume that a new derived allele has a fixed probability, tutions is required for speciation. Because the time to p, of being incompatible with each of the loci that has speciation is an inverse function of& it is not as sensi- previously experienced asubstitution. Because substitu- tive to p as one might expect. Doubling theprobability tion K may be incompatible with K - 1 loci, the Kth that an incompatibility occurs, for example, does not substitution has a probability 1 - (1 - p)K-l of causing halve the time to speciation but reduces it by a factor speciation. The cumulative probability of speciation, S, of l/h. is simply the probability that atleast one incompatibility Multiple incompatibility speciation: I now consider occurs, given by the case where speciation results from the cumulative R effects of several to many smaller incompatibilities. We s = 1 - (1 - py-1 = 1 - (1 - p)f-1)/2. (4) n will see that the above results do not depend on the n= 1 assumption that speciation is caused by a single incom- When p is small (which it surely is) and K is large, patibility of complete effect. r I r s 1 - e-mK-’)P//n 1 - e-K2P/2* The reduction in hybrid fitness, (0 5 l),re- (5) sulting from an interactionbetween any two genes has It is important to note that Equations 4 and 5 do not some frequency distributionf( r) . Obviously, the mean of Speciation 1809

P must be very small or speciation would be nearly TABLE 1 instantaneous;indeed, most interactions between Number of substitutions having largeeffects on genes in hybrids may have no effect on hybrid fitness reproductive isolation increases faster than linearly (r = 0). I assume that different incompatibilities act indepen- Mean proportion dently: if one incompatibility reduces fitness to (1 - rl) f(3 (2var) Factor Replicates and another reduces fitness to (1 - r2), both together reduce fitness to (1 - TI) (1 - r2). Thus the effect of Gaussian 0.172 + 0.0008 5.8X 1000 0.174 0.0009 5.8X 1000 substitution Kon hybrid fitness is given by LK = 1 - WK Exponential + 0.052 + 0.0015 19.4X 1000 = 1 - nEl’ (1 - rJ, where uKis fitness, considering U-shaped only those incompatibilities that involve the Kth substi- The second column shows the proportionof substitutions tution. If the ri are all fairly small, this is roughly LK M of large effect that occur during the first half of the time to speciation (from simulations). The third column shows the 1 - (1 - P)‘-I. Considering the cumulative effects of factor by which the number of substitutions of large effect all K substitutions, increases when the time of divergence is doubled (reciprocal K of column two). If genes of large effect accumulated at a L = 1 - M 1 - n (1 - g-1 = 1 - e-K‘v2 7 (8) constant rate, the total number of large factors would be 2 j= I times the number fixed in the first half of speciation. For where L is the strength of reproductive isolation (or each substitution, K - 1 random numbers were drawn from the probability density,f(r), where r is the strength of repro- the fitness “load” among hybrids due to complemen- ductiveisolation between two loci. ir = lo-’. Substitutions tary interactions). Thus,early in the divergence of continued until hybrid fitness was <5% of that of the pure two taxa (L< l),the strength of reproductive isolation species. The number of substitutions of greater-than-thresh- increases as the square of the number of substitutions: oldeffect (threshold = 3%) was tabulatedfor the first vs L = K‘?7/2. second halves of the speciation process. Similar results were obtained when different periods in the time to speciation Thus the chanceof speciation increases much faster were studied (e.g., the first and second quarters of the time than linearly with K (or time) whether speciation typi- to speciation). Larger thresholds yield even larger values in cally results from avery small number of genes of large column three. “Gaussian” densities refer to right-hand tail effect (as in the first model) or a large number of genes of a Gaussian centered at zero with D = ira.U-shaped f( r) of smaller effect (as in the second). constructed by adding “spike” of alleles of large effect (r= 0.5) to an exponential density (0.05% of probability density The number of substitutions having a substantial ef- fell in this spike). fect on reproductive isolation also increases faster than linearly with time. Thus, if one were to double the time since divergence, one would morethan double the Formally, if K loci have diverged, the mth locus to number of genes having a large effect on hybrid fitness. diverge (m< K) has a probability Pb = 1 - e-(m”)P This result has importantconsequences for genetic of being incompatible with a locus that diverged be- analysis of reproductive isolation (see DISCUSSION). Al- fore it, and a probability P, = l - e-(K-m)f’of being though it is very difficult to analytically determine the incompatible with a locus that diverged after it. Thus expected size of this effect (the problem involves the the totalprobability that locus m affects hybrid fitness product of K - 1 integrals, each with different limits is approximately 1 - (e-(m-l)f’)(e-(K-m)P) = 1 - of integration), the effect is easily seen in Monte Carlo e-(K”)P, which is independent of m. Thus, the fact simulations (Table 1). If, for example,one could detect that a gene has a large effect on hybrid fitness tells any gene that decreased hybrid fitness by 23%, the us nothing aboutits place in the chronologyof substi- number of alleles with detectably large effects easily tutions between two species. increases by five- to 19-fold when the time involved is Complexincompatibilities: We have assumed that doubled (the exact size of the effect depends on both hybrid incompatibilities involve interactions between T and the shape of f( r) ; Table 1). The effect is even pairs of genes. Thisneed not be the case (MULLER more dramatic if one uses larger, more realistic thresh- 1942). An incompatibility might, for instance, require olds (i.e., genetic analysis can only detect genesof fairly an interaction among three loci: the hybrid genotype large effect). Abc might be completely or partially sterile while any The role of early us. late substitutions: The above genotype consisting of any other combination of alleles discussion might seem to imply that a gene of known at these loci might be perfectly fertile. large effect on hybrid fitness was more likely a later Remarkably, such complex interactions have been than earlier substitution. This is incorrect. Although repeatedly found among the few hybridizations that the probability that a substitutioncauses hybrid steril- have been adequately analyzed (MULLER1942; CABOT ity or inviability increases with time, any gene af- et al. 1994). In theDrosophila pseudoobscura Bogota-USA flicting hybrids is just as likely to have been the first hybridization,for example, hybrid sterility appears gene to diverge as the last. This is becausea late only when males carry the “right” combination of al- diverging gene must be incompatible with something, leles at at least three loci (H. A. ORR, unpublished inparticular with somelocus that diverged earlier. data). [See GOTTSCHEWSKI(1940) [reviewed in 1810 H. A. Orr

MULLER 19421, PONTECORVOAABB(1943, p. 60), Om and aaBB COYNE(1989) and CABOTet aZ. (1994) for similar re- sults.] It thus appears that complex interactions may be quite common. Here I ask how these complex in- compatibilities affect the rate of speciation. Later, I consider the problemaabb of why complex AABB incompatibili- aabb AABB ties are so common. FIGURE2.-Two imaginable paths for the evolution of If speciation involves such complex interactions, the two taxa from a common ancestor. One daughter specieshas ge- probability of speciation must rise even faster than K2. notype “AABB” and the other “aabb”. The pathway on the In particular, if speciation can result suddenly from an left is allowed by natural selection. The pathway on the right incompatibility between n loci (where n ranges between is not allowed by selection as it requires passing through the 2 and Kinclusive) with a probability of pn for each type sterile or inviable “AAbb” hybrid genotype. of incompatibility, then the cumulative probability of speciation is As Equation 6 suggests, the number, In,of incompati- bilities involving an interaction between n genes is

where the binomial coefficient (fi) = K!/(n![ K - n] !), the number of combinations of K objects taken n at a i.e., the probability that any randomly chosen n substitu- time. Equation 9 shows that S must increase at least as tions will be incompatible with each other multiplied by fast as K‘ whether pair-wise incompatibilities are the the numberof possiblecombinations of n substitutions, commonest or the rarest type of incompatibility: (f), where K is the total number of substitutions that have for instance,is larger than (F), so inclusion of “triplet” occurred between two taxa. I assume that the probabil- interactions can only increase the cumulative probabil- ity that any set of loci is incompatible is independent ity of speciation in Equation 9. A similar result holds of whether other such sets are incompatible. if speciation results from the accumulation of many One reason complex incompatibilities may be so incompatibilities of smaller effect (not shown). Equa- common, then, is simply that (f) increases with n (as tion 9 also makes it clear that reproductive isolation long as n < K/2, which includes the range of biological increases faster than linearly as a trivial consequence of interest). There are, for example,many more possible the fact that the number of pairs or triplets, and so on, combinations of three than two loci. of any object (including substitutions) increases faster There is, however, a second reason complex incom- than the number of objects itself. patibilities may be common that is not obvious from Equation 10 because it is buried in K: the evolution of simpler incompatibilities is often prevented because it THE FREQUENCY OF COMPLEX INCOMPATIBILITIES would require passing through the unfit hybrid geno- It is not obvious why complex incompatibilities arise type, i.e., because the next substitution would cause ste- so often (MULLER1942; CABOT et al. 1994). Why should rility or inviability. To see this, consider the simplest the genetic architecture of reproductive isolation be case-an incompatibility involving a single locus (n= constructed in such a way that hybrids suffer no de- 1). One species has the genotype AA and the other crease in fertility or viability until they carry the “right” aa; the sterile or inviable hybrid is Aa. As DOBZHANSKY alleles at three or more loci? Complex incompatibilities (1937) and MULLER(1942) pointed out, it is virtually could arise for purely biochemical reasons. For in- impossible to evolve such an incompatibility because, stance, physical interactions between protein products whether the ancestral genotype was AA or aa, the evolu- from four loci may be more common than interactions tion of the other species’ genotype would require pass- between two gene products. This seems unlikely, how- ing through the unfit Aa genotype. ever. On theother hand, complex incompatibilities The same problem confronts theevolution of incom- might reflect the redundancy of biochemical pathways: patibilities involving more than one gene: if an incom- it may be necessary to “knock out” both loci A and B patibilityinvolves two genes (e.g., species 1 = AABB, and redundant loci C and D in hybrids to obtain any species 2 = aabb and theunfit hybrid genotype = AAbb) , sterility or inviability (the redundantpathway might in- then there are six possible ways to evolve these geno- volve duplicate genes). types from some common ancestor, but only three of There are,however, other, more evolutionary, rea- these paths do not either begin with or pass through sons why complex incompatibilities may be so com- the unfit hybrid genotype. (I assume that we do not mon. First, there are simply more combinations of know the genotype of the common ancestor and that three(or more) than two substitutions.Second, it evolution takes the most parsimonious path between is “easier”to evolve complexinteractions without two genotypes, i.e., multiple substitutions at a locus are incurring selection against any of the intermediate not allowed. Different orders of substitutions count as steps involved. different paths.) Two sample paths are shown in Figure Popu lation Genetics Population Speciation of 1811

1 incompatibilities involving two genes than one, so it is easier to evolve incompatibilities involving three (or more) genes than 0.8 two. Complex incompatibilities may, therefore, be com- mon simply because it is easier for evolution to arrive 0.6 at such genotypes without passing through any mal- adaptive intermediate step. 0.4 CABOT et al. (1994) have also recently considered the evolution of complex incompatibilities and have inde- 0.2 pendently come to essentially the same conclusion. It should be noted, however, that CABOT et al. derived 0 the total number of possible pathways connecting two species; thisnumber increases very fast withthe number -0.2 012345670 of loci involved, suggesting that incompatibilities involv- ing 20 genes may be much easier to evolve than those n involving 10. Equation 11 shows, however,that thefruc- FIGURE3.-The fraction of imaginable paths,F, connecting tion of allowable paths levels off quickly. By the time common ancestors to two incompatible genotypes that are interactions involve six or seven loci, natural selection “allowed” by natural selection (from Equation 11). n is the almost never bars any imaginable path (Figure 3). number of loci involved in the incompatibility. For conve- nience, I assume that the hybrid genotype is intermediate There is no reason, therefore, to expect the evolution between the parental genotypes (i.., m is set equal to the of extraordinarily complex interactions involving a very integer closest to n/2). large number of genes.

2; the left one is allowable while the right one is not DISCUSSION because it would require passing through theunfit AAbb genotype. The alleles causing postzygotic isolation cannot have In general, there are (n + 1)! paths leading from caused sterility or inviability when they first arose. This all imaginable common ancestors to the present two simple fact constrains the ways in which poatzygotic species. Some (n- m) !m! of these paths, however, begin can evolve. Consequently, the genetics of speciation are with a common ancestor having the genotype of the expected to show several regularities. unfit hybrid, where m is the number of n loci at which First, as MULLER(1942) emphasized, hybrid incompat- the hybrid and one of the present species (it does not ibilities should be asymmetric: if allele A from one spe- matter which) carry the same alleles. Another n(n - cies is incompatible with allele B from the other, alleles m) !m!of all imaginable paths pass through the sterile a and b must be compatible (as least early in speciation or inviable hybrid genotype as an intermediate step. before multiple substitutions at loci are common). Sec- Thus, a total of (n+ 1)(n - m) !m!paths are proscribed ond, derived alleleswill be involved in hybridincompati- by selection. Therefore, the fraction of possible paths is bilities more often than ancestral. When two lineages experience equal substitution rates, derived alleles are F = 1 - l/(l). (11) three times more likely to cause hybrid problems than ancestral. Third, it is easier toevolve hybrid incompatibil- [Equation 11 also gives the fraction of possible paths if ities between three (or more) than two loci; in the first one begins with some fixed (known) ancestral genotype place, there are simply more possible combinations of and considers all ways of evolving two species separated three (or more) than two substitutions. In addition, by an incompatibility involving n genes.] while the evolution of pair-wiseincompatibilities is often The important point is simple: Equation 11 shows prevented by natural selection (because the next re- that the fraction of allowable paths is fairly smallwhen quired substitution would cause sterility or inviability), n is small but quickly increases as n grows (see Figure more complex incompatibilitiesare easily arrived at with- 3). Thus the evolution of incompatibilities involving a out incurring selection against any intermediate step. small number of genes is often prevented by selection Last, and most important, hybrid sterility and inviability because it would require passing through anunfit geno- should evolve faster than linearly with time. type, for example, when n = 2 loci, 50% of the possible This fact has several interesting consequences. First, waysof evolving these incompatible genotypes from the evolution of reproductive isolation may differ from imaginable common ancestors are barred. Conversely, the evolution of “normal” character differences be- when incompatibilities involve more genes, more of the tween species. It seemslikely that morphological or possible ways of evolving these incompatible genotypes physiological differences between species usually in- are allowed by evolution. When n = 4, for instance, crease in proportion to the number of substitutions only 17% of all imaginable ways of obtaining these in- affecting such characters [these characters can, how- compatible genotypes are prevented by selection. ever, diverge faster than linearly with time if there is In short, for the same reason that it is “easier” to evolve strong epistasis between thegenes involved (LYNCH 1812 H. A. Orr

1994)l. Reproductive isolation must, however, increase It is certainly possible that the number of “speciation faster than linearly with time. Although we should not genes” detected here could be far smaller. conclude that strong reproductive isolation will arise Finally, this work highlights several places where we before morphological or physiological differences (this remain remarkably ignorant about thegenetics of speci- depends on the value of p or F), this snowballing cer- ation. First, we have surprisingly little information about tainly improves the odds that substantial reproductive the rate at which reproductive isolation accumulates. isolation will precede or accompany the evolution of COYNEand Om’s (1989b) review of the Drosophila lit- other differences between taxa. Thus the genetics of erature, in which the strength of reproductive isolation hybrid sterility and inviability may fortuitously increase between taxa was compared with , prob- the chances that differences evolved in allopatry will ably provides the best data. Although these data are be preserved by reproductive isolation upon secondary rough (e.g., partial sterility/inviability of a hybrid sex geographic contact. As FUTUYMA(1987) has argued, was ignored), they provide some support for a “snow- this “preserving” effect of reproductive isolation may balling” of postzygotic isolation. In particular, an ap- cause anapparent association between speciation proximate doubling of the numberof incompatibilities events and morphological evolution in thefossil record. separating two taxa does not require a doublingof time. Second,the increasing rate of speciation requires COYNEand Om’s data show that the mean (+ SE) ge- that we interpret geneticanalyses ofreproductive isola- netic distances at which hybrid males are sterile/invia- tion with caution. One purpose of such experiments is ble in only one us. both directions of a cross do not to estimate the number of genes causing speciation. significantly differ: D = 0.369 + 0.183 (n= 5) us. D = Most experiments involvetaxa (usually Drosophila) 0.226 + 0.021 (n= 28), respectively (all female hybrids that produce sterile or inviable male hybrids only (re- are viable and fertile). Similarly, the D at which nonre- viewed by COMEand Om 1989a).A traditional concern ciprocal and reciprocal female sterility/inviability ap- about these experiments is that one might erroneously pear do not differ: D = 0.950 + 0.123 (n= 6) us. D = count genes that affect hybrid fertility or viability but 1.009 + 0.087 (n= 20), respectively (all male hybrids are sterile/inviable). Thus, for bothmales and females, that evolved after the appearance of complete male ste- reciprocal sterility/inviability (requiring at least two in- rility/inviability. indeed, in Drosophila, male sterility/ compatibilities) seems to appear onthe heels of nonre- inviability often arises quite early and there is a long ciprocal effects (requiring atleast one incompatibility). lag before the evolution of female sterility/inviability While the data are not voluminous, this result is con- (COYNEand Om 1989b). The likely reasons for this sistent with the notion that the numberof incompatibil- stalling at male effects are complex and need not con- ities separating taxa accumulates at an accelerating rate cern us here (see COWEand Om 1989a,b). (Equation 6). Unfortunately, a snowballing of isolation Nonetheless, this stalling poses a problem: genetic is not the only process that could yield this pattern: analysis cannot distinguish between the genes thatactu- if hybrid fitness falls off suddenly (as with truncation ally caused the appearance of hybrid sterility (say be- selection), theinterarrival times for one-way us. recipro- tween NEI’Sgenetic distance D = 0 and D = 0.25, where cal hybrid sterility/inviability become smaller even if we use D as a measure of time) and those that accumu- contributions to the underlying additive “hybrid break- lated after the evolution of complete male sterility (say down” scale occur at a steady, not accelerating, pace between D = 0.25 and D = 0.50). The important pointis (TUKELLIand Om, unpublished data). Thus, the ge- that this problem may be more serious than previously netic distance dataare consistent with, butdo not realized. in the single incompatibility model, for exam- prove, a snowballing of reproductive isolation. ple, in going from D = 0.25 to D = 0.50, we quadruple, Last, and most remarkably,we have verylittle informa- not double, the observed number of genes affecting tion about the probability, p, that two genes will be in- hybrid fertility (Equation 6). The same result holds compatible with each other, causing hybrid inviabilityor qualitatively if isolation is caused by the cumulative ef- sterility. Consequently, we have little feel for how “easy” fect of genes of smaller effect. Indeed, the simulation speciation is; if many pairs (or triplets, etc.) of substitu- results show that when doubling the time since diver- tions can complementary lethaldsteriles, specia- gence, it is possible to increase the number of genes of tion may not usually require a very large number of detectable effect by an order of magnitude. The genet- substitutions. if, on the other hand, complementary in- ics ofreproductive isolation will, therefore, become very teractions are very rare, an enormous numberof substi- complicated very quickly.Consequently, we might easily tutions may be required before speciation is likely. We overestimate the number of genes required to obtain know only that complementary lethals and steriles can hybrid male sterility. The same problem afflictsat- be found in nature (DOBZHANSKY1946; DOBZHANSKYet tempts to estimate the numberof genes causing species- al. 1959; KFUMBAS 1960) and that, at least in Drosophila level reproductive isolation between taxa producing mlanogaster, they are very rare. TEMINet al. (1969) found sterile or inviable hybrids of both sexes (MULLERand that =0.005 of all second and third combi- PONTECORVO1940; Om and COWE 1989). The only nations harbor complementary lethals. Unfortunately, solution, it seems, is to genetically analyze younger taxa. we have little idea about the number of (coding) allelic Population Genetics of Speciation 1813 differences between such and so cannot DOBZHANSKY,T., 1936 Studies on hybrid sterility. 11. Localization of sterility factors in Drosophila pseudoobsmra hybrids. Genetics 21: easily translate these figures into values of p. 113-135. Given this paucity of direct data, it may be worth DOBZHANSKY,T., 1937 Genetics and the Ongzn of Species. Columbia noting that indirectestimates of p are possible. In Dro- University Press, New York. DOBZHANSKY,T., 1946 Genetics of NaturalPopulations. XIII. Re- sophila, COYNEand ORR(1989b, Table 2) showed that combination andvariability in populations of Drosophilapseudoob hybrid male sterility/inviability appears at an average scura. Genetics 31: 269-290. genetic distance of D = 0.25. Male effectsnever appear DOBZHANSKY,T., H. LEVENE,B. SPASSKY and N.SPASSKY, 1959 Re- lease of genetic variability through recombination.111. Drosophila before D = 0.07. Because Drosophila (at least D. melano- prosaltans. Genetics 44: 75-92. gaster) appears to have -5000 loci (most of which are FUTW, D. J., 1987 On the role of species in . Am. Nat. essential) (ASHBURNER 1989), these taxa have diverged 130: 465-473. GERSTEL,D. U, 1954 A new lethal combination in interspecific cot- at roughly 1200 loci (we dangerously assume here that ton hybrids. Genetics 39: 628-639. allozyme data are representative of most loci). If most HOI.I.INGSHEAD, L., 1930 A lethal factor in c7epis effective only in cases of hybrid sterility/inviability havea simple genetic interspecific hybrids. Genetics 15: 114-140. KRIMBAS, C. B., 1960 Synthetic sterility in Drosophila zuillistoni. Proc. basis, Equation 6 suggests that p = lop6.This figure is, Natl. Acad. Sci. USA 46: 832-833. of course, the result of several wild assumptions and LYNCH,M., 1994 Neutral models of phenotypic evolution, pp. 86- must be treated with skepticism. Nonetheless, it seems 108 in , edited by L. Rw.. Princeton University Press, Princeton, NJ. unlikely that this estimate is off by several orders of MULLER,H. J., 1939 Reversibility in evolution considered from the magnitude: for example, ifp lop4,only 400 loci would standpoint of genetics. Biol. Rev. Camb. Philos. SOC.14 261-280. haveto diverge and hybrid male sterility/inviability MULLER,H. J., 1940 Bearing of the Drosophila work on , pp. 185-268 in The N~ISystematics, edited by J. HuxLEY. would typically appear long before D = 0.25. Clarendon Press, Oxford. Far more direct estimates of p and ?=will,of course, MULLER, H. J., 1942 Isolating mechanisms, evolution and tempera- be possible once the number of “speciation genes” ture. Biol. Symp. 6: 71-125. MULLER,H. J., and G. PONTECORVO,1940 Recombinants between separating young taxa of known genetic distance is de- Drosophila species, the F, hybrids of which are sterile. Nature termined experimentally. Until then we are left with 146: 199. only the roughest and most indirect estimates of these MULLER,H. J., and G. PONTECORVO,1942 Recessive genes causing interspecific sterility and other disharmonies between Drosophila critical parameters. melanogaster and simulans. Genetics 27: 157. NEI, M., 1976 Mathematical models of speciation and genetic dis- I thank L. ORRfor many very helpful discussions and for solving tance, pp. 723-766 in Population Cenetics and Ecology, edited by the problem of the number of allowable paths to speciation. I also S. KARLIN and E. NEVO.Academic Press, New York. thank N. BARTON,J. COWE, D. FUTUYMA,J. JAENIKE, M. LYNCH, T. NEI, M., T. MARUYAMA and C.4. WU, 1983 Models of evolution of PROUT, M. TURELLIand J. WERRENfor helpful comments. reproductive isolation. Genetics 103: 557-579. ORR,H. A,, 1993 A mathematical model of Haldane’s rule. Evolu- tion 47: 1606- 161 1. LITERATURE CITED ORR,H. A,, and J. A. COYNE,1989 The genetics of postzygotic isola- ASHBURNER,M., 1989 Drosophila: A Laboratory Handbook Cold Spring tion in the Drosophila vir& group. Genetics 121: 527-537. PANTAZIDIS,A. C., and E. ZOUROS,1988 Location of an autosomal Harbor Laboratory Press, Cold Spring Harbor, Ny. factor causing sterility in Drosophila mojavensis males carrying the BARTON, N. H., and B. CHARLESWORTH,1984 Genetic revolution, Drosophila arizonensis Y chromosome. 60: 299-304. founder effects, and speciation. Annu. Rev. Ecol. Syst. 15 133-164. PONTECORVO,G., 1943 Viability interactions between chromosomes CAROT, E. L., A. W. DAVIS,N. A. JOHNSON and C.-I. WU, 1994 Genetics of and Drosophila simulans. J. Genet. 45: of reproductive isolation in the Drosophila simulans : complex 51-66. epistasis underlying hybrid male sterility. Genetics 137: 175-189. SIATKIN,M., 1982 and parapatricspeciation. Evolution CHARLESWORTH,B., J. A. COWE and N. H. BARTON,1987 The rela- 36: 263-270. tive rates of evolution of sex chromosomes and autosomes. Am. TEMIN, G., H. U. MEER, P. S. DAWSONand J. F. CROW,1969 Nat. 130: 113-146. R. The influence of epistasis on homozygous viability depression in CHRISTIE,P., and M.R. MACNAIR,1984 Complementary lethal fac- Drosophila melanogaster. Genetics 61: 497-519. tors in two North American populations of the yellow monkey TEMPLETON,A.R., 1981 Mechanisms of speciation-a population flower. J. Hered. 75: 510-511. genetics approach. Annu. Rev. Ecol. Syst. 12: 23-48. CHRISTIE,P., and M. R. MACNAIR,1987 The distribution of postmat- WALSH, J. B., 1982 Rate of accumulation of reproductive isolation ing reproductive isolating genes in populations of the yellow by chromosome rearrangements. Am. Nat. 120: 510-532. monkey flower, Mimulus guttatus. Evolution 41: 571-578. WI?TBRODT,J., D. ADAM, B. MALITSCHEK,W. MAUELER, F. RAULF et COWE, J. A,, and H. A. 0% 1989 Patterns of speciation in Drosoph- al., 1989 Novel putative receptor tyrosine kinase encoded by ila. Evolution 43: 362-381. the melanoma-inducing Tu locus in Xzphophorus. Nature 341: COWE, J.A., and H.A. Om, 1989 Two rules of speciation, pp. 180- 415-421. 207 in Speciation and its Consequences, edited by D. O~Eand J. WU, C.-I., and A. T. BECKENBACH,1983 Evidence for extensive ge- ENDLER. SinauerAssociates, Sunderland, MA. netic differentiation between the sex-ratio and the standard ar- CROW,J. F., and M. KIMuRA, 1970 An Intmduction to Population Genet- rangement of Drosophila pseudoobscura and D. persimilis and identi- ics Themy. Harper and Row Publishers, New York. fication of hybrid sterility factors. Genetics 105: 71-86. DARWIN,C., 1859 Tlle Origin of Species. Avenel Books Edition, New York. Communicating editor: M. LYNCH