<<

INTERNATIONAL JOURNAL OF ORGANIC EVOLUTION PUBLISHED BY THE SOCIETY FOR THE STUDY OF EVOLUTION

Vol. 57 October 2003 No. 10

Evolution, 57(10), 2003, pp. 2197±2215

PERSPECTIVE: MODELS OF : WHAT HAVE WE LEARNED IN 40 YEARS?

SERGEY GAVRILETS Department of and Evolutionary , Department of Mathematics, University of Tennessee, Knoxville, Tennessee 37996 E-mail: [email protected]

Abstract. Theoretical studies of speciation have been dominated by numerical simulations aiming to demonstrate that speciation in a certain scenario may occur. What is needed now is a shift in focus to identifying more general rules and patterns in the dynamics of speciation. The crucial step in achieving this goal is the development of simple and general dynamical models that can be studied not only numerically but analytically as well. I review some of the existing analytical results on speciation. I ®rst show why the classical theories of speciation by peak shifts across adaptive valleys driven by random run into trouble (and into what kind of trouble). Then I describe the Bateson-Dobzhansky-Muller (BDM) model of speciation that does not require overcoming selection. I describe exactly how the probability of speciation, the average waiting time to speciation, and the average duration of speciation depend on the and migration rates, size, and selection for local . The BDM model postulates a rather speci®c genetic architecture of . I then show exactly why the genetic archi- tecture required by the BDM model should be common in general. Next I consider the multilocus generalizations of the BDM model again concentrating on the qualitative characteristics of speciation such as the average waiting time to speciation and the average duration of speciation. Finally, I consider two models of in which the conditions for sympatric speciation were found analytically. A number of important conclusions have emerged from analytical studies. Unless the population size is small and the adaptive valley is shallow, the waiting time to a stochastic transition between the adaptive peaks is extremely long. However, if transition does happen, it is very quick. Speciation can occur by mutation and random drift alone with no contribution from selection as different accumulate incompatible genes. The importance of and drift in speciation is augmented by the general structure of adaptive landscapes. Speciation can be understood as the divergence along nearly neutral networks and holey adaptive landscapes (driven by mutation, drift, and selection for adaptation to a local biotic and/or abiotic environment) accompanied by the accumulation of reproductive isolation as a by-product. The waiting time to speciation driven by mutation and drift is typically very long. Selection for (either acting directly on the loci underlying reproductive isolation via their pleiotropic effects or acting indirectly via establishing a genetic barrier to gene ¯ow) can signi®cantly decrease the waiting time to speciation. In the parapatric case the average actual duration of speciation is much shorter than the average waiting time to speciation. Speciation is expected to be triggered by changes in the environment. Once genetic changes underlying speciation start, they go to completion very rapidly. Sympatric speciation is possible if disruptive selection and/or assortativeness in mating are strong enough. Sympatric speciation is promoted if costs of being choosy are small (or absent) and if linkage between the loci experiencing disruptive selection and those controlling is strong.

Key words. Allopatric, mathematical, models, parapatric, speciation, sympatric, theory.

Received December 3, 2002. Accepted April 1, 2003.

Speciation, that is, the origin of , is one of the most divergence between different populations (or between parts intriguing evolutionary processes. Speciation is the process of the same population) resulting in substantial reproductive directly responsible for the diversity of . Understanding isolation. A suite of tools for modeling the dynamics of ge- speciation still remains a major challenge faced by evolu- netic divergence is provided by theoretical population ge- tionary biology even now after almost 150 years since the netics and ecology. These two theories form a foundation for publication of Darwin's book, . Al- the quantitative/mathematical study of speciation. though the argument on the appropriate way(s) to de®ne a Theoretical population has identi®ed a number of ``species'' is still unsettled (Wilson 1999), speciation is ul- factors controlling evolutionary dynamics such as mutation, timately a consequence of . Here, I will random genetic drift, recombination, natural and sexual se- view the dynamics of speciation as the dynamics of genetic lection, etc. A straight-forward approach for classifying dif- 2197 ᭧ 2003 The Society for the Study of Evolution. All rights reserved. 2198 SERGEY GAVRILETS

same point on the basis of empirical data. I conclude this discussion with a few comments some of which were made repeatedly in the past (e.g., Smith 1955, 1969; Endler 1977). The ®rst is that a consistent use of terminology is very im- portant. Sometimes the notion of sympatric speciation is used as an antonym of allopatric speciation, and thus includes both and sympatric speciation as de®ned above. This lays grounds for confusion. Even more mislead- ing is the usage of the term ``sympatric speciation'' as a synonym of ``the origin of species in a well de®ned geo- graphic area.'' For example, both in the empirical and the- oretical literature one can easily ®nd statements about ``sym- FIG. 1. Geographic modes of speciation. patric speciation of Lake Victoria '' which are only a little less misleading than something like ``sympatric spe- ciation of North American birds.'' One should also be ab- ferent mechanisms and modes of speciation is according to solutely clear that the de®nitions of the geographic modes of the type and strength of the factors controlling or driving speciation imply nothing about the forces that drive genetic genetic divergence. In principle, any of the factors listed divergence leading to speciation. Any given evolutionary fac- above can be used at any level of classi®cation. However, tor can play a role within each of the three geographic modes. traditionally in the discussions of spe- For example, sympatric speciation can be driven by mutation ciation mechanisms are framed in terms of a classi®cation in and random drift and allopatric speciation can be driven by which the primary division is according to the level of mi- ecological factors. Models describing both these examples gration between the diverging (sub)populations (Mayr 1942). exist. Finally, the traditional stress on the spatial structure In this classi®cation the three basic (geographic) modes of of (sub)populations as the primary factor of classi®cation speciation are allopatric, parapatric, and sympatric. rather than, say, on selection re¯ects both the fact that it is Allopatric speciation (from Greek words allos meaning most easily observed (relative to the dif®culties in inferring ``other'' and patra meaning ``fatherland'' or ``country'') is the type and/or strength of selection acting in natural pop- the origin of new species from geographically isolated ulations) and the growing realization that spatial structure of (sub)populations. In the allopatric case there is no migration populations is very important. Therefore, I do not ®nd com- of individuals (and gene ¯ow) between the diverging pelling recent suggestions to abandon this classi®cation in (sub)populations. At the other extreme of the highest possible favor of a classi®cation based on types of selection (Via 2001) migration level lies sympatric speciation (from Greek word or make it subordinate to a ``/prezygotic isolating sym meaning ``the same''). Sympatric speciation is usually mechanisms'' continuum (Kirkpatrick and Ravigne 2002). de®ned as the origin of new species from a single local pop- Speciation is a very complex process which is affected by ulation (Mayr 1942). This de®nition (as well as similar def- many different factors (genetical, ecological, developmental, initions based on the ``absence of geographic separation,'' environmental, etc.) interacting in nonlinear ways. Both this occurrence within a ``cruising range,'' occurrence within a complexity and the dif®culties of experimental approaches ``geographic range'' of ancestor, etc., although intuitively coming in particular from the very long time scales that are appealing, is not precise enough for modeling purposes. In- typically involved imply that mathematical models have to deed, to use such a de®nition one would need to additionally play a very important role in speciation research. The ability de®ne the exact meaning of ``local population,'' ``geographic of models to provide insights into the speciation process, to separation'' or ``cruising range,'' etc. I will de®ne sympatric train our intuition, to provide a general framework for study- speciation as the emergence of new species from a population ing speciation, and to identify key components in its dynam- where mating is random with respect to the place of birth of ics are invaluable. the mating partners (for a similar approach, see Kondrashov For reasons that are unclear, models of speciation were and Mina 1986). This de®nition is actually implied in most slow to appear. Paradoxically, none of the four greats who mathematical models of sympatric speciation. The interme- are usually viewed as the founders of modern theoretical diate cases when migration between diverging population geneticsÐFisher, Wright, Haldane, KimuraÐex- (sub)populations is neither zero nor maximum possibly fall pressed much interest in developing mathematical models within the domain of parapatric speciation (from Greek word explicitly dealing with speciation. (Among their work most para meaning ``beside'', ``side-by-side,'' ``next to''). closely related to speciation are Fisher's verbal models of To illustrate this classi®cation of speciation modes, let us runaway evolution caused by [1930], consider a system of two demes that exchange a proportion Wright's verbal theory of shifting balance [1931, 1982] and m (Յ1/2) of its members each generation. Assume that mating his model of assortative mating [1921], studies of stochastic follows migration and is random within each . Then the peak shifts [Haldane 1931; Kimura 1985; Wright 1941], and allopatric case corresponds to m ϭ 0, the sympatric case clines [Fisher 1950; Haldane 1948]). To my knowledge, the corresponds to mϭ1/2, and parapatric case corresponds to ®rst substantial discussions of speciation utilizing formal 0ϽmϽ1/2 (see Figure 1). Both this ®gure and biological in- mathematical models are those by Maynard Smith (1962, tuition suggest that parapatric speciation is the most general 1966) and Bazykin (1965, 1969). Although these papers have (geographic) mode of speciation. Endler (1977) made the been very important for setting the ®eld and one of them has MODELS OF SPECIATION 2199 become a ``citation classic,'' as far as concrete results are simulations as the basic tool. The most general feature of concerned, they were mostly about modeling the maintenance simulation models is that their results are very speci®c. In- of rather than about speciation per se. A terpretation of numerical simulations, the interpolation of part of the 1966 paper by Maynard Smith did treat speciation their results for other parameter values, and making gener- directly, but only in a somewhat awkward numerical way for alizations based on simulations are notoriously dif®cult. For a single set of parameter values assuming complete repro- these reasons the results of simulations are typically wide ductive isolation from the start. Systematic studies of spe- open to interpretation and narrowing of interpretation is not ciation utilizing mathematical models started only in the early possible without additional work. However, simulation re- 1970s with the pioneering papers of Crosby (1970), Dick- sults are usually impossible to reproduce because many tech- inson and Antonovics (1973), and Balkau and Feldman nical details are not described in original publications. At the (1973). Crosby (1973) was the ®rst to use an individual-based same time, the authors of the original publications (and this model for describing parapatric speciation. Dickinson and is not limited to theoretical research) almost never return to Antonovics (1973) were the ®rst to use numerical iteration their work to check the generality of their conclusions under of dynamic equations to explore a wide range of parameter more general or reasonable parameter values or assumptions. values for a speci®c model of parapatric and sympatric spe- What is missing in the theoretical speciation research are ciation. Balkau and Feldman (1973) were the ®rst to ®nd general and transparent analytical results comparable to those conditions for parapatric speciation in a certain model ana- in other areas of theoretical and ecology. lytically. For comparison, according to Rice and Hostert's It is simple mathematical models allowing for analytical in- (1993) review (the title of which I have adapted for this vestigation (rather than complex numerical models) that form paper), the earliest experimental paper on speciation is that the basis of most scienti®c theories, and there are no reasons of Koopman published in 1950, and by the early 1970s nine- why evolutionary biology, in general, and speciation re- teen more experimental papers had been published. search, in particular, should be an exception. The questions However now the situation has dramatically changed and that can be answered depend on the tools used in theoretical the theoretical studies well outnumber the experimental pa- research. The majority of existing models of speciation (es- pers. This imbalance only continues to grow. For example, pecially models of sympatric speciation) mostly demonstrate recent reviews of experiments on speciation (Florin and that speciation in a certain scenario may occur. What is need- OÈ deen 2002; OÈ deen and Florin 2000, 2002) list only four ed now is a shift in focus to identifying more general rules new experimental papers published since Rice and Hostert's and patterns. In principle, to learn that ``selection promotes 1993 review, while during the same time interval dozens of speciation'' or that ``peak shifts by random genetic drift are new modeling papers came out. At present there have been unlikely'' one does not need mathematical models because at least a hundred papers modeling some aspects of speciation this is what biological intuition already tells us. Mathematical (Kirkpatrick and Ravigne 2002). Unfortunately, theoretical models are needed only if one wants to get much more spe- speciation research provides an example where the second ci®c conclusions and predictions about the effects of genetic law of dialectics has failed, at least so far. (The second law architecture and different types and intensities of selection of dialectics is concerned with the transformation of quantity on the probability of speciation or peak shift and their dy- into quality [e.g., Engels 1940, Hegel 1975].) Instead of a namical characteristics. Such knowledge can be coupled with sound mathematical theory of speciation, there is what I can the estimates of relevant parameters from natural populations only interpret as a (justi®able) frustration among both em- to get a much deeper understanding of the process. Mathe- piricists and theoreticians with the theoretical speciation re- matical models are also needed if intuition is not working search. For example, Via (2001), an empiricist, who is very because of the complexity of the process under consideration. favorable to the idea of sympatric speciation and appears to My goal here is to describe and illustrate some existing be convinced it is very probable from a theoretical point of theoretical results on speciation that are both simple and gen- view still asks theoreticians (who have already published eral. The common wisdom is that a picture is worth a thou- dozens of papers on the subject) to ``generalize, by further sand words. In the exact sciences, an equation is worth a theoretical exploration and integration of existing models, thousand pictures. Equations and their interpretations are the the theoretical conditions under which sympatric speciation most concrete results a theoretician can come up with. There- can occur'' (p. 389). Theoreticians Kirkpatrick and Ravigne fore, equations and their interpretations are the focus of this (2002) talk about the ``balkanization'' of the theory of spe- paper. The extensive body of purely numerical work on spe- ciation and the absence of clear and general results worthy ciation is deemphasized. The equations to be discussed below to be included in the textbooks. Kirkpatrick and Ravigne are easily accessible to biologists who lack mathematical devised a scheme for classifying the published models of training. They represent a kind of a ``do-it-yourself'' kit speciation in a hope to better understand their results. Turelli which can be used by biologists to check or train their in- et al. (2001) paint a very gloomy picture of the ®eld of the- tuition about speciation by substituting speci®c numerical oretical research concluding in particular that because of the values of parameters and interpreting the results. complexity of the process of speciation . . . ``progress on ma- In the remainder of this paper, I ®rst will show why the jor issues . . . is more likely to emerge from empirical than classical theories of speciation by peak shifts across adaptive from mathematical analyses'' (p. 330). valleys driven by random genetic drift run into troubles (and I believe the current situation stems primarily from the into what kind of troubles). Then I will describe the Bateson- limitations of the methods used in theoretical speciation re- Dobzhansky-Muller (BDM) model of speciation that does not search. Most modeling papers on speciation use numerical require overcoming selection. I describe exactly how the 2200 SERGEY GAVRILETS probability of speciation, the average waiting time to spe- (Lande 1979). This equation shows that T grows exponen- ciation, and the average duration of speciation depend on the tially with the product Ns. (Note that although the term mutation and migration rates, population size, and selection 1/͙Ns does decrease with increasing Ns, it is dominated by for local adaptation. The BDM model postulates a rather the exponential term). For example, if ␮ϭ10Ϫ5 (which is a speci®c genetic architecture of reproductive isolation. I then common estimate of the mutation rate, e.g., Futuyma 1998, show exactly why the genetic architecture required by the Grif®ths et al. 1996), sϭ0.05 (which implies a shallow adap- BDM model should be common in general. The theoretical tive valley) and Nϭ400 (which is a relatively small popu- results to be described here have led to a major shift in focus lation size), then T ϭ 1013 generations, which is a very long from Wright's ``rugged adaptive landscapes'' to nearly neu- time, in fact, far longer than the probable age of any extant tral networks and holey adaptive landscapes. Next I consider vertebrate species. the multilocus generalizations of the BDM model again con- Most of this time will be spent waiting for a ``lucky'' centrating on the qualitative characteristics of speciation such mutant destined to be ®xed to appear in the population as the average waiting time to speciation and the average as the overwhelming majority of mutant are removed duration of speciation. Finally, I consider two models of sym- by selection and random genetic drift. One can also estimate patric speciation in which the conditions for sympatric spe- the average time ␶ that it will take for the ``lucky'' mutant ciation were found analytically. Results of this type, if known allele to take over the population. Time ␶ characterizes the more widely, can clear up a lot of controversy that currently average actual duration of stochastic transitions between the surrounds the questions about the plausibility and generality peaks. Using the diffusion approximation (e.g., Ewens 1979) of sympatric speciation. The Appendix contains a glossary one ®nds that this time is approximately of symbols used in the paper. 12 Ns ␶ ഠ ϩ ln . Dynamics of Stochastic Peak Shifts ss΂΃Ί 2 Here I describe two simple models of stochastic transitions The term ln(͙Ns/2 ) changes very slowly with the product between adaptive peaks driven by random genetic drift. My Ns. Therefore, the actual duration of transition is on the order main goal is to show in quantitative terms that strong repro- of 1/s generations and, thus, is much shorter than the average ductive isolation is a very unlikely outcome of a stochastic waiting time to transition T. For example, with the same peak shift. values of parameters as above, ␶ϭ66 generations. The ratio of ␶ and T can be viewed as a measure of the likelihood to One- two-allele model with underdominance observe the actual transition. The above results show that observing the transition is very unlikely. Following Lande (1979), let us consider a randomly mating diploid population with discrete and nonoverlapping gener- Additive quantitative character ations. Assume that there is a single diallelic locus with al- leles A and a controlling ®tness (viability). Let the relative Next, let us consider a diploid population of size N where ®tnesses of AA and Aa be wAA ϭ 1, wAa ϭ 1 Ϫ individuals differ with respect to a single additive quanti- s and waa, respectively (0 Յ s Յ 1) . The adaptive landscape tative trait z. Assume that the genotypic variance of the trait corresponding to this model has two ``adaptive peaks'' at is somehow maintained at a constant value G. Let the pop- genotypes AA and aa, and an ``adaptive valley'' at ulation be under viability selection de®ned by a ®tness func- Aa (see Fig. 2a). tion Assume that initially the population is monomorphic for w(z) ϭ exp[Ϫs(1 Ϫ z22)], allele A. Let us allow for mutation from A to a with a small probability ␮ per gene per generation. Selection will tend to where sϾ0. This ®tness function has two local peaks at z ϭ eliminate alleles a, whereas mutation will continuously sup- Ϫ1 and z ϭ 1 separated by a valley at zϭ0 (see Fig. 2b). ply them. If the population size N is very large, the population The height of the peaks is one, and the ®tness at the bottom will stay at a state where allele a is maintained at a very low of the valley is exp(Ϫs), which is approximately 1 Ϫ s for frequency ␮/s. However, if the population size N is small, small s. That is, parameter s, as before, characterizes the depth the frequency of the mutant allele a will eventually cross 1/2 of the adaptive valley separating the adaptive peaks. as a result of random genetic drift and approach 1 resulting Assume that initially the average trait value is in a neigh- in a ``peak shift.'' Parameter s, measuring the depth of the borhood of one of the peaks. Random genetic drift will even- adaptive valley that separates the peaks, also characterizes tually result in the population crossing the adaptive valley the strength of (postmating) reproductive isolation between and approaching the other peak (Barton and Charlesworth the population states before and after the peak shift. 1984; Lande 1985). In this model the average waiting time How long does one have to wait until a peak shift occurs to the peak shift is approximately in this model? If the mutation rate is very small and the ͙2␲ product of N and s is at least moderately large (Ͼ2) , the T ഠ e2Ns. average waiting time until a peak shift occurs can be ap- 4Gs proximated as Thus, as in the previous model, the waiting time to transition grows exponentially with Ns. Most of time T will be spent 1 ͙␲ eNs T ഠ in a neighborhood of the initial adaptive peak waiting for the ␮ 2 ͙Ns ``right'' sequence of stochastic events for the peak shift to MODELS OF SPECIATION 2201

isolation may follow after a chain of stochastic peak shifts each of which is across a shallow valley and, thus, results in a very small degree of reproductive isolation (Walsh 1982). If there are many adaptive peaks separated by shallow val- leys, then the pattern of the evolutionary dynamics will be that of a sequence of rapid transitions between adaptive peaks separated by very long periods of no apparent change spent at each of the ``intermediate'' adaptive peaks. The extended stays at ``intermediate'' peaks implies that in this model of allopatric divergence the pattern of evolutionary dynamics is quite different from that of ``'' (Eld- redge 1971; Eldredge and Gould 1972; Gould 2002) in which intermediate stages are (almost) never observed. Below we will see that the pattern of ``punctuated equilibrium'' arises in models of parapatric speciation.

The Bateson-Dobzhansky-Muller Model By the Bateson-Dobzhansky-Muller (BDM) model, I mean a very speci®c assumption about the genetic architecture of reproductive isolation, namely that there are two loci with two alleles at different loci being ``incompatible'' (Bateson 1909; Dobzhansky 1937; Muller 1942). The adaptive land- scapes implied by this model are illustrated in Figure 3. In this Figure it is alleles a and B that are assumed to be in- compatible in the sense that individuals carrying both of them have zero viability (Fig. 3a) or that females carrying allele a do not mate with males carrying allele B (Fig. 3b). Spe- ciation occurs if two populations that have initially the same genetic composition end up at the opposite sides of the ridge of high ®tness values as illustrated in Figure 3. In the BDM model, the populations are not required to cross any adaptive valleys as they simply follow a ridge of high ®tness values. For example, inviability in the ®sh Xiphophorus be- haves as if controlled by a simple BDM model (Orr and FIG. 2. Simple adaptive landscapes with two equal peaks. (a) One- Presgraves 2000). locus two-allele diploid model with underdominance. (b) Disruptive Assume that initially the population has genotype AABB selection on an additive quantitative character. ®xed. Let us allow for mutation from A to a and from B to b at an equal rate ␮ per generation, and let us neglect the occur. The average actual duration of transition ␶ is approx- possibility of backward mutations. In this model the popu- imately lation will de®nitely reach the state with genotype aabb ®xed (see Fig. 3b). This means that speciation is certain. How long 1 ␶ ഠ ln[8͙2͙sN], does one have to wait until speciation and what is the actual 4Gs duration of the speciation process? Can speciation happen by that is, it has order 1/(Gs). For example, if the genetic var- mutation and random genetic drift alone? What are the quan- iance G ϭ 0.04 (which implies that the bottom of the valley titative effects of selection for local adaptation on the dy- is at ®ve standard deviations from the peak), sϭ0.05 and namics of speciation? These are the questions I intend to Nϭ400, then T ϭ 1.31 ϫ 1020 generations and ␶ϭ490 answer in this section. generations. Let us de®ne the average waiting time to speciation T as There are two important conclusions emerging from these the average time that it takes to get from the ancestral state simple analyses. First, unless the population size is small and (i.e., with genotype AABB ®xed) to the state of reproductive the adaptive valley is shallow, the waiting time to a stochastic isolation (i.e., with genotype aabb ®xed). Let us de®ne the transition between the adaptive peaks is extremely long. This average actual duration of speciation ␶ as the average time implies that it is very unlikely that a single peak shift will that it takes to get from the ancestral state to the state of result in strong reproductive isolation. Second, if transition reproductive isolation without returning to the ancestral state. does happen, it is very quick. This implies that observing it Time ␶ can also be thought of as the average duration of the ``in action'' is practically impossible. These conclusions are intermediate stages in the actual transition to a state of re- very general and apply to a broad class of stochastic tran- productive isolation (speciation). In a sense, the ratio of ␶ sitions between adaptive peaks (e.g., Barton and Charles- and T characterizes the probability of observing incipient worth 1984; Barton and Rouhani 1987). Strong reproductive speciation. The dynamics of speciation can be modeled as a 2202 SERGEY GAVRILETS

isolation. In this case, speciation will be driven by mutation and random genetic drift which represents a general null mod- el of speciation. The average waiting time to speciation and the average duration of speciation are equal to 2 T ϭ␶ϭ . ␮ This last expression is very easy to understand. The average waiting time until ®xation of a neutral allele is approximately the reciprocal of the mutation rate (Nei 1976). To evolve to a state with haplotype ab ®xed, the population has to ®x two neutral allelesÐthe fact re¯ected in the expression for T above. Note that because in the neutral case the rate of sub- stitutions does not depend on the population size, the pop- ulation size N does not enter the above equations for T and ␶. Because in this model genetic divergence is irreversible, the average duration of speciation is equal to the average waiting time to speciation. Next we allow for selection for local adaptation. I will assume that mutant alleles a and b have a small selective advantage sa over the ancestral alleles A and B in the en- vironment experienced by the population. In this case the average waiting time to speciation and the average duration of speciation are approximately 21 T ϭ␶ഠ , ␮ Sa

where Sa ϭ 4Nsa (and it is assumed that Sa is at least mod- erately large, i.e., Ͼ3) . The above expression strongly sup- ports previous arguments (e.g., Schluter 2000) that selection for local adaptation can dramatically accelerate speciation. For example, increasing Sa by one order of magnitude will decrease the waiting time to speciation by one order of mag- nitude.

Parapatric speciation In the parapatric case, we assume that each generation a small proportion m of the population is substituted by im- migrants coming from a source population. All immigrants have ancestral genotype AABB. First assume that speciation is driven by mutation and ran- dom genetic drift (the null model). Then the waiting time to speciation is FIG. 3. Adaptive landscapes in the diploid BDM model. The height of a bar gives the ®tness of the corresponding combination of genes. m 1 m (a) of an individual. The presence of alleles a and B results T ϭ 2 ϩ ഠ 2 , in zero viability. (b) Fitness of a mating pair. Females carrying ΂΃␮␮ ␮ allele a do not mate with males carrying allele B. It is assumed while the average duration of speciation is that alleles at locus B are not expressed in females whereas alleles at locus A are not expressed in males. The arrows show two possible m routes to the evolution of reproductive isolation. 2 ϩ 1 ␮ 1 ␶ϭ ഠ , ␮␮m random walk performed by the most common genotype in 1 ϩ the population along the ridge of high ®tness values (Gav- ␮ rilets 2000a). I will describe the corresponding approxima- where the approximate equalities assume that the rate of mu- tions for T and ␶ separately for allopatric and parapatric cases. tation is much smaller than the rate of migration (␮Km). For example, if m ϭ 0.01 and ␮ϭ10Ϫ5, then Tഠ108 generations Allopatric speciation and ␶ഠ105 generations. Note that the mutation rate has more First assume that the two loci under consideration have no in¯uence on the dynamics of parapatric speciation than the other pleiotropic effects on ®tness not related to reproductive migration rate. MODELS OF SPECIATION 2203

Next we consider two models of selection for local ad- m 1 T ഠ ␥ , ␶ ഠ . aptation. The ®rst model describes direct selection on the ␮␮2 alleles underlying reproductive isolation. Speci®cally, let us assume that mutant alleles a and b have a small selective These equations show that a strong genetic barrier to the neutral gene ¯ow (i.e., low ␥) will signi®cantly decrease the advantage sa over the ancestral alleles A and B. Then the average waiting time to speciation and the average duration waiting time to speciation but will not affect its average of speciation are duration. For example, if ␥ϭ0.1, the waiting time to spe- ciation reduces to one tenth of that in the case of divergence driven by mutation and drift. 1 m ϪS 1 T ഠ 2 ϩ e a , There are several conclusions emerging from this analysis. ␮␮΂΃Sa Unless there is selection for local adaptation, the waiting time m to speciation in the BDM model is very long (but not as long 2 ϩ eϪSa 1 ␮ 1 as under stochastic peak shift). Migration can signi®cantly ␶ ഠ , delay speciation. It is generally agreed upon that spatial het- ␮ m Sa 1 ϩ eϪSa erogeneity in selection is common and that this can have ␮ profound effects of the possibility of speciation even in the presence of substantial gene ¯ow (Endler 1977; Ogden and where it is assumed that the coef®cient Sa ϭ 4Nsa is at least moderately large (Ͼ3) . For example, with the same param- Thorpe 2002; Schneider et al. 1999; Smith et al. 1997). Our 5 model makes this intuition more precise. Selection for local eters as above and with Saϭ5, then T ϭ1.73 ϫ 10 generations and ␶ϭ2.24 ϫ 104 generations. That is, the average waiting adaptation (either acting directly on the loci underlying re- time to speciation is reduced by three orders of magnitude productive isolation via their pleiotropic effects or acting relative to that in the model with no selection, and this brings indirectly via establishing a genetic barrier to gene ¯ow) can it near a realistic range of values. signi®cantly decrease the waiting time to speciation. Direct The second model considers effects of a genetic barrier to selection is much more effective than indirect selection. In neutral gene ¯ow. We assume that new alleles a and b have the parapatric case the average actual duration of speciation no effects on adaptation to local conditions but that the pop- is much shorter than the average waiting time to speciation. ulation has already diverged from the source population in some other loci controlling adaptation to local conditions. Nearly Neutral Networks and Holey Adaptive Landscapes Now immigrants will have a reduced ®tness, which will affect We have seen that evolution of strong reproductive iso- the fate of the ancestral alleles A and B that they carry. The lation via stochastic transitions between isolated adaptive effects of selection ``induced'' on neutral alleles via their peaks is unlikely. At the same time, the BDM model shows association with some locally deleterious alleles can be char- that strong reproductive isolation can be achieved if the pop- acterized in terms of the gene ¯ow factor (Bengtsson 1985; ulation evolves along a ridge of high ®tness values in the Barton and Bengtsson 1986; Gavrilets 1997a; Gavrilets and genotype space. How likely are such ridges in general? My Cruzan 1998) de®ned as the probability that a neutral gene goal in this section is to answer this question quantitatively. brought by immigrants makes it to the local genetic back- I will follow the general approach of Gavrilets and Gravner ground. For example, assume that immigrating adults differ (1997) and Gavrilets (1997b, 2003). from the residents in two genes: a gene reducing the viability To get some intuition about this question let us ®rst con- of F1 hybrids to 1Ϫs (relative to viability 1 of the residents) sider the set of all possible haploid genotypes that have only and a neutral gene. Assume also the average number of off- two genes each with a very large number of alleles. The spring produced by matings between the immigrants and res- alleles at the ®rst locus will be denoted as Ai and the alleles idents, , and by matings between F hybrids and residents, ␣ 1 at the second locus will be denoted as Bj. Assume a step- , is reduced relative to that of matings among residents ␤ wise mutation pattern (Nei et al. 1983) allowing allele Ai (which is normalized to be 1) . This can happen if there is mutate only to alleles Ai ϩ 1 and Ai Ϫ 1 and in a similar way, fertility and/or sexual selection against immigrants and/or allowing allele Bi mutate only to alleles Bi ϩ 1 and Bi Ϫ 1. The hybrids. Then the gene ¯ow factor is genotype space (i.e., the set of all possible genotypes) in this model can be represented by a two-dimensional lattice of r(1 Ϫ s)␣␤ ␥ϭ , square sites where the x- and y- coordinates specify the alleles 1 Ϫ (1 Ϫ r)(1 Ϫ s)␤ at the ®rst and second locus, respectively. where r is the rate of recombination between the selected and Following Gavrilets and Gravner (1997), assume that ge- neutral loci (Gavrilets and Cruzan 1998). For example, let s notype ®tnesses (viabilities) are generated randomly and in- ϭ␣ϭ␤ϭ0.5. Then if r ϭ 0.5, then ␥ϭ0.07; if r ϭ 0.05, dependently and are only equal to one (viable genotype) or then ␥ϭ0.016. That is, joint action of viability selection zero (inviable genotype) with probabilities P and 1 Ϫ P, re- and fertility selection and assortative mating can signi®cantly spectively. Here, one might think of the set of all possible decrease the effective rate of immigration of neutral alleles genotypes playing one round of Russian roulette with P being even if they are unlinked to the locus under selection. Re- the probability to get a blank. In this model, any change in turning back to modeling speciation, let us assume that the genotype no matter how large or small, results in a ®tness gene ¯ow factor resulting from selection for local adaptation value completely independent of the initial value. Note that is ␥. Then approximately this assumption is, of course, a great oversimpli®cation which however does not affect our conclusions. 2204 SERGEY GAVRILETS

FIG. 4. Formation of clusters of viable genotypes (painted in black) in two dimensions for different values of the probability of being viable p. Inviable genotypes are painted in white. (a) P ϭ 0.15, (b) P ϭ 0.30, (c) P ϭ 0.45, (d) P ϭ 0.60.

Let us paint viable genotypes (sites) in black and inviable proportion of all viable genotypes (see Fig. 4d). In this model, genotypes (sites) in white (see Fig. 4). For each site (geno- describing a so-called ``site percolation'' on an in®nite two- type), its four adjacent sites (directly above, below, on the dimensional lattice, the percolation threshold is Pc ϭ 0.593 left, and on the right) represent other genotypes that can be (e.g., Grimmett 1989). obtained by a single mutation. In this model, viable genotypes The percolation threshold drops dramatically with more tend to form connected networks (or ridges) that populations loci and alleles. For example, if there are L loci each with A can evolve along by ®xing single mutations without the need alleles, then the percolation threshold is to go across any inviable states. The number and the structure 1 of networks depend on the probability P. For small values of P ϭ . c ( Ϫ 1) P there are many networks of small size (see Fig. 4a). As P L A increases, the size of the largest network increases (see Fig. For example, if L ϭ 10,000 and A ϭ 10, then values of P Ϫ5 4a,b,c). As P exceeds a certain threshold Pc, known as the larger than 10 will typically result in the existence of an ``percolation threshold,'' there emerges the largest network extensive network of high-®tness ridges expanding through- (known as the ``giant component'') that extends (``perco- out the genotype space. Here, the genotypes that belong to lates'') through the whole system and includes a signi®cant this network have the same ®tness. In this sense, the network MODELS OF SPECIATION 2205

and nearly neutral networks show that Wright's picture can be very misleading if adaptive landscapes have a very high dimensionality because such landscapes are characterized by the existence of extensive nearly neutral networks. Among different nearly neutral networks, those with suf®ciently high ®tnesses are of particular evolutionary importance because they allow for continuous evolutionary innovations without any signi®cant loss in ®tness. An important notion describing such networks is that of ``holey adaptive landscape.'' A holey adaptive landscape is de®ned as an adaptive landscape where relatively infrequent high-®tness genotypes form a contigu- ous set that expands throughout the genotype space. An ap- propriate three-dimensional image of such an adaptive land- scape that focuses exclusively on the percolating network is a nearly ¯at surface with many holes representing genotypes that do not belong to the network (see Fig. 5b). The smooth- ness of the surface in this ®gure re¯ects close similarity be- tween the ®tnesses of the genotypes forming the correspond- ing nearly-neutral network. The ``holes'' include both lower ®tness genotypes (``valleys'' and ``slopes'') and very high ®tness genotypes (the ``tips'' of the adaptive peaks). The BDM model considered above provides one of the simplest examples of holey adaptive landscapes. Many more examples are known (Gavrilets 1997b, 2003; Gavrilets and Gravner 1997). This section has dealt exclusively with the structure of adaptive landscapes. A number of mechanisms, such as ran- dom drift, pleiotropic selection, or random ¯uctuations in ®tness can drive evolutionary divergence on the landscapes. The next section considers the ®rst two mechanisms.

Multilocus Generalizations of the BDM Model The BDM model was formulated in terms of only two loci. However, existing data on the genetics of reproductive iso- lation show that typically there are many different loci un- derlying reproductive isolation even at very early stages of divergence (Naveira and Masida 1998; Wu 2001; Wu and Palopoli 1994). If genetic architecture of reproductive iso- lation is known, the dynamics of speciation can be modeled, at least in principle. First, however, the genetic architecture of reproductive isolation is never known completely and, FIG. 5. Adaptive landscapes. (a) A rugged adaptive landscape second, the mathematical treatment becomes extremely com- formed by genotypes with ®tnesses assigned randomly from a uni- plicated as the number of genes involved increases. There- form distribution. (b) A holey adaptive landscape formed by ge- notypes with ®tnesses within a narrow ®tness band. Evolution along fore, rather than studying the dynamics of speciation given a holey landscape is nearly neutral. a speci®c genetic architecture, it becomes much more fruitful to look at the dynamics of speciation expected ``on average.'' (We have already used a similar approach in our discussion is neutral. In more general situations where a range of var- of the properties of typical adaptive landscapes in the pre- iation of ®tness values is allowed, the above results can be vious section.) One such approach (Orr 1995; Orr and Orr generalized to demonstrate the existence of connected nearly 1996; Orr and Turelli 2001) reduces the complexity of adap- neutral networks in which genotypes have approximately the tive landscapes underlying reproductive isolation to simpler same ®tness (Gavrilets 1997b, 2003; Gavrilets and Gravner effects of genetic ``incompatibilities'' of certain types that 1997). arise with certain probabilities. In what follows, two (or Starting with Wright (1932) typical adaptive landscapes more) genes are called incompatible if their joint presence are usually imagined as very rough surfaces with many dif- in an individual's genotype or in the genotypes of a (poten- ferent peaks and valleys (see Fig. 5a). Continuous evolution tial) mating pair results in reduction of a ®tness component. on such ``rugged adaptive landscapes'' requires crossing The goal of this section is to ®nd the average waiting time adaptive valleys (as modeled above). The results on neutral to speciation and the average duration of speciation. 2206 SERGEY GAVRILETS

Accumulation of genetic incompatibilities Let us assume that there is a large number of diallelic loci potentially affecting reproductive isolation. Let any combi- nation of k alleles at different loci be incompatible with prob- ability q. (Note that in the BDM model kϭ2). Consider two populations that have diverged in d loci. Variable d is a measure of genetic distance separating the populations. As shown by Orr (1995), the expected number of incompatibil- ities between them is d J ϭ q , (1) ΂΃k d where ϭ d!/[k!(d Ϫ k)!] is the binomial coef®cient (i.e., ΂΃k the number of combinations of k objects chosen from a set of d objects). The value of J increases very rapidly (``snow- FIG. 6. The probability of no reproductive isolation with C ϭ 10. balls'') with genetic distance d (approximately as the k-th Different lines correspond to k ϭ 2 (the shallowest), 4, 6, 8 (the order of d) . Notice that the snowball effect is much more steepest). pronounced with larger values of k. How do the incompatibilities translate into reproductive isolation between the populations? Let us assume that com- tive landscape is ``correlated'' in the sense that ®tnesses of plete reproductive isolation occurs when C incompatibilities similar genotypes are similar which is more realistic. separate the populations. (Note that in the BDM model, Allopatric speciation Cϭ1). Then the probability that two genotypes at distance d are not reproductively isolated is approximately The simplest approach to model allopatric speciation is to ⌫(C, J) assume that the populations accumulate substitutions at a w(d) ϭ (2) constant rate, say ␻ substitutions per generation (Orr 1995; ⌫(C) Orr and Orr 1996; Gavrilets 2000a; Orr and Turelli 2001). (SG, unpubl. data). Here ⌫(´, ´) and ⌫(´) are the incomplete If ``on average'' K substitutions are required for complete gamma function and gamma function, respectively (Grad- reproductive isolation between two populations, the average shteyn and Ryzhik 1994) and J is given by equation (1). That waiting time to speciation is approximately the probability w decreases with d is compatible with a gen- K eral empirical pattern that reproductive compatibility de- T ϭ (4) creases with genetic distance between parents (Edmands 2␻ 2002). The probability that two genotypes at distance d are generations, where the coef®cient 2 is because there are two reproductively isolated is 1 Ϫ w(d). One can also approximate populations accumulating substitutions. Thus, the average the average K and the coef®cient of variation CVK of the waiting time to allopatric speciation can be predicted on the number of substitutions required for speciation (i.e., complete basis of K alone without the need to specify the ®tness func- reproductive isolation): tion w(d) precisely. For example, if reproductive isolation k! 1/k results from C pairwise incompatibilities (kϭ2) , then equa- K ഠ C , (3a) tion (3) predicts that K ϭ ͙2C/q, leading to the average q ΂΃ waiting time to speciation 1 CVK ഠ (3b) 1 C k C T ϭ . ͙ ␻Ί2q (S. Gavrilets, unpubl. data). As expected, K increases with Orr (1995) and Orr and Turelli (2001) arrived to a similar the ratio C/q. The dependence of K on k is nonmonotonic expression using a different approach. How can one approx- with K minimized at some intermediate values of k. Figure imate the rate of substitutions ␻? One way is to assume that 6 shows that as genetic divergence exceeds value K, the prob- genes underlying reproductive isolation have no other pleio- ability that no complete reproductive isolation occurs un- tropic effects on ®tness. In this case, genetic divergence will dergoes a rapid transition from 1 to 0. This ``threshold effect'' be driven exclusively by mutation and random drift (the null is especially strong when many complex incompatibilities model of speciation). Let ␯ be the rate of neutral mutations are required for complete reproductive isolation. The latter per gamete per generation. If the within-population genetic feature is also apparent from the fact that the coef®cient of variation is absent (or very low), each new mutation that is variation CV quickly goes to zero as both C and k become K compatible with the current genetic state can be treated as large. selectively neutral. Then the rate of accumulation of substi- It should be intuitively clear that the adaptive landscapes tutions can be written as implied by this model are ``holey.'' However, in contrast to the Russian Roulette model described above, here the adap- ␻ϭ␯ MODELS OF SPECIATION 2207

This approach predicts that the average waiting time to spe- a locus that has already diverged will decrease d by one. ciation does not depend on the population size either. How- Speciation occurs when genetic distance d reaches the value ever, in general the process of accumulation of incompati- of K ϩ 1. bilities cannot be treated as neutral even if the relevant genes Neglecting within-population genetic variation, the aver- do not control any other ®tness components. This is because age waiting time to speciation is approximately the alleles producing reproductive isolation are weakly se- K 1 m lected against when rare (Nei et al. 1983; Gavrilets et al. T ഠ K! (6) 1998; Gavrilets 1999). A more careful analysis of the in- ␯␯΂΃ compatibility model used here shows that large populations (Gavrilets 2000a). The average duration of speciation is ap- will actually evolve slower than small populations (Nei et al. proximately 1983; Gavrilets et al. 1998, 2000; Gavrilets 1999). If alleles 1 ⌿(K ϩ 1) ϩ 0.577 underlying reproductive isolation also increase ®tness in the ␶ ഠ 1 ϩ , local conditions through some pleiotropic effects by a small ␯ []m/␯ amount s , then a where the number 0.577 is Euler's constant and ⌿(´) is the ␻ ഠ ␯Sa, psi (digamma) function (Gradshteyn and Ryzhik 1994). [Function ⌿(K ϩ 1) slowly increases with K and is equal to where as before S ϭ 4Ns . Strong selection for local ad- a a 0.42 at K ϭ 1, to 2.35 at K ϭ 10, and to 4.61 at K ϭ 100.] aptation will speed up both the genetic divergence and spe- For example, if m ϭ 0.01, ␯ϭ0.001 and K ϭ 5, then the ciation. waiting time to speciation is very long: T ϭ 1.35 ϫ 1010 One can also ®nd the variances of these waiting times. If generations, but if speciation does happen, its duration is substitutions accumulate at a constant rate, then the coef®- relatively short: ␶ϭ1236 generations. Figure 7 illustrates cient of variation CV of the time to speciation is the same T the dependence of T and ␶ on model parameters in more as the coef®cient of variation of the number of substitutions detail. Notice that ␶ is of the order 1/␯ across a wide range necessary for complete reproductive isolation (see eq. 3b). of parameter values. That is, Direct selection for local adaptation. Assume that each 1 ``new'' allele improves adaptation to the local conditions. CVT ഠ . (5) k͙C Let sa be the average selective advantage of a new allele over the corresponding ancestral allele and Sa ϭ 4Nsa. The waiting Orr and Turelli (2001) arrived to a similar expression for k time to speciation is approximately ϭ 2 using a different approach. Interestingly, equations (3a, 4, 5) show that speciation times of the character- 1 Ts ഠ T , ized by large C are expected to have both larger averages Saaexp(KS ) and narrower relative ranges (Orr and Turelli 2001). where T is given by equation (6). The average duration of speciation is approximately Parapatric speciation 1 ⌿(K ϩ 1) ϩ 0.577 1 To predict the dynamics of parapatric speciation one needs ␶ saഠ 1 ϩ exp(S ). to specify function w(d) precisely. A simple choice is the ␯ [](m/␯) Sa threshold function of reproductive compatibility For example, with the same values of parameters as above 4 1 for d Յ K, and Sa ϭ 2, Ts ϭ 2.74 ϫ 10 generations and ␶s ϭ 2170 w(d) ϭ generations. Thus, selection for local adaptation dramatically 0 for d Ͼ K Ά decreases T (in the numerical example, by the factor ഠ (Gavrilets et al. 1998; Gavrilets 1999). This function should 50,000). Selection for local adaptation also somewhat in- be viewed as a limiting case of function w(d) given by equa- creases ␶ relative to the case of speciation driven by mutation tion (2) when complete reproductive isolation requires many and genetic drift. complex incompatibilities (that is, when k or C are large; see Effects of a genetic barrier. Assume that alleles under- the brief discussion of the ``threshold effect'' below eq. 3). lying reproductive isolation have no other pleiotropic effects The neutral case (no reproductive isolation) corresponds to on ®tness but that the population has already diverged from K equal to the number of loci. Some other choices of w(d) the source population in some other loci underlying adap- were considered in Gavrilets (1999, 2000b). tation to local conditions. Let ␥ be the corresponding gene Suppose there is a large number of loci and let ␯ be the ¯ow factor resulting from selection for local adaptation. In probability of mutation per gamete per generation. As before, this case, the migration rate m has to be replaced by the we assume that the population is subject to immigration at effective migration rate me ϭ␥m. The average waiting time a constant rate m and that all immigrants have an ``ancestral'' to speciation can now be approximated as genotype. Now speciation can be modeled as a random walk T ഠ T␥K, (Gavrilets 2000a) on the integers 0, 1, 2, . . . performed by ␥ the average pairwise distance d between the immigrants and where T is given by equation (6). Even a weak genetic barrier the genotype most common in the population. Each new mu- can signi®cantly decrease the waiting time to speciation if tation ®xed in the population will increase d by one. The there are many loci underlying reproductive isolation. For

®xation of an ancestral allele brought by the immigrants in example, if ␥ϭ0.1 and K ϭ 5, then T␥ reduces by ®ve orders 2208 SERGEY GAVRILETS

expected to appear ®rst and increase in frequency in different populations necessarily resulting in some geographic differ- entiation even without any variation in local selection re- gimes. This genetic differentiation by mutation and drift alone can result in allopatric and parapatric speciation. How- ever, in the parapatric case the waiting time to speciation is relatively short only if a very small number of genetic chang- es is suf®cient for complete reproductive isolation. If more signi®cant genetic change is necessary, then parapatric spe- ciation without selection for local adaptation is hardly pos- sible. The results presented above show that even relatively weak selection (acting directly on the loci underlying repro- ductive isolation or indirectly via genetic barrier) reduces the waiting time to speciation by orders of magnitude. The waiting time to speciation, T, is very sensitive to pa- rameters: changing a parameter by a small factor can increase or decrease T by several orders of magnitude. Most of the parameters of the model (such as the migration rate, intensity of selection for local adaptation, the population size, and, probably, the mutation rate) directly depend on the state of the environment (biotic and abiotic) the population experi- ences. This suggests that speciation is triggered by changes in the environment (Eldredge 2003). If it is a signi®cant environmental change that initiates speciation, the popula- tions of many different species inhabiting the same geograph- ic area should all be affected in a similar way. In this case, one expects more or less synchronized bursts of speciation in a geographic area, that is, a ``turnover pulse'' (Vrba 1985). The results about the duration of speciation ␶ lead to two important generalizations. The ®rst is that the average du- ration of parapatric speciation, ␶, is much smaller than the average waiting time to speciation, T. This feature of the models studied here is compatible with the patterns observed in the fossil record which form the empirical basis of the theory of punctuated equilibrium (Eldredge 1971; Eldredge and Gould 1972; Gould 2002). The second generalization concerns the absolute value of ␶ which is on the order of one FIG. 7. The average waiting time to speciation, T, and the average over the mutation rate for a subset of the loci affecting re- duration of speciation, ␶, in the case of speciation driven by mutation productive isolation for a wide range of migration rates, pop- and random drift. The horizontal axes give the ratio of migration and mutation rates. In (a), the vertical axis gives the product of T ulation sizes, intensities of selection for local adaptation, and and the mutation rate ␯ on the logarithmic scale. In (b), the vertical the number of genetic changes required for reproductive iso- axis gives the product of ␶ and the mutation rate ␯ on the linear lation. Given a ``typical'' mutation rate on the order of 10Ϫ5 scale. In (b), the lines correspond to K ϭ 1,2,4,6,8, and 10 (from Ϫ 10Ϫ6 per locus per generation (Grif®ths et al. 1996; Fu- bottom to top). [Using composite variables for the vertical axes tuyma 1998) and assuming that there are at least on the order allows one to represent relevant dependencies in two dimensions.] of 10±100 genes involved in the initial stages of the evolution of reproductive isolation (Singh 1990; Wu and Palopoli 1994; Coyne and Orr 1998; Naveira and Masida 1998; Wu 2001), of magnitude relative to the neutral case. The average du- 3 ration of speciation is approximately the duration of speciation is predicted to range between 10 and 105 generations. 1 ⌿(K ϩ 1) ϩ 0.577 1 ␶ϭ 1 ϩ , ␥ ␯ []m/␯␥ Sympatric Speciation and is increased by the presence of the barrier. The greatest share of modeling work has been on sympatric It would be very important to know the variances of the speciation. In spite of this and for the reasons discussed in waiting time to speciation and of the duration of speciation. the introduction, no clear and simple quantitative results are Unfortunately, no analytical results comparable to expression generally known. Here I consider two simple models of sym- (5) are known for the parapatric case. patric speciation for which conditions for sympatric speci- Most species consist of geographically structured popu- ation can be found analytically. lations, some of which experience little genetic contact for The ®rst model, developed by Udovic (1980) describes a long periods of time (Avise 2000). Different mutations are diploid population with alleles A and a and the ®rst locus MODELS OF SPECIATION 2209 and alleles B and b at the second locus. There are two com- model, if there are no other loci, the allele frequencies at the mon indices measuring the progress toward sympatric spe- AM locus do not change whereas the genotype frequencies ciation in models of this kind. One is the heterozygote de®- approach an equilibrium at which the index of heterozygote ciency index de®ciency is y ␣ I ϭ 1 Ϫ I ϭ (7) 2p(1 Ϫ p) A 2 Ϫ␣ where y is the frequency of heterozygotes at a locus under (O'Donald 1960) The second locus with alleles B and b will consideration, p and 1 Ϫ p are the frequencies of the two be called the disruptive selection locus or the DS locus. The alleles at the locus. The range of possible values of I is DS locus is subject to symmetric frequency-dependent dis- between 0 and 1. If the population is in Hardy-Weinberg ruptive selection favoring rare genotypes. This implies that proportions (which is expected if mating is random), y ϭ 2pq if mating is random (that is, if ␣ϭ0), the population evolves and I ϭ 0. If the population has split into two homozygous to a stable polymorphic equilibrium where the allele fre- groups that do not mate and hybrids (i.e., heterozygotes) are quencies are pB ϭ pb ϭ 1/2 and the heterozygotes experience completely absent, then y ϭ 0 and I ϭ 1. Another index is a relative ®tness loss which we will denote as S (0 Յ S Յ the normalized linkage disequilibrium, DЈ, between the two 1). If S ϭ 1, heterozygotes are inviable and (postmating) loci which is de®ned as reproductive isolation between the two homozygotes is com- plete. D DЈϭ . In this model there always exists a line of equilibria where ͙ppppAaBb the allele frequency at the AM locus can be arbitrary, the Here D is the gametic linkage disequilibrium (de®ned as D de®ciency of heterozygotes at the AM locus is given by equa- tion (7), the allele frequencies at the DS locus are at 1/2, the ϭ xABxaB Ϫ xAbxaB, where xAB, xab, xAb, and xaB are the fre- genotype frequencies at the DS locus are in Hardy-Weinberg quencies of the corresponding gametes) and pA,pa,pB and proportions (I ϭ 0) , and the linkage disequilibrium between pb are the corresponding allele frequencies. The range of B possible values of DЈ is from Ϫ1toϩ1. If alleles are dis- the loci is absent (DЈϭ0). In words, on this line of equilibria tributed randomly between different gametes, then DЈϭ0. the loci behave as completely independent. Generically, this If the population has split into two genetic clusters so that line of equilibria is locally stable if disruptive selection and the only gametes present are AB and ab, then DЈϭ1. If the assortative mating are weak and linkage is loose. only gametes present in the population are Ab and aB, then Under certain conditions the line of equilibria becomes DЈϭϪ1. unstable. If this happens, the population evolves to one of In the second model, which deals with sexual con¯ict, the two alternative polymorphic equilibria at which the fre- individuals are haploid, and there are multiple alleles. The quencies of all four alleles are equal to 1/2, heterozygotes progress toward speciation in this model will be apparent are in de®ciency and there is a statistical association between from the formation of discrete genotypic clusters and from the alleles in different loci. The two equilibria differ in the the degree of reproductive isolation between members of dif- sign of linkage disequilibrium DЈ and which one is ap- ferent clusters. proached by the population depends on initial conditions. At these equilibria, the population splits into two genetic clusters The Udovic model which arises within the population sympatrically and the loci mutually reinforce their effects leading to stronger repro- Udovic's paper published in 1980 was the only one in ductive isolation between the clusters. Evolution toward such almost a 40-year-long time interval that got very close to an equilibrium represents (a step toward) sympatric specia- ®nding conditions for sympatric speciation analytically in a tion. nontrivial model. However, both his approach and results No linkage. If the loci are unlinked (r ϭ 1/2), sympatric were mainly ignored, being overshadowed by other papers speciation occurs if describing numerical simulations. In this section, I describe and extend Udovic's ground-breaking results. ␣ϩS Ͼ 1. (8) Consider a very large population with discrete nonover- This condition has a very simple biological interpretation. lapping generations. There are two diallelic possibly linked Considering each locus in isolation, conditions S ϭ 1 and ␣ loci with r being the recombination rate. The ®rst locus with ϭ 1 imply complete postmating isolation and complete pre- alleles A and a will be called the assortative mating locus or mating isolation, respectively. Thus, equation (8) tells us that the AM locus. The AM locus controls assortative mating sympatric speciation occurs only if the cumulative strength according to the symmetric version of the O'Donald model of selection against hybrids and the strength of assortative (O'Donald 1960). That is, each individual mates with prob- mating, S ϩ␣, is larger than the threshold strength for each ability ␣ with another individual that has the same genotype of them to cause complete reproductive isolation when acting at the AM-locus. With probability 1 Ϫ␣the individual is in isolation. engaged in random mating. If ␣ϭ0, the population is ran- If condition (8) is satis®ed, then at equilibrium the index domly mating. If ␣ϭ1, the population is split into three of heterozygotes de®ciency in the DS locus is reproductively isolated genotypic clusters corresponding to S ϩ␣Ϫ1 the three genotypes AA, Aa, and aa, of which the cluster of I ϭ . heterozygotes will gradually disappear. In the O'Donald B S 2210 SERGEY GAVRILETS

FIG. 8. The Udovic model. (a) Conditions for sympatric speciation. Speciation occurs for parameter values above the corresponding curve. (b) Isolation index IB for ␣ϭ0.50. The recombination rates are given next to the corresponding curves.

Notice that the condition for sympatric speciation (8) coin- common alleles. For example, genotypes using an under- cides with the condition for IB to be positive. explored resource can get a competitive advantage over other The index of heterozygote de®ciency at the AM locus is genotypes. In this model the equilibrium with allele fre- quencies p ϭ p ϭ 1/2 exists and is stable for any s Ͼ 0. (S ϩ 2␣)␣2 B b At this equilibrium the relative ®tness loss of heterozygotes IA ϭ 22 . ␣ S ϩ␣ ϩ␣S Ϫ S ϩ 1 is S ϭ s/(2 ϩ s) and condition (8) for sympatric speciation The normalized linkage disequilibrium is is 2 ͙IIAB DЈϭϯ . ␣Ͼ . ␣ 2 ϩ s Note that the maximum possible value for each of these three For example, if s ϭ 1 (that is, selection is weak), then ␣ has measures is observed at S ϭ 1 and is equal to ␣. to be larger than 0.67 (that is, strong assortativeness in mating Linkage. With linkage (i.e., if r Ͻ 1/2), sympatric spe- is required) whereas if s ϭ 8 (that is, selection is strong), ciation occurs if then ␣ has to be larger than 0.20 (that is, weak assortativeness in mating is suf®cient). Alternatively, if ␣ϭ0.2 (that is, 2r(1 Ϫ S)(2 Ϫ S) ␣Ͼ . (9) weak assortativeness in mating), then s has to be larger than S ϩ 4r(1 Ϫ S) 8 (that is, strong selection is required) and if ␣ϭ0.67 (strong assortativeness in mating), then s has to be larger than 1 (i.e., The equilibrium values of I ,I and DЈ are too cumbersome A B weak selection is suf®cient). Figure 9 illustrates the distri- to be given here. Figure 8 illustrates the effects of parameters butions of genotype frequencies for three different combi- on the possibility of sympatric speciation and the degree of nations of parameters obtained by numerical iterations of resulting genetic differentiation. Close linkage (i.e., small r) dynamic equations. and strong selection are known to both make sympatric spe- A somewhat inconspicuous feature of the Udovic model ciation easier and result in stronger genetic differentiation inherited from the O'Donald model of assortative mating is (Dickinson and Antonovics 1973; Udovic 1980; Felsenstein that organisms pay no costs for being ``choosy'' no matter 1981). The inequality (9) and Figure 8 provide a quantitative how rare their preferred mates are. Under this condition the description of these effects. I note that the equation for I A Udovic model shows that sympatric speciation is possible if was given in the appendix of Udovic's original paper. The disruptive selection and/or assortativeness in mating are fact that positivity of I is required for the stability of the B strong enough. Closer linkage between the AM locus and the equilibrium is also apparent from the numerical simulations DS locus promote sympatric speciation. reported by him. As an example, let us consider the case of unlinked loci Sexual con¯ict (r ϭ 1/2) assuming linear frequency-dependent selection with ®tnesses In the model considered in the previous section sympatric speciation resulted as an outcome of disruptive natural se- W ϭ 1 ϩ sp , W ϭ 1, W ϭ 1 ϩ sp , BB b Bb bb B lection emerging from ecological interactions between in- where s Ͼ 0. This is one of the simplest cases of frequency- dividuals. In this section I discuss a model where sympatric dependent selection (e.g., Gavrilets and Hastings 1995, 1998) speciation is a consequence of sexual selection. which implies that rare alleles have a ®tness advantage over Consider a large sexual haploid population with distinct MODELS OF SPECIATION 2211

FIG. 9. The distributions of genotype frequencies after 200 generations starting with a random distribution. The loci are unlinked (r ϭ 1/2) and s ϭ 2, S ϭ 0.5. The frequencies of double heterozygotes AB/ab and Ab/aB are pulled together. (a) ␣ϭ0.4 so that condition (8) is not satis®ed and no association of A and B evolves (IA ϭ 0.25, IB ϭ 0, DЈϭ0). (b) ␣ϭ0.6 so that condition (8) is satis®ed but the association of A and B is weak (IA ϭ 0.46, IB ϭ 0.20, DЈϭ0.50). (c) ␣ϭ0.8 so that condition (8) is satis®ed and the association of A and B is strong (IA ϭ 0.72, IB ϭ 0.60, DЈϭ0.82). nonoverlapping generations. We concentrate on two possibly rilets and Waxman 2002). For example, in sea urchins, egg linked multiallelic loci assuming step-wise mutation. The al- ®tness is maximized at a level of sperm density which is leles Ai at the ®rst locus are only expressed in females (or much smaller than levels common under natural conditions eggs), and the alleles Bj at the second locus are only expressed (Franke et al. 2002). In our model, female mating rate is in males (or sperm). The genotype space in this model is directly proportional to the proportion of compatible males identical to that in the Russian Roulette model considered and the assumption Popt Ͻ 1 formalizes the idea of sexual above. con¯ict over mating rate because for the males, it is optimal I will say that two individuals are ``compatible'' if mating, to have Popt ϭ 1 since then all females are susceptible to fertilization, and offspring development are not prevented by fertilization by any male. To clarify the implications of the isolating mechanisms. Assume that the probability ⌿ij that a above assumptions, assume that the population is mono- female (or egg) carrying an Ai allele is compatible with a morphic for male allele Bj. Then Pi ϭ⌿ij and the females male (or sperm) carrying a Bj allele is that have the optimum mating rate and the highest overall ®tness are those for which ⌿ ϭ P . Using the de®nition (i Ϫ j)2 ij opt ⌿ϭexp Ϫ . of ⌿ it is easy to see that there are two such female alleles: ij 2␴ 2 [] Ajϩ␦ and AjϪ␦ both at distance ␦ from the male allele, where This choice of function ⌿ij formalizes a general observation that successful reproduction requires certain ``matching'' of ␦ϭ␴͙ln(PϪ2 ). male and female genes (Tregenza and Wedell 2000). Spe- opt ci®cally, function ⌿ij is maximized for all pairs of female This simple model exhibits three general dynamic regimes and male genes with i ϭ j. The coef®cient ␴ controls the (Gavrilets and Waxman 2002). The ®rst regime is an endless tolerance of matching: small ␴ implies that matching has to coevolutionary chase between the sexes in which females be pretty good for successful mating, and large ␴ implies continuously evolve to decrease the mating rate while males that even when the value of ͦi Ϫ jͦ is large, mating can happen continuously evolve to increase it (Holland and Rice 1998; easily. Gavrilets 2000b; Gavrilets et al. 2001; Gavrilets and Waxman Let Pi be the proportion of the males in the population that 2002). In this regime, there is a dynamic compromise between are compatible with females carrying allele Ai. For each fe- the sexes, and the proportion of compatible pairs is inter- male genotype the value of Pi can be found if one knows the mediate between Popt and 1. The coevolutionary chase is frequencies of male genotypes in the population. Pi can also generically observed if the level of genetic variation is not be thought of as a proxy of a female mating rate. We assume too large. that males can be involved in multiple mating and that they Here we are concerned with two other regimes observed compete for fertilization opportunities. Although males gen- when the population size or mutation rates are suf®ciently erally bene®t from high mating rates, in many situations mul- large. In the Buridan's Ass regime (Gavrilets and Waxman tiple matings reduce female viability and total ®tness, that 2002) there is very low variation in male alleles maintained is, there is a con¯ict between the sexes with regard to the by mutation whereas female alleles split into two clusters optimum mating rate (Arnqvist and Rowe 1995; Chapman both at the optimum distance ␦ from the male allele (see Fig. and Partridge 1996; Rice 1996; Holland and Rice 1998; Park- 10a). In this regime, males get trapped between the two fe- er and Partridge 1998). To formalize this fact we assume that male subclusters and have relatively low mating success. the overall probability that an Ai female leaves offspring is In the sympatric speciation regime (Gavrilets and Waxman a unimodal function of Pi that reaches a maximum at a certain 2002) males answer the diversi®cation of females by diver- value Popt Ͻ 1 (Gavrilets 2000b; Gavrilets et al. 2001; Gav- sifying themselves and splitting into two clusters that start 2212 SERGEY GAVRILETS

FIG. 10. Population genetic states in the sexual con¯ict model. Parameters: mutation rate ␮ϭ10Ϫ5, recombination rate r ϭ 0.5, ␴2 ϭ 2 10, and female overall ®tness wf ϭ exp(Ϫ(P Ϫ Popt) ). (a) The Buridan's Ass regime (Popt ϭ 0.6; the average of value of Pi is 0.64; there is a single male allele B0 and two female alleles A3 and AϪ3); 5000 generations of selection. (b) The sympatric speciation regime (Popt ϭ 0.4; the average of values of Pi is 0.42); 8000 generations of selection. evolving toward the corresponding female clusters. As a re- si®cation induced by sexual con¯ict. In this model selection sult, the initial population splits into different genetic clusters does not have to overcome the homogenizing effect of re- (species) that are reproductively isolated and which have combination that otherwise can prevent sympatric speciation emerged sympatrically (see Fig. 10b). The regime of coevo- (Udovic 1980; Felsenstein 1981; Rice 1984). lutionary chase within-species ends after increasing genetic variation in female alleles leads to the splitting of female CONCLUSIONS alleles into two subclusters within each species. By contrast, genetic variation in male alleles remains very low within each The theory of speciation needs many more simple and general analytical results allowing for transparent biological species. At equilibrium, female Pi values are close to Popt whereas males get trapped between two female subclusters interpretation. In spite of the complexity of the processes and have low mating success. leading to speciation, analytical approaches can be successful as demonstrated above. The major shift in the focus of the- The probability ⌿ij can be written as a function ⌿(d)of the ``genetic distance'' d ϭ i Ϫ j between the female and oretical studies needs to be from the demonstration that spe- male alleles. In the limit of very low mutation rates, sympatric ciation is possible in a speci®c scenario (which has so far speciation occurs if been the major goal of speciation models) to answering much more detailed questions about the probability of speciation, ⌿(␦Ϫ1) ϩ⌿(␦ϩ1) Ͼ 2⌿(␦). (10a) the waiting time to speciation, the duration of speciation, the If ⌿ is a Gaussian function with zero mean and variance ␴2, degree of genetic and phenotypic divergence between sister this inequality can be rewritten as species, the way different resources (including space) are partitioned between the sister species, etc. Several important ␴Ͻ␦ (10b) generalizations about speciation have already emerged from (Gavrilets and Waxman 2002). If the above conditions are not analytical models. satis®ed, the population stays in the Buridan's Ass regime. (1) Unless the population size is small and the adaptive valley is shallow, the waiting time to a stochastic transition Sympatric speciation requires small values of Popt implying that sexual con¯ict over mating rates must be strong. For ex- between the adaptive peaks is extremely long. This implies that it is very unlikely that a single peak shift will result in ample, if Popt ϭ 0.67, then ␦ϭ3 and condition (10a) is not strong reproductive isolation. If transition does happen, it is satis®ed (see Fig. 10a). If Popt ϭ 0.45, then ␦ϭ4 and condition (10a) is satis®ed (see Fig. 10b). Suf®ciently small values of very quick. This implies that observing it ``in action'' is Popt can result in more than two species emerging sympatri- practically impossible. cally (Gavrilets and Waxman 2002). The above results are not (2) Most species consist of geographically structured pop- affected by the recombination rate between the loci. ulations. Different mutations are expected to appear ®rst and In this model sympatric speciation requires suf®ciently increase in frequency in different populations necessarily re- strong selection (i.e., small Popt) and suf®ciently strong as- sulting in some geographic differentiation even without any sortativeness in mating (i.e., small ␴). In contrast to the Udov- variation in local selection regimes. Speciation can occur by ic model, costs of being choosy are explicitly included in the mutation and random drift alone with no contribution from sexual con¯ict scenario. That sympatric speciation still oc- selection as different populations accumulate incompatible curs is explained by the fact that the loci underlying repro- genes. This claim is based on models describing all three geo- ductive isolation also experience direct selection for diver- graphic modes (allopatric, parapatric, and sympatric). In the MODELS OF SPECIATION 2213 same way as divergence by random drift and mutation rep- tutes of Health grant GM56693 and by National Science resents a null model of , speciation by Foundation grant DEB-0111613. random drift and mutation represents a null model of speci- ation. (Note that Wright himself believed that ``the principal LITERATURE CITED evolutionary mechanism in the origin of species must . . . be Arnqvist, G., and L. Rowe. 1995. Sexual con¯ict and arms races an essentially nonadaptive one'' [Wright 1932, p. 364]). between the sexes: a morphological adaptation for control of (3) The importance of mutations and drift in speciation is mating in a female insect. Proc. R. Soc. Lond. B 261:123±127. augmented by the general structure of adaptive landscapes. Avise, J. C. 2000. . Harvard Univ. Press, Cam- bridge, MA. Organisms have thousands of genes and millions of nucle- Balkau, B. J., and M. W. Feldman. 1973. Selection for migration otides. Although divergence in a small number of genes lead- modi®cation. Genetics 74:171±174. ing to strong (or complete) reproductive isolation is possible, Barton, N. H., and B. O. Bengtsson. 1986. The barrier to genetic in most cases divergence will include multiple genes. The- exchange between hybridizing populations. Heredity 56: oretical considerations of multidimensional genotype spaces 357±376. Barton, N. H., and B. Charlesworth. 1984. Genetic revolutions, have led to the realization that their properties are quite dif- founder effects, and speciation. Annu. Rev. Ecol. Syst. 15: ferent from those of spaces of low dimensionality. In partic- 133±164. ular, the multidimensional genotype spaces are characterized Barton, N. H., and S. Rouhani. 1987. The frequency of shifts be- by the existence of nearly neutral networks and holey adap- tween alternative states. J. Theor. Biol. 125:397±418. Bateson, W. 1909. Heredity and variation in modern lights. Pp. 85± tive landscapes. Speciation can be understood as the diver- 101 in A. C. Seward, ed. Darwin and modern science. Cambridge gence along these networks (driven by mutation, drift, and Univ. Press, Cambridge, U.K. selection for adaptation to a local biotic and/or abiotic en- Bazykin, A. D. 1965. On the possibility of sympatric species for- vironment) accompanied by the accumulation of reproductive mation (in Russian). Byulleten' Moskovskogo Obshchestva Is- pytateley Prirody. Otdel Biologicheskiy 70:161±165. isolation as a by-product. ÐÐÐ. 1969. Hypothetical mechanism of speciation. Evolution 23: (4) The waiting time to speciation driven by mutation and 685±687. drift is typically very long. Migration can signi®cantly delay Bengtsson, B. O. 1985. The ¯ow of genes through a genetic barrier. speciation. However, selection for local adaptation (either Pp. 31±42 in J. J. Greenwood, P. H. Harvey, and M. Slatkin, eds. Evolution Essays in Honor of John Maynard Smith. Cam- acting directly on the loci underlying reproductive isolation bridge Univ. Press, Cambridge, U.K. via their pleiotropic effects or acting indirectly via estab- Chapman, T., and L. Partridge. 1996. Female ®tness in lishing a genetic barrier to gene ¯ow) can signi®cantly de- melanogaster: An interaction between the effect of nutrition and crease the waiting time to speciation. Direct selection is much of encounter rate with males. Proc. R. Soc. Lond. B 263: 755±759. more effective than indirect selection. In the parapatric case Coyne, J. A., and H. A. Orr. 1998. The evolutionary genetics of the average actual duration of speciation is much shorter than speciation. Philos. Trans. R. Soc. Lond. B 353:287±305. the average waiting time to speciation. Crosby, J. L. 1970. The evolution of genetic discontinuity: computer (5) Theoretical studies predict extreme sensitivity of the models of the selection of barriers to interbreeding between sub- species. Heredity 25:253±297. probability of speciation and the waiting time to speciation on Dickinson, H., and J. Antonovics. 1973. Theoretical considerations model parameters which in turn strongly depend on the envi- of sympatric divergence. Am. Nat. 107:256±274. ronmental conditions. This suggests that in general speciation Dobzhansky, T. G. 1937. Genetics and the origin of species. Co- is triggered by changes in the environment. Theoretical studies lumbia Univ. Press, New York. Edmands, S. 2002. Does parental divergence predict reproductive also show that once genetic changes underlying speciation start, compatibility? Trends Ecol. Evol. 17:520±527. they go to completion very rapidly. This is so both for changes Eldredge, N. 1971. The allopatric model and phylogeny in Paleozoic driven by strong selection and for changes driven by weak invertebrates. Evolution 25:156±167. stochastic factors. Thus, the quantitative theory predicts the ÐÐÐ. 2003. The sloshing bucket: how the physical realm controls short duration of intermediate stages in speciation and the dif- evolution. Pp. 3±32 in J. Crutch®eld and P. Schuster, eds. To- wards a comprehensive dynamics of evolution - exploring the ®culties of observing their traces in the fossil record. interplay of selection, neutrality, accident, and function. Oxford (6) Sympatric speciation is possible if disruptive selection University Press, New York. and/or assortativeness in mating are strong enough. Sympatric Eldredge, N., and S. J. Gould. 1972. Punctuated equilibria: an al- speciation is promoted if costs of being choosy are small (or ternative to phyletic gradualis. Pp. 82±115 in T. J. Schopf, ed. Models in paleobiology. Freeman, Cooper, San Francisco. absent) and if linkage between the loci controlling assortative Endler, J. A. 1977. Geographic variation, speciation, and clines. mating and those experiencing disruptive selection is close. Princeton Univ. Press, Princeton, NJ. These are but a few initial steps on a long road. A major Engels, F. 1940. Dialectics of nature. International Publishers, New future challenge for theoretical speciation research is to de- York. Ewens, W. J. 1979. Mathematical population genetics. Springer- velop a comprehensive dynamical theory of speciation and Verlag, Berlin. to link microevolutionary processes with macroevolutionary Felsenstein, J. 1981. Skepticism towards Santa Rosalia, or why are patterns observed in the fossil record. there so few kinds of animals? Evolution 35:124±138. Fisher, R. A. 1930. The genetical theory of . Oxford Univ Press, Oxford, U.K. ACKNOWLEDGMENTS ÐÐÐ. 1950. Gene frequencies in a determined by selection and diffusion. Biometrics 6:353±361. I am really grateful to N. H. Barton, C. R. B. Boake, M. Florin, A.-B., and A. OÈ deen. 2002. Laboratory environments are Saum, and an anonymous reviewer for extremely helpful not conducive for allopatric speciation. J. Evol. Biol. 15:10±19. comments of the manuscript. Supported by National Insti- Franke, E. S., C. A. Styan, and R. Babcock. 2002. Sexual con¯ict 2214 SERGEY GAVRILETS

and polyspermy under sperm-limited conditions: in situ evidence lation between stable phenotypic states. Proc. Nat. Acad. Sci. from ®eld simulations with the free-spawning marine echinoid USA 82:7641±7645. Evechinus chloroticus. Am. Nat. 160:485±496. Maynard Smith, J. 1962. Disruptive selection, and Futuyma, D. J. 1998. Evolutionary biology. Sinauer, Sunderland, sympatric speciation. Nature 195:60±62. MA. ÐÐÐ. 1966. Sympatric speciation. Am. Nat. 100:637±650. Gavrilets, S. 1997a. Hybrid zones with epistatic selection of Dob- Mayr, E. 1942. and the origin of species. Columbia zhansky type. Evolution 51:1027±1035. Univ. Press, New York. ÐÐÐ. 1997b. Evolution and speciation on holey adaptive land- Muller, H. J. 1942. Isolating mechanisms, evolution, and temper- scapes. Trends Ecol. Evol. 12:307±312. ature. Biol. Symp. 6:71±125. ÐÐÐ. 1999. A dynamical theory of speciation on holey adaptive Naveira, H. F., and Masida, X. R. 1998. The genetic of hybrid male landscapes. Am. Nat. 154:1±22. sterility in Drosophila. Pp. 330±338 in D. J. Howard and S. H. ÐÐÐ. 2000a. Waiting time to parapatic speciation. Proc. R. Soc. Berrocher, eds. Endless forms: species and speciation. Oxford Lond. B 267:2483±2492. Univ. Press, New York. ÐÐÐ. 2000b. Rapid evolution of reproductive isolation driven by Nei, M. 1976. Mathematical models of speciation and genetic dis- sexual con¯ict. Nature 403:886±889. tance. Pp. 723±768 in S. Karlin and E. Nevo, eds. Population ÐÐÐ. 2003. Evolution and speciation in a hyperspace: the roles genetics and ecology. Academic Press, New York. of neutrality, selection, mutation, and random drift. Pp. 135± Nei, M., T. Maruyama, and C.-I. Wu. 1983. Models of evolution 162 in J. Crutch®eld and P. Schuster, eds. Towards a compre- of reproductive isolation. Genetics 103:557±579. hensive dynamics of evolution - exploring the interplay of se- OÈ deen, A., and A.-B. Florin. 2000. Effective population size may lection, neutrality, accident, and function. Oxford Univ. Press, limit the power of laboratory experiments to demonstrate sym- New York. patric and parapatric speciation. Proc. R. Soc. Lond. B 267: Gavrilets, S., and M. B. Cruzan. 1998. Neutral gene ¯ow across 601±606. single locus clines. Evolution 52:1277±1284. ÐÐÐ. 2002. Sexual selection and : the Ka- Gavrilets, S., and J. Gravner. 1997. Percolation on the ®tness hy- neshiro model revisited. J. Evol. Biol. 15:301±306. percube and the evolution of reproductive isolation. J. Theor. O'Donald, P. 1960. Assortative mating in a population in which Biol. 184:51±64. two alleles are segregating. Heredity 15:389±396. Gavrilets, S., and A. Hastings. 1995. Intermittency and transient Ogden, R., and R. S. Thorpe. 2002. Molecular evidence for eco- chaos from simple frequency-dependent selection. Proc. R. Soc. logical speciation in tropical . Proc. Nat. Acad. Sci. USA Lond. B 261:233±238. 99:13612±13615. ÐÐÐ. 1998. Coevolutionary chase in two-species systems with Orr, H. A. 1995. The population genetics of speciation: the evo- applications to mimicry. J. Theor. Biol. 191:415±427. lution of hybrid incompatibilities. Genetics 139:1803±1813. Gavrilets, S., and D. Waxman. 2002. Sympatric speciation by sexual Orr, H. A., and L. H. Orr. 1996. Waiting for speciation: the effect con¯ict. Proc. Nat. Acad. Sci. USA 99:10533±10538. of population subdivision on the waiting time to speciation. Evo- Gavrilets, S., H. Li, and M. D. Vose. 1998. Rapid parapatric spe- lution 50:1742±1749. ciation on holey adaptive landscapes. Proc. R. Soc. Lond. B 265: Orr, H. A., and D. C. Presgraves. 2000. Speciation by postzygotic 1483±1489. isolation: forces, genes, and molecules. BioEssays 22: ÐÐÐ 2000. Patterns of parapatric speciation. Evolution 54: 1085±1094. 1126±1134. Orr, H. A., and M. Turelli. 2001. The evolution of postzygotic Gavrilets, S., G. Arnqvist, and U. Friberg. 2001. The evolution of isolation: accumulating Dobzhansky-Muller incompatibilities. female mate choice by sexual con¯ict. Proc. R. Soc. Lond. B Evolution 55:1085±1094. 268:531±539. Parker, G. A., and L. Partridge. 1998. Sexual con¯ict and speciation. Gould, S. J. 2002. The structure of evolutionary theory. Harvard Philos. Trans. R. Soc. Lond. B 353:261±274. University Press, Cambridge, MA. Rice, W. R. 1984. Disruptive selection on preferences and Gradshteyn, I. S., and I. M. Ryzhik. 1994. Tables of integrals, series, the evolution of reproductive isolation. Evolution 38: and products. Academic Press, San Diego, CA. 1251±1260. Grif®ths, A. J. F., J. H. Miller, D. T. Suzuki, R. C. Lewontin, and ÐÐÐ. 1996. Sexually antagonistic male adaptation triggered by W. M. Gelbart. 1996. An introduction to genetic analysis. W. experimental arrest of female evolution. Nature 381:232±234. H. Freeman, New York. Rice, W. R., and Hostert, E. E. 1993. Laboratory experiments on Grimmett, G. 1989. Percolation. Springer-Verlag, Berlin. speciation: what have we learned in 40 years? Evolution 47: Haldane, J. B. S. 1931. A mathematical theory of natural and ar- 1637±1653. ti®cial selection. V. Metastable populations. Proc. Camb. Philos. Schluter, D. 2000. The ecology of . Oxford Univ. Soc. 23:838±844. Press, Oxford, U.K. ÐÐÐ. 1948. The theory of a cline. J. Genet. 48:277±284. Schneider, C. J., T. B. Smith, B. Larison, and C. Moritz. 1999. A Hegel, G. W. F. 1975. Hegel's logic: being part one of the ency- test of alternative models of diversi®cation in tropical rainfo- clopaedia of the philosophical sciences (1830). Clarendon Press, rests: ecological gradients vs. rainforest refugia. Proc. Nat. Acad. Oxford, U. K. Sci. USA 96:13869±13873. Holland, B., and W. R. Rice. 1998. Chase-away sexual selection: Singh, R. S. 1990. Patterns of species divergence and genetic the- antagonistic seduction versus resistance. Evolution 52:1±7. ories of speciation. Pp. 231±265 in K. WoÈhrmann and C. K. Jain, Kimura, M. 1985. The role of compensatory neutral mutations in eds. Population biology ecological and evolutionary viewpoints. molecular evolution. J. Genet. 64:7±19. Springer-Verlag, Berlin. Kirkpatrick, M., and V. RavigneÂ. 2002. Speciation by natural and Smith, H. M. 1955. The perspective of species. Turtox News 33: sexual selection: models and experiments. Am. Nat. 159: 74±77. S22±S35. ÐÐÐ. 1969. Parapatry: or allopatry? Systematic Zo- Kondrashov, A. S., and S. I. Mina. 1986. Sympatric speciation: ology 18:254±255. When is it possible? Biol. J. Linn. Soc. 27:201±223. Smith, T. B., R. K. Wayne, D. J. Girman, and M. W. Bruford. 1997. Koopman, K. F. 1950. Natural selection for reproductive isolation A role for ecotones in generating rainforest . Science between and Drosophila persimilis. 276:1855±1857. Evolution 4:135±148. Tregenza, T., and N. Wedell. 2000. Genetic compatibility, mate Lande, R. 1979. Effective deme size during long-term evolution choice and patterns of parentage. Mol. Ecol. 9:1013±1027. estimated from rates of chromosomal rearrangement. Evolution Turelli, M., N. H. Barton, and J. A. Coyne. 2001. Theory and spe- 33:234±251. ciation. Trends Ecol. Evol. 16:330±343. ÐÐÐ. 1985. Expected time for random genetic drift of a popu- Udovic, D. 1980. Frequency-dependent selection, disruptive selec- MODELS OF SPECIATION 2215

tion, and the evolution of reproductive isolation. Am. Nat. 116: ÐÐÐ. 1932. The roles of mutation, inbreeding, crossbreeding and 621±641. selection in evolution. Pp. 356±366 in Proceedings of the Sixth Via, S. 2001. Sympatric speciation in animals: the ugly duckling International Congress on Genetics. Vol 1. Brooklyn Botanic grows up. Trends Ecol. Evol. 16:381±390. Garden, Brooklyn, NY. Vrba, E. S. 1985. Environment and evolution: alternative causes of ÐÐÐ. 1941. On the probability of ®xation of reciprocal trans- the temporal distribution of evolutionary events. S. Afr. J. Sci. locations. Am. Nat. 75:513±522. 81:229±236. ÐÐÐ. 1982. The shifting balance theory and . Walsh, J. B. 1982. Rate of accumulation of reproductive isolation Annu. Rev. Genet. 16:1±19. by rearrangements. Am. Nat. 120:510±532. Wu, C.-I. 2001. The genic view of the process of speciation. J. Wilson, R. A. 1999. Species. New interdisciplinary essays. MIT Evol. Biol. 14:851±865. Press, Cambridge, MA. Wu, C.-I., and M. F. Palopoli. 1994. Genetics of postmating re- Wright, S. 1921. Systems of mating. III. Assortative mating based productive isolation in animals. Annu. Rev. Genet. 27:283±308. on somatic resemblance. Genetics 6:144±161. ÐÐÐ. 1931. Evolution in Mendelian populations. Genetics 16: 97±159. Corresponding Editor: J. Mitton

APPENDIX Notations used herein: The ®rst part of the appendix contains variables used throughout the paper; the following ®ve parts describe variables unique to the ®ve major sections of the paper.

A, B, a, b, Ai, Bi Alleles at different loci T, ␶ Average waiting time to speciation and average duration of speciation ␮, ␯ The rate of mutation per locus and per gamete r, m, N, w Recombination rate, migration rate, population size, ®tness s, S, Sa (ϭ4Ns) Different selection coef®cients z, G Average and variance of an additive quantitative trait ␥ The gene ¯ow factor a, b Coef®cients characterizing reduction in fertility

P, Pc Probability of being viable and the threshold value of P L, A Number of loci and number of alleles d Number of diverged loci k, q Number of alleles in an incompatible set and probability of incompatibility C Number of incompatibilities necessary for speciation J Expected number of incompatibilities K, CVK Average and coef®cient of variation of the number of substitutions required for speciation ␻ Rate of substitutions CVT Coef®cient of variation of the time to speciation

I, IA, IB Heterozygote de®ciency indices p, pA, pa, pb, pB Allele frequencies y Frequency of heterozygotes D, DЈ Linkage disequilibrium and normalized linkage disequilibrium ␣ Probability of assortative mating ⌿ij Preference function Pi Proportion of compatible males maximizing female ®tness Ai Popt Optimum proportion of males compatible with a female ␦ Genetic distance resulting in Popt