Gene I. Res., Camb. (1989), 54, pp. 59-77 With 8 text-figures Printed in Great Britain 59

The divergence of a polygenic system subject to stabilizing selection, and drift

NICK BARTON Department of and Biometry, University College London, 4 Stephenson Way, London NWI 2HE (Received 4 December 1987 and in revised form 26 January 1989)

Summary Polygenic variation can be maintained by a balance between mutation and stabilizing selection. When the alleles responsible for variation are rare, many classes of equilibria may be stable. The rate at which drift causes shifts between equilibria is investigated by integrating the gene frequency distribution W2NY\{pqYNl''1. This integral can be found exactly, by numerical integration, or can be approximated by assuming that the full distribution of allele frequencies is approximately Gaussian. These methods are checked against simulations. Over a wide range of population sizes, drift will keep the population near an equilibrium which minimizes the genetic variance and the deviation from the selective optimum. Shifts between equilibria in this class occur at an appreciable rate if the product of population size and selection on each locus is small (Nsa2 < 10). The Gaussian approximation is accurate even when the underlying distribution is strongly skewed. Reproductive isolation evolves as populations shift to new combinations of alleles: however, this process is slow, approaching the neutral rate (x/i) in small populations.

possibility is that mutation alone is responsible for the 1. Introduction high heritabilities which are found in nature (Lande, How is polygenic variation maintained, and how does 1975). If this is so, one might hope that the genetic it evolve? How does a population separate into two variance could be predicted from measurable incompatible species? How can a population search quantities, such as the total mutation rate, the effective through the vast space of possibilities to find better number of loci, and the strength of selection. adapted combinations of alleles? How is selection on Unfortunately, predictions depend on the distribution the whole organism reflected in the of of allelic effects at individual loci: if a large number of individual genes? Because many genes interact to alleles segregate, giving an approximately Gaussian produce the observable phenotype, these questions distribution, then the variance is proportional to the have proved hard to approach, either empirically or square root of the ratio between mutation rate and theoretically. selection pressure, whereas if variation depends on In this paper, I will analyse the effects of sampling rare alleles with large effects, it is directly pro- drift on polygenic variation which is maintained by a portional to this ratio (Turelli, 1984). In general, the balance between mutation and stabilizing selection. A evolution of the mean and variance of a character character is determined by the sum of effects of a large (which are usually all that can be measured) depends number of genes, each of which segregates for two on the higher moments of the underlying distribution alleles. Stabilizing selection tends to reduce variation, of allelic effects: only by making strong and unrealistic and is strong enough that each gene is close to assumptions can a closed set of equations be obtained fixation. This model was first proposed by Wright (Barton & Turelli, 1987). Even when all the loci are (1935); it has since been elaborated by Latter (1960), close to fixation, as in the case considered here, the Bulmer (1972, 1980), Kimura (1981), Barton (1986), outcome may still be unpredictable. Barton (1986) Hastings (1987), and Burger el al. (1988). Despite its showed that many stable equilibria can coexist: simplicity, it is relevant to each of the issues raised the genetic variance can range from that given by the above. 'rare allele' approximation up to that given by Since most quantitative characters must be subject the Gaussian approximation. So, the phenotypic state to stabilizing selection, at least in the long run, some depends on how populations move between different other force must maintain variability. The simplest equilibria.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 N. Barton 60

Since many different combinations of alleles may be is assumed to be completely heritable: environmental able to satisfy the constraints imposed by selection, variation would merely weaken selection, and would many distinct stable equilibria will generally be have no qualitative effect. Individual fitness follows a available. Wright (1932; Provine, 1986) illustrated this Gaussian curve centred on some optimum z0, and point using the 'adaptive landscape': a graph of mean with variance \/s. fitness plotted against the possible states of the Provided that selection is weak, the population can population. In all but the simplest cases, this surface be considered to change approximately continuously will contain many local 'adaptive peaks'. The central in time. If selection is also much slower than problem for Wright was how populations can move recombination, linkage disequilibrium can be towards superior peaks, despite their tendency to be neglected (Bulmer, 1980; see Turelli & Barton, 1989, trapped by local irregularities in the surface. He for a general justification). The effects of selection are proposed that sampling drift, or perhaps random then given by the general relation: fluctuations in selection pressures, could knock dp, populations between peaks, and that, for a variety of (1) reasons, populations would tend to accumulate at the dt dp, higher peaks. Here, I consider just one factor which (p,, qt are the frequencies of the + and — alleles at the favours higher peaks: random transitions from low z'th locus). peaks to high tend to be more likely than those in the If the number of loci is large, and selection is weak, opposite direction. In the present model of a quan- then the mean fitness, W, of a population with mean titative character, the question is whether this tendency z and variance v is approximately: is efficient enough that the population will usually be 2 s(z-z ) in the equilibrium with lowest variance, and the log(»f) = - 0 sv (2) highest mean fitness. If this is so, then the simple 'rare "2 alleles' approximation can safely be used to predict where the evolution of the mean and variance. 2 z = Ilac(pi-qi), v = '£l2a. piqi Shifts between alternative equilibria are important I i even if they have no direct phenotypic or adaptive Including mutation at an equal rate /i in each direction: consequences. Even when stabilizing selection is strong enough that the phenotype is confined within narrow limits, shifts between different combinations of alleles (3) may still be possible. This process is of interest from two points of view. First, the rate and pattern of Here, 8 = (z — zo)/x is the deviation from the optimum substitution at individual loci will be affected by relative to the effect of a substitution at a single locus. The equilibria of eqn (3) satisfy a cubic; provided selection (Kimura, 1981, Hastings, 1987, Foley, 1987): 2 is the observed level of selection on the whole organism that S and /i are not too large (\S\ < f, /i < sa /8; consistent with effectively neutral evolution at the Barton, 1986), it has three solutions, denoted by molecular level? Second, two populations may [p,p, F\. The equilibria which can be reached by the eventually come to have very different combinations whole system can be described by the numbers of of alleles, leading to substantial reproductive barriers loci which are at each of these three frequencies, between them. How rapid is this process, relative to [m, v, M] (m+u + M = n). Each combination [m, v,M] other mechanisms of ? represents a class of equilibria with the same overall Progress on these questions depends on techniques properties, but includes many different permutations for analysing stochastic systems with many degrees of across loci. freedom. Though such techniques are well developed The deterministic equilibria can be calculated by in other fields, their application to genetical problems solving eqn (3) numerically (Table 1). The nature of requires some innovations. The simple model analysed the stable equilibria depends on the ratio between mutation and selection at each locus. When this is here is interesting in its own right. However, the main 2 2 aim is to develop methods which can deal with extremely small (/i < sa /(n + I) ), mutation is neg- evolution in many dimensions. ligible: one locus can be kept polymorphic by (effectively) heterozygote advantage, whilst the re- mainder contribute negligible variance. At the other 2. Deterministic behaviour extreme, when /i > s

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 Divergence under stabilizing selection 61

Table 1. Properties of the deterministic equilibria

m r M P P P r r log (HO U A log(D)

(a) n = 100, y = 0-01, r0 = 0, 5= 1, a = 01 45 0 55 00123 0-9018 00311 01084 -10-9367 -26-9722 -00087 -232-578 45 1 54 00123 0-8967 0-9018 00311 01084 -10-9367 -26-9722 00087 -230-567 46 0 54 00124 0-9181 00307 00924 -9-3371 -25-791'4 -00760 -113-211 46 1 53 00124 0-8750 0-9189 00306 00924 -9-3345 -25-7979 00718 -110-784 47 0 53 00126 0-9349 00292 00762 -7-7107 -24-6551 -01670 -70-365 47 1 52 00127 0-8400 0-9366 00289 00761 - 7-7003 -24-6602 01466 -67-237 48 0 52 00132 0-9517 00255 00603 -61039 -23-6034 -0-3033 -40-404 48 1 51 00135 0-7754 0-9548 00244 00602 -60826 -23-6305 0-2345 -36050 49 0 51 00150 0-9675 00169 00465 -4-6798 -22-7701 -0-5351 -17087 49 1 50 00163 0-6292 0-9725 00118 00470 -4-7224 -22-8978 0-3127 -11-850 50 0 50 00204 0-9795 00000 00400 -40000 -22-4207 -0-9200 -6066

55 0 45 00981 0-9876 -00311 01084 -10-9367 -26-9722 -00087 -232-578

(b) n = 100, 7 = ooi,.-0 = 5, s= 1, a = 01 17 0 83 00123 0-9029 50311 01496 -150580 -28-9093 -00130 -345123 17 1 82 00123 0-8955 0-9030 50311 01496 -150579 -28-9093 00129 -343-073 18 0 82 00124 0-9137 50309 01336 -13-4551 -27-7326 -00563 -221-620 18 1 81 00124 0-8820 0-9141 50309 01335 -13-4532 -27-7328 00541 -219-291 19 0 81 00124 0-9248 50303 01172 -11-8182 -26-5705 -01088 -165-923 19 1 80 00125 0-8638 0-9256 50302 01172 -11-8115 -26-5719 01007 -163181 20 0 80 00126 0-9361 50290 01006 -101507 -25-4413 -01750 -126-223 20 1 79 00127 0-8383 0-9373 50288 01005 -101341 -25-4469 01540 -122-831 21 0 79 00130 0-9475 50267 00839 -8-4617 -24-3754 -0-2633 -93023 21 1 78 00131 0-7996 0-9494 50262 00835 -8-4263 -24-3929 0-2155 -88-524 22 0 78 00138 0-9590 50225 00672 -6-7755 -23-4274 -0-3910 - 62-404 22 1 77 00141 0-7343 0-9618 50210 00666 -6-7052 -23-4790 0-2832 -55-824 23 0 77 00156 0-9702 50144 00514 -51701 -22-7037 -0-5960 -32-316 23 1 76 00170 0-6029 0-9743 50094 00504 -5-0521 -22-8635 0-3326 -21-706 24 0 76 00209 0-9801 4-9987 00394 -3-9473 -22-4223 -0-8920 -4-564 24 1 75 00310 0-3272 0-9846 4-9843 00414 -41741 -22-9494 0-2709 4-329 25 0 75 00382 0-9858 4-9789 00393 -3-9760 -22-8680 -0-4302 4-823 25 1 74 00654 01547 0-9873 4-9708 00516 -5-2547 -23-8739 01271 -11-661 26 0 74 00686 0-9874 4-9704 00516 -5-2516 -23-8702 -01452 -14177

(c) n = iOO.y = 00025, z0 = 0, 5 = l,a = 0-2 48 0 52 00027 0-9627 00796 01596 -41502 -9-4002 -0-0580 -115-415 48 1 51 00027 0-9314 0-9634 00795 01596 -41486 -9-4006 -00552 -112-680 49 0 51 00029 0-9810 00704 00874 -2-3099 -7-8931 -0-2213 -48-232 49 1 50 00030 0-8486 0-9835 00670 00871 -2-2902 -7-9114 01698 -40-762 50 0 50 00050 0-9949 00000 00400 -10000 -6-9915 -0-9800 -0-908

52 0 48 00372 0-9972 -00796 01596 -41502 -9-4002 -00580 -115-415

(d) n = 25,y = 001,z0 = 0, J = 1, a = 0-2 12 0 13 00137 0-9577 00463 00550 -1-4307 -5-8564 -0-3734 -6-732 12 1 12 00204 0-5000 0-9795 00000 00584 -1-4600 -5-9641 01107 -2-869 13 0 12 00422 0-9862 -00463 00550 -1-4307 -5-8564 -0-3734 -6-732

Notation: [m, v, M], numbers of loci at the three possible equilibria; [p,p, P], the corresponding allele frequencies; z, v, the mean and variance of the character; log(WO> V, the log mean fitness, and the potential function which accounts for mutation; A, the leading eigenvalue (negative for a stable equilibrium); \og(D), the log of the absolute value of the determinant of the stability matrix (i.e. the log of the product of the eigenvalues). Where z0 = 0 (a, c, d), the tables are symmetrical; here, only one set of equilibria are shown.

far from the optimum (\8\ < §), the phenotypic eigenvalues of the matrix ^(dpJdO/dpj, which governs variance may differ substantially. In this deterministic the dynamics near equilibrium. Whenever an equi- case, therefore, the amount of variation that can be librium of the class [m, 0, M] exists, it is stable: all the maintained by a mutation/selection balance depends eigenvalues are negative. The degree of stability can on the history of the population. be measured by the product of (minus) these The stability of the equilibria is determined by the eigenvalues (D = dei{ — c(dpt/dt)/Zpj)); we will see in

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 N. Barton 62

the next section that this quantity is important in the represents the graph of U plotted against allele stochastic behaviour of the system. For the stable frequency. This surface is not quite the same as equilibria [m,0,M]: Wright's 'adaptive topography', because it includes the effects of mutation: peaks of U do not coincide with peaks in mean fitness, because mutation (G + 4MPQ)\ (4a) maintains maladaptive polymorphism. The stability of an equilibrium depends on the curvature of the 2 2 where g = 1 — 6pq + 4/i/sa — 28(p — q) potential surface (3 U/dpt 9/>;) near a stationary point. This is related to the eigenvalues discussed above: G = 1 -6PQ + 4fc/sa2-28(P-Q) 2 D= Kdet(3 U/dp,dp}). (6) (from eqn 12) in Barton, 1986; note the difference in sign in the definition of g, G here). The deterministic properties of the system are To move from one stable equilibrium to another, summarised in Table 1, for various parameter the population must pass through an unstable state. combinations. In general, the most likely path for such a transition is via an unstable equilibrium; the chances of such a 3. Stochastic behaviour: simple approximations transition depend on the dynamics near this equi- librium (Barton & Rouhani, 1987). Though I will not The three central questions are: first, what is the attempt a formal proof, it seems certain that the most distribution of the phenotypic mean and variance? likely route from an equilibrium of the class [m,0, M] Second, what is the chance that a population will be to an adjacent equilibrium [m+ 1,0, M— 1] is via the in the vicinity of one deterministic equilibrium out of unstable state [m, 1, M— 1] (Fig. 5). This corresponds the many alternatives? Third, how often do to a shift of one of the loci from near fixation for the populations shift between equilibria? A full answer to + allele, through a polymorphism, to near fixation these questions must follow the whole set of allele for the — allele, as this shift occurs, allele frequencies frequencies. However, before beginning the multi- at all the other loci adjust so as to keep the mean close dimensional analysis, it will be helpful to discuss some to the optimum. simple approximations. Numerical results show that whenever an equi- librium of the class [m, 1, M] exists, it is unstable: one The distribution of the mean and variance eigenvalue is positive, whilst the remaining (n— 1) are negative. The degree of instability can still be measured In the limit where the alleles responsible for polygenic by the product of (minus) the eigenvalues, which is variation are rare, and where the distribution of now negative: breeding values is symmetric, changes in the mean and variance under selection and mutation are approxi- m 1 M 1 mated by: = gg - G - det (g + 4mpq) 4pq 4Mpq dz (4 b) d7 v 0\/-s(z-z )\ ( 0 o (la) I dv .0 va2j\ -.9/2 ) \2njLia2 4mpp (T + 4pp) AMpp d7 (where T 4mPQis denne d in4PQ the same(G wa + y4MPQ). as g,G above; P = P)- (from Barton, 1986, eqn 14, setting the third moment Thus far, the deterministic behaviour has been to zero). described by a set of differential equations (3), coupled The effects of drift are given by the same matrix that together by the deviation, 8. The stochastic analysis mediates the effects of selection in eqn (7 a) (Barton & below rests on the fact that the dynamics of this model Turelli, 1987, eqn (3.7)). The covariance of random can also be described in terms of a potential function, fluctuations in the mean and variance is: U: E(8z2) E(8z8v)\_\_fv 0 (7 b) dp _ E(8z8v) E(8v2))~~N\0 va2) 1 (5) dt 2 dPj (where N is the effective number of diploid where individuals). Sampling drift will also decrease the expected variance by a factor (1 — 1/2/V) in each generation. The expectations of the mean and variance can be The rate of change of allele frequency is given by found by setting eqn (la) to zero. The variance of the the product of the additive genetic variance (piqi/2) mean is given by E(8z~)/2A, where A = s is the rate of and the gradient of U. The population can be seen as return to the equilibrium mean. climbing towards local peaks of U, on a surface which Similarly, the variance of the variance is E(Sv'2)/2A,

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 Divergence under stabilizing selection 63

where A is now the rate of return to the equilibrium give a formula for the single locus case which is valid variance. The A are derived from eqn (7 a). with recurrent mutation. 1 var (z) = E(z) = z0 2Ns 4. Stochastic behaviour: multidimensional analysis 1 var Background EW> = -7* : (») = h7Ns32)(l+[l/Ns

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 N. Barton 64

comm.). However, in the particular case of an additive the constraint of fixed z,v by two Dirac delta character, an exact solution can be found. The problem functions: - • is simplified by the fact that we are only interested in a few of the many dimensions. For example, we might be interested in the phenotypic mean and variance. If the phenotypic distribution is approximately normal, and if selection acts only on the polygenic character, ^ *P- (14«) then the mean fitness depends only on the mean and Since the delta function can be written as 8{x) = variance. Therefore, the integral of ^0 over allele jeixidx/2n, frequency, conditional on the mean and variance is:

2 (13) I J j exp iz (z - 2 aip, - qs) j I exp I iS f 0 - S 2oc Pj q)

iN l iN 1 where F(z, v) = j V '' dp (the integral being taken JL/plq<\ >'- over fixed z, v). xlll —I dpdzdv. (146) When 4N/i < 1, this integral will include many This separates into a product over loci: peaks, corresponding to near fixation for different alleles. It can be split into separate components, Fm, as in eqn (11). Equation (13) separates the distribution into a factor reflecting the effects of selection, and a x {HXtxz, a2v))m(H (az,

over all possible states (eqn 11) gives the partition The functions H_, H+ are the Fourier transforms of function, Z. The entropy, S, can be defined by the allele frequency distributions at loci near fixation F=exp(27V5); it is a measure of the number of for the — or + alleles, respectively. The limits of microscopic states (here, allele frequencies) which integration are at the unstable equilibria which must give a specified macroscopic state (here, z,v). The be crossed by loci passing from a — allele to a + allele distribution of the macroscopic variables (eqn 13) (p_), or by loci passing from a + allele to a — allele depends on the log mean fitness ('energy') and also on the density of states ('entropy'). One could draw the The above derivation uses the fact that the Fourier analogy differently, by equating the whole potential U transform of a convolution is the product of the with the energy, and defining F as J V'1 dp; the latter Fourier transforms of the individual distributions. then becomes a purely geometric term. The effects of The method could easily be extended to include mutation would then be included with the energy, dominance, but would be complicated by . rather than the entropy. It is not obvious which For example, if distinct pairs of loci (/',/) interact, eqn analogy is more helpful. (14c) becomes a product over functions H which are Because the trait is assumed to be determined by the two dimensional integrals over p, and pr sum of effects of the n loci, the distribution in the Equation \4d, e is similar to those given by Bulmer absence of selection, F(z, v), can be found simply by (1972), who derived expressions for the expected convolving the separate distributions contributed by variance in this model. However, there are two key each locus. Since we need to know the distribution differences. First, Bulmer assumes that the mean when the population is in the vicinity of some always coincides with the optimum, so that the loci particular equilibrium of the class m, F must be evolve independently. Here, the term W2* in eqn (13) divided into its components, Fm. The division is made introduces a coupling between the loci. Second, by dividing each of the n allele frequencies at the Bulmer calculates the overall variance, rather than the appropriate unstable threshold, p. Strictly, one should variance when the population is near one of the many subdivide the allele frequency space into the domains stable equilibria. Because we are concerned with only of attraction of the various equilibria; however, the a part of the allele frequency space, eqns \4d,e cannot procedure used here will be a good approximation be solved analytically: Bulmer was able to write the when the populations cluster close to the equilibria. expected variance, averaged over all equilibria, in The convolution can be carried out by a method terms of hypergeometric functions. based on Fourier transformation. We first represent Equation 14 can be applied to find the probability

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 Divergence under stabilizing selection 65

that the population is near some equilibrium (Zm/Z), frequencies (p, P) at the two sets of loci. This reduced the rate of transition between equilibria (F), the distribution is: marginal distribution of allele frequency (i/ro(p)), and 2 2 the distribution of phenotypic mean and variance. fio(p, P) dp dP = exp (- Nsa S ) I(m, p) I(M, P)dpdP This method will be referred to below as 'exact (16) integration'; details are given in Appendix 1. The where

calculations can be speeded up by assuming that (oa 4A>-1 F(z, v) is Gaussian. This leads to simple expressions ,p) = J exp «Jn^J dp. for the distribution of the phenotypic mean and variance (eqn A 1.8-1.10), and will be referred to as Formally, the integrand in /includes many peaks: the the 'partial Gaussian' approximation. range of integration must be restricted to include only The full Gaussian approximation the peak near p( = p. The integral is taken over the (m — 1) dimensional region Ti™.xp( = mp. Explicit formulae can be obtained by making two Suppose that Nsa2 is large enough that fluctuations approximations. First, assume that 4N/i is large around pt= p are small, and / can be approximated enough that allele frequency distributions at each by a Gaussian integral. To a first approximation, / is locus are approximately Gaussian. (This should not given by the value of the integral at p = p. Fluc- be confused with the approximation that allelic effects ( tuations around pt = p can be approximated by a are normally distributed (Nagylaki, 1986): that clearly 2 factor exp (— wY.{pt —p) /2); vv is the curvature around cannot be the case here, since there are only two alleles the peak, and equals $n(/i(\—2pq)/s V/(2TT/H')'""7W, and so: phenotype in the absence of selection, F(z, v), is Gaussian - the 'partial Gaussian' approximation). YNI'~1 t 2 / Second, instead of integrating over fixed (z, v), the ) (±J average allele frequencies at the m loci near fixation (17) for the — allele (p = Hpjm) and at the M loci near Equations 16 and 17 now combine to give the fixation for the + allele (P = X P,/M) are fixed. This is less elegant, because it does not lead to a separation distribution of average allele frequencies (/?, P). The derivation has been based on the assumption that between mean fitness and the underlying allele 2 2 frequencies. However, it does not affect the accuracy Mra is large, so that only leading terms in (\/Nsa ) of the calculations in this paper. This is because loci are needed, and so that the distribution of individual will usually be close to fixation, so that (assuming allele frequencies is approximately Gaussian. Given v = 0) v x 2

(15) Simulations Simulations were run to check these analytic xexp\-2Nsa* £ Plq\ \[ (W . predictions. Ideally, one would use direct Monte \ t-m + l 1 t-m+1 \ *• I Carlo simulations, which would represent each in- We wish to integrate out (m — l) + (M— 1) degrees of dividual, and allow for linkage disequilibrium. How- freedom, to find the distribution of average allele ever, this is not feasible: many loci must be involved

GRH 34

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 N. Barton 66

before multiple equilibria can coexist, and shifts will each locus, and y = /i/stx2, the ratio between mutation occur at a significant rate even when there are many and selection. The position of the optimum, relative to (x 104) individuals. Therefore, simulations only the point which would be reached under recurrent tracked allele frequencies at the n loci. Selection and mutation alone, has rather little effect (Barton, 1986; mutation were represented by the discrete time version compare Tables 1 a, b). This is because the of eqn (3 a). For intermediate allele frequencies, deterministic equations eqn (3) depend only on 8: a sampling drift was approximated by a Gaussian white shift in the optimum can be compensated for by a noise with variance pq/2N. Where fewer than 10 change in [m, 0, M\. The remaining parameters can be 2 copies of the rarer allele were present, a Poisson scaled out: sa sets a characteristic timescale, whilst a sets a scale of measurement for the character. We will distribution was used. 2 Runs were begun from some particular deterministic therefore concentrate on the effects of Nsa , n, and y. equilibrium: results were not recorded for the first Even with this restriction, there is still a wide range 3/a2s generations, to allow time for random of possibilities. Unfortunately, there is little firm fluctuations to build up. Each run was continued until evidence on the mutation rates, population sizes, a shift to some other state occurred: the time before a selection pressures, and numbers of loci which might shift gives an estimate of T"1. Whilst the population be found in nature. The problem is exacerbated by the remained near the initial state, the distribution of the fact that per locus mutation rates seem too low to phenotypic mean and variance, and the allele fre- account for observed levels of quantitative variation quency, were recorded. In most cases, 50 replicates (Turelli, 1984). However, some rough guesses can be were run from each set of parameters. made. A selection pressure .? x 005/Ve is typical of Determining when a shift has occurred is delicate. values measured in nature and in the laboratory (Turelli, 1984; though measures are obscured by The criterion used was that at least one allele frequency fluctuations in natural selection (Endler, 1986)). The should pass beyond the threshold set by the unstable variance introduced by mutation is n/ioc2 x \0~3V per deterministic equilibrium, for at least \/3a2s e 2 generation. In order to account for high heritabilities generations. This is unambiguous when Nsa is large; 2 (h x §, say) by a mutation/selection balance involving however, when Nsa2 is very small, the distribution rare alleles, one must have VJ2 = Vg = {An/i/s). The does not cluster around distinct attractors. The rate of total mutation rate n/i must thus be x 000625. This shifts, measured by the above criterion, does not tend is consistent with the few measurements of the rate to the neutral substitution rate as N tends to zero: this of production of mutations affecting quantitative is because a shift is deemed to occur when an allele characters, but is hard to reconcile with estimates frequency passes some intermediate threshold, rather of the numbers of loci involved in polygenic variation than when a substitution is completed. (n x 20 upwards; Turelli, 1984, Shrimpton & Rob- ertson, 1988), and per locus mutation rates {/i x 5 6 5. Results 10~ -10~ ). The problem increases when one realizes that alleles will tend to be selected against not only Choice of parameters because of their effects on the character of interest, but The key parameters are n, the number of loci, Nsa2, a also because of their pleiotropic effects on other measure of the strength of drift relative to selection on characters (Turelli, 1986). However, since the aim of this paper is to investigate the effects of drift on polygenic systems with multiple equilibria, this (a) (b) u problem will not be discussed further. Most results will be for the case n = 100 loci; the above empirical lnW U values then imply that y x 001. Lower mutation rates -10 H. (y = 00025) and fewer loci (n = 25) will also be considered. These parameter ranges complement those of Burger et al. (1988), who considered much smaller population sizes (mostly Nsa2 < 2), and Hastings -20 < -20 H (1987), who considered many fewer loci (n = 8— 16). v ..••'

m 17 26 45 55 v 0 0 0 0 M 83 74 55 45 Distribution across equilibria Fig. 1. The log mean fitness(log(HO ) and potential (U) The chance that a population will be near to some associated with alternative equilibria, for the case n = 100, particular deterministic equilibrium depends primarily y = 001. (a) z0 = 5; (b) z0 = 0. The leftmost point in each on the potential, U = log (W) + 2/t log (V). This series corresponds to the most extreme stable equilibrium (a) [17,0,83]; (b) [45,0,55]. The next point corresponds to combines the effects of selection (which tends to the adjacent unstable equilibrium (a) [17,1,82]; (b) [45, 1, increase mean fitness) with the effects of mutation 54]. Stable and unstable equilibria then alternate until the (which tends to increase genetic variance). U is plotted rightmost stable equilibrium is reached. against the possible classes of equilibria in Fig. 1, for

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 Divergence under stabilizing selection 67

Table 2. Comparison of exact integration, and the full and partial Gaussian approximations, for different classes of stable equilibria [m, 0, M]

m M Exact integration Partial Gaussian Full Gaussian

(a)Nsa* = 10, n = 100, y = 001,70 = 0 45 55 -7-589 46 54 -9113 -5-754 -1-704 -5-652 47 53 000000 000010 000012 -9-227 -5-805 -7-654 -5180 48 52 000000 000299 000156 -9-222 -5-805 -9192 -5132 49 51 005998 009651 009546 -8093 -5-982 -7-946 -5-832 50 50 0-88004 0-80081 0-80572 -8093 -5-982 -7-946 -5-832 51 49 005998 009651 009546 -9-222 -5-805 -9192 -5132 52 48 000000 000299 000156 -9-227 -5-805 -7-654 -5180 53 47 000000 0-00010 000012 -9113 -5-754 -1-704 -5-652 54 46 — 7-589 55 45 2 (b)Nsoc = 20, n = 100, y= 001,z0 = 0 45 55 -7-589 46 54 -11-824 -9-639 -13131 -5-657 47 53 -12-544 -9-440 -18-222 -5-230 48 52 -14-627 -8-334 -17-796 -5-403 49 51 000462 000978 000357 -12-424 -7-817 -12-717 -7109 50 50 0-99077 0-98045 0-99285 -12-424 -7-817 -12-717 -7109 51 49 000462 000978 000357 -14-627 -8-334 -17-796 - 5-403 52 48 -12-544 -9-440 -18-222 -5-230 53 47 -11-824 -9-639 -13131 -5-657 54 46 -7-589 55 45 2 toNsa = 10, 77 = 100, y= 00025, zo = 0 48 52 -10-795 -6004 -4-148 49 51 000011 008289 000305 -10-362 -8053 -5-633 -11 -400 50 50 0-99988 0-83422 0-99390 -10-362 -8053 -5-633 - 11 -400 51 49 000011 008289 000305 -10-795 -6004 -4148 52 48

(d)Nsa? = 10, 7j = 25, y =-- 001, zo = 0 12 13 0-5000 0-5000 0-5000 -6138 -6138 -6068 -6068 13 12 0-5000 0-5000 0-5000

The third column gives the probability that the population will be in some equilibrium of a class: this includes the factor [n\/m\M\], which is the number of equilibria in the class, and is calculated by exact integration. The fourth column gives this probability, calculated using the partial Gaussian approximation. The fifth and sixth columns give the log transition rates (log(F)) either away from the optimal equilibrium, or towards it (fifth and sixth columns, respectively); these rates are scaled relative to the characteristic time 2/sa2. These values are calculated using the partial Gaussian approximation. The last three columns give corresponding predictions from the full Gaussian approximation.

5-2

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 N. Barton 68

latter barrier is barely visible in Fig. 1. We therefore 001 01 expect that over a wide range of population sizes, drift I will be strong enough to knock the population away from suboptimal states, but not strong enough to knock it away from the optimal state. The population size at which drift allows shifts at an appreciable rate across a barrier At/ is Nx2/AU. When z0 = 0 0-99 - (Fig. \b), escape from equilibria [50,0,50], [49,0,51], [48,0, 52], ...becomes likely when Nsa2 < S-4, 31-3, 1560, 784-3, ...If, as is likely, selection on each locus is weak, these figures correspond to large numbers of individuals: taking the rough estimates of the previous section, if set2 x 4 x 10~\ then the criterion for stability is that N< 21,000; 78,000; 369,000; 1,961,000.... The chance that the population lies near the optimal state depends on the shape of the distribution (exp (2NU)) near that state, as well as on the height of 0-9 - the peak in U. It can be calculated by integrating around the peak, using either the exact method, or the 001 01 I Gaussian approximation (eqn 23). Both give similar results (Table 2), confirming that provided Nsa2 > 2, the population is most likely to be found near the deterministic equilibrium in which the mean is closest to the selective optimum. Calculations of transition 0-99 - rates (see below) also show that shifts towards the optimum are (for Nsa2 > 2) more frequent than those away from it, and that these shifts can be frequent enough, even in a large population, for the optimum to be reached quickly (Fig. 6).

Distribution of allele frequencies Calculations of transition rates, and of phenotypic distributions, depend on averaging over irrelevant J 0-9 variables. This procedure is illustrated in Fig. 2. Here, Fig. 2. (a) The logarithm of the equilibrium probability population size is small enough that the distribution distribution of two particular allele frequencies, one of individual allele frequencies is strongly skewed: it chosen from the set of m loci at low frequency (/?,), and one from high frequency (/^.). All other loci are held at rises to a singularity at fixation. The distribution the deterministic equilibrium. The distribution is plotted illustrated in Fig. 2a is a cross-section through an on an arc-sin transformed geodesic scale. z0 = 0, H-dimensional hypercube, with peaks at every corner. Nsa} = 10, y = ///5a2 = 0-01, « = 100; hence, 4N/i = 0-4 is This /;-dimensional distribution of average allele small enough that the distribution rises to a singularity at frequencies is reduced to a two-dimensional distri- p = 0, P = 1. Contours are spaced at 01 units of log probability, {b) The distribution of average allele bution of average allele frequencies (ijro(p,P); eqns frequencies, (p, P), for the same case. Here, contours are 16, 18). Figure 2b shows that this latter is tightly spaced 10 units of log probability apart: the distribution clustered around the deterministic equilibrium and is tightly clustered around the peak (A). The deterministic has an approximately Gaussian form near the peak. equilibrium {X) is slightly displaced from the peak. Simulations confirm that although individual allele frequencies vary widely, their averages follow the the standard parameter set (n = 100, y = 001). Values distribution of Fig. 2b. of U, and of log mean fitness, are also given in By integrating over all but one allele frequency, the Table 1. Note that the pattern of log mean fitness marginal distribution at each locus can be found. The parallels the potential: thus, the most likely state also distributions at loci almost fixed for ' + ' alleles, and has highest mean fitness. for loci almost fixed for ' — ' alleles, are shown on the The graph of U against possible equilibria consists same graph in Fig. 3. The theoretical distribution of a series of rectangular steps; the drop in potential (whether calculated by exact integration, by the partial which impedes transitions away from the optimal Gaussian approximation, or by the full Gaussian equilibrium is much greater than the drop which approximation) is indistinguishable from simulation impedes shifts in the opposite direction. Indeed, the results on this scale.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 Divergence under stabilizing selection 69

y \/2N o6\ 01 0-9 0-99^ 1 -1/2/V

Fig. 3. The distribution of allele frequency (t/r(l(p)), given that the population is near [49,0,51]. n = 100, y = 001, Nsa' = 10. r0 = 0. p is plotted on an arc-sin transformed scale. The left-hand curve shows the marginal distribution for the 49 loci nearly fixed for ' — '; the right-hand curve is for the 51 loci nearly fixed for ' + '. The two arrows show the deterministic equilibria. Simulation results (from 64,000 generations) are drawn on the same graph, and are indistinguishable from the theoretical prediction, except at extreme frequencies: when p < ]/2N or p > 1 — \/2N, the 2000 diffusion approximation breaks down. (Here and in Fig. 2, the distribution is calculated using the full Gaussian approximation. Results from the partial Gaussian approximation and from exact integration are Fig. 5. A typical segment of a simulated path. Nsa- = 10, indistinguishable.) y = 001, n = 100, .? = 1, a = 01, r,, = 0. The population begins at a deterministic equilibrium of class [49,0,51], and runs until a shift to [50,0,50] occurs (downward line Transition rates on right). The graph shows allele frequencies at the 100 loci, plotted on an arc-sin transformed scale. Theoretical calculations of transition rates are compared in Fig. 4, for the standard case of n = 100, f 7 = 0-01, z(l = 0. Even though the dis ribution of allele bution across equilibria, and the marginal distribution frequencies is strongly skewed when Nsu? < 20 (Figs. of allele frequencies, from these two methods are 2a, 3), the full Gaussian approximation (straight solid similar (Table 2). line in Fig. 4) is close to the partial Gaussian Two single-locus approximations are also shown, approximation for Nsa2 > 5 (dotted solid line in by dotted lines. Ignoring deviations from the optimum Fig. 4). Results from exact integration are not altogether (i.e., 8 = 0) gives a model of symmetric given, because they require calculation of a three- underdominance. This gives poor results: it predicts dimensional integral. However, they should be close equal rates for all classes of equilibria, when in fact the to the partial Gaussian approximation: the distri- optimal equilibrium [50,0,50] is much stabler. Linear interpolation of S between the deterministic equilibria gives better results; however, there is still a substantial 2 (b) 2 Nsa •20 0 Nsa 20 discrepancy, which is due to the failure to account for fluctuations at the (n— 1) loci which are not directly involved in the shift. These various predictions can be checked against simulations. Figure 5 shows a typical run. Allele frequencies fluctuate near fixation. Occasionally, one allele frequency drops to hover near the unstable threshold (dotted line), and then either returns to its 5=0 5=0 original state or drops further. The average time Fig. 4. Comparison of various approximations to the before a shift from [m, 0, M] to [m— 1,0, M + 1] gives .transition rate, F. This is the chance per generation of a an estimate of (1/MT); the factor M is introduced shift from the vicinity of one particular state to the because M loci are liable to shift from ' + ' to ' — '. vicinity of another particular state. (Since there are many For the standard case, with 7 = 001, the partial neighbouring states, the total transition rate is much Gaussian approximation gives results in good agr- higher.) In both graphs, za = 0, n = 100, y = 001. (a) [49, 0,51] to [50,0,50]. (/>) [50,0,50] to [49,0,51]. Results from eement with the simulations over the whole range of exact integration are shown by the solid curve; the Nsa2 for shifts from [49,0,51] to [50,0,50] (Fig. 6a). Gaussian approximation is shown by the solid straight 2 For jumps in the opposite direction, agreement is less line, and coincides with exact integration for Nsa > 8. good (Fig. 6/>). This is to be expected, since the The dotted lines show two single locus approximations. analytic predictions are based on a formula derived The line S = 0 is obtained by ignoring deviations from the 2 optimum; it is the same in (a, b). The other dotted line is for large Nsa (eqn (12)). When populations are small, obtained by linear interpolation of the deviation, <5. some disagreement is to be expected: allele frequencies

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 TV. Barton 70

(a) (b)

2 2 0 Nsa 20 Nsa 20

In (D

-10

Fig. 6. Comparison between exact integration (solid rightmost points in Fig. 5 b, where only 7 shifts were curve), the Gaussian approximation (solid straight line), seen). The bars show log (F)± 2 standard errors. The and simulation results. Parameters are as for Fig. 4, arrow on the left of each graph shows the neutral except that N is varied. For each value of N, simulations substitution rate (//). were run until 50 shifts had occurred (excepting the

fluctuate so much that it is not clear when the reasonable accuracy of the full Gaussian approxi- population is 'near' one or other deterministic mation is a common, and as yet unexplained equilibrium, and the simulation results become sen- phenomenon; it arises, for example, in quantum sitive to the criteria used to define a shift. mechanics and in optics (Schulman, 1981, Ch. 5). The full Gaussian approximation (straight lines in The arrows in Fig. 6 show the neutral substitution Figs. 6a, b). is good for Nsa2 > 5, and performs rate (F = //), which is expected when Nsa2 is small almost a*s well as the results from exact integration. enough that selection is negligible. Simulations at This is surprising, since the distribution of allele small TV give rather higher rates. There are two frequencies. is strongly skewed (Fig. 3). The un- reasons for this. First, the definition of a shift is that

(a) (h)

10 10

Nsa2 Fig. 7. The variance of the phenotypic mean, plotted as line on right). The full Gaussian approximation predicts a 2 2/Vs.var(r), against Nsa. .z0 = 0, n = 100, y = 001, s = 1, somewhat lower constant value (solid line on right, from a = 001. (a) [49,0,51]; (b) [50,0,50]. Crosses show eqn A 2.2b). The solid curve shows the prediction from simulation results. The naive phenotypic prediction exact integration, which is very close to the partial (eqn Sa) is that this quantity is always equal to 1 (dotted Gaussian approximation.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 Divergence under stabilizing selection 71

(a) (b)

005 I £(V)

10

Nsa2 Fig. 8. The expected phenotypic variance, plotted against shows the prediction obtained by neglecting variations Nscc2. Parameter values are as for Fig. 7. Crosses show about the optimum (which is close to Bulmer's (1972) simulation results (a) [49,0,51]; (b) [50,0,50]. The right- formula). The light solid curve in (a) shows the prediction hand horizontal dotted line gives the deterministic from the partial Gaussian approximation. For the equilibrium, which is slightly higher for the asymmetric symmetric equilibrium (b), this is indistinguishable from equilibrium (a). The left-hand dotted line gives the neutral exact integration, and so is not shown: there, the light prediction (v = ANnfi). The heavy solid curve shows the solid curve shows the Gaussian approximation (eqn prediction from exact integration. The light dotted curve A 2.2a).

allele frequency should pass some intermediate which arises from deviations from the optimum is threshold, rather than that a substitution should ^ \/4N. occur. When Nsa.2 is large, a locus that passes the The relation between the genetic variance and threshold will almost certainly go on to near fixation population size is shown in Fig. 8. The variance is for the new allele (cf. Fig. 5). However, when Nsa2 is somewhat higher when the population is near an small, frequencies may often return to their original asymmetric equilibrium in the class [49,0,51] (Fig. values. Second, though selection opposes the initial 8a). Since drift is likely to keep the population near to increase of a rare allele, it assists the shift once the the optimal state [50, 0, 50], we should concentrate on threshold is passed. This may increase the rate of Fig. 8h. The variance is close to the deterministic shifts above the neutral rate when the initial equi- value (right hand dotted line) for Nsoc2 > 2. It decreases librium is asymmetric (compare Figs. 6 a, b). to zero as the population becomes very small; however, selection holds the variance well below the Distribution of the phenotype neutral prediction (v = 2nNs/i; left hand dotted line), even for small N. Surprisingly, the variance increases The variance of the mean around the deterministic 2 equilibrium is shown in Fig. 7, as a function of slightly for intermediate A^a before decreasing in population size. The product 2Ns.var(z) is plotted; very small populations. This effect was predicted by naive phenotypic arguments predict that this should Bulmer (1972), by integration of the single-locus have a constant value of 1 (eqn 8a). The variance is distribution on the assumption that the mean indeed close to this prediction (upper dotted line) coincided with the optimum (dotted curve). Exact provided that the population is not too small (Nsa2 > integration of the multilocus distribution, or use of 1). If allele frequencies follow a multivariate Gaussian the partial Gaussian approximation, gives results distribution, 2Af?.var(z) should have i. somewhat similar to Bulmer's, showing that fluctuations in the smaller value (lower solid line; eqn A 2.2b). The more mean have little effect on the variance. However, the sophisticated prediction, from exact integration, is full Gaussian prediction (shown in Fig. 86 by the lower solid curve) does not predict this increase. shown by the solid curve: this converges to the full 2 Gaussian prediction as Nsa2 becomes large, and the Simulations do show an increase at intermediate Nsa . However, it is smaller than predicted for the symmetric underlying distributions approach normality. The equilibrium (Fig. 8/>). The reason for this discrepancy main conclusion is that the mean is never likely to is obscure: it may be an artefact of the criterion used deviate from the optimum by more than » y/2/ Ns for to decide when the population has shifted to a new Nsa2 > 1. Since Nsa.2 var(z) decreases for small N, this equilibrium. limit may always hold: it implies that the 'drift load'

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 N. Barton 72

founds a new population. Then, the chance that the Accumulation of reproductive isolation rarer allele at a locus is fixed is equal to its initial Suppose that two isolated populations each remain in frequency, p as 2y. Therefore, only a small fraction 2y a balance between mutation and stabilizing selection. of the maximum isolation is likely to evolve; this Even though both are subject to the same selection fraction is j of the equilibrium mutation load (Barton, pressures, and the mean in both may remain close to 1989). With the above values, a founder event gives the same optimum, drift will occasionally change the isolation of 0002 per character. Changes during a underlying combination of alleles. Eventually, the founder event depend on initial polymorphism: here, combination may become different enough that when levels of polymorphism at each locus are low. the populations cross, extreme and unfit recombinants The most plausible cause of reproductive isolation will be produced. To find the rate at which re- under this model of polygenic variation seems to be a productive isolation will evolve, we must know first, combination of a fluctuating optimum and weak the level of isolation produced when populations sampling drift. First, consider slow changes in the differ at k/n loci, and second, the rate at which they optimum in an infinite population. Because a wide become different. range of equilibria are stable, shifts between different In a large population at equilibrium, in which the combinations of common alleles would not occur: the mean coincides with the optimum, the genetic variance mean would instead be adjusted to the optimum by is v = 4n/i/s. This reduces mean fitness by sv/2 = 2n/i. slight changes in allele frequency, at the cost of We can compare levels of isolation with this basic an increase in genetic variance. For example, with mutation load. In the extreme case, suppose that two n = 100, y = 001, changes in optimum of as much as populations differ in the common allele at all n loci. ±10a do not cause instability: this corresponds to Then, the Fl population will have intermediate ±3-3 phenotypic standard deviations. However, drift phenotype, and will be slightly fitter than either will knock even a fairly large population away from parent. However, in the F2 population, all loci will suboptimal states. So, suppose that the optimum have equal allele frequencies; this gives a variance shifts by 2a, or 066 standard deviations, in a finite (2na2pq) = {no} 17). The log mean fitness is therefore population. If Nsa2 < 20, a shift to the new optimal reduced by sna2/4. In fact, even. after indefinite equilibrium is likely to occur within as 1/mf gener- divergence, only half the loci will differ on average; ations (3,000 generations; Fig. 6a). Rapid fluctu- the maximum load in the F2 population is therefore ations would have the same effect as sampling drift, (sna2/8). When k loci differ, the load is {ska2/A). For and would promote rather slower isolation (see example, with the standard values sa2/2 = 1/250, y = Maynard Smith, 1979). 001, /< = 8x 10"5, and n = 100, the equilibrium load is 0-016, and the maximum reduction in fitness in the F2 is 010. This is hardly enough to cause a significant 6. Discussion barrier to gene exchange; however, if similar processes We can now return to the general questions raised in were to occur for many characters, substantial the introduction. First, can the evolution of polygenic isolation could evolve. A limit is placed by the total variation be understood from gross phenotypic mutation load, which cannot be very large (<0-5, measures, without detailed knowledge of processes at say). With y = 001, the maximum isolation cannot be individual loci? In an infinite population, many greater than (0-1 /0016) as 6 times this load (< 3, say). different equilibria with different properties may In general, the ratio is (l/16y). Thus, if the alleles coexist. Thus, the state of the population cannot be responsible for polygenic variation are usually rare predicted without knowing which of these equilibria it (y 10; close to the optimal state, in which the mean coincides when Nsa2 < 2, the frequency of shifts approaches the with the selective optimum. Furthermore, provided neutral rate, /<, per locus. The neutral value sets a the population is not extremely small (N < sa2 as 100, rough upper limit on the rate of divergence. This limit say), the mean and variance will have a distribution is low: for the above values, the maximum isolation of close to that predicted from phenotypic arguments as 0-1 per character develops over a time as 1 //< = (eqn 8). This is consistent with the simulation results 12,500 generations. Mutation rates are in fact likely to 5 of Burger et al. (1988). These conclusions can be be lower than the value of 8 x 10" used here. phrased in terms of the general results of Barton & One might imagine that occasional founder events Turelli (1987), who showed that changes in the mean could accelerate speciation. This would be so if the and variance depend on the third moment of the standing populations were so large that stochastic distribution of breeding values. Different stable shifts were essentially impossible (Nsa2 > 10, say). equilibria correspond to different values of the third However, consider the consequences of the most moment: sampling drift acts to keep the third moment drastic founder event, in which one haploid genome near to zero.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 Divergence under stabilizing selection 73

So, given this model of a set of identical biallelic Levin: one must consider the surface of U, plotted loci, the existence of multiple equilibria does not cause against all the allele frequencies pt, rather than difficulties in any but very large populations. However, considering the fitnesses of individual genotypes we may still ask whether this conclusion holds for (see Provine, 1986, for discussion of this distinction). more realistic models of polygenic variation. Obvious However, because the population clusters around one extensions are to allow for dominance (Charlesworth genotype, the same considerations apply. It may be et al. 1987), for different effects (a,) across loci, for that advance towards the global optimum is relatively multiple alleles, and for selection on a vector of easy, because the model of additive gene effects leads quantitative characters. Although a full analysis is to a strong correlation between adjacent fitness values. needed, one can use dimensional arguments to show Important areas for future work are to find how far that drift will be able to bring the population near to Kaufmann & Levin's arguments apply at the level of the optimal equilibrium if TV > \/sa2. However, it is populations as well as genotypes, and how far the the step-like pattern of log mean fitness and potential relative ease with which the global optimum is reached across equilibria (Fig. 1) which allows the optimal can be related to general features of the model. state to be reached across a very wide range of N; it How does strong selection on a polygenic character is not clear whether more general models would also affect individual loci? The rate of shifts approaches show this behaviour, or whether it is an artefact of the that for a neutral locus when A' < \/sar; it is in this extreme symmetry in this model. Numerical results range that drift begins to reduce the phenotypic allowing different effects (a,) across loci show that the variance appreciably (compare Figs. 6, 8). It is mean fitness of different classes of equilibria become therefore not possible to reconcile effectively neutral more similar to each other. evolution at each locus with strong stabilizing selec- Of course, polygenic variation may not be tion, as argued by Kimura (1981, 1983), unless drift maintained by a balance between mutation and has a noticeable effect at the phenotypic level. stabilizing selection. Other possibilities are that del- However, the distribution of allele frequencies is eterious alleles are maintained by mutation, and have indistinguishable from that for mildly deleterious pleiotropic effects on quantitative characters; that alleles, for all population sizes. It is therefore not polymorphisms maintained by balancing selection possible to use allele frequencies to distinguish the have pleiotropic side-effects (Gillespie, 1984a); or, present model from a version of the neutral theory that frequency-dependent selection acts directly on that incorporates weak purifying selection. One might the character of interest, so as to maintain variation imagine that the pattern of substitutions would be (Slatkin, 1979). One should note, though, that the more helpful. When a single class of equilibria main objection to a balance between mutation and coincides with the selective optimum (Fig. 1 b rather stabilizing selection is that an unreasonably high net than Fig. 1 a), then a shift from an optimal to a mutation rate («//) is needed to offset the strong suboptimal equilibrium will be quickly followed by a stabilizing selection which seem:* prevalent. Strong compensating shift in the opposite direction. However, stabilizing selection on many characters is hard to the compensation will be at a different locus, and so reconcile with any model since it implies a high genetic will not be reflected by an increased variance of load, and a high variance in fitness. substitution rates (cf. Gillespie, 1986). It is hard to see how observations of individual loci can distinguish Clearly, the above calculations of the relation stabilizing selection on a polygenic system from simple between phenotypic distribution and population size purifying selection. may only apply under the specific model used here. However, conclusions about the way optima are Given sufficient time, significant reproductive iso- reached, and the rate of peak shifts, may apply under lation can evolve between isolated populations subject more general schemes. As Wright (1932, 1980) has to the same regime of stabilizing selection. However, stressed, the prevalence of pleiotropy and epistasis the maximum isolation that can evolve can only makes multiple equilibria inevitable. This is confirmed greatly exceed the mutation load in a standing by a range of models of multilocus selection (Karlin, population when the alleles responsible for variation 1979). Kaufmann & Levin (1987) have discussed the are extremely rare. Moreover, divergence is extremely geometry of systems with multiple local optima, slow, at best occurring over a timescale set by the stressing the point that only a tiny fraction of possible mutation rate. Founder events do not help much, gene combinations can be reached from any one because the alleles responsible for variation are rare, genotype (c.f. Gillespie, 1984 ft).I n the model discussed and are likely to be lost rather than fixed. Divergence here, this fraction is n/2", which is 8 x 10~29 for n = would be possible in a geographically structured 100. Kaufmann & Levin find that the maximum population if the number of individuals exchanged advance in fitness, from some arbitrary starting point between demes, or the neighbourhood size, were small up to the nearest local peak, increases substantially in at least some times and places (Nm < 1, or with the correlation in fitness values between adjacent 4npa-2 < 10; Rouhani & Barton, 1987). However, points. The model here is set in a rather different divergence would still be slow. The most favourable framework from that considered by Kaufmann & conditions for the evolution of isolation under this

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 N. Barton 74

model may be when a fluctuating optimum is Coyne, J. & Orr, H. (1989). Patterns of speciation in combined with weak sampling drift. Drosophila Evolution (in press). These conclusions in fact apply to a wide range of Endler, J. A. (1986). Natural Selection in the Wild. Princeton University Press, Princeton, New Jersey. models (Barton & Charlesworth, 1984, Rouhani & Foley, P. (1987). Molecular clock rates at loci under Barton, 1987, Barton, 1989). Substantial reproductive stabilizing selection. Proceedings of the National Academy isolation is unlikely to evolve rapidly as a result of of Science (USA) 84, 7996-8000. drift; founder events are only likely to lead to strong Gardiner, C. W. (1983). Handbook of Stochastic Methods. isolation through shifts at highly polymorphic loci, Synergetics, Vol. 13. Springer Verlag, Berlin. Gillespie, J. H. (1984a). Molecular evolution over the and is limited by the standing genetic load; and mutational landscape. Evolution 38, 1116-1129. processes other than pure drift may cause faster Gillespie, J. H. (19846). Pleiotropic overdominance and the speciation. Of course, speciation is a slow process: for maintenance of genetic variation in polygenic characters. example, Coyne & Orr (1989) estimate that sibling Genetics 107, 321-330. species of Drosophila diverged, on average, 5 million Gillespie, J. H. (1986). Variability of evolutionary rates of DNA. Genetics 113, 1077-1091. years ago. Gradual divergence under stabilizing Hastings, A. (1987). Substitution rates under stabilizing selection may therefore be a major cause of re- selection. Genetics 116, 479-486. productive isolation: as a hypothesis, it has the Karlin, S. (1979). Principles of polymorphism and epistasis advantage that the rate of divergence can be calculated for multilocus systems. Proceedings of the National from observable features of present-day populations. Academy of Science (USA) 76, 541-545. Kaufmann, S. & Levin, S. A. (1987). Towards a general This discussion of and speciation has theory of adaptive walks on a rugged landscape. Journal rested on a simple, symmetrical, and probably of Theoretical Biology 128, 11-46. unrealistic model. However, the methods used can Kimura, M. (1981). Possibility of extensive neutral evolution readily be generalized to other cases. Even using the under stabilizing selection with special reference to non- Gaussian approximation, integration over fluctu- random usage of synonymous codons. Proceedings of the National Academy of Science (USA) 78, 5773-5777. ations around the equilibria gives accurate predictions, Kimura, M. (1983). The Neutral Theory of Molecular even where the underlying distribution is strongly Evolution. Cambridge University Press. skewed. It would be interesting to apply these methods Landc, R. (1975). The maintenance of genetic variability by to other models to find whether the conclusions of this mutation in a polygenic character with linked loci. paper are generally valid. Genetical Research 26, 221-236. Latter, B. D. H. (1960). Natural selection for an inter- This work benefittcd greatly from discussions with Shahin mediate optimum. Australian Journal of Biological Rouhani, Reinhard Burger and Michael Turelli. It was Sciences 13, 30-35. supported by SERC grants GR/D/91529, GR/E/08507, Maynard Smith, J. (1979). The effects of normalizing and by NSF grant BSR/88/06548, and by grants from the disruptive selection on genes for recombination. Genetical Nuffield Foundation and University of London Central Research 33, 121-128. Research Fund. Nagylaki, T. (1986). The Gaussian approximation for random . In Evolutionary Processes and Theory (eds. S. Karlin & E. Nevo), pp. 629-644. Academic References Press, New York. Provine, W. (1986). Sewall Wright and . Barton, N. H. (1986). The maintenance of polygenic University of Chicago Press. variation through a balance between mutation and Rouhani, S. & Barton, N. H. (1987). Speciation and the stabilizing selection. Genetical Research 47, 209-216. 'shifting balance' in a continuous population. Theoretical Barton, N. H. (1989). Founder effect speciation. In Population Biology 31, 465-492. Speciation and its consequences (ed. J. A. Endler & D. Schulman, L. (1981). Techniques and Applications of Path Otte. Sinauer Press, Sunderland, Mass. Integration. John Wiley, New York. Barton, N. H. & Charlesworth, B. (1984). Genetic Shrimpton, A. E. & Robertson, A. (1988). The isolation of revolutions, founder effects, and speciation. Annual polygenic factors controlling bristle score in Drosophila Reviews of Ecology and Systematics 15, 133-164. melanogaster. II. Distribution of the third chromosome Barton, N. H. & Rouhani, S. (1987). The frequency of shifts bristle effects on chromosome sections. Genetics 118, between alternative equilibria. Journal of Theoretical 445-459. Biology 125, 397-418. Slatkin, M. (1979). Frequency- and density-dependent Barton, N. H. & Turelli, M. (1987). Adaptive landscapes, selection on a quantitative character. Genetics 93, genetic distance, and the evolution of quantitative 755-771. characters. Genetical Research 49, 157-174. Turelli, M. (1984). Heritable genetic variation via mutation- Bulmer, M. G. (1972). The genetic variability of polygenic selection balance: Lerch's zeta meets the abdominal characters under optimizing selection, mutation and drift. bristle. Theoretical Population Biology 25, 138-193. Genetical Research 19, 17-25. Turelli, M. (1986). Gaussian versus non-Gaussian genetic Bulmer, M. G. (1980). The Mathematical Theory of Quan- analyses of polygenic mutation-selection balance. In titative Genetics. Oxford University Press, Oxford. Evolutionary Processes and Theory (eds. S. Karlin & Burger, R., Wagner, G. P. & Stettinger, F. (1988). How E. Nevo), pp. 607-628. Academic Press, New York. much heritable variation can be maintained in finite Turelli, M. & Barton, N. H. (1989). Dynamics of polygenic populations by a mutation selection balance? Evolution characters under selection. Manuscript submitted to (in press). Genetics. Charlesworth, B., Coyne, J. A. & Barton, N. H. (1987). The Wright, S. (1932). The roles of mutation, inbreeding, relative rates of evolution of sex chromosomes and crossbreeding and selection in evolution. Proceedings of autosomes. American Naturalist 129, 113-146. the Sixth International Congress of Genetics 1, 356-366.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 Divergence under stabilizing selection 75

Wright, S. (1935). Evolution in populations in approximate The 'partial Gaussian' approximation used here, that equilibrium. Journal of Genetics 30. 257-266. H'"H+ is roughly Gaussian in 5, is much weaker than Wright, S. (1980). Genetic and organismic selection. the full 'Gaussian approximation' discussed above Evolution 34. 825-843. and in Appendix 2, that allele frequencies at individual Appendix 1. Details of exact integration, and the loci are normally distributed. For the parameter partial Gaussian approximation values used in this paper, eqn (A 1.4) is usually close to the exact result, eqn (A 1) (Figs. 7, 8). Distribution across equilibria The chance that the population is near some equi- The marginal distribution, and transition rates ls librium m is proportional to Zm = / W FJj, ;>)dzdr (from eqn 13). Substituting for If'from eqn (2) and for Consider one locus, with allele frequencies u, v. If (m — 1) of the other loci are near fixation for the ' — ' Fm from eqn (14e) we see that the integral over z gives the Fourier transform of a Gaussian (another allele, and M are near fixation for the ' + ' allele, then Gaussian), and the integral over v gives the Fourier the population will be close to an equilibrium in the transform of an exponential (a delta function). Thus: class [m, 0, M] when u is small, and will move towards [m—1,0, A/+1] as u increases. For given w, the distribution of the remaining (n— 1) allele frequencies will be as if the optimum were at (z0 — a(u — v)). 2 x (H+(ccz, iNsx ))" dz. (A 1.1) Hence, the integral over these (n— 1) variables will be given by eqn (A 1.4) evaluated for this shifted A close approximation to this expression can be optimum, and for an equilibrium in the class [m— 1, obtained by noting that: 0,M]: 2 /_(ar, iNsoc) = \" l'-1 e'«<""«» x -2-v"''" dp. Jo e (A 1.2) yj 2 2 This is the Fourier transform of the allele frequency [H_(0, iNscc )Y(H+(0, iNs

distribution which would obtain at each locus if the ZmX/\+2Nsvar(z) mean always coincided with the optimum: this distribution is given by eqn (26) of Bulmer (1972). x exp — (A 1.5) (Strictly, H_ is the Fourier transform of the dis- (l+2Mvar(z))

tribution of a.{p — q).) Now the product of the Fourier H_, H+, E(z) and var(z) are evaluated using the transforms of a set of distributions is the Fourier threshold frequencies appropriate for the shifted transform of the convolution of those distributions. optimum, and so depend on u. However, this The product H"'Hl' is therefore the Fourier transform dependence is weak, because the population is unlikely of the distribution of the phenotypic mean (z = to lie near the threshold. (As noted above, there is a Ia{p — q)) which would obtain if fluctuations from the slight approximation here, because the domain of optimum had no effect; this distribution will be attraction of the different equilibria does not have the denoted by a tilde, ~. (It would be found if stabilizing rectangular shape implied by dividing the allele selection were centred on the actual mean in each frequency space at fixed thresholds. This approxi- generation, and so tended to reduce the variance mation is needed, however, to make the integrals without introducing any directional component.) separable.) With a large number of loci, we expect this The chance per generation of a shift from [m, distribution to be approximately Gaussian, and so 0,M] to [m— 1,0,M+\] is given by eqn (12) where characterized by its mean and variance (£(z), vaf (z)). jsi/r0dS = ^0(w), and e = u; the integral over u is In other words, its Fourier transform FfUH'l' should taken between the two stable equilibria, u = p and be approximated by a Taylor series expansion around M = P. z = 0. Since 6 log (H^H?)/3£ = iE(z), and S2 log(//? //f)/3z2 = -var(z) (from eqn \4d, e), we have: The distribution of the phenotypic mean and 2 2 M (#_(«, iNsa ))^H+{az, iNsa. )) variance x (//.(0, iNsa2))m(H (0, iNsa2))M + The expected variance is (from eqn 14c): x exp (izE(z)) exp (- z2 vaf (z)/2). (A 1.3) Substituting into eqn (A 1), and integrating across z: 2 m 2 M _ (//.(0, iNsa )) (H+(0, iNsa )) 2 m 2 M x (H_(az, a v)) (H+(az, a t!))' dzdv dz dv. (A 1.6) (E(z)-z )2Ns o (A 1.4) This is similar to the integral used to derive Z^ (eqn x exp — A 1.1); it differs in that the integral over v gives the

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 N. Barton 76

Fourier transform of v.exp(-Nsv), which is the is approximately

differential of a delta function (i£8(v — iNs)/c!v). If this x 1 is integrated by parts: \(p/2)* '- exp (- 2Nscrp) dp, and so vaf (p) x /i/(Nsoc2)2, and 2Afa.vaf (z) » 8w/«/ sot2. We have assumed that n is large and that /i/sa2 is n.2Ns) small; however, Sn/i/sa2 could be large or small. These arguments suggest that the phenotypic . 9 2 m 2 A , ia Ns)) (H+(*z, /a N(r)) '} dz arguments used to derive eqn (8) will only be valid if 'ffi there are very large number of loci, and if allele (A 1.7) frequencies at each locus are not too low. We can see (where the differential d/dv is evaluated at v = iNs). these conditions from a different angle by noting that Next, apply the 'partial Gaussian' approximation, 2Ns.var(z) = %n/.i/sa2 = 2K*/a2, the ratio between and expand the term 9(//"'//+')/9i5 as a Taylor series in V*, the equilibrium variance under the rare alleles z, as for Zm. This gives (as before) terms involving approximation, and the variance of a single sub- E(pq), E(z), and vaf (z), these moments being taken stitution : the phenotypic arguments will hold when a2 across the distribution (/?tf/2)4V/'~'exp( — 2Nsa2pq) is much smaller than the total standing genetic which obtains if fluctuations from the optimum are variance. ignored. There are now also terms E(p — q), and var (p — q), these moments being taken across the dis- tribution (pq/2)iy>'exp(-2Nsa2pq). A little algebra Appendix 2. Details of the full Gaussian gives: approximation E(v) = 2c The distribution of the mean and variance The distribution of the mean and variance, for populations near to some equilibrium w, is ij/Q{z,v) = 2X V *\ V [2i; 2\>+ W(z,v) F!IL(z,v)/Z!1 (eqn 13), where is given by eqn (14). For Gaussian stabilizing selection on a where v = 1 +2jV.vvar(z) A = E(z)-z u large number of loci, this distribution is itself almost exactly Gaussian, and so can be summarized by its mean and variance. A_ = A + oc(E_(p-q)-E,(p-q)) Exact integration cannot give explicit analytic 2 v+ = v + 2Nsa (vaf+(p-q)- vaf+(/>-

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 Divergence under stabilizing selection 11

we can write pq = PQ = ft/sot.2 = y, and these D is the product of eigenvalues of the deterministic expressions simplify considerably: equations (eqn 4 a); this expression could of course be obtained directly from that determinant, without passing through the intermediate calculation of E(v) = (4nft/s) 1- 2Nsa\\-Sy)(\+Sy(n-\))) ftoiP' P)- The probability of being near a particular (A 2.2 a) equilibrium depends mainly on the potential U, but also decreases with D: equilibria which are extremely (4n/i/s) var(z) = (A 2.2 b) stable, in the sense of having large D, are surrounded Nsa2(\+8y(n-\)) by a more compact cloud of populations.

-8y + \6mMy/n) var(t') = (A 2.2 c) Transitions between equilibria (Ns{\+8y(n-\))) where y = p./sa2. Consider a shift from somewhere near one equilibrium These values can be compared with the phenotypic in the class [m,0, M], via the unstable state [m,\, approximations of eqn (8). Suppose that n is large, M—l], towards a new state, [m+1,0, M]. Label the and y = /i/sa2 is small. Then, eqn (A 2.2a) predicts initial stable state S, and the unstable intermediate U. that sampling drift will decrease the phenotypic The probability that such a shift will occur is F per 2 unit time; conversely, the expected time before a shift variance, v, by a factor (l/2Nsa ). This conflicts with 1 the prediction of a factor (l/Nsa2), from eqn (8). is F" . The transition rate is given by eqn (12). We Simulations show that in fact, neither method is must make the slight approximation that the ridge of particularly accurate (Fig. 8, and below). probability which connects the two stable states The predicted variances of z and v differ from the represents change of allele frequency at one locus (/>,., phenotypic predictions by factors involving ny. For say). Integration across transverse surfaces S then the range of parameters considered here, this could be requires that i/r0 be integrated over the other n—\ large or small: the variance of z agrees with eqn (8) allele frequencies, keeping pt fixed. This gives the when ny is large, whilst the variance of v agrees when marginal distribution ofpt, which will have a minimum ny is small. The discrepancy arises through the at the unstable threshold, p. The integral of the inverse accumulation of small deviations (of order y) from of this marginal distribution gives the inverse of the the assumptions of eqn (8) at many (n) loci. transition rate (eqn 12). Note that fluctuations at all the loci will affect the marginal distribution of p(, and so this method gives different (and better) results than The distribution across equilibria the simple single-locus approximations discussed above. The probability that a population will be found near some particular equilibrium in the class [m, 0, M] can If we assume that the allele frequency distribution be found by integrating \j/a(p,P). Since this dis- \j/u is Gaussian, transition rates are simply given by tribution is approximately Gaussian, the integral eqn (15) of Barton & Rouhani (1987): introduces a factor \/det(27r/log(^0)"), where log (^n)" is the curvature matrix. This curvature matrix exp(-2WAt/) (A 2.4) can be found directly by exact integration. With the Gaussian approximation, an explicit formula can be Ds, Dv are the determinants which give the stability of found: multiplying eqn (18) by this factor: the system near the stable and unstable points (eqn 4), and VH, Vu are the products of the genetic Z _cxp(2NU) 2n 1 m (A 2.3) variances (eqn 5) at these points. A is the leading eigen- z z value at the unstable point.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378 Downloaded from https://www.cambridge.org/core. IP address: 170.106.33.14, on 25 Sep 2021 at 11:07:38, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300028378