<<

Genet. Res., Camb.(1999), 73, pp. 133–146. With 6 figures. Printed in the United Kingdom # 1999 Cambridge University Press 133

The effect of background selection at a single locus on weakly selected, partially linked variants

" # # WOLFGANG STEPHAN *, BRIAN CHARLESWORTH  GILEAN MVEAN " Department of Biology, UniŠersity of Rochester, Rochester, NY 14627-0211, USA # Institute of Cell, Animal and Population Biology, UniŠersity of Edinburgh, Edinburgh EH9 3JT, UK (ReceiŠed 18 May 1998 and in reŠised form 18 September 1998)

Summary Previous work has shown that genetic diversity at a neutral locus is affected by background selection due to recurrent deleterious as though the effective population size Ne is reduced by a factor that is calculable from genetic parameters such as rates, selection coefficients, and the rates of recombination between sites subject to selection and the neutral locus. Given that silent changes at third coding positions are often subject to weak selection pressures, it is important to develop similar quantitative predictions of the effects of background selection on variation and evolution at weakly selected sites. A diffusion approximation is derived that describes the effects of the presence of a single locus subject to mutation and strongly deleterious selection on variation and evolution at a partially linked, weakly selected locus. The results are validated by computer simulations using the Ito pseudo-sampling method. We show that both nucleotide site diversity and rates of molecular evolution at a weakly selected locus are affected by background selection as though Ne is reduced in the same way as for a neutral locus. Heuristic arguments are presented as to why the change in Ne for the neutral case also applies with weak selection. As in the case of a neutral locus, the number of segregating sites in the population is poorly predicted from the change in Ne. The potential significance of the results in relation to the effects of recombinational environment on molecular variation and evolution is discussed.

regions of reduced recombination, suggesting that 1. Introduction selection at weakly selected sites is less effective when There has recently been a great deal of theoretical recombination is infrequent (Kliman & Hey, 1993). work on variation and evolution at loci that are These observations can be explained by (i) ‘selective closely linked to sites that are the targets of selection. sweeps’ of favourable mutations, which result in the This work has largely been motivated by the ob- fixation of adjacent chromosomal regions (Maynard servation of reduced DNA variability in natural Smith & Haigh, 1974; Kaplan et al., 1989; Stephan et populations of Drosophila at loci situated in regions al., 1992; Stephan, 1995), (ii) ‘background selection’, where genetic recombination is relatively infrequent, in which neutral or nearly neutral variants are lost as compared with regions where it occurs at normal a result of linkage to strongly deleterious mutant frequencies (reviewed by Aguade! & Langley, 1994; alleles that are destined to be rapidly eliminated from Aquadro et al., 1994; Moriyama & Powell, 1996). the population (Charlesworth et al., 1993, 1995; Similar patterns have recently been detected in Mus Charlesworth, 1994, 1996; Hudson, 1994; Hudson & (Nachman, 1997), Aegilops (Dvorak et al., 1998) and Kaplan, 1994, 1995; Nordborg et al., 1996), (iii) Lycopersicon (Stephan & Langley, 1998). In addition, temporally varying selection pressures, which can the level of codon bias in D. melanogaster is lower in cause linked variants to be brought close to loss or fixation (Gillespie, 1994, 1997; Barton, 1995). In this paper we will be concerned solely with the effects of background selection. Previous work on this * Corresponding author. Telephone: j1 (716) 275 009. Fax: j1 (716) 275 2070. e-mail: stephan!troi.cc.rochester.edu. problem has concentrated on the properties of neutral

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 W. Stephan et al. 134

variants linked to loci under selection. The reduction that of the mutant allele a by q. Furthermore, we in genetic diversity at neutral sites predicted by the assume that q ' 1; thus, selection acts primarily background selection model can be largely understood against heterozygous carriers of the mutant allele. Let in terms of a reduction in effective population size (Ne) u be the mutation rate from A to a, and t be the caused by the fitness effects of the loci under selection product of the dominance coefficient and the (Charlesworth et al., 1993), and useful formulae for coefficient of selection against mutant homozygotes; this reduction have been developed (Hudson, 1994; then q l u\t (approximately), according to the classi- Hudson & Kaplan, 1994, 1995; Nordborg et al., 1996; cal formula for equilibrium under mutation and Santiago & Caballero, 1998). Charlesworth (1994) selection. The second locus is assumed to be under derived results for the effects of background selection directional selection; furthermore, it is assumed that on the probabilities of fixation of weakly selected dominance is intermediate, so that Bb and BB favourable or deleterious alleles in a non-recombining individuals have fitness 1js and 1j2s, respectively, genome; Peck (1994) studied the effect of background relative to a fitness of 1 for bb. (The latter assumption selection on the fixation probability of a strongly could be generalized to arbitrary degrees of dominance selected favourable allele in a non-recombining without changing the main conclusions of this paper.) genome subject to recurrent deleterious mutations. s may be negative or positive. We assume that QsQ ' t. Barton (1995) has obtained results for the effect of Recombination between the two loci is measured by recurrent deleterious mutations at a single locus on the recombination fraction, rh. the fixation probability of a linked, strongly selected, Since the first locus is assumed to be in equilibrium, favourable mutation. it is convenient to deviate from the standard notation The results of these calculations suggest that of two-locus theory and to background selection affects fixation probabilities introduce the following variables: x is the frequency through a reduction in Ne, in a way which is virtually of B among chromosomes carrying allele A, and y is identical to its effect on diversity in the neutral case. the frequency of B among chromosomes carrying But no theoretical results on fixation probabilities or allele a. Then the deterministic recurrence equations genetic diversity for the effect of background selection for the changes in x and y are (Nordborg et al., 1996) on partially linked, weakly selected loci are available. 5 (By weak selection, we mean that the product of the ∆x l sx(1kx)kqr(xky), 6 (1) effective population size and the selection coefficient is 1 7 ∆y l sy( ky)jp(rjt)(xky), 8 of the order of one, so that finite population size effects can significantly influence the fate of the alleles where r l rh(1kt). These equations are exact to O(s) in question.) Given the evidence that such weak and O(q). On the boundary, the vector field (defined selection pressures are important in controlling by the right-hand side of (1)) is directed towards the patterns of variation and evolution at synonymous interior of the unit square, except at (0, 0) and (1, 1). sites in Drosophila (Akashi, 1995, 1996; Akashi & In parallel, we consider the corresponding diffusion Schaeffer, 1997), it is clearly important to have a process. On the time scale of generations (measured quantitative theory of the effects of background by τ), the Kolmogorov forward equation for the selection on weakly selected sites. The purpose of the probability densities of x and y is (cf. Nordborg et al., present paper is to present some analytical and 1996) simulation results for the effects of the presence of a c single locus subject to mutation and strongly del- p(x, y, τ) l Lp(x, y, τ), (2) eterious selection on a partially linked, weakly selected cτ locus. We show that both nucleotide site diversity and where rates of molecular evolution at the weakly selected # locus are affected by background selection as though 1 c c N L l # y(1ky)k [sy(1ky)jp(rjt)(xky)] e is reduced in the same way as for a neutral locus. 4Nq cy cy # 1 c c 2. The model and its assumptions j # x(1kx)k [sx(1kx)kqr(xky)]. (3) 4Np cx cx We consider a two-locus, two-allele system, in a population of N diploid, randomly mating individuals. Equation (2), with L defined as in (3), is exact to " We assume that N is sufficiently large and selection is O(N− , s); quadratic and higher-order terms in q are sufficiently strong that the alleles A and a at the first also neglected in the drift operator. To define the locus are in an approximate equilibrium, due to diffusion problem completely, initial and boundary selection against the deleterious mutant allele a. conditions have to be specified. With respect to the Following the notation of Nordborg et al.(1996), the initial value, we assume throughout this paper that a frequency of the wild-type allele A is denoted by p and diffusion process which is at (x, y) at time τ has started

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 The effect of background selection on weakly linked sites 135

at (ξ, η) at time 0. The definition of boundary along the diagonal y l x (caused by selection) will be conditions is less clear, because a general theory of decoupled as follows: boundary conditions is not available for two-dimen- c sional diffusions. However, it appears that no L# lka (yky(x)), (4c) conditions can be imposed on the boundary of the cx unit square, except for (0, 0) and (1, 1). This is because and the boundary of the unit square is not invariant under # 1 c c the vector field defined by the right-hand side of (1), L$ l # x(1kx)k [sx(1kx)ka(xky(x))], except that (0, 0) is mapped onto itself, as is (1, 1). The 4Np cx cx diffusion will eventually be absorbed at (0, 0) or (1, 1). (4d) Therefore, we will be able to impose conditions on the where a l qr. We also write px(y) for the stationary probability of ultimate fixation of allele B, which we solution of the fast dynamics; i.e. the solution of calculate in Section 4. L" px(y) l 0, (5a) 3. Approximate solution of the diffusion equation and the quasi-equilibrium for y is given by " The diffusion equation (2) cannot be solved exactly y(x) l ypx(y) dy.(5b) by analytical methods. In this section, we apply an &! approximation method that allows us to reduce the 1 diffusion equation to an exactly soluble one. To Gardiner ( 990) has shown that an approximate describe this procedure, we start with the deterministic solution of (2) and (3) can be obtained in terms of the Laplace transform. The Laplace transform of any equations (1). Because we assumed that q ' 1 and QsQ function of time f(τ) is defined by ' t, the dynamics of the deterministic system are such _ that the trajectory, starting at some point (ξ, η), − fg(σ) l e στ f(τ) dτ. (6) rapidly approaches a quasi-equilibrium state y l y(x) &! near the diagonal y l x. After this rapid relaxation −" phase, the system moves slowly towards the stationary The solution, which is accurate to O(N , s) is then solution (i.e. (0, 0) or (1, 1)), while staying close to the found as (Gardiner, 1990, eqs. [6.6.83] and [6.6.105]) y diagonal. The fast-relaxing variable is , because for 1 −" σν4 (σ) l PL$k PL# L" (L#j[L$, P]) ν4 (σ)jν(0). initial values ξ  η its dynamics are mainly determined ( b * by the parameter p(r t); because of our standard j (7) assumptions (q ' 1 and QsQ ' t), this parameter is much larger than the other two parameters of (1), Here we have used the projection operator s and q. This procedure thus leads to an ‘adiabatic’ " elimination of variable y, which varies rapidly on the Pf(x, y) l px(y) f(x, y) dy, (8) &! time scale of the slowly varying variable x. The two equations can therefore be decoupled and solved where f(x, y) is an arbitrary function. [L$, P]isa successively, starting with the equation for y. commutator defined as [L$, P] l L$ PkPL$. It follows Similar elimination procedures have been developed from (8) that for diffusion processes. We follow the method of ν(τ) l px(y) p# (x, τ), (9a) Gardiner (1990, chap. 6.6.3). This method leads to more accurate results than the standard approxi- where mation procedure used in population genetics for the " p# (x, τ) l p(x, y, τ) dy (9b) elimination of fast-changing variables (which are in a &! quasi-equilibrium) from multidimensional diffusion equations (e.g. Kimura, 1985; Stephan, 1996). We and p(x, y, τ) is the solution of (2) and (3). first write the diffusion operator L as We first consider the term PL$ νg (σ) of (7). Using (9), we find L l bL"jL#jL$ (4a) # where b p(r t) is the relaxation coefficient of the 1 c c l j PL$ ν(τ) l px(y) # x(1kx)k fast variable, y, and L" describes the diffusion of the (4Np cx cx fast variable y (which depends on x); thus, # i[sx(1kx)ka(xky(x))] p# (x, τ). (10) 1 c c s * L" l # y(1ky)k y(1ky)jxky .(4b) 4Nqb cy cy 9b : The term in curly brackets is the infinitesimal operator Furthermore, the exchange between the of a diffusion in x which would result if the fast- ‘subpopulations’ of A and a chromosomes (caused by changing variable y were eliminated – as is generally recombination and mutation) and the slow movement done in population genetic applications of diffusion

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 W. Stephan et al. 136

−" theory – by simply using the deterministic equations PL# L" L# px(y) pV (x) (1). However, as simulation has revealed (results not # c dy(x) shown), this reduction procedure produces poor lka px(y) kD!(x) jD"(x) results in this case. We therefore evaluate the cx ( dx remaining terms of (7). −" c 1k2y(x) # We begin with the operator PL# L" L#. A con- i jβ # Ess((yky(x) ) j2βD#(x) 9cx (y(x)(1ky(x)) : venient expression can be found if we use the identity (Gardiner, 1990, eq. [6.5.33]) 1 1k2y(x) i kβD$(x) # pV (x), (16) y(x)(1ky(x)) (y(x)(1ky(x)) * _ −" L L" 1kP lk e "τ dτ.(11) 0 1 &! where " Using (11), we immediately find Dn(x) l dyh(yhky(x)) " &! −" PL# L" L# p (y) pV (x) lkp (y) dyh L# _ x x L n &! i e "τ dτ(yhky(x)) px(yh), n & 0. (17) &! _ L i e "τ dτ L# px(yh) pV (x). (12) &! Dn(x) can be computed in a straightforward way (Appendix, Section (i)) To evaluate the integrals, we need an explicit expression for p (y), which is defined by (5a). Equation s n+" x Dn(x) l 1j (1k2y(x)) Ess((yky(x)) ), (18) (5a) can be written in the form 9 b : # 1 c c which is O(s). The stationary moments can also be # y(1ky) px(y)k [αy(1ky)kβ(1kx)y calculated to a sufficient degree of accuracy (Appendix, 2 cy cy Section (ii)). Using the formulae for the moments jβx(1ky)] px(y) l 0, (13) (A.7–A.10), (16) leads to # # where we use the abbreviations α l 2Nqs and β l 1 −" 1 r c k PL# L" L# ν(τ) l px(y) q # # pV (x, τ). 2Nqb. The form of (13) is well known (e.g. Ewens, b (4N (rjt) cx * 1979, chap. 5.6); its solution is (19) #βx−" #β("−x)−" #αy " px(y) l Cx y (1ky) e ,(14) This equation is O(N− , s). Quadratic terms in q have also been neglected. where C is a normalization constant. Furthermore, x Finally, we have to evaluate the operator we need the partial derivative −" PL# L" [L$, P], where (Gardiner, 1990, eq. [6.6.106]) c y y we have px(y) l px(y) ln kEss ln ,(15a) cx 9 1ky 0 1ky1: [L$, P] ν(τ) l px(y) (rx(y) where E (…) denotes the expectation with regard to 1 c ss i x(1kx)k[sx(1kx)ka(xky(x)) the stationary solution px(y). Since the diffusion 94Np cx : process is expected to stay close to the quasi- 1 equilibrium y l y(x) after the short initial phase, we jsx(y) x(1kx) pV (x, τ), (20a) may expand the function in the square brackets into a 4Np * Taylor series about y y(x): l and c # # px(y) l 2βpx(y) (cpx(y))\cx (c px(y))\cx cx rx(y) l , sx(y) l . (20b) px(y) px(y) 1 1 1k2y(x) i [yky(x)]k # A straightforward calculation yields (y(x)(1ky(x)) 2 (y(x)(1ky(x)) # 1 −" 1 r c PL# L" [L$, P] ν(τ) l px(y) k q # x(1kx) # # $ b ( 2N rjt cx i9(yky(x)) kEss((yky(x) )):jO(yky(x)) * (15b) r c jsq (x(1kx) pV (x, τ). (21) rjt cx * Thus, using (15b) with terms up to second order in yky(x), we can write Adding (10), (19) and (21), and using (7), results in a

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 The effect of background selection on weakly linked sites 137

diffusion equation that describes the dynamics of the For large Nfs, (23a) for the case of a favourable slow variable x: mutation present as a single copy (ξ l 1\2N) # # approaches Barton’s (1995) (17c): c 1 t c p# (x, τ) l 1jq # # x(1kx) cτ 4N (rjt) cx 1 0 1 U l 2sf. (23c) 02N1 c ip# (x, τ)ks x(1kx) p# (x, τ). (22) cx On the assumption that long-term evolutionary change results from the fixation of new mutations that −" Again, this equation is O(N , s), and neglects quad- occur as unique events, the rate of molecular evolution, ratic and higher-order terms in q. It has the same form k, is given by the rate of input of new mutations into as a one-dimensional diffusion equation with di- the population, multiplied by their probability of rectional selection. The additional factor (in front of fixation (Kimura, 1983, chap. 3). Thus, in the present the diffusion operator) indicates that the effective # # case we have population size is reduced by 1kqt \(rjt) . The same reduction in Ne has been found by Hudson & Kaplan 1 k l 2Nµ U , (24) (1994, 1995) and Nordborg et al.(1996) in their 02N1 analysis of the effects of background selection and recombination on neutral variation. where µ denotes the mutation rate with respect to the segment of the genome under consideration.

(ii) Sojourn times and nucleotide site diŠersity 4. Applications of the results Using (23) and (4.24) from Ewens (1979), we (i) Fixation probability and the rate of molecular immediately obtain a formula for the sojourn time eŠolution density for our one-dimensional diffusion, which Equation (22) shows that the distribution of allele describes the expected time that the allele B with frequencies among the A-carrying chromosomes is initial frequency ξ spends between frequencies x and described by the standard forward diffusion equation xjdx among the chromosomes carrying A: 1 1 of population genetics (Kimura, 964, 983), with a %Nfs(x−") 1ke simple modification to the effective population size, t(x, ξ) l U(ξ) , ξ % x % 1. (25) Ne, which describes the sampling variance of allele sx(1kx) frequency over a single generation. Since the forward equation can, in principle, be used to obtain all As discussed by Charlesworth et al.(1993) and properties of interest, including fixation probabilities Charlesworth (1994), under the infinite sites model and sojourn times in specified intervals of gene with unidirectional mutations from b to B arising in frequency (Kimura, 1964, 1983), this result can be each generation, the average nucleotide diversity, π, used to obtain the relevant formulae, by substituting within the A class contributed by such variants, relative to the classical neutral value, π!, is given by this expression for Ne into standard equations. We first provide an approximate formula for the π " fixation probability of a weakly selected mutant. To a l #H, (26a) π good approximation, it should be legitimate to ! consider only the distribution of B among A where chromosomes, as in (22), since (1) imply that fixation of B in this class essentially guarantees fixation in the " H l 2x(1kx) t(x, ξ) dx population as a whole, and is certainly required for &! fixation. Using standard results (Kimura, 1964, 1983), −%Nfs the probability of fixation, U(ξ), of a mutant allele B U(ξ) 1ke l 2 1k . (26b) with initial frequency ξ becomes s 0 4Nfs 1 −%Nfs 1ke ξ For s 4 0, H converges to 4Nfξ; with ξ l 1\2N,we U(ξ) l −%Nfs , (23a) 1ke obtain the familiar result π\π! l f for the effect of background selection on neutral variants (Hudson & where Kaplan 1994, 1995; Nordborg et al., 1996). # If q is small, as we have assumed, the overall t diversity in the population is overwhelmingly de- f l 1kq #. (23b) (rjt) termined by the diversity within the A class (Nordborg

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 W. Stephan et al. 138

et al., 1996), and so (26) should provide an excellent 5. Simulation results approximation for the diversity in the population as a whole; i.e. Nf can be used to replace Ne in the (i) Simulation methods standard equation for diversity at weakly selected sites In order to examine the validity of the analytical (Kimura, 1983, p. 45; Charlesworth, 1994). We would approximations derived above, we have performed not, however, expect this to be true of the number of simulations of the fate of novel mutants in small segregating sites maintained in the population under populations (N l 2000 diploid individuals). The above drift–selection–mutation equilibrium, since this is system of two loci with two alleles was simulated. At proportional to the expected sojourn time of a variant the A locus, a constant frequency of the strongly in the whole population (Ewens, 1979, p. 238). Unless deleterious allele a (with heterozygous selection the population size is extremely large, the time to loss coefficient t l 0n02) was maintained at the deter- of a B mutation that is associated with a is not greatly ministic equilibrium of q l 0n1 (where q l u\t), while different from that for a neutral allele, so that the novel mutations (b to B) of intermediate dominance presence of a alleles does not have as marked an effect (h l 1\2) were introduced as single copies at the on the persistence of B variants as on the net diversity, B locus, separated from the first by recombination to which a contributes little (Charlesworth et al., fraction rh. The effects of selection and drift on 1993). This applies to both neutral and weakly selected variants on chromosomes carrying A are independent B variants.

Advantageous 1·015 1·025 (A) (B)

1·01 1

1·005 0·975 1 0·95

0·995 Diversity 0·925

Fixation probability Fixation 0·99

0·9 0·985

0·98 0·875 0 0·02 0·04 0·06 0·08 0·1 0 0·02 0·04 0·06 0·08 0·1

Deleterious 1·025 1·02 (C) (D)

1 1·01

0·975 1

Diversity 0·95

Fixation probability Fixation 0·99 0·925

0·98 0·9 0 0·02 0·04 0·06 0·08 0·1 0 0·02 0·04 0·06 0·08 0·1 Recombination fraction

Fig. 1. The effect of background selection on fixation probability (A and C) and nucleotide site diversity (B and D)of advantageous (A and B) and deleterious (C and D) variants with Q Ns Q l 0n02. The x-axis indicates the recombination fraction between the locus experiencing recurrent deleterious mutation and the locus of interest. In each figure, simulation results are given by continuous lines (with standard errors) and the values predicted by theory are given by $ # dotted lines. For the cases shown, N l 2000, q l 0n1, u l 10− , t l 10− and each point represents the average of ) 1n6i10 replications.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 The effect of background selection on weakly linked sites 139

Advantageous 1·05 1·01 (A) (B)

1 1

0·99 0·95

0·98 Diversity

Fixation probability Fixation 0·9 0·97

0·96 0·85 0 0·02 0·04 0·06 0·08 0·1 0 0·02 0·04 0·06 0·08 0·1

Deleterious 1 1·06 (C) (D)

0·98 1·04

0·96 1·02

Diversity 0·94

Fixation probability Fixation 1 0·92

0·98 0·9 0 0·02 0·04 0·06 0·08 0·1 0 0·02 0·04 0·06 0·08 0·1 Recombination fraction Fig. 2. The effect of background selection on fixation probability (A and C) and nucleotide site diversity (B and D)of advantageous (A and B) and deleterious (C and D) variants with Q Ns Q l 0n2. The other variables are as in Fig. 1.

of the effects on a chromosomes (although a variant represented as a random walk of magnitude NVδx may move between classes by recombination). The (where Vδx is the variance in change in gene frequency sampling process for each is performed separately, due to drift). Uniform random numbers are drawn to assuming population sizes of Np and Nq, respectively. decide the sign of the change in gene frequency at the In a generation when the number of the rarer allele B locus, such that negative and positive increments at the B locus in either class with respect to the A locus occur with equal frequency. For the A class, we have

was less than 15, a random number was drawn from Vδx l x(1kx)\2Np, and for the a class, Vδy l a Poisson distribution whose mean was equal to the y(1ky)\2Nq. number of copies of the allele in the parental The frequencies of loss and fixation, the times to generation, after the deterministic forces had changed loss and fixation, and the sum of the heterozygosities allele frequencies according to (1). This was used to for each generation over the sample path for a variant generate the number of alleles of the rarer type were followed for many replicate introductions (at ( (ancestral or mutant) in the next generation. To least 8i10 ). These statistics were then compared increase the speed of the simulations, the Ito pseudo- with values for mutations with the same selection sampling method was employed when the number of coefficient when there is no background selection, and alleles of each type was greater than 15 (Li, 1980; with the values predicted by the analytical methods Charlesworth et al., 1995). In the Ito method, the presented in this paper. change in frequency of an allele is approximated by a At the suggestion of Dr John Gillespie, multi- stochastic difference equation dimensional diffusion simulations were also performed to investigate the effects of stochastic variation in ∆x l M pNV , (27) δx δx frequency at the locus experiencing recurrent del-

which has a deterministic component Mδx due to eterious mutation. This was achieved by modelling the selection and a stochastic element (due to drift) system under mutation–selection–drift equilibrium for

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 W. Stephan et al. 140

Advantageous 1 1·02 (A) (B)

1

0·95 0·98

0·96 Diversity 0·9 Fixation probability Fixation 0·94

0·92 0·85 0 0·02 0·04 0·06 0·08 0·1 0 0·02 0·04 0·06 0·08 0·1

Deleterious 1·2 1·2 (C) (D)

1·15 1

1·1 0·98

1·05 Diversity 0·96 Fixation probability Fixation 1 0·94

0·95 0·92 0 0·02 0·04 0·06 0·08 0·1 0 0·02 0·04 0·06 0·08 0·1 Recombination fraction Fig. 3. The effect of background selection on fixation probability (A and C) and nucleotide site diversity (B and D)of ) advantageous (A and B) and deleterious (C and D) variants with Q Ns Q l 0n5. Each point represents the average of 8i10 ) (advantageous) or 1n6i10 (deleterious) replications. The other variables are as in Fig. 1.

both loci simultaneously and calculating the expected locus and the locus experiencing recurrent deleterious heterozygosity and frequency of the preferred allele at mutation; dominance was intermediate (h l 1\2) for the weakly selected locus in a sample from the both loci. The mutation rate at the A locus was population every 10N generations, after an initial adjusted to give an expected frequency of the period of 10N generations to reach mutation– deleterious allele, a, at mutation–selection balance of selection–drift equilibrium. Samples separated by 10N 0n1 (using q l u\t), except when the selection generations should be largely independent (Li, 1987). coefficient was zero (at which point the expected Each generation, the frequencies of the four alleles frequency is 1). A much lower reverse mutation rate (a ' AB, Ab, aB and ab were altered first by selection to A) was also included (10− ) to prevent permanent (according to the formulae in Hill & Robertson, fixation of the a allele when it is under weak selection. 1966), then by mutation, and finally drift. When every The rates of forward and reverse mutation at the & allele was present in at least 20 copies, the Ito pseudo- weakly selected locus (B) were both 10− , and a sampling method was used to simulate the change in population size of N l 1000 was used. Only cases of frequency due to drift (for the implementation of the complete linkage between the two loci were considered. Ito scheme in a multidimensional situation see Li, 1980). When any one allele was present in fewer than (ii) Simulation results 20 copies, multinomial sampling was used to obtain the number of copies of the three least frequent allele The results for both negative and positive selection on types. newly introduced alleles at the B locus are displayed in Simulations were carried out for different values of Figs. 1–3, for various magnitudes of Ns. Nucleotide the selection coefficient at both the weakly selected site diversities and fixation probabilities are plotted

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 The effect of background selection on weakly linked sites 141

1·06 (A) Advantageous

Fixation probability Diversity 1·04 Segregating sites

1·02 Observed value/predicted value Observed value/predicted 1

0·98 0 0·02 0·04 0·06 0·08 0·1 Recombination fraction

1·06 (B) Deleterious

1·04

1·02 Observed value/predicted value Observed value/predicted

1

0·98 0 0·02 0·04 0·06 0·08 0·1 Recombination fraction Fig. 4. The discrepancy between the simulation results and the values predicted for the effect of background selection (as modelled by a change in Ne), for fixation probability, nucleotide site diversity and the number of segregating sites in the population for advantageous (A) and deleterious (B) variants with Q Ns Q l 0n2. Parameter values are the same as in Fig. 2. The expected number of segregating sites was calculated from the expected sojourn time in each frequency class, given the reduction in Ne, due to background selection. These values were corrected for a small difference (1%) in the average time to loss or fixation between simulations and numerical estimates in the absence of background selection.

against the recombination fraction rh, as the ratio of on each plot. Fig. 4 displays results for the fixation the observed values to those expected under the same probability, nucleotide site diversity and number of selection model with no background selection. The segregating sites on the same graph. theoretical values and simulation results are displayed The agreement between theory and simulation for

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 W. Stephan et al. 142

Table 1. Summary of simulation results to examine the effects of background selection on completely linked, weakly selected Šariants, when there is stochastic Šariation in allele frequencies at both loci

Ns at Average weakly Nt at Nµ at frequency Average selected linked linked preferred Expected expected Expected a b site site site allele (F ) F\F! F\F! heterozygosity H\H! H\H! 0020n502p0n007 — 0n03796p0n00012— 20n20n500p0n004 0n998p0n0160n03571p0n00037 0n939p0n012 10 1 0n491p0n005 0n980p0n0170n03441p0n000160n905p0n008 20 2 0n507p0n007 1n000 1n011p0n020 0n03456p0n00008 0n900 0n904p0n010 0n2020n679p0n003 0n03673p0n00027 — 20n20n670p0n004 0n987p0n007 0n03380p0n00030 0n920p0n011 10 1 0n656p0n003 0n966p0n006 0n03324p0n00025 0n905p0n009 20 2 0n656p0n005 0n975 0n967p0n009 0n03322p0n00038 0n909 0n904p0n012 0n5020n858p0n003 0n03000p0n00031 — 20n20n843p0n002 0n982p0n004 0n02866p0n00030 0n955p0n014 10 1 0n839p0n003 0n977p0n005 0n02784p0n00023 0n928p0n012 20 2 0n842p0n003 0n974 0n981p0n005 0n02842p0n00026 0n940 0n947p0n013 % a Each figure is the average across 10 rounds of simulation, each of which consists of 10 samples, one taken every 10N generations. F is the average frequency of the preferred allele, H is average expected heterozygosity. F! and H! are the values observed in the absence of background selection. Expected values are calculated from diffusion theory assuming the infinite sites model. b Standard errors calculated by the delta technique.

diversities and fixation probabilities is generally weakly selected locus (B) is in agreement with the excellent for all values of s and r, except for the results shown previously, when Q Nt Q & 10 at the A probabilities of fixation when the selection coefficient locus. That is, for the case of no recombination, the is very small. This is probably due to the increased effect of background selection on weakly selected influence of stochastic forces when the selection linked variants can be accurately modelled as a coefficient is low (in this case for Q Ns Q ! 0n1). Given change in Ne. We expect that this should also apply that there is no consistent deviation between the for recombination rates greater than zero. The simulated and analytical results, the observed dis- approximation does not hold only when Q Nt Q l 2at crepancies are unlikely to be due to the approximation the A locus. procedure of the analytical results. The agreement The effect of background selection on the average between the predicted and observed diversities is frequency of the preferred allele is also consistent with excellent for all values of s. In sum, the effects of the predictions of the above theory. For Q Ns Q l 0at background selection on these aspects of the behaviour the weakly selected locus, background selection has of weakly selected, linked variants can be accurately no effect on the average frequency (always 0n5). For approximated as a change in Ne, as in the case of Q Ns Q " 0 at locus B, there is a reduction in the average linked neutral variants (Charlesworth et al., 1993), frequency of the preferred allele when Q Nt Q " 0 at the and for completely linked, weakly selected variants A locus. This is in close agreement with the value (Charlesworth, 1994). As expected from the argument expected from the reversible mutation model of in Section 4(ii), the total sojourn times are not well selection and drift, using the predicted value of Ne (Li, predicted by substituting Ne into the standard diffusion 1987; Bulmer, 1991). It is worth noting that the equation formulae, as was previously found in the predicted effect of a change in Ne on heterozygosity is neutral case (Charlesworth et al., 1993) (Fig. 4). of a greater relative magnitude than on the average The results of the simulations which modelled frequency of the favoured allele. mutation–selection–drift equilibrium at both loci are shown in Table 1. For Q NtQ 1, the average frequency & 6. Discussion of the deleterious allele at the A locus is in close agreement with the value expected from the deter- The general conclusion from the results we have ministic formula q l u\t (data not shown). Fur- presented is that background selection caused by a thermore, the observed variance in frequency of the single locus affects nucleotide site diversity and fixation deleterious allele at the A locus is very close to that probabilities at linked sites subject to weak selection expected from diffusion theory for Q Nt Q & 1.Itisnot as though effective population size is reduced by the then surprising to find that the effect of background factor f of (23b). The same factor has been shown to selection on the average heterozygosity at the linked, apply to nucleotide site diversity at neutral sites

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 The effect of background selection on weakly linked sites 143

(Hudson & Kaplan, 1994, 1995; Nordborg et al., But, as noted by Kimura (1983, p. 229), there is a 1996), and to the probability of fixation of a relatively negligible effect of selection on a rare allele on its strongly selected favourable mutation (Barton, 1995). distributional properties under drift, since the relevant Santiago & Caballero (1998) have pointed out, term is of the order of the product of selection however, that the formula for f is inexact for large r as coefficient s and the allele frequency (see the first terms far as neutral diversity is concerned (the reduction in (1a) and (1b)). Thus, the effect of weak selection at below 1 is underestimated by a factor of 2 for free the B locus on the probability of fixation of a variant recombination). The discrepancy is, however, very at this locus will not become manifest for many hard to detect in the case of background selection at generations after its origin by mutation. Combining a single locus, where the reduction in Ne is extremely this result with the argument used above implies that small for large recombination fractions. For this the asymptotic value of Ne derived for the neutral case reason, we have confined our simulations to re- should apply as a good approximation for the fixation combination fractions of 10% or less. probability of a weakly selected allele. It is remarkable The following heuristic argument suggests a reason that the branching process analysis of Barton (1995) for the similarity between the neutral and weakly for the case when Ns ( 1 also yields a similar result selected cases. In general, under a diffusion model, the (see (23c)). effect of drift on the transition from one generation to If these heuristic arguments are correct, then it the next at a locus can be described completely by the should be possible to extrapolate the conclusions on sampling variance of allele frequencies at this locus, in fixation probabilities and nucleotide diversities at the generation under consideration. As shown by weakly selected sites to multiple loci contributing to Ethier & Nagylaki (1980), weak selection of intensity background selection, at least if their fitness effects s at the locus perturbs the sampling variance from the combine multiplicatively, by an extension of the neutral value by a term of order s relative to its value arguments of Nordborg et al.(1996) and Santiago & in the absence of such selection. Such a locus should Caballero (1998). It should thus be feasible to extend thus behave very similarly to a neutral one as far as analyses of the effects of regional differences in the sampling variance due to drift in any one recombination rate on nucleotide diversity at neutral generation is concerned. sites in the D. melanogaster genome (Hudson & It is known that the effects of selection at linked loci Kaplan, 1995; Charlesworth, 1996) to weakly selected on the sampling variance experienced by alleles at a sites. In addition, it should also be possible to develop given locus in an arbitrary generation can be described quantitative predictions of the effect of background by a sum of cumulative terms involving the effects on selection on weak selection for preferred codons, by this variance of the variance in fitness caused by including sites subject to both favourable and del- randomly generated associations with genotypes at eterious variants in the same model (cf. Li, 1987; the selected locus or loci, over 1, 2, 3…generations Bulmer, 1991). This will enable us to explore the previously (Robertson, 1961; Santiago & Caballero, extent to which the observed relation between re- 1998). In the case of background selection due to a combination and codon bias in D. melanogaster single locus, the sum of these terms for a neutral locus (Kliman & Hey, 1993) can be explained by background converges rapidly, to yield an asymptotic value for the selection. factor by which Ne is reduced. This is identical to f in One caveat should be noted, however. The results (23b), provided that r is not too large. As shown by discussed here assume that simultaneous weak selec- Nordborg et al.(1996) and Santiago & Caballero tion at multiple sites in the genome has no effects on (1998), this asymptotic value is sufficient to provide an variation and evolution at individual sites, and that accurate approximation for the neutral nucleotide site departure from single-locus expectations is caused diversity, which is primarily determined by variants solely by strong selection at loci linked to the sites in that have persisted for some time in the population. question. But if linkage among weakly selected sites is The above argument on the effect of weak selection on tight, the Hill–Robertson effect of mutual interference the sampling variance implies that this result should among linked loci under selection (Hill & Robertson, also hold true as a good approximation for loci 1966) will also reduce the efficacy of selection (Li, subject to weak selection; i.e. their nucleotide site 1987). The likely magnitude of such effects requires diversity is determined by the standard formula for a further study. weakly selected locus (Kimura, 1983, p. 45), replacing As discussed in Section 4(ii), and confirmed by N by Nf. the simulation results displayed in Fig. 4, we do It is not immediately obvious why this reduction in not expect as large an effect of background selection Ne should also apply to the probability of fixation of on the number of segregating sites in the population a new weakly selected mutant allele, since the fate of as on the statistics already described. Santiago & a rare variant is strongly affected by stochastic events Caballero (1998) have described a method for occurring in the first few generations after its origin. numerically determining the effect of background

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 W. Stephan et al. 144

selection on the number of segregating sites for the the deterministic changes, the standard deviations are neutral case. This requires the use of a formula for the 0 and (approximately) 1\(2Nq), respectively. There is probability that a neutral variant remains segregating thus a significantly larger probable stochastic effect on an arbitrary number of generations after its in- x in the first case. A similar argument applies to the troduction into the population (Voronka & Keller, stochastic effects on y when a B mutation is introduced 1975) as well as expressions for the effects of initially into the A class. background selection on the effective population sizes This raises the question of why the diffusion at these times. It is possible to use Kimura’s (1955) equation approximations seem to work so well, since solution of the forward diffusion equation with genic formally they require the expected effects of the selection to obtain results for the probability of deterministic forces to be sufficiently small that segregation of a weakly selected variant, but we have second-order terms in them and their product with the not pursued this approach. This is because the number sampling variance in allele frequencies can be neglected of segregating sites in a sample from a population is in comparison with first-order terms (Ewens, 1979, much more closely related to the diversity per site than chap. 4). Inspection of (1) indicates that, while this to the number of segregating sites in the population, condition can easily be met for the frequency x of B since rare variants have a low probability of inclusion among A chromosomes by making q sufficiently in a sample. It is hard, therefore, to detect any effect small, this is not necessarily true for the frequency y of of background selection on the shape of the allele B among a chromosomes. If r and t are sufficiently frequency distribution in samples of reasonable size small, this condition can be met for (1b) (assuming (Hudson & Kaplan, 1995; Charlesworth et al., 1995). that s is small), but it is not necessarily met if either of It should be noted that the order in which the these is large, unless Q xky Q is small compared with y. deterministic and stochastic elements of the Ito For recombination fractions of more than a few per method were carried out had an important effect on cent, the diffusion approximation is thus likely to be the fit between theory and simulation. The results inaccurate in the initial generations, before Q xky Q is shown are for the case where the deterministic portion reduced to a small value by recombination. For this is evaluated first, which gives excellent agreement with reason, caution should be exercised in using (22) for the theory. This also holds for the purely neutral case. situations in which r is more than a few per cent. The When the stochastic element is evaluated first, there is practical effect of this is likely to be small, however, strong disagreement between theory and simulation, since the effect of background selection with t values particularly at high recombination fractions (where of realistic magnitude decreased rapidly as r increases the effect of background selection should be weakest). (Figs. 1–3). Biologically, the procedure of carrying out the deterministic changes first is more realistic, since it assumes that the next generation of adults (with a Appendix fixed finite population size) is formed by sampling from an infinite pool of gametes, whose composition (i) DeriŠation of equation (18) is controlled by the effects of the deterministic forces L n We note that e "τ(yky(x)) kp (y) is the solution of on the genotype of the adults of the previous x the equation generation (Ethier & Nagylaki, 1980; Nagylaki, 1990). Inspection of (1) suggest that the reason for this c f(y, τ) l Lf(y, τ)(A1) discrepancy is the possibility of a significant effect of cτ the terms in xky on the allele frequencies in the initial generations after introduction of a variant, before with the initial condition recombination has reduced the value of x y. The k n terms in question are likely to be small relative to the f(y,0)l (yky(x)) px(y). (A 2) stochastic changes in x or y if r is small (since q and t Hence, are assumed to be small), but may be comparable to the stochastic changes if r is large. For example, if a B " L n mutation is introduced initially into the a class, x 0 e "τ(yky(x)) px(y) l dyh p(y, τ Q yh,0) l &! and y l 1\(2Nq) in the first generation. Hence, the n deterministic changes in x and y are dominated by i(yhky(x)) px(yh), (A 3) r\(2N) and p(rjt)\(2Nq), respectively. If random and sampling is applied after these deterministic changes, the corresponding stochastic changes have standard _ " ! & deviations of approximately r n \(2N) and 1\(2Nq). Dn(x) l dτ dyh dy(yky(x)) p(y, τ Q yh,0) &! &! These measure the expected magnitude of the stoch- n astic changes. If random sampling is applied before i(yhky(x)) px(yh). (A 4)

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 The effect of background selection on weakly linked sites 145

The integral over y produces the trajectory of the Akashi, H. (1995). Inferring weak selection from patterns of first moment. We obtain an appropriate expression by and divergence at ‘silent’ sites in integrating the ordinary differential equations cor- Drosophila DNA. Genetics 139, 1067–1076. Akashi, H. (1996). Molecular evolution between Drosophila 1 responding to ( ). These can be written as melanogaster and D. simulans; reduced codon bias, faster d s s rates of amino-acid substitution, and larger proteins in D. (yky(x)) lk(yky(x))k x(1kx)j y(1ky). melanogaster. Genetics 144, 1297–1307. dτ b b Akashi, H. & Schaeffer, S. (1997). and the (A 5) frequency distributions of ‘silent’ DNA polymorphism in Drosophila. Genetics 146, 289–307. Linearizing y(1ky) about y l y(x), taking the ex- Aquadro, C. F., Begun, D. J. & Kindahl, E. C. (1994). pectation, and using the formulae (A 7) and (A 8) for Selection, recombination, and DNA polymorphism in the first and second stationary moments from Section Drosophila.InNon-neutral EŠolution: Theories and Mol- (ii) of the Appendix, we find for the time-dependent ecular Data (ed. B. Golding), pp. 46–56. London: Chap- first moment near the quasi-equilibrium y l y(x) man and Hall. Barton, N. H. (1995). Linkage and the limits to natural d s selection. Genetics 140,821–884. E(yky(x) lk 1k (1k2y(x)) dτ 0 b 1 Bulmer, M. G. (1991). The selection–mutation–drift theory of synonymous codon usage. Genetics 129, 897–207. # s s 1 Charlesworth, B. (1994). The effect of background selection iE(yky(x))jO , .(A6) against deleterious alleles on weakly selected, linked b b β 00 1 1 variants. Genetical Research 63,213–228. Integrating this ODE with the initial condition Charlesworth, B. (1996). Background selection and patterns of genetic diversity in Drosophila melanogaster. Genetical yhky(x), and carrying out the remaining integrations Research 68, 131–150. in (A 4) leads to (18). Charlesworth, B., Morgan, M. T. & Charlesworth, D. (1993). The effect of deleterious mutations on neutral (ii) The stationary moments molecular variation. Genetics 134, 1289–1303. Charlesworth, D., Charlesworth, B. & Morgan, M. T. The stationary moments can be computed by (1995). The pattern of neutral molecular variation under linearizing the exponential function in (14) (because the background selection model. Genetics 141, 1619–1632. 1 α ' 1) and by repeatedly exploiting the well-known Dvorak, J., Luo, M.-C. & Yang, Z.-L. ( 998). Restriction fragment length polymorphism and divergence in the 1 property of the gamma function, Γ(zj ) l zΓ(z). genomic regions of high and low recombination in self- The resulting formulae are relatively simple, since β ( fertilizing and cross-fertilizing Aegilops species. Genetics 1 (or, equivalently, 2Nu ( 1) may be assumed for 148, 423–434. most biological applications (Nordborg et al., 1996). Ethier, S. & Nagylaki, T. (1980). Diffusion approximations The lowest-order terms of the stationary moments are of Markov chains with two time scales and applications to population genetics. AdŠances in Applied Probability s 12, 14–19. E (y) l y(x) l xj x(1kx), (A 7) ss b Ewens, W. J. (1979). Mathematical Population Genetics. Berlin: Springer-Verlag. # 1 Gardiner, C. W. (1990). Handbook of Stochastic Methods Ess((yky(x)) ) l x(1kx), (A 8) 2β for Physics, Chemistry and the Natural Sciences. Berlin: Springer-Verlag. $ 1 Gillespie, J. H. (1994). Alternatives to the neutral theory. In Ess((yky(x)) ) l # x(1kx)(1k2x), (A 9) Non-neutral EŠolution: Theories and Molecular Data (ed. 2 β B. Golding), pp. 1–17. London: Chapman and Hall. and Gillespie, J. H. (1997). Junk ain’t what junk does: neutral alleles in a selected context. Gene 205,291–299. % 3 # # Hill, W. G. & Robertson, A. (1966). The effect of linkage on Ess((yky(x)) ) l # x (1kx) .(A10) 4β limits to artificial selection. Genetical Research 8, 269–294. Hudson, R. R. (1994). How can the low levels of DNA This work was supported by grants from the Royal Society sequence variation in regions of the Drosophila genome to B.C. and the National Environment Research Council to with low recombination rates be explained? Proceedings B.C. and Deborah Charlesworth, and by NSF grants DEB- of the National Academy of Sciences of the USA 91, 9407226 and DEB-9896179 to W.S. We thank John Gillespie 6815–6818. and an anonymous reviewer for suggesting improvements to Hudson, R. R. & Kaplan, N. L. (1994). Gene trees with the manuscript. background selection. In Non-neutral EŠolution: Theories and Molecular Data (ed. B. Golding), pp. 140–153. References London: Chapman and Hall. Hudson, R. R. & Kaplan, N. L. (1995). Deleterious back- Aguade! , M. & Langley, C. H. (1994). Polymorphism and ground selection with recombination. Genetics 141, divergence in regions of low recombination in Drosophila. 1605–1617. In Non-neutral EŠolution: Theories and Molecular Data Kaplan, N. L., Hudson, R. R. & Langley, C. H. (1989). The (ed. B. Golding), pp. 67–76. London: Chapman and Hall. ‘hitch-hiking’ effect revisited. Genetics 123, 887–899.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705 W. Stephan et al. 146

Kimura, M. (1955). Stochastic processes and distribution of random . Theoretical Population Biology 37, gene frequencies under natural selection. Cold Spring 192–212. Harbor Symposia on QuantitatiŠe Biology 20, 33–53. Nordborg, M., Charlesworth, B. & Charlesworth, D. (1996). Kimura, M. (1964). Diffusion Models in Population Genetics. The effect of recombination on background selection. London: Methuen. Genetical Research 67, 159–174. Kimura, M. (1983). The Neutral Theory of Molecular Peck, J. (1994). A ruby in the rubbish: beneficial mutations, EŠolution. Cambridge: Cambridge University Press. deleterious mutations, and the evolution of sex. Genetics Kimura, M. (1985). The role of compensatory neutral 137, 597–606. mutations in molecular evolution. Journal of Genetics 64, Robertson, A. (1961). Inbreeding in artificial selection 7–19. programmes. Genetical Research 2, 189–194. Kliman, R. M. & Hey, J. (1993). Reduced natural selection Santiago, E. & Caballero, A. (1998). Effective size and associated with low recombination in Drosophila melano- polymorphism of linked neutral loci in populations under 1 1 11 gaster. Molecular Biology and EŠolution 10, 1239–1258. selection. Genetics 49,2 05–2 7. Stephan, W. (1995). An improved method for estimating the Li, W.-H. (1980). Rate of gene silencing at duplicate loci: a rate of fixation of favorable mutations based on DNA theoretical study and interpretation of data from tetra- polymorphism data. Molecular Biology and EŠolution 12, ploid fishes. Genetics 95, 237–258. 959–962. 1 Li, W.-H. ( 987). Models of nearly neutral mutations with Stephan, W. (1996). The rate of compensatory evolution. particular implications for non-random usage of syn- Genetics 144,419–426. onymous codons. Journal of Molecular EŠolution 24, Stephan, W. & Langley, C. H. (1998). DNA polymorphism 337–345. in Lycopersicon and crossing-over per physical length. Maynard Smith, J. & Haigh, J. (1974). The hitch-hiking Genetics 150, 1585–1593. effect of a favourable gene. Genetical Research 23, 23–35. Stephan, W., Wiehe, T. H. E. & Lenz, M. W. (1992). The Moriyama, E. N. & Powell, J. R. (1996). Intraspecific effect of strongly selected substitutions on neutral poly- nuclear DNA variation in Drosophila. Molecular Biology morphism: analytical results based on diffusion theory. and EŠolution 13,261–277. Theoretical Population Biology 41, 237–254. Nachman, M. W. (1997). Patterns of DNA variability at X- Voronka, R. & Keller, J. B. (1975). Asymptotic analysis of linked loci in Mus domesticus. Genetics 147, 1303–1318. stochastic models in population genetics. Mathematical Nagylaki, T. (1990). Models and approximations for Biosciences 25,331–362.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.139, on 29 Sep 2021 at 23:05:34, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672399003705