<<

Genet. Res., Camb. (1994), 63, pp. 213-227 With 3 text-figures Copyright © 1994 Cambridge University Press 213

The effect of background selection against deleterious on weakly selected, linked variants

BRIAN CHARLESWORTH Department of Ecology and Evolution, The University of Chicago, 1101 E. 57th St., Chicago, IL 60637-1573, USA (Received 21 October 1993 and in revised form 11 January 1994)

Summary This paper analyses the effects of selection against deleterious alleles maintained by (' background selection') on rates of evolution and levels of genetic diversity at weakly selected, completely linked, loci. General formulae are derived for the expected rates of gene substitution and genetic diversity, relative to the neutral case, as a function of selection and dominance coefficients at the loci in question, and of the frequency of gametes that are free of deleterious mutations with respect to the loci responsible for background selection. As in the neutral case, most effects of background selection can be predicted by considering the effective size of the population to be multiplied by the frequency of mutation-free gametes. Levels of genetic diversity can be sharply reduced by background selection, with the result that values for sites under selection approach those for neutral variants subject to the same regime of background selection. Rates of fixation of slightly deleterious mutations are increased by background selection, and rates of fixation of advantageous mutations are reduced. The properties of sex-linked and autosomal loci in random-mating populations are compared, and the effects of background selection on asexual and self-fertilizing populations are considered. The implications of these results for the interpretation of studies of molecular evolution and variation are discussed.

would be expected to show a smaller but significant 1. Introduction reduction in variation than in selfing or asexual In regions of the genome where recombination is populations, since they make up only a fraction of the infrequent, the amount of genetic variability at neutral total genome and so are subject to a lower rate of sites can be reduced by selection against linked input of deleterious mutation (Charlesworth et al. deleterious alleles, maintained in the population by 1993). The process of background selection provides recurrent mutation at many loci. This process was a possible explanation for observations of reduced called 'background selection' by Charlesworth, levels of DNA sequence variability in asexual Morgan & Charlesworth (1993), who showed that its and self-fertilizing taxa, and in regions of the magnitude depends on the total mutation rate to Drosophila genome where recombination is restricted deleterious alleles for the genomic region in question (Charlesworth et al. 1993). and the frequency of recombination in that region. The neutral model of molecular evolution (Kimura, Given the fact that the mutation rate to alleles with 1983) may be viewed as a useful null model, whose detectable deleterious effects on fitness may be as high statistical predictions can be tested against observed as an average of one or more new mutations per patterns of molecular evolution and variation zygote per generation in higher organisms (Mukai (Kreitman, 1991; McDonald & Kreitman, 1991). et al. 1972; Crow & Simmons, 1983; Kondrashov, There is still considerable controversy concerning the 1988; Houle et al. 1992), background selection may extent to which nucleotide site variants are subject to severely reduce neutral genetic variation in asexual or , particularly for substitutions causing highly self-fertilizing species, in which there is effect- amino-acid replacements (Kimura, 1983; Gillespie, ively little or no recombination across the entire 1991; McDonald & Kreitman, 1991). It is thus genome. The telomeric and centromeric regions of important to determine whether background selection many randomly mating species also show greatly distorts the predictions of selective versus neutral restricted recombination. Neutral sites in such regions models.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 B. Charlesworth 214

In addition, if background selection operates each locus and equal effects on the two sexes, together differently on sites subject to different kinds of with multiplicative fitness interactions between del- evolutionary forces, differences in patterns of mol- eterious alleles at different background loci, are ecular evolution and variation between taxa or assumed. The value of /„ depends on the mating genomic regions subject to different levels of back- system and mode of inheritance e.g. with autosomal ground selection could shed light on the nature of the inheritance and random-mating, we have /„ = evolutionary forces affecting molecular variation. It is exp(—U/2hs), whereas with sex-linkage /„ = intuitively fairly obvious that both background selec- exp(-3U/2s[2h + l]) (Charlesworth, etal. 1993). For tion and the related process of hitch-hiking of variants chromosome regions of comparable physical size and by linked selectively favourable alleles (Maynard map length, we would therefore expect larger values Smith & Haigh, 1974; Ohta & Kimura, 1975; of/0 for X-linked loci, given that deleterious alleles are Thomson, 1977; Birky & Walsh, 1988; Kaplan, generally partly recessive (Crow & Simmons, 1983). Hudson & Langley, 1989; Stephan, Wiehe & Lenz, An approximate /„ value of 0-2 seems reasonable for 1992; Wiehe & Stephan, 1993) are analogous to a an autosomal centromeric region in D. melanogaster, reduction in the effective population number for the 0-6 for the centromeric region of the X, and 008 for a region of the genome concerned. We would ac- highly selfing species (Charlesworth et al. 1993). cordingly expect the level of variability for both A general formula for the rate of substitution of neutral and selected variants to be reduced by weakly selected mutations in a genome influenced by background selection, although the extent of this background selection will be given first, followed by a reduction will depend on the mode and strength of formula for the expected nucleotide-site diversity selection. If effective population number is greatly which they contribute during their sojourn in the reduced, the behaviour of weakly selected and neutral population, under the infinite-sites model of molecular sites will become indistinguishable. The rate of variation (Kimura, 1969, 1971). Selection on the substitution of favourable variants should thus be mutations studied in this way is assumed to be much reduced by background selection and hitch-hiking, weaker than on the background loci subject to whereas that for deleterious variants will be increased, recurrent deleterious mutations. This is probably and neutral variants will be unaffected. Birky & Walsh (1988) have previously shown this to be true, but did not provide any explicit formulae for the size of the Table 1. Definitions of some quantities introduced in effects. the text The main purpose of this paper is to investigate the expected effects of background selection under differ- U Mean number of new deleterious mutations ent assumptions about breeding system and mode of per individual per generation, at loci inheritance. Complete linkage is assumed throughout, responsible for background selection as this enables explicit formulae to be developed and (background loci) f0 Equilibrium frequency of gametes free of provides an upper bound to the effect of background mutations at background loci selection. Sex-linked as well as autosomal loci will be i Selection coefficient against homozygous considered, as both play a prominent role in empirical mutations at background loci work on Drosophila. h Dominance coefficient for fitness effect of a background locus v, Rate of germline mutation per nucleotide site in females 2. A general model vm Rate of germline mutation per nucleotide site in males (i) General considerations a Selection coefficient on homozygote for a Background selection is assumed to operate by the weakly selected variant 6 Dominance coefficient for fitness effect elimination of new mutations at the loci of interest if of a weakly selected variant they arise in gametes that carry one or more deleterious S Product of a and effective population number mutations (the loci in question will be referred to as K Rate of substitution of variants under background loci). Under equilibrium between selection background selection, relative to the and mutation in a large population, the frequency of neutral value ft Nucleotide site diversity under background gametes free of deleterious mutations is/0, where/„ is selection, relative to the neutral value a function of the diploid mutation rate per genome to without background selection deleterious alleles (U), and of the selection coefficient n* Nucleotide site diversity under background and dominance coefficient for a single deleterious selection, relative to the neutral value mutant allele at a background locus. These are denoted under background selection Q Ratio of substitution rate to nucleotide by s and h respectively, where s is the reduction in site diversity, both measured relative fitness to a mutant homozygote and sh is the reduction to their neutral values under background in fitness experienced by a heterozygous carrier (see selection Table 1 for a list of symbols). Equal fitness effects of

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 The effect of background selection against deleterious mutations on weakly selected, linked variants 215

appropriate for most types of nucleotide site variants breeding sex ratio of the population, and the breeding (Kimura, 1983; Gillespie, 1991). system of the population (random-mating, asexual, or The formulae will be applied to a number of special self-fertilizing). Further details are given in section (i) cases, to illustrate the expected effects of background of the Appendix. selection on genetic variability at loci subject to weak Since variants that arise in gametes carrying selection, under a variety of assumptions concerning deleterious alleles at background loci are lost from the mating system and mode of inheritance. The effective population with a probability of one (section 2(i) population number is assumed throughout to be above), we need only consider the variant further if it sufficiently large that the distribution of the numbers is carried initially in a gamete free of deleterious of deleterious mutations per individual is close to the mutations. Conditioning on this, the initial frequency infinite population equilibrium, for the loci generating of the new variant is background selection. The method used by Charlesworth et al. (1993) for (1) the neutral case will be employed here. This method Po = /„' assumes that, with complete linkage, selection on the background loci is sufficiently strong in relation to The corresponding conditional probability of fix- population size that there is a negligible probability ation can be obtained from the standard diffusion that a deleterious mutation at a background locus can equation formula (Kimura, 1962): be fixed by drift over the time taken for a weakly selected or neutral variant to become lost or fixed. u(po) = A I G(x)dx, (2 a) This implies that, in the absence of recombination, a /o weakly selected variant is doomed to loss if it arises where initially in a gamete carrying a deleterious mutation at one or more of the background loci, or if it G{x) = exp-2 dy. (2 b) subsequently enters this class as a result of new v {y) mutations at background loci. Hence, only a fraction Jo Sy /0 of new, weakly selected, variants have a non-zero MSy{y) and VSy{y) = y{\ -y)/2f0Ne are the mean and probability of fixation. variance of the change in allele frequency at frequency Similarly, since the mean time to loss from the y, respectively, Ne is the effective population number population of a given gamete carrying deleterious for the specified mating system in the absence of mutations at the background loci is very short, weakly background selection, and selected or neutral variants carried in such gametes are rapidly lost, and contribute very little to the A = \ \ G(x)dx. (2 c) overall nucleotide site diversity of the population (Charlesworth et al. 1993; Hudson, 1994). The net contribution to such diversity from a weakly selected From the considerations of section 2(i), the net variant can therefore be approximated by considering fixation probability of the variant is its sojourn in the population as being effectively (3) restricted to the gametes that are free of mutations at "=/o "(/><>)• the background loci. This in turn implies that the If the mutation is neutral, we have u(p0) = p0. effective population number 7Ve is replaced by f0Ne in Equations (1) and (3) imply that in this case w = p, i.e. the standard formulae for allele frequency change it is the same as in the absence of background under (Charlesworth et al. 1993). While selection, as expected from the results of Birky & this assumption is clearly only an approximation, it Walsh (1988). has been shown to be very accurate in the neutral Fixation probabilities obtained from these formulae case when the size of the population is more than a can be used to obtain rates of gene substitution, under few hundred individuals (Charlesworth et al. 1993; the assumption that each variant that is fixed in Hudson, 1994), and there is no reason why weak evolution arises by mutation as a unique copy (Kimura selection should affect its accuracy. & Ohta, 1971, pp. 11-12). This requires specification of mutation rate for the sites in question, since the substitution rate per site, K, is equal to the rate of input per site of mutations into the population, (ii) Fixation probabilities and rates of gene multiplied by their probability of fixation. Given the substitution with background selection evidence that mutation rates may depend on sex, at Let the frequency of a new, weakly selected, variant in least in mammals (Miyata et al. 1987; Charlesworth, the gene pool of the breeding adults in the generation 1993; Crow, 1993), it is desirable in general to allow in which it arises be p. The value of p depends on the for the possibility of sex differences in mutation rates. mode of inheritance (autosomal or sex-linked), the sex Let the mutation rate for the female germline be v, and of the individual in which the mutation arises, the that for the male germline be vm. General expressions

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 B. Charlesworth 216

for the substitution rates are derived in section (i) of case of neutrality, these give the following results for the Appendix. autosomal and sex-linked variants, respectively It is often convenient to express these as ratios of n. x2fnN(v, + v ) (7 SL) the corresponding values for neutral mutations. An J o e\ i m/ v "/ Equations (2) of the Appendix imply that background selection does not affect the neutral substitution rates, In the absence of sex differences in mutation rates, as expected intuitively. Consistent with the results of equation (7 a) is identical with equation (3) of Miyata et al. (1987), these rates for autosomal and Charlesworth et al. (1993). sex-linked loci (with male heterogamety) are respect- ively 3. Approximate formulae for weak selection K*n = ¥vt + vJ (4 a) Except for some special cases, the general equations K = ±(2v + v ). (4 b) Kn s m derived above and in section (i) of the Appendix can only be evaluated numerically. Some useful insights can be obtained for the case when selection is weak, so (iii) Nucleotide site diversity that higher-order terms in the product of the selection Under statistical equilibrium between mutation and coefficient and/0 Ne can be neglected. This may well be drift in the infinite sites models (Kimura, 1969, 1971), appropriate for many molecular variants (Kimura, ergodicity implies that the equilibrium nucleotide site 1983; Gillespie, 1991). Unless there is a strongly non- diversity n is proportional to the sum H of the linear dependence of the quantities of interest on the diversity measure 2x(l —x) over all allele frequencies product of the selection coefficient and f0Ne, the x experienced by a variant in the interval between its results derived below should give a useful indication origination by mutation and loss or fixation. The of the nature of the effects of genetic system and equilibrium value of n is obtained by the same background selection in more general cases. procedure that was used above to obtain rates of substitution from fixation probability: H is multiplied (i) Autosomal inheritance with random mating by the number of new mutations that enter the population each generation (e.g. Ewens, 1979, p. 239). Assume initially that selection acts equally on the two By the argument of section 2(i), only mutations that sexes, and that the population is random-mating. Let arise in gametes free of deleterious mutations con- the relative fitness of the genotypes AA, Aa and aa be tribute significantly to H. 1 + a, 1 + da and 1, respectively. A is the mutant allele

For a variant with conditional initial frequency p0, under consideration, cr is the selection coefficient on H with background selection is thus approximated by the mutant homozygote, such that a < 0 implies that the standard formula for a variant allele in a the mutation is deleterious when homozygous com- population of effective population size f0Ne. By pared with the wild-type homozygote, and a > 0 equation (4.24) of Ewens (1979), the expected time implies that it is advantageous. 6 is the coefficient of spent by such a variant in the frequency range x to dominance. For 0 ^ 6 < 0-5, the mutant allele is x + dx (x ^ p0) is given by recessive or partly recessive; 6 = 0-5 corresponds to the case of intermediate dominance; 0-5 < 6 ^ 1 corresponds to a dominant or partly dominant P mutation. The cases of 6 < 0, a < 0 and 8 > 1, 0 Jx (5) VJ,x)G(x) correspond to overdominance (heterozygote ad- vantage). where the quantities on the right-hand side are defined For random-mating populations, standard theory by equations (2). (e.g. Ewens, 1979, p. 138) implies that Weighting the diversity 2JC(1 — x) by t(x,p0) and integrating over all values of x (Kimura, 1969, 1971; Ewens, 1979, p. 239), we obtain (8)

H~ SBNJou(po), (6a) where S = Nea. Approximate expressions for K and v are obtained where by substituting from equation (8) into the general n n equations derived above and in the Appendix, and B = G(x)"1 G{y)dydx. (6b) neglecting second- and higher-order terms in f0S JO Jx (Charlesworth, Coyne & Barton, 1987). Division by This formula can be combined with the expression the appropriate neutral values yields the rate of for fixation probabilities and the rates of input of substitution relative to the neutral value and the mutations to yield formulae for the equilibrium equilibrium genetic diversity relative to the neutral diversity levels (Appendix, equations [A 3]). In the value without background selection. These are

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 The effect of background selection against deleterious mutations on weakly selected, linked variants 217

Table 2. Weak selection approximations to rates of substitution and nucleotide site diversity (relative to neutral values without background selection), and to the ratio of substitution rate to diversity (each measured relative to its neutral value with background selection)

Genetic system K n e Autosomal locus (selection on both sexes; random mating) General 1+S/O 5(1+0) /o(l+i/oS0) l+§/0S No sexual selection 1 +yoNcr(l +6) /0(l +yoN Mr(l + 6) /0(l +y0 Naff) i+i/iivo- f Intense sexual selection i+if0Nma-(\+6) /0(l + f/0 Nm

denoted by A* and n, respectively. In addition, it is rate/diversity ratio Q is increased above unity by useful to consider the ratio of the rate of evolution to selection in favour of the mutant alleles, and decreased the level of genetic diversity (when both are measured below unity by selection against them. Reductions in relative to their neutral values with background f0 bring Q closer to unity. Perhaps unexpectedly, Q is selection), since this has been used to test for selection independent of the dominance coefficient. on DNA and protein sequences (McDonald & The case when selection acts on only one of the two Kreitman, 1991; Sawyer & Hartl, 1992). For a region sexes can be dealt with by treating this as equivalent subject to background selection, this ratio is given by to the above case with a selection coefficient of 0-5

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 B. Charlesworth 218

the formulae for this case are the same as for the Similar conclusions apply to cases when selection is random-mating case with no sexual selection and with sex-limited (cf. Charlesworth et al. 1987). Only the intermediate dominance. case of a species with dosage compensation in the With diploid asexuality, the descendant copies of a heterogametic sex (taken here to be male, for brevity mutation are confined to one of the two haploid of description) will be considered here. If selection is genomes of each individual in the line of descent from male-limited, the expression for Mte in the sex-linked the individual who carried the original mutation, and case is equivalent to that for the standard autosomal a mutation is considered to be fixed if all individuals equation with a selection coefficient of 2tr/3 and a in the population come to be heterozygous for it. The dominance coefficient of one-half. Comparing this right-hand side of equation (8) is thus replaced by with the result for the autosomal case with male- 2/0 Ne da, where Ne is again one-half the value for a limited selection, the magnitudes of the deviations of random-mating population of equivalent size. The K and n from the neutral value are greater for sex- formulae for this case are therefore equivalent to the linked loci with partly recessive mutations, and less random-mating case with a dominance coefficient of for partly dominant ones, just as with selection on one-half, and selection coefficient of 26a. They are both sexes. Q is always the same for sex-linked and shown in row 8 of Table 2. autosomal loci. With intense sexual selection, mag- nitudes of the deviations of K and n from the neutral value are greater for sex-linked loci with 6 < 1-25 and (iii) Sex-linkage with random mating 6 < 0-75 respectively. The deviation of Q from its The case of a sex-linked locus with selection acting neutral value is 1-5-fold larger in magnitude than for equally on hemizygous males and homozygous females the equivalent autosomal case. can be considered similarly. With weak selection, If selection is limited to the homogametic sex, it is equation (8) is replaced by easy to see that the selection term for the sex-linked case is equivalent to that for the autosomal locus without sex-limitation, with the same dominance (9) coefficient and with selection coefficient 2a/3. It is also easy to see that, with no sexual selection, the The general formula for effective population size in values of S are the same for the sex-linked case and the the sex-linked case is given by the following equation corresponding sex-limited autosomal case, so for the (Wright, 1969, pp. 213-214) same value of/„, K, n and Q are the same as for the equivalent autosomal case. With intense sexual selec- 9NmNf N = (10) tion, the magnitudes of their deviations from the neutral value are 1-5 times that for the autosomal case. If there is no sexual selection, so that the sex ratio of breeding males and females is 1:1, Ne is three- 4. Arbitrary selection intensities quarters the value for that of an autosomal locus, i.e. Af, = 0-75N. In this case, Ne is substantially less than (i) Intermediate dominance the corresponding autosomal value. If there is strong Explicit formulae for the relative rate of substitution sexual selection, so that Nt P Nm, Ne approaches and diversity for the case of random mating and 4- 5iVms slightly larger than the corresponding auto- selection on both sexes, with arbitrary values of S, can somal value (see 3(i) above). only be obtained easily for the case of intermediate The results for the sex-linked case with random dominance (6 = 0-5). For an autosomal locus with mating are shown in the lower part of Table 2, using a <^ 1, and assuming that \S\p0 is small, the results of the same method as above. With selection on both Kimura (1962, 1969) give sexes and no sexual selection, there is a larger

magnitude of the deviation of K from the neutral £r JO ^ /I 1 \ value than for the corresponding autosomal case with A~7; _-2f.SN (ll) the same f0 value, if the new mutation is recessive or partly recessive (9 < 0-5), and a smaller deviation for n = (12) partly dominant alleles. The same applies to genetic L C » ) diversity. With intense sexual selection, there is always a larger deviation for K than for the equivalent Q=: (13) autosomal case, provided that 6 < 3-5 (which applies to all cases of directional selection, for which 6 ^ 1). Similar formulae apply to the cases of completely self- There is a larger deviation for ft provided that 6 < 1-5. fertilizing or asexual populations, with the appropriate With no sexual selection, Q takes the same value as for changes in the definition of S (see section 3[ii]). the autosomal case. With intense sexual selection, The dependence of substitution rate and diversity there is a 1-5-fold deviation of Q from the neutral on the direction and strength of selection for the case value compared with the equivalent autosomal case. of intermediate dominance have been discussed by

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 The effect of background selection against deleterious mutations on weakly selected, linked variants 219

0-2 0-4 0-6 0-8 10 0 0-2 Frequency of unloaded gametes Fig. 1. The effect of background selection on slightly deleterious alleles at an autosomal locus in a random-mating population. A 1:1 sex ratio and equal male and female mutation rates are assumed. The abscissa is the frequency of deleterious-mutation free (unloaded) gametes (/„) under background selection. The curves show the equilibrium genetic diversity relative to the neutral value with no background selection (#), the rate of substitution of new mutations relative to the neutral value (•), and the ratio of this rate to the genetic diversity relative to the neutral value with background selection (». , genetic diversity for neutral alleles relative to the value without background selection. (a) is the case with S = -1 and 6 = 0-5, (b) is for S = - 2 and 6 = 0-5, (c) for S = -1 and 6 = 0-2, and (d) for S = - 2 and 6 = 0-2. The breeding population consists of 100000 individuals.

Kimura (1983, pp. 44—45). It is evident from the above with the finding for small f0 S. With intense sexual formulae that background selection reduces the effect selection, it becomes of selection on the locus of interest, with respect to both variables, as expected intuitively. The relative K = (14b) fixation probability ^depends exponentially on/0S; for deleterious alleles, it quickly tends to zero as/0|5| increases above unity. It approaches 2/ S for strongly This implies a stronger effect of selection than in the 0 corresponding autosomal case, again as expected selected advantageous alleles. A large reduction in f0 can have a considerable effect on the rate of evolution, from the finding for small /„ S. if \S\ is large. The relative diversity -n is less sensitive We also have to /0S. For deleterious alleles, it tends to —\/S as f0 \S\ increases; for advantageous alleles it tends to 2/0, n = (15) i.e. twice the neutral value (cf. Kimura, 1983, p. 239). The rate/diversity ratio Q is a strictly increasing It is easily verified that this expression is identical to function of/„ S (see section (ii) of the Appendix for a that for the autosomal case when there is no sexual proof of this statement). Q tends to zero with selection. With intense sexual selection, S = 4Nm a, increasingfo\S\ for deleterious alleles, and to/05 for giving a larger effect of selection than in the equivalent advantageous alleles. autosomal case. Similar results can be obtained for sex-linked loci. In addition, We have

K = Q = (16) 3(1 - With no sexual selection, this is identical with the Again, this is identical to the autosomal case when value for the corresponding autosomal case, consistent there is no sexual selection. There is a stronger effect 16-2

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 B. Charlesworth 220

0 0-2 0-4 0-6 0-8 10 0 0-2 Frequency of unloaded gametes Fig. 2. The effect of background selection on slightly advantageous alleles at an autosomal locus in a random-mating population, (a) is the case with 5=1 and 6 = 0-5, (b) is for 5 = 1 and 6 = 0-2. Otherwise, the curves are as in Fig. 1.

0-4 06 08 10 0 02 0-4 06 0-8 10 Frequency of unloaded gametes Fig. 3. The effect of background selection on a sex-linked locus in a random-mating population. 6 — 0-2 in each case, (a) and (c) are cases of no sexual selection, with S = — 0-75 and S = 0-75 respectively, (b) and (d) are cases of fairly strong sexual selection, with 5 = —1-07 and S= 1-07 respectively. Autosomal loci exposed to the sexual selection regimes corresponding to these sex-linked cases give results shown for the |5| = 1 cases in Figs 1 and 2. Otherwise, the curves are as in Fig. 1.

of/0 S with extreme sexual selection. As before, Q is a panels a and b) and 0-2 (lower panels c and d). Equal strictly increasing function of/0 S. mutation rates in males and females, and a 1:1 breeding sex ratio are assumed. As expected from the weak-selection approximation discussed above, differ- (ii) Arbitrary dominance ences in the dominance coefficient have only a small Results for arbitrary dominance coefficients can be effect on K, and a negligible effect on Q. Considerably obtained by numerical integration of the relevant more genetic diversity is maintained for moderate or equations. Fig. 1 shows the dependence of K, n and Q high values of/0 when there is partial recessivity of the on f0 for the case of deleterious alleles with S = — 1 deleterious fitness effects, again in agreement with the (left-hand panels a and c) and — 2 (right-hand panels weak selection approximation. For low values of/0 b and d), and dominance coefficients of 0-5 (upper (i.e. when there is a strong effect of background

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 The effect of background selection against deleterious mutations on weakly selected, linked variants 221

selection), there is little effect of dominance or of S, (i) Genetic diversity since the diversity values for the cases with selection converge on the neutral value with background The results derived above show that, as expected selection (shown by the heavy straight lines in the intuitively, the equilibrium level of genetic diversity is figures). There is a much more marked effect of reduced by background selection under all types of background selection on ft and Q when |5| is large selection acting directly on variants at the sites than when it is small, e.g. with intermediate domi- concerned. If recombination is absent or infrequent nance, the ratio of Q values for/0 = 1 and/0 = 0-1 is and background selection is sufficiently strong, genetic 0-49 when S = -\, and 0-18 when S = -2. This diversities for both neutral and selected sites will be reflects the exponential dependence of K and Q on similar and low. We would therefore expect genetic f0S in equations (11) and (13). Background selection diversity to be quite similar for silent and replacement can thus have a very large effect in accelerating the sites, in regions of reduced recombination. Surveys of rate of evolution by the fixation of deleterious alleles, molecular variation in Drosophila have provided the when these have fitness effects which are as large as or main basis for the conclusion that genetic variability is larger than the reciprocal of the effective population severely reduced in such regions, but these are mainly size. based on restriction mapping studies which usually do Fig. 2 shows comparable results for the case of not distinguish between silent and replacement site autosomal sites subject to positive selection, with variation (Begun & Aquadro, 1992; Aquadro, Begun f0S= 1 and dominance coefficients of 0-5 and 0-2. & Kindahl, 1994). The limited sequencing studies Here, the effect of selection is to increase both the rate available to date indicate that both silent and of evolution and the equilibrium diversity over neutral replacement site polymorphisms are infrequent in loci expectation. Partial recessivity of the favourable fitness in regions of reduced recombination (Berry, Ajioka effects of the mutations reduces both the rate of & Kreitman, 1991; M.Wayne, in preparation; H. evolution and the level of diversity, but Q is almost Hilton, R. M. Kliman & J. Hey, in preparation), but unaffected by dominance, as expected from the weak it seems premature to conclude that levels of diversity selection approximation. Background selection can are always similar for both classes of variant. again have a substantial effect on the rate of evolution, This conclusion assumes that the intensity of this time by reducing it towards the neutral value. selection at the sites under consideration is much Fig. 3 displays results for a sex-linked locus with weaker than that operating on the deleterious alleles equal selection on the two sexes, for sites subject to which are the source of the background selection, so partly recessive deleterious mutations (upper panels a that variants which arise in chromosomes carrying and b) and favourable mutations (lower panels c and such deleterious alleles are destined to be rapidly lost d), for the cases of no sexual selection with Ne = from the population, even if they confer a selective 0-75N (left-hand panels a and c) and fairly strong advantage. Variants subject to sufficiently strong sexual selection (5% of breeding individuals are balancing selection in relation to the degree of linkage males, so that Ne = 0-204./V (right-hand panels b and to surrounding deleterious mutations may, of course, d). The parameter values have been chosen to be be maintained close to their equilibrium, even in comparable with the autosomal cases of Figs. 1 and 2 regions subject to significant background selection. with \S\ = 1. The results are qualitatively similar to This suggests that, if a is detected in a those for the autosomal case, with somewhat more region of restricted recombination, where variation at marked effects of the parameter values, especially for most sites is depleted, serious consideration should be the case of intense sexual selection. For example, the given to the possibility that it is maintained by equilibrium diversity (relative to the neutral value) for balancing selection. According to Aquadro et al. deleterious mutations with sexual selection is 79 % of (1994), allozyme variation is not reduced significantly the corresponding autosomal value when /„ = 1, and in regions of reduced recombination in D. melano- the Q value is 66%. The differences between sex- gaster, suggesting that it may be maintained by linked and autosomal cases diminish with higher balancing selection. intensities of background selection, e.g. with/0 = 0-1, the sex-linked values for diversity and Q are 98 and 97 % of the corresponding autosomal values. Corre- (ii) Rates of molecular evolution spondingly, there is a somewhat stronger effect on sex- linked loci of the same change in /„. As shown in section 4(ii), background selection can have a very large effect in accelerating the rate of evolution by the fixation of deleterious alleles, when 5. Discussion these have fitnesseffect s which are as large as or larger The results described above have a number of than the reciprocal of the effective population size. implications for understanding patterns of molecular The rate of substitution of weakly selected advan- evolution and variation. These will be considered in tageous alleles is reduced (see also Birky & Walsh, turn. 1988). One might therefore expect a substantially

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 B. Charlesworth 222

higher rate of substitution of replacement site variants evidenced by codon bias, to be less marked in regions in regions of restricted recombination under the of restricted recombination. There is evidence that slightly deleterious alleles model of molecular evol- this is the case in Drosophila (Kliman & Hey, 1993). ution (Ohta, 1974, 1992), and a lower rate under the Similar processes may be responsible for the relatively rival model of the adaptive evolution of protein high GC contents of regions of the mammalian and sequences (Gillespie, 1991), relative to their values in yeast genomes with unusually high levels of re- other regions. Hitch-hiking by favourable mutations combination (Charlesworth, 1994). will have similar effects (Birky & Walsh, 1988). The effect of background selection on the rate of Comparisons of rates of amino acid substitution in molecular evolution also implies that loci which are in different regions of the Drosophila genome with regions of low recombination in some species but not different recombinational regimes might thus provide in others, may vary in their rate of molecular evolution. a means of distinguishing between these theories. The Changes in the recombinational environment of a sequencing data of Hilton et al. (in preparation) for locus can occur by chromosome rearrangements (e.g. five loci in the melanogaster species group of O'Brien, 1993), especially when rearrangements in- Drosophila show that there is more inter-species volve breaks close to the centromeric heterochromatin, replacement divergence per nucleotide site for two loci where recombination is often suppressed. Thus, (ciD and asense) that are located in regions of reduced observed variation in the rate of protein evolution for recombination than for the loci subject to normal the same locus among different lineages, which has levels of recombination, suggesting that this possibility been used as an argument in favour of adaptive may be realized in this case. There is also an indication evolution (Gillespie, 1991), could simply reflect vari- from restriction mapping surveys of both D. ananassae ation in the recombinational environment of the (Stephan & Mitchell, 1992) and D. melanogaster locus. This suggests that it is important to consider the (Begun & Aquadro, 1993; Aquadro et al. 1994) that recombinational environment of loci, and the breeding there are greater inter-population differences within system of the species, when conducting comparative species for nucleotide-site variants in regions of studies of rates of molecular evolution. For example, restricted recombination than in regions where re- moving a gene from the mid-section of a D. combination is normal. The effect of background melanogaster autosome to the centromeric region selection in reducing effective population size should could cause an 80 % reduction in f0 (Charlesworth promote divergence between populations subject to et al. 1993). This would result in a factor of 8-6 increase limited gene flow, even at neutral sites, since f0Nem in the rate of evolution for deleterious amino acid must replace Nem in the standard formulae for replacements with intermediate dominance and an |5| divergence between populations for variants that of 2. Even within the melanogaster subgroup of survive stochastic loss. The observed differentiation Drosophila there have been extensive internal re- might, however, simply be the consequence of hitch- arrangements of gene locations within chromosome hiking events that occurred relatively recently in one arms (Ashburner, 1989, Chap. 36). of the populations surveyed (Stephan & Mitchell, 1992), rather than an effect of background selection on enhancing divergence by drift. (iii) Tests of neutrality versus selection Unfortunately, differences among different proteins in the degree to which sequence evolution is subject to McDonald & Kreitman (1991) have proposed the use selective constraints makes this test difficult to carry of the ratio of the number of nucleotide substitutions out when only small numbers of sequences in regions separating two species to the level of nucleotide site of both normal and restricted recombination are variability within species, in a test of the neutral available for comparing related species or populations, theory of molecular evolution. On the pure neutral since the effects of recombination are confounded theory, this ratio should be the same for silent and with functional differences among proteins. When a replacement sites; on a model of adaptive protein large bank of gene sequences in sets of populations sequence evolution, it should be higher for replacement and related species becomes available, this problem sites than for silent sites, assuming that the latter are should be less severe, since functional differences close to neutrality. This follows from the fact that should presumably average out. diversity is less sensitive to selection than the rate of An alternative method of examining this question is evolution, i.e. the rate/diversity ratio for selected provided by the fact that there are good reasons for versus neutral sites, Q, exceeds unity for positively believing that the level of codon bias in bacteria, yeast, selected sites (see Table 2). McDonald & Kreitman and Drosophila is affected by the chance fixation of (1991) found a significant difference in the rate/ silent alterations to third codons, which cause the diversity ratio between silent and replacement sites for adoption of sub-optimal codons and which are the Adh gene in the D. melanogaster subgroup, therefore likely to be slightly deleterious (e.g. Bulmer, suggesting that adaptive evolution had occurred at 1991; Li & Graur, 1991). One might therefore expect some amino acid sites. Since the rate of replacement the degree of selective control of codon usage, as site evolution is generally lower than that for silent

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 The effect of background selection against deleterious mutations on weakly selected, linked variants 223

sites (Kimura, 1983), indicating negative selection At least some amino acid site variants at the Adh locus against most amino acid changes, the McDonald- must therefore be subject to positive selection, with Kreitman test for adaptive evolution with a null selection coefficients several times greater than 1 /Ne, hypothesis of Q = 1 is in fact rather conservative; in agreement with the more elaborate analysis of because Q < 1 for deleterious mutations, the rate/ Sawyer & Hartl (1992). diversity ratio for replacement changes should be less These considerations also bring out a potential bias than one in the absence of positive selection. in the test proposed by Kreitman & Aguade (1986) The considerable effect of /„ on Q (see Figs. 1-3) and Hudson, Kreitman & Aguade (1987) for selection implies that comparisons of the rate/diversity ratio on molecular polymorphism. This test uses a com- for replacement and silent sites among regions of the parison of levels of intra- and inter-specific nucleotide genome with different levels of recombination should site diversity in different regions of the genome. The indicate a general tendency towards values close to test assumes that the only differences between regions unity in regions of reduced recombination, with either are in the probabilities that mutations fall into one of higher or lower values for amino acid replacements in the two categories of effectively neutral or deleterious, regions with normal levels, depending on whether or which implies that both intra-specific and inter-specific not negative or positive selection for amino acid site diversity are affected proportionately by differences in variants is more prevalent. It is therefore of importance these probabilities. to take into account recombinational environment But the existence of a range of values of selection and breeding system when interpreting the results of coefficients, so that the fate of mutations at many sites the McDonald/Kreitman test. is controlled jointly by drift and selection, may lead to The models discussed so far assume that the sites lack of such proportionality and hence to a bias in under consideration are all subject to the same the test. As already discussed, the standing genetic selection regime. This is, of course, quite unrealistic. diversity contributed by slightly deleterious alleles Even if we confine ourselves to the problem of during their sojourn in the populations is less sensitive variation and evolution of amino acid changes, there to the magnitude of/„ S than is the rate at which they is likely to be a wide distribution of fitness effects, become fixed (Kimura, 1983, pp. 44-45). A particular from strongly deleterious to strongly advantageous region could thus have a high level of intra-specific (Kimura, 1983; Gillespie, 1991). An extreme alterna- diversity for replacement substitutions, relative to the tive to the model of fixed selection coefficients is to degree of inter-specific divergence, simply because of imagine two classes of replacement substitutions: lower selective constraints in that region. Conversely, those subject to strong negative selection, with the a low relative degree of intra-specific diversity in a product of selection coefficient and Ne equal to So, particular region need not reflect a recent hitch-hiking such that /„ So <^ — 1, and those subject to strong event (cf. Kreitman & Hudson, 1991). The potential positive selection, with the product of selection importance of this bias remains to be investigated. It coefficient and Ne equal to Slf such that/,,^ > 1. Let would not, however, produce an increase in variation the proportion of amino acid sites that fall into these at silent sites linked to a replacement site poly- two categories be (f>0 and 0X, respectively. Using the morphism, of the kind observed at the Adh locus results of section 4(i) for an autosomal locus with (Kreitman & Hudson, 1991). intermediate dominance, we obtain

•i (17 a) (iv) Sex-linked loci (17 b) The comparison of the sex-linked and autosomal cases indicates that, under the slightly deleterious (17 c) alleles model, levels of genetic variability can be lower Equation (17c) implies that Q — \ <$1f0S1. The for sex-linked loci than for autosomal loci, even if rate/diversity ratio minus one thus provides an variation is expressed relative to neutral expectation underestimate of a composite of the frequency of to correct for differences in effective population sizes positively selected sites, the frequency of unloaded between the sex-linked and autosomal cases. This gametes, and the product of selection coefficient and assumes that deleterious mutations are recessive or effective population size. Using the data on Adh partly recessive, as seems plausible for deleterious sequences in three species of the D. melanogaster alleles (Crow & Simmons, 1983). These effects are subgroup in table 2 of McDonald & Kreitman (1991), large only if mutations are fairly recessive in their and taking the number of segregating sites within effects on fitness (Table 2, Figs. 1-3). The maximum species as a crude estimate of diversity for the Adh effect is seen when there is intense sexual selection. gene, we have Q = 7 x 42/2 x 17 = 8-64. In this case,/0 Similarly, rates of evolution and Q values are likely to is close to unity, since Adh is in a region with normal be lower for sex-linked than for autosomal loci under recombination rates (Lindsley & Zimm, 1992), so that the slightly deleterious model. The converse is true for Q — \ provides a lower bound to the estimate of x St. sites subject to positive selection, if favourable

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 B. Charlesworth 224

mutations are partly recessive (Charlesworth et al. tions are expected. In selfers, the effects of such small 1987). /„ values on evolutionary rates are likely to be Little information is currently available on rates of considerably greater than the fact that the homo- molecular evolution and levels of genetic diversity for zygous effects of mutations are the target of selection, X-linked versus autosomal loci. A recent survey of so that in general the rate of evolution of even partly restriction-site variation in populations of D. melano- recessive slightly deleterious mutations will probably gaster indicates that X-linked loci tend to be less be accelerated relative to neutral evolution in highly variable than autosomal loci, even after correcting for selfing species. This contrasts with the conclusion of differences in effective population size between sex- Charlesworth (1992), who only considered loci evolv- linked and autosomal loci by multiplying diversity ing in the absence of background selection. The levels for sex-linked loci by 4/3 (Aquadro et al. 1994). converse is true for advantageous mutations. Related This may indicate some degree of selection against taxa with different breeding systems may therefore slightly deleterious effects of silent-site changes, exhibit considerable differences in rates of molecular consistent with the evidence on codon bias (Kliman & evolution, as well as in diversity levels. Correlations Hey, 1993). Alternatively, hitch-hiking events may be between these rates and levels of self-fertilization in commoner on the X than on the autosomes, if hermaphroditic groups could provide a test of the favourable mutations are predominantly recessive or slightly deleterious mutation versus adaptive mutation partly recessive in their effects on fitness, thus leading models. These effects may involve the plastid and to a higher frequency of selective sweeps on the X mitochondrial genomes, as well as the nuclear genome, compared with the autosomes (Aquadro et al. 1994). since there is effectively complete linkage between the The difference between X-linked and autosomal nuclear and maternally transmitted genomes in a variation noted by Aquadro et al. (1994) may even be completely selfing or asexual population (Charles- underestimated by the fact that the standard cor- worth et al. 1993). rection for sex-linkage assumes no sexual selection. Joel Peck (in preparation) has pointed out that the But equations (7) imply that this is inappropriate low values of/0 which are likely to prevail in asexual when there is intense sexual selection. With a highly taxa imply that such groups may have much lower male-biased breeding sex ratio (Nm <§ Nt) and as- rates of substitution of new favourable mutations suming no sex differences in mutation rate, we find than sexual taxa, and that this may contribute to the that the neutral values of TT are \6f0Nmv for the geologically short duration of many asexual taxa autosomal case and 18f0Nmv for the sex-linked case. (Maynard Smith, 1978; Bell, 1982). In other words, neutral diversity in the case of sex- linked loci is slightly higher than that for autosomal loci with the same mutation rate, with intense sexual (vi) Hitch-hiking versus background selection selection. Values for more moderate intensities of It is intuitively obvious that the hitch-hiking effects of sexual selection will be intermediate. linked favourable mutations that sweep to fixation In mammals, mutation rates in males are higher will have qualitatively similar effects on both the level than in females (Miyata et al. 1987; Charlesworth, of genetic diversity and rate of evolution. Birky & 1993; Crow, 1993). Since X chromosomes in species Walsh (1988) found this to be true for the rate of with male heterogamety spend less time in males than evolution due to the fixation of both advantageous do autosomes, both rates of evolution and levels of and deleterious alleles. There is an extensive literature diversity should be higher for the autosomes if on the effects of hitch-hiking on neutral genetic mutation rates are higher in the male germ line. In variation (Maynard Smith & Haigh, 1974; Ohta & Drosophila there is, however, no solid evidence for Kimura, 1975; Thomson, 1977; Birky & Walsh, 1988; such a sex difference in mutation rate (Woodruff, Kaplan, Hudson & Langley, 1989; Stephan, Wiehe & Slatko & Thomson, 1983; Crow, 1993). Mutation rate Lenz, 1992; Wiehe & Stephan, 1993), which shows differences between the sexes have little or no effect on that such variation can be drastically reduced, just as measures of diversity and evolutionary rates when in the case of background selection (Charlesworth these are expressed relative to the neutral values, as et al. 1993). has been the focus of attention here. There may, however, be some quantitatively differ- ent patterns in the effects of hitch-hiking and background selection on neutral variation, which (v) Asexual and self-fertilizing species would permit them to be distinguished empirically Very much smaller f0 values are expected for asexual (Charlesworth et al. 1993; Aquadro et al. 1994). For and predominantly self-fertilizing species than for the example, a complete loss of all genetic variability due relatively small segments of the genome subject to to the fixation of a favourable mutation that remains restricted recombination in random-mating popula- completely linked to the region under study is likely to tions (Charlesworth et al. 1993), so that large be followed by a long recovery period in which differences in both evolutionary rates and levels of variability restored by mutation is skewed towards diversity from comparable random-mating popula- rare variants, compared with the neutral equilibrium.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 The effect of background selection against deleterious mutations on weakly selected, linked variants 225

Selective sweeps may thus give significantly negative the effective initial frequencies of a new mutation values of Tajima's (1989) D statistic. With background present as a single copy in a female in the autosomal selection, a negative D is unlikely to be found in a and sex-linked cases, respectively, are sample of realistic size (Charlesworth et al. 1993, in preparation; Hudson, 1994). In general, it seems that 1 (Ala) Drosophila loci in regions of reduced recombination 4/oAT, fail to show negative D values, yet their reduction in 1 variation is too great to be entirely caused by (Alb) background selection (Aguade, Meyers, Long & Langley, 1994; Aquadro et al. 1994). Another possible The corresponding expressions for males, pAm and approach would be to compare the degrees of pXm, are obtained by substituting Nm for Nt in these reduction in variation associated with reduced re- expressions. combination proximal to the centromere for X-linked In the autosomal case, variants present in males and versus autosomal loci in Drosophila. Since /0 is females have equal probabilities of having arisen in expected to be smaller for this region for the autosomal male and female parents. The expected numbers of arms than for the X (Charlesworth et al. 1993), new mutations per site that appear in breeding females background selection predicts a smaller reduction for and males are thus N,(v{ + um) and Nm(v, + vm) re- the X chromosome than for the autosomes. If no such spectively. Substituting the initial frequencies from pattern, or the opposite relation, were to be observed, equation (Ala) into equations (2a) and (3), we the hitch-hiking model would be supported, since a obtain the substitution rate for the autosomal case as higher rate of selective sweeps can occur on the X chromosome under suitable conditions (Aquadro et al. 1994). At present, neither model on its own seems The expression for the sex-linked case is slightly to be capable of fully explaining the data. different, since mutations carried in males must have It is unclear whether such differences between hitch- arisen in females: hiking and background selection will cause detectable differences in the relative frequencies of selected and Kx = (w, + »J NJ0 u(Pxt) + vt N,J0 u(Pxm). (A 2b) neutral alleles, when regions of normal and restricted In the neutral case, the fixation probabilities are equal recombination are compared. For example, a complete to the initial frequencies multiplied by /„, yielding loss of variability due to a selective sweep will cause equations (4) of the text. both slightly deleterious and neutral alleles to remain A similar procedure can be used to obtain expres- at low frequencies during the period of recovery from sions for the expected nucleotide site diversities, by the sweep. There is thus likely to be a similar pattern substituting from equations (1) into equation (6 a) and of low but similar levels of diversity for both classes of multiplying the appropriate H values by the numbers variant, as with background selection. This question of new mutations that arise per generation, discounted requires further quantitative study. by/0 to allow for the fact that background selection I thank Deborah Charlesworth, Charles Langley, and eliminates a fraction/,, of new variants. The following Allen Orr for discussions that led to the preparation of this formulae are obtained for the diversities for autosomal paper. I also thank Charles Aquadro, Jody Hey, Charles and sex-linked loci: Langley, Joel Peck, and Marta Wayne for providing me with their unpublished results. Charles Aquadro, Deborah nA « 8/« BNe(vt + vj {N, u(pj + Nm u(pAm)} (A 3 a) Charlesworth, Richard Hudson, Charles Langley, Wolf- gang Stephan, and Bruce Walsh made helpful comments on nx « 8/3 BNe{(Vl + vj N, u(Pxl) + vt Nm u(PxJ}. (A 3 b) the manuscript. I am especially grateful to Richard Hudson for pointing out an error in the original version. In the neutral case, 5 = 0-5; using the neutral fixation probabilities we obtain equations (7) of the text.

Appendix (ii) Proof that Q is an increasing function o/f0S (i) General formulae for rates of substitution and From equation (13), we can write nucleotide site diversities In the case of a random-mating population, assume (A 4) that the numbers of breeding adult females and males in each generation are N, and Nm, respectively. The where x = 2/0 S. total number of breeding adults is N = Nt + Nm. With Using primes to denote derivatives, we have autosomal inheritance, males and females make equal contributions to the gene pool produced by these Q\x)ozxg{x), (A 5) individuals. With sex-linkage and male heterogamety, where females contribute two-thirds of the gene pool and males one-third. Using the argument of section 2(ii), 17 GRH 63

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 B. Charlesworth 226

Using a Taylor expansion, we find that g(0) = 0, Charlesworth, B. (1992). The genomic mutation rate for g(0 + e) > 0, and g(0 — e) < 0, if e is taken sufficiently fitness in Drosophila. Nature 359, 58-60. small. Using primes to denote derivatives, we also Hudson, R. R. (1994). Gene trees with background selection. In Non-neutral Evolution: Theories and Molecular Data have (ed. G. B. Golding), in press. London: Chapman & Hall. g'{x) = 1 — (x+ \)erz (A 6) Hudson, R. R., Kreitman, M. & Aguade, M. (1987). A test of neutral molecular evolution based on nucleotide data. It is easily seen that g'(x) > 0 for x > 0, so that Genetics 116, 153-159. g{x) > 0 for all x > 0. Similarly, g"(x) < 0 for x < 0, Ikemura, I. & Wada, K. (1991). Evident diversity of codon usage patterns of human genes with respect to chromo- and so g'(x) < 0 for all x < 0. Hence, g(x) < 0 for all some banding patterns and chromosome numbers; re- JC<0. lation between nucleotide sequence data and cytogenetic This establishes that Q\x) > 0 for all x. data. Nucleic Acids Research 19, 4333-4339. Kaplan, N. L., Hudson, R. R. & Langley, C. H. (1989). The 'hitch-hiking' effect revisited. Genetics 123, 887-899. References Kimura, M. (1962). On the probability of fixation of a mutant gene in a population. Genetics 47, 713-719. Aguade, M., Meyers, W., Long, A. D. & Langley, C. H. Kimura, M. (1969). The number of heterozygous nucleotide (1994). Reduced DNA sequence polymorphism in the sites maintained in a finite population due to steady flux su(s) and su(w°) regions of Drosophila melanogaster as of mutations. Genetics 61, 893-903. revealed by SSCP and stratified DNA sequencing. Proceedings of the National Academy of Sciences, USA (in Kimura, M. (1971). Theoretical foundations of population press). genetics at the molecular level. Theoretical Population Biology!, 174-208. Aquadro, C. F., Begun, D. J. & Kindahl, E. C. (1994). Kimura, M. (1983). The Neutral Theory of Molecular Selection, recombination, and DNA polymorphism in Drosophila. In Non-neutral Evolution: Theories and Mol- Evolution. Cambridge: Cambridge University Press. ecular Data (ed. G. B. Golding), in press. London: Kimura, M. & Ohta, T. (1971). Theoretical Aspects of Chapman & Hall. . Princeton, NJ: Princeton University Press. Ashburner, M. (1989). Drosophila. A Laboratory Handbook. Cold Spring Harbor, N.Y.: Cold Spring Harbor Lab- Kliman, R. M. & Hey, J. (1993). Reduced natural selection oratory Press. associated with low recombination in Drosophila melano- Begun, D. J. & Aquadro, C. F. (1992). Levels of naturally gaster. Molecular Biology and Evolution 10, 1239-1258. occurring DNA polymorphism correlate with recom- Kondrashov, A. S. (1988). Deleterious mutations and the bination rate in D. melanogaster. Nature 356, 519-520. evolution of sexual reproduction. Nature 336, 435-440. Begun, D. J. & Aquadro, C. F. (1993). African and North Kreitman, M. (1991). Detecting selection at the level of American populations of Drosophila melanogaster are DNA. In Evolution at the Molecular Level (ed. R. K. very different at the DNA level. Nature 365, 548-550. Selander, A. G. Clark and T. S. Whittam), pp. 202-221. Bell, G. (1982). The Masterpiece of Nature. London: Croom- Sunderland, MA: Sinauer. Helm. Kreitman, M. & Aguade, M. (1986). Excess polymorphism Berry, A. J., Ajioka, J. W. & Kreitman, M. (1991). Lack of at the alcohol dehydrogenase locus in Drosophila melano- polymorphism on the Drosophila fourth chromosome gaster. Genetics 114, 93—110. resulting from selection. Genetics 129, 1111-1117. Kreitman, M. & Hudson, R. R. (1991). Inferring the Birky, C. W. & Walsh, J. B. (1988). Effects of linkage on evolutionary history of the Adh and Adh-dup loci in rates of molecular evolution. Proceedings of the National Drosophila melanogaster from patterns of polymorphism Academy of Sciences, USA 85, 6414-6418. and divergence. Genetics 127, 565-582. Bulmer, M. G. (1991). The selection-mutation-drift theory Li, W.-H. & Graur, D. (1991). Fundamentals of Molecular of synonymous codon usage. Genetics 129, 897-907. Evolution. Sunderland, MA: Sinauer. Charlesworth, B. (1992). Evolutionary rates in partially self- Lindsley, D. L. & Zimm, G. G. (1992). The Genome of fertilizing species. American Naturalist 140, 126-148. Drosophila melanogaster. San Diego, CA: Academic Charlesworth, B. (1993). More mutations in males. Current Press. Biology 3, 466-467. Maynard Smith, J. (1978). The Evolution of Sex. Cambridge: Charlesworth, B. (1994). Patterns in the genome. Current Cambridge University Press. Biology 4, 182-184. Maynard Smith, J. & Haigh, J. (1974). The hitch-hiking Charlesworth, B., Coyne, J. A. & Barton, N. H. (1987). The effect of a favourable gene. Genetical Research 23, 23-35. relative rates of evolution of sex chromosomes and McDonald, I. H. & Kreitman, M. (1991). Accelerated autosomes. American Naturalist 130, 113-146. protein evolution at the Adh locus in Drosophila. Nature Charlesworth, B., Morgan, M. T. & Charlesworth, D. 351, 652-654. (1993). The effect of deleterious mutations on neutral Miyata, T., Hayashida, H., Kuma, K., Mitsuyasu, K. & molecular variation. Genetics 134, 1289-1303. Yasunaga, T. (1987). Male-driven molecular evolution: a Crow, J. F. (1993). How much do we know about model and nucleotide sequence analysis. Cold Spring spontaneous human mutation rates? Environmental and Harbor Symposia on Quantitative Biology 52, 863-867. Molecular Mutagenesis 21, 122-129. Mukai, T., Chigusa, S. I., Mettler, L. E. & Crow, J. F. Crow, J. F. & Simmons, M. J. (1983). The mutation load in (1972). Mutation rate and dominance of genes affecting Drosophila. In The Genetics and Biology of Drosophila (ed. viability in Drosophila melanogaster. Genetics!!, 335—355. M. Ashburner, H. L. Carson and J. N. Thomson), pp. O'Brien, S. J. (1993). The genomics generation. Current 1-35. London: Academic Press. Biology 3, 395-397. Ewens, W. J. (1979). Mathematical Population Genetics. Ohta, T. (1974). Mutational pressure as main cause of Berlin: Springer-Verlag. molecular evolution. Nature 252, 351-354. Gillespie, J. H. (1991). The Causes of Molecular Evolution. Ohta, T. (1992). The nearly neutral theory of molecular Oxford: Oxford University Press. evolution. Annual Review of Ecology and Systematics 23, Houle, D., Hoffmaster, D. K., Assimacopoulos, S. & 263-286.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365 The effect of background selection against deleterious mutations on weakly selected, linked variants 227

Ohta, T. & Kimura, M. (1975). The effect of a selected locus Tajima, F. (1989). Statistical method for testing the neutral on heterozygosity of neutral alleles (the hitch-hiking mutation hypothesis. Genetics 123, 585-595. effect). Genetical Research 25, 313-326. Thomson, G. (1977). The effect of a selected locus on linked Pollak, E. (1987). On the theory of partially inbreeding neutral loci. Genetics 85, 753-788. populations. I. Partial selfing. Genetics 117, 353-360. Wiehe, T. H. E. & Stephan, W. (1993). Analysis of a genetic Sawyer, S. A. & Hartl, D. L. (1992). Population genetics of hitchhiking model and its application to DNA poly- polymorphism and divergence. Genetics 132, 1161-1176. morphism data from Drosophila melanogaster. Molecular Stephan, W. & Mitchell, S. J. (1992). Reduced levels of Biology and Evolution 10, 842-854. DNA polymorphism and fixedbetween-populatio n differ- Woodruff, R. C, Slatko, B. E. & Thompson, J. N. (1983). ences in the centromeric region of Drosophila ananassae. Factors affecting mutation rates in natural populations. Genetics 132, 1039-1045. In The Genetics and Biology of Drosophila, Vol. 3c (ed. M. Stephan, W., Wiehe, T. H. E. & Lenz, M. W. (1992). The Ashburner, H. L. Carson and J. N. Thompson), pp. 37- effect of strongly selected substitutions on neutral poly- 124. London: Academic Press. morphism: analytical results based on diffusion theory. Wright, S. (1969). Evolution and the Genetics of Populations. Theoretical Population Biology 41, 237-254. Chicago, IL: University of Chicago Press.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.40.40, on 01 Oct 2021 at 00:46:00, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672300032365