Genet. Res., Camb (1997), 70, pp. 155–174. With 8 figures. Printed in the # 1997 Cambridge University Press 155

The effects of local selection, balanced polymorphism and background selection on equilibrium patterns of genetic diversity in subdivided populations

BRIAN CHARLESWORTH*†, MAGNUS NORDBORG‡  DEBORAH CHARLESWORTH† Department of Ecology and EŠolution, UniŠersity of Chicago, 1101 E 57th Street, Chicago, IL 60637, USA (ReceiŠed 3 February 1997 and in reŠised form 17 June 1997)

Summary Levels of neutral genetic diversity in populations subdivided into two demes were studied by multi- locus stochastic simulations. The model includes deleterious mutations at loci throughout the genome, causing ‘background selection’, as well as a single locus at which a polymorphism is maintained, either by frequency-dependent selection or by local selective differences. These balanced polymorphisms induce long coalescence times at linked neutral loci, so that sequence diversity at these loci is enhanced at statistical equilibrium. We study how equilibrium neutral diversity levels are affected by the degree of population subdivision, the presence or absence of background selection, and the level of inbreeding of the population. The simulation results are compared with approximate analytical formulae, assuming the infinite sites neutral model. We discuss how balancing selection can be distinguished from local selection, by determining whether peaks of diversity in the region of the polymorphic locus are seen within or between demes. The width of such diversity peaks is shown to depend on the total species population size, rather than local deme sizes. We show that, with population subdivision, local selection enhances between- deme diversity even at neutral sites distant from the polymorphic locus, producing higher FST values than with no selection; very high values can be generated at sites close to a selected locus. Background selection also increases FST, mainly because of decreased diversity within populations, which implies that its effects may be distinguishable from those of local selection. Both effects are stronger in selfing than outcrossing populations. Linkage disequilibrium between neutral sites is generated by both balancing and local selection, especially in selfing populations, because of linkage disequilibrium between the neutral sites and the selectively maintained alleles. We discuss how these theoretical results can be related to data on genetic diversity within and between local populations of a species.

studied together in theoretical models of the effects of 1. Introduction balanced polymorphism on genetic diversity at neutral The levels of genetic diversity in natural populations, sites close to selectively maintained sites. If selection and the forces maintaining this diversity, have been maintains two or more alleles at a site, neutral sites debated by population geneticists for many years. It is closely linked to the site under selection behave as well known that diversity may sometimes be present though they were in different, partially isolated because of the effects of genetic drift on selectively populations, and an accumulation of neutral neutral mutations in finite populations (Kimura, mutations in the surrounding region is expected 1983). It is also well known that various kinds of (Strobeck, 1983; Hudson & Kaplan, 1988). selection can maintain diversity (Crow & Kimura, The theory of this process has been well studied for 1970; Gillespie, 1991). These two processes have been randomly mating populations. At statistical equi- librium between genetic drift and mutation, the degree * Corresponding author. of increase in variability at a neutral site over the † Present address: ICAPB, , Ashworth Laboratories, Edinburgh EH9 3JT, UK. standard neutral expectation caused by linkage to a Tel: j44(0) 131-650-5750. Fax: j44(0) 131-650-6564. long-established balanced polymorphism is inversely e-mail: brian.charlesworth!.ed.ac.uk related to the product of the population size and the ‡ Present address: Department of Genetics, Lund University, So$ lvegatan 29, 223 62 Lund, Sweden. frequency of recombination between the neutral site Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 B. Charlesworth et al. 156

and the polymorphic locus (Strobeck, 1983; Hudson using the coalescent method (Nordborg, 1997). This & Kaplan, 1988; Nordborg, 1997). In addition, models approach has generated approximations for the that include the joint effects on neutral variability of expected coalescence time for a pair of alleles sampled population subdivision and selection at linked loci at a neutral locus, and hence for equilibrium neutral show that transient population differentiation can diversity levels under the infinite sites model (Kimura, sometimes be greatly prolonged over standard neutral 1971; Hudson, 1990), as a function of the product of expectations when immigrant alleles at the selected population size and recombination distance from a loci are subject to counter-selection (Barton, 1979; locus with a selectively maintained polymorphism. Bengtsson 1985; Barton & Bengtsson, 1986; Barton & The model incorporates both population subdivision Gale, unpublished results). Petry (1983) has examined and background selection. The selected locus which the equilibrium variance in allele frequencies at a generates the peak of diversity at neutral sites can be neutral locus among local populations when there is subject to either balancing selection (which is uniform this type of selection at a linked locus, but no across local populations), or local selective differences comparable study of more general neutral models has (whereby different alleles are favoured in different been done other than that of Nordborg (1997). populations). Partial self-fertilization can also be Another type of study combining neutral sites with included. loci under selection is the recent work on the effect of The present paper describes the results of simu- neutral diversity of selection against deleterious lations of these various scenarios for populations with mutations at linked loci, known as ‘background different levels of self-fertilization, and compares them selection’ (Charlesworth et al., 1993; Hudson & with the analytical results. The analytical and simu- Kaplan, 1994, 1995; Nordborg et al., 1996a, b). In lation results enable two important questions to be inbred populations, the effects of background selection investigated in relation to population subdivision. The are expected to be particularly strong, because the low first is whether the widths of peaks of diversity depend heterozygosity makes recombination ineffective as far on local deme sizes, or on the population size of the as its populational consequences are concerned species as a whole. If the widths depend on total (Narain, 1966). Selfing populations should therefore species population size, this implies that our previous show considerably reduced neutral or nearly-neutral simulations using limited population sizes would variability (Charlesworth et al., 1993; Nordborg et al., generate much wider peaks than are likely in real 1996b; Nordborg, 1997). Alternatively, selection may situations, so that loci with selectively maintained take the form of ‘selective sweeps’ (Maynard Smith & variation may be detectable empirically by very Haigh, 1974; Kaplan et al., 1989; Stephan et al., sharply defined diversity peaks (Nordborg et al., 1992; Braverman et al., 1995; Simonsen et al., 1995), 1996b). The second question is whether data on where diversity is reduced by repeated fixations of diversity at neutral loci can be used to distinguish selectively favoured alleles. Again, the reduced effec- between situations with local selective differences and tive rate of genetic recombination makes this process ones where a balanced polymorphism is maintained particularly important in highly inbred populations within local populations. We also relate the results to (Hedrick, 1980), although its effects on neutral empirical data on genetic diversity within and between diversity have not been explicitly studied in this case. natural populations. Finally, our findings shed light Recurrent, weak directional or fluctuating selection on the interpretation and utility of different statistics pressures may also reduce variability at linked sites for describing population differentiation, when neutral (Gillespie, 1994). loci are linked to loci under selection. In real populations, the situation is likely to be much more complicated than in such simple models, 2. Methods since it is very unlikely that only one form of selection (i) General description of the model will be acting, and species are often divided into numerous partially isolated populations. The situation The simulations of systems with many neutral sites when background selection and balancing selection and background selection caused by deleterious both operate has been previously investigated by mutations at many loci, together with selection at a Monte Carlo simulation (Nordborg et al., 1996b). single further polymorphic locus, were performed as The reduction in genetic diversity due to background described previously for models of the simultaneous selection when recombination is restricted, or the effects on neutral diversity of background selection population is highly inbreeding, still holds true for and balancing selection (Nordborg et al., 1996a, b). A most neutral sites in the presence of a balanced total population of N diploid hermaphrodites with polymorphism, but diversity shows a pronounced selfing rate S was assumed. In simulations with peak close to the locus at which the polymorphism is background selection, each chromosome was assumed maintained. to have 1000 loci subject to deleterious mutations Expressions for expected nucleotide site diversities uniformly distributed along its length, with a total have now been derived under models that allow for rate U l 0n8 per diploid genome. (This unrealistically the simultaneous operation of several of these factors, high value was used in order for background selection

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 Genetic diŠersity in subdiŠided populations 157

to produce large effects in outcrossing, as well as We refer to the selected locus in both cases as the inbred, populations). Deleterious mutations at all loci polymorphic locus. were assumed to have the same selection coefficient (0n1) against homozygous mutations, in all the runs (ii) Simulation methods involving background selection. The dominance coefficient was 0n2 (except for a few runs with a value In each generation, the sequence of events was as of 0n5: see below), so that the fitness of heterozygotes follows. To model migration between the two demes, was 0n98 relative to homozygotes for wild-type. To a random number is picked for each new individual model populations without background selection, U being generated in a given sub-population in the next was simply set to zero. generation, to decide whether it is to be a migrant or Two demes, of equal and constant size N\2, were not. If it is a migrant, it is created from a parent or assumed in these simulations. Migration between the parents in the other subpopulation, instead of coming demes was assumed to occur with rate m (the from the same subpopulation. If the individual is to be probability that an individual is an immigrant from a product of self-fertilization, a single parent is chosen; the other deme). Two alternative selection regimes otherwise two parents are chosen. Parents are chosen were used, in addition to the selection just described according to their relative fitnesses within each deme, against deleterious mutations. In both models, all in a similar manner to the method used in our fitness effects interacted multiplicatively to determine previous simulations (Nordborg et al., 1996a, b). genotypic fitness. One selection model assumed the Each haplotype of the new individual is then created same regime of balancing selection in both demes as in from the genome of the parent by the recombination our previous simulations (i.e. there was a locus subject and mutation processes. No interference between to frequency-dependent selection in the centre of the crossovers was assumed in the recombination process. chromosome). In order to maintain an average allele The whole procedure corresponds to fertility selection frequency of 0n5, the fitness of homozygotes for the among adults, followed by gamete production (invol- most common allele in this model was assumed to be ving recombination and mutation), followed by 2(1kp), and the fitness of the heterozygote was migration of zygotes or juveniles. 1n5kp, where p is the frequency of the common allele To obtain the simulation data on neutral diversity (Nordborg et al., 1996b). As discussed by Nordborg under an infinite-sites model (Kimura, 1971), 100 et al.(1996b), the exact model of balancing selection neutral sites were modelled. In most of the runs, is not important for the results; all that matters is that individuals had one chromosome, each with a map the alleles at the locus in question are maintained length of 1 morgan, and with 50 neutral sites on each indefinitely in the population, close to the deterministic side of the locus subject to balancing or local selection. equilibrium frequency of one half. In what follows, we refer to the frequency of Alternatively, local selection (without balancing recombination between the polymorphic locus and a selection within demes) was modelled by assuming neutral site as r. Some runs were done with three directional selection acting in opposite directions in chromosomes, with 50 neutral sites on the chromo- the two subpopulations. The gene involved will be some with the polymorphic locus in the centre, as just referred to as the locus under local selection. The described, and 25 on each of the other two chromo- heterozygous and homozygous fitnesses at this locus somes, i.e. with a total of 50 loci that were unlinked to are denoted by 1khs and 1ks, respectively (see the polymorphic locus. Appendix, section (i)). The selection model assumed Nucleotide diversities (π) at each neutral site were h l 0n5, and s l 0n5 for ‘strong’, or s l 0n1 for estimated from the mean of 2 Σ zt (1kzt), over ‘moderate’, local selection. With free migration (m l replicated introductions at the site of single variants, 0n5), the model is equivalent to Levene’s (1953) model where zt is the frequency of the neutral variant at time for the maintenance of polymorphism in a randomly t, and summation is over all times until either fixation mating population with spatial variation in genotypic or loss occurs (Charlesworth et al., 1993; Hudson & fitnesses, and the condition h % 0n5 is sufficient for Kaplan, 1995; Nordborg et al., 1996a, b). Each data % Levene’s conditions for maintenance of polymorphism point for a neutral site represents the mean of 2n4i10 in an infinite population to be met. It is well known introductions of a variant at the site in question. The that restriction of migration reduces the stringency of total genetic diversity at the neutral sites (πT) was also these conditions (Karlin, 1982). With the population decomposed into that within and between allelic sizes and selection coefficients used in the simulations, classes at the polymorphic locus. Diversity within variation is maintained almost indefinitely. The allelic classes, which will be written here as πA, was simulations were intended to show the qualitative estimated from the mean of 2 Σ oxt(1kxt)jyt)(1kyt)q, effects of the various forces studied, and their relative where xt and yt are the frequencies of the neutral magnitudes, so we did not choose biologically plaus- variant within the first and second allelic classes, ible values of the selection coefficients at the locus respectively. Diversity between allelic classes with responsible for selection for local adaptation. Rather, respect to the polymorphic locus was calculated as the we used values that would produce clear-cut effects. difference between the total diversity values and πA,

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 B. Charlesworth et al. 158

and is denoted by πT−A. The within-deme (πS) diversity analytical equations for equilibrium diversity levels in was calculated in an analogous manner, using fre- the different situations studied, the relevant formulae quencies of the variants in the two demes, rather than are summarized in Table 1. As will be shown, there is two allelic classes. The component of diversity between generally good agreement between the formulae and subpopulations is denoted by πT−S. We thus have the simulation results, and qualitatively the theory predicts most of the effects observed. We here describe πT−A l πTkπA (1a) the effects of the different factors (polymorphism and maintained by local selection versus balancing selec- πT−S l πTkπS (1b) tion within demes, population subdivision versus free migration, presence or absence of background selec- All the results on neutral site diversity, including those tion, and inbreeding versus outcrossing), making clear for non-zero selfing rates, are expressed relative to where there are discrepancies between the theoretical neutral expectation for the case of S l 0, i.e. no and simulation results. corrections were made for the reduction in effective The formulae predict neutral diversity values as population size due to selfing (Pollak, 1987). functions of the migration rates between the sub- FST values were calculated as the ratio of the populations, the allele frequency q of the counter- between-deme component of diversity to the total selected alleles in each subpopulation under local (Nei, 1987, p. 162), i.e. selection, the equilibrium inbreeding coefficient for a neutral locus in an infinite population with selfing rate πS FST l 1k (1c) S, Fp l S\(2kS) (Crow & Kimura, 1970, p. 94), and πT the frequencies of recombination between the neutral This is equivalent to KST of Hudson et al.(1992), sites and the selectively maintained locus. In the which is derived from measures of the numbers of formulae in Table 1, α is 1 in outcrossing populations nucleotide differences between sequences. A similar without background selection, and is reduced by quantity, FAT, can be defined for variation between inbreeding or background selection (see Nordborg et allelic classes at the polymorphic locus. al., 1996a, b; Nordborg, 1997). As explained in Section 2(ii), diversity is partitioned into biologically 3. Results meaningful and observable components: within- population diversity, diversity within allelic classes (i) Analytical theory for subdiŠided populations with respect to the polymorphic locus, diversity Our simulations were guided by approximate ana- between allelic classes with respect to the polymorphic lytical results based on coalescent theory which are locus, and diversity between demes. presented elsewhere (Nordborg, 1997). In that paper, however, the case of low migration rates between (ii) No population subdiŠision subpopulations subject to local selection was not considered. Since this case is biologically important, it We start with the simplest case of random mating and is treated in the Appendix, Section (i). To facilitate no background selection. Because of the high mi- comparisons of the simulation results with the gration rate, the allele frequencies at the polymorphic

Table 1. Expected neutral nucleotide site diŠersities relatiŠe to the standard Šalue for a randomly mating population of size N

Partitioned with respect to Partitioned with respect to allelic class subpopulation

Within Between Within Between

Free migration α 1 1 0 αj 4Nrg 4Nrg 1 1 1 1 1 1 Subdivision with balancing selection αj j αj j 8Nm 4N(2mjrg ) 4Nrg 4Nrg 4N(2mjrg ) 8Nm 1 1 1 1 Subdivision with local selection (1kq)αj (1kq)αj 2Nsg 8Nmg 2Nrg 8Nmg

α represents the effects on diversity of the frequency of self-fertilization and background selection (α l 1 in the absence of these factors, and is otherwise ! 1). Expressions for α are given by Nordborg (1997). s4 is the effective selection coefficient against a rare allele with local selection and partial self-fertilization; r4 is the effective recombination frequency with partial self-fertilization (Nordborg, 1997); m4 is the effective migration rate with local selection and partial self-fertilization. See the text and Section (i) of the Appendix for further details.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 Genetic diŠersity in subdiŠided populations 159

The component of diversity between allelic classes 0·015 (a) with respect to the polymorphic locus, πT−A, is also low for most sites, but there is a peak at sites close to the polymorphic locus, again as expected (see Table 0·01 1). The simulation results for this component of diversity for the two selection models are shown in T–A

p Fig. 1. The values of πT−A from the simulations are plotted with the chromosome folded at the poly- 0·005 morphic locus, so that each point shown represents the mean of data from two sites, one from each chromosome arm. Note that, because the simulation models assume no interference, r $ 0n32 for the most 0 0 0·1 0·2 0·3 0·4 0·5 distal loci on the chromosome (which are 50 cM from Distance in morgans the selected locus), according to Haldane’s (1919) 0·015 (b) mapping function. The predicted diversity values, using the equations in Table 1, are also plotted. Since the analytical theory assumes small Nr (Nordborg, 1997), these predictions are expected to work only for 0·01 diversity at neutral sites which are rather close to the polymorphic locus. In practice, however, they are T–A p often surprisingly similar to the observed simulation results, even for quite loose linkage. For the purposes 0·005 of making the general nature of the results clear, and since qualitative agreement is excellent, we therefore show the results of plotting the equations for the 0 entire chromosome. 0 0·1 0·2 0·3 0·4 0·5 Without population subdivision, genetic diversity Distance in morgans under local selection is, as expected, indistinguishable Fig. 1. Neutral genetic diversities from simulations of from that when there is balancing selection within outcrossing populations of size N l 2000, with free each population. Both forms of selection produce migration between subpopulations. Diversity values peaks in π at sites close to the polymorphic locus between allelic classes (πT−A) with respect to a T−A polymorphic locus (located at map position zero) are (whereas there is no such relationship within allelic shown as a function of map distance. Each data point classes). This is true in populations with high selfing from the simulation results (filled diamonds) is the (cf. Nordborg et al., 1996b), as well as with random average of two results, from equivalent neutral sites on mating (results not shown). each side of the selected locus. Values predicted from the analytical theory are shown as continuous lines. (a) Since the diversity within allelic classes, πA (which is Results of simulations assuming a balanced the major component in the models without popu- polymorphism maintained by frequency-dependent lation subdivision), shows essentially no relationship selection (see text). (b) Results with a polymorphism to the distance from the polymorphic locus, we can maintained by strong local selection. There is no simplify the description of the results, and make it background selection. easier to compare the effects of selfing and other factors, by comparing the predicted πA with the locus will be very similar in the two demes. With free average from simulations with the same parameter migration (Table 1 with Nm 4_), terms involving values. The results are summarized in Fig. 2. The first 1\Nm will be very small and we thus expect the two two pairs of columns for each selfing rate set models of selection maintaining polymorphism to correspond to the two selection models whose results yield similar results, with a major fraction of the total have just been described, illustrating the good quan- diversity being within subpopulations, and a small titative agreement for both outcrossing and selfing between-population component of diversity (πT−S). populations. The analytical theory accurately predicts Diversity within allelic classes, πA, should show no the effects of selection on πA for both the balancing dependence on proximity to the polymorphic locus. selection and local selection models without back- Our simulation results for a single, non-subdivided ground selection, even with high levels of selfing (Fig. (m l 0n5), completely outcrossing population with a 2a). The effect of inbreeding on πA almost reaches its total N value of 2000 agree with these predictions, for full extent once populations are 90% selfing, and a both types of selection. πS values are close to 1 for further increase in the selfing rate has little further outcrossing populations with no background selec- effect on this diversity measure. Populations with tion, and are similar at sites throughout the entire 50% selfing have diversities intermediate between the chromosome. πT−S is low and also bears no relation to fully outcrossing and 90% selfing values (results not map position. shown). The quantitative predictions for diversity

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 B. Charlesworth et al. 160

S =0 0·08 1·10 (a) (a) Observed 1·00 Expected 0·06 0·90

0·80 A

T–A 0·04 p S = 0·9 p 0·70 S = 0·99 0·60 0·02 0·50

0·40 0 0·60 0 0·1 0·2 0·3 0·4 0·5 (b) Distance in morgans 0·08 0·50 (b)

0·40 0·06 A

p 0·30

0·20 T–A 0·04 p

0·10

0·02 0·00

selection – + – + – + – + – + – + m 0·5 0·5 0·001 0·001 0·5 0·5 0·001 0·001 0·5 0·5 0·001 0·001 0 Fig. 2. Observed and expected diversities within allelic 0 0·1 0·2 0·3 0·4 0·5 classes with respect to a polymorphic locus, for Distance in morgans populations of total size 2000 with a range of selfing Fig. 3. Observed and expected diversities between allelic rates. The results for the two selection models are classes with respect to a locus under balancing selection displayed, with or without subdivision into two demes (with no local selection or population subdivision), for with restricted migration. (a) Results without background populations of total size 2000 with selfing rate S l 0n9. (a) selection; (b) with background selection. The k and j Results without background selection; (b) with signs in the boxes below (b) indicate the operation of background selection. balancing selection or strong local selection, respectively; the values of the migration rate m are indicated in the lower set of boxes. agreement with the analytical formulae is good, and the simulations agree with the theory in showing a pronounced peak of diversity when chromosomes between allelic classes with respect to the polymorphic with different alleles at the polymorphic locus are locus are not in complete agreement with the compared, but show no such localized effect for the simulation results. For instance, selfing increases πT−A other diversity measures. A more important dis- much as predicted, but the values are slightly higher agreement with the theoretical results is that πT−A than expected at sites loosely linked to a locus under values are also rather lower than predicted, at least in balancing or local selection (Figs. 1,3a). populations with S l 0n9 with a polymorphism main- With background selection also operating, diversity tained by frequency-dependent balancing selection within populations, and within allelic classes with (Fig. 3b), i.e. background selection decreases diversity respect to the polymorphic locus, is reduced. Fig. 2b between allelic classes as well as within them. shows that, for both selection models with free migration, the observed π values are similar to the A (iii) SubdiŠided populations with balancing selection expected ones, although the reduction due to back- ground selection is generally somewhat greater than The equivalence between the effects of balancing predicted by the analytical theory, in both inbreeding selection and local selection on genetic diversity at and outcrossing populations. Simulation results are neutral sites no longer holds when there is population missing for the cases with 99% selfing with strong subdivision, i.e. with restricted migration (Table 1). local selection and background selection, because the When m is reduced to 0n001 (Nm l 2) the theoretical polymorphism was not maintained under these con- results again give good predictions of diversity within ditions, presumably because the low effective popu- allelic classes, for both inbreeding and outcrossing lation size induced by background selection with this populations without background selection (right-hand selfing rate allowed fixation of one allele or the other. pairs of results within each set in Fig. 2a). For both Despite these minor discrepancies, however, overall selection models, unless there is both high selfing and

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 Genetic diŠersity in subdiŠided populations 161

Expected No background Background When there is a balanced polymorphism maintained 0·07 selection selection by frequency-dependent selection within demes, most 0·06 of the diversity is, as expected, within the demes, and there is a peak in the between-allele component of 0·05 diversity close to the polymorphic locus. As predicted 0·04 by the analytical results, the effect of the balanced

T–S polymorphism on neutral diversity is proportional to p 0·03 1\4Nrg (where rg is the effective recombination rate; see 0·02 Appendix, Section (i)), just as in the case of a single population of size N. The division into two smaller 0·01 populations slightly reduces πS. πT−S is now non-zero, 0·00 due to the fact that migration between populations is S =0 S= 0·9 S = 0·99 restricted; it bears no relationship to the map position Fig. 4. Observed and expected diversities between two of the neutral sites with respect to the polymorphic subpopulations with no local selection, for populations of locus. From the formula in Table 1, we expect πT−S l total size 2000, migration rate m l 0n001, and selfing rates 1\(8Nm) for any selfing rate, i.e. a value of 0n0625 with from 0 to 0n99. Simulation results with and without our parameters, but the means from the simulations background selection are shown. are slightly lower than this value (Fig. 4). The effect of background selection (see below), diversity within background selection on πS is similar to that in a allelic classes is the major component, except at sites single population, but πT−S is also somewhat reduced close to the polymorphic locus. Its magnitude is (Fig. 4), which is not predicted by the analytical similar for both models, and πS is roughly the same as theory. As with the reduction in πT−A (see above), the πA for most sites. effect is stronger, the higher the selfing rate.

0·6 4 (a) (b)

3 0·4

T– S 2 p

0·2 1

0 0 0 0·1 0·2 0·3 0·4 0·5 0 0·1 0·2 0·3 0·4 0·5

7 40 (c)(36 d) 6 32 5 28 24 4

T– S 20 p 3 16

2 12 8 1 4 0 0 0 0·1 0·2 0·3 0·4 0·5 0 0·1 0·2 0·3 0·4 0·5 Distance in morgans Distance in morgans Fig. 5. Observed and expected diversities between two subpopulations of a population of total size 2000, and migration rate 0n001. There is no background selection. Results with moderate local selection are shown for the case of outcrossing (a), and with a selfing rate 0n9(b); (c) and (d) show the same two cases with strong local selection. Diversity values are displayed as functions of map distance from the locus subject to local selection.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 B. Charlesworth et al. 162

4 is usually close to 1. When s ( m the frequency of the Without background selection rare allele, which is present because of migration from With background selection the other deme, is given by 3 m q $ , (2) sg

T– A 2 where sg is the effective selection coefficient against the p minority allele in a population defined in section (i) of the Appendix (Haldane, 1930; Nordborg, 1997). 1 (When the above condition does not hold, the disfavoured allele will no longer be rare, and (2) is not exact.) 0 Application of this expression to the diversity 0 0·1 0·2 0·3 0·4 0·5 formulae in Table 1 shows that the π component of Distance in morgans T−A diversity at the neutral sites in a subdivided population Fig. 6. Simulation results with and without background selection for diversity between allelic classes with respect is expected to be increased by local selection, and this to a locus subject to moderate local selection, for a is indeed found in the simulations. Examples with subdivided population of total size 2000 (m l 0n001) and moderate and strong local selection, and without selfing rate 0n9. Diversity values are displayed as functions background selection, are shown in Fig. 5. Although of map distance from the locus subject to local selection. the increased diversity is especially marked at sites close to the polymorphic locus, it extends throughout (a) the chromosome, as the equations predict, and does 0·2 not disappear with large r4. With moderate local selection, the equilibrium frequencies of the rare allele in each deme are about 0 02 and 0 01 for populations 0·15 n n with S l 0 and S l 0n9, respectively, and the cor-

S responding frequencies for strong selection are 0n004 p 0·1 and 0n002, but the actual frequencies in the simulations fluctuate widely since Nm is small. Applying (A3) to the results in Table 1, πT−S should fall to 0n073 and 0·05 0n0128 for loci at the distal end of the chromosome in outcrossing populations with moderate and local 0 selection, respectively, and to 0n170 and 0n956 for 0 0·1 0·2 0·3 0·4 0·5 populations with S l 0n9. The observed values are Distance in morgans often higher than these values (Fig. 5). For outcrossing (b) 20 populations, agreement was good for moderate local selection, but with strong selection the mean value from the simulations for the most distal 25 neutral loci 15 was about 0n30 (Fig. 5C), and 0n19 for unlinked loci (the expectation for the latter is 0n104). With 90%

T–S selfing, even moderate local selection produced greater p 10 differentiation than the equations predict, with mean πT−S values of 0n29 and 3n87 for distal loci with the two b d 1 1 5 strengths of selection (Fig. 5 , ), and 0n 5 and n55 for unlinked loci (expectations of 0n131 and 0n635, respectively). As expected, the increased diversity 0 between demes is paralleled by a similarly high 0 0·1 0·2 0·3 0·4 0·5 Distance in morgans diversity between allelic classes at the polymorphic locus, which, of course, has very different allele Fig. 7. Observed and expected diversities within (a) and between (b) subpopulations of a population of total size frequencies in the two subpopulations. 2000, m l 0n001 and S l 0n9, with strong local selection With background selection also acting, both πT−S and background selection. The continuous lines are the and πT−A values are somewhat reduced, as in the case analytical predictions. Diversity values are displayed as without local selection described above. These results functions of map distance from the locus subject to local are again contrary to the theoretical predictions. With selection. outcrossing, the mean values of πT−A for loci loosely linked to the polymorphic locus are similar to those (iv) SubdiŠided populations with local selection given above within background selection. With 90% With local selection, the equilibrium frequency of the selfing, background selection reduces between-allele favoured allele at the polymorphic locus in each deme diversity at loci loosely linked to the polymorphic

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 Genetic diŠersity in subdiŠided populations 163

1 1 (a) (b) Strong local selection Moderate local selection No local selection 0·75 0·75 ST

F 0·5 0·5

0·25 0·25

0 0 0 0·1 0·2 0·3 0·4 0·5 0 0·1 0·2 0·3 0·4 0·5

(c)

(d) 0·75 0·75 ST

F 0·50 0·5

0·25 0·25

0 0 0 0·1 0·2 0·3 0·4 0·5 0 0·1 0·2 0·3 0·4 0·5 Distance in morgans Distance in morgans

Fig. 8. Observed FST values in simulations of subdivided outcrossing and inbreeding populations of total size 2000 (m l 0n001). (a) S l 0, no background selection; (b) S l 0, with background selection; (c) S l 0n9, no background selection; (d) S l 0n9, with background selection. Diversity values are displayed as functions of map distance from the locus subject to local or balancing selection.

locus, to about one half of the values given above, for exceeds πA. These peaks are predicted by the analytical moderate local selection (Fig. 6), and one third for theory, and should be noticeable when α is small, strong selection. which will occur when there is selfing and background In inbreeding populations, small peaks of within- selection. In theory they are not restricted to selfing population neutral diversity were seen at sites close to populations (Table 1), but in outcrossing populations the locus under local selection, as well as in πT−S and are too narrow to be seen in our simulations, where πT−A (Fig. 7). Such peaks of diversity might give the the closest neutral site is 1 cM from the polymorphic appearance that a balanced polymorphism is main- locus. With background selection, πS in the simu- tained within such populations, when the selection is, lations is also consistently lower than predicted (Fig. in fact, between localities. The existence of such peaks 7a). might seem unexpected, but can be explained by the Due to the nearly complete conservation of πS while fact that migrant genomes rarely recombine with πT−S is increased, total diversity is increased when genotypes in the recipient populations, and so the there is restricted migration as well as local selection association between alleles at the polymorphic locus and thus, not surprisingly, the proportion of diversity and those at neutral sites is persistent. Because of the that is between populations is increased by local rarity of the selectively disfavoured allele in a given selection, compared with the situation when a subpopulation, the peaks should be small, relative to balanced polymorphism is maintained within each those between demes, as is indeed observed (Fig. 7). local population. In outcrossing populations without On this interpretation, peaks within populations background selection, the πT−S values in the simu- should be between, rather than within allelic classes, lations remain considerably lower than πS (compare, which is consistent with the absence of any detectable for instance, the results on diversity values between peak of diversity within alleles with respect to the demes or alleles for moderate local selection in Fig. 5a polymorphic locus, and with the finding that πS with Fig. 2a for within-allele values). This is true even

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 B. Charlesworth et al. 164

when local selection is very strong; for the outcrossing not have been expected to be true in highly selfing case corresponding to that just described, the mean populations, because of the reduced effective rate of πT−S over all sites is 0n60 and, of course, most sites recombination implied by high homozygosity (Narain, distant from the polymorphic locus had lower values 1966). In agreement with the analytical results, than this. simulations have shown that background selection is effective in reducing diversity, even in very highly selfing populations, except at sites quite closely linked (v) Effects of local selection and background to the polymorphic locus, although the region of selection on FST enhanced diversity linked to this locus is wider than 1 The effects on πT−S relative to πT in the simulations for outcrossing populations (Nordborg et al., 996b; are illustrated for different levels of local selection in Nordborg, 1997). Fig. 8, in terms of FST values (see (1c)). The values The results presented here show that the major agree well with the predicted ones for outcrossing effect of background selection is still to reduce diversity populations when there is no local selection or within populations, even with subdivision of a background selection. For the parameters of Fig. 8a population and local selection, as in the less complex without local selection FST should be unrelated to map models studied before (Charlesworth et al., 1993; distance from the polymorphic locus, and its expected Nordborg et al., 1996b), i.e. the effects of background value is 0n059, compared with a mean of 0n059 from selection within populations can largely be considered the simulations. With local selection, however, agree- without regard to events in other demes, even when ment with the predictions is poorer. Since between- there is migration between them. There is, however, a deme diversity values are higher in the simulations tendency for background selection to reduce diversity than expected (see above), FST values are higher than between demes to some extent (see Section 3(iii)), expected on strict neutrality. Background selection which is not predicted by the analytical results (Table does not alter these qualitative conclusions, but it 1). strongly decreases πS. Although πT−S values also We previously suggested that peaks of diversity decrease, these effects are relatively minor, so the might offer a means of empirically detecting loci with effect is to increase FST considerably (Fig. 8a, b). With balanced polymorphisms, and that the reduced di- the parameter values of our simulations, πT−S for versity caused by background selection in inbred outcrossing populations nevertheless never exceeds πS populations should make such populations particu- (i.e. FST is always less than 0n5), even with very strong larly useful for such studies (Nordborg et al., 1996b). local selection, except at sites close to the polymorphic The practicality of this approach relies on the peaks locus, where πT−S becomes extremely high in all cases being very local. Since the width of the peak of where local selective differences exist (Fig. 8a, b). diversity around a locus with a polymorphism With selfing and local selection, πT−S and πT−A are maintained by balancing selection size is proportional again similar in magnitude. The quantitative effects of to 1\Nrg , it will decrease with total population size. local selection on πT−S, and on total diversity, can be Our previous simulations used population sizes much much larger than under outcrossing, even comparing smaller than those of most species, and so must over- cases with only moderate selection, although πT−S is predict peak widths associated with polymorphic loci. still lower than πS unless local selection is very strong Given that real populations are often patchily distri- (Fig. 8c). When there is background selection as well, buted, with small local population sizes, it is important the effect can be extremely pronounced, because πS is to know whether the peak width is determined by the strongly reduced. Even with moderate local selection, sizes of the local populations or by the total species most of the diversity can be between demes, even for population size. In the case of two demes with loci which are loosely linked to the polymorphic locus symmetrical migration studied here, both the ana- (Fig. 8d). lytical and simulation results show that the peak width around a locus with a polymorphism maintained by balancing selection depends on the total species 4. Discussion population size, not the local ones. Thus, if the species ' ' N value is 10 , only loci with r4 of the order of 10− are (i) The joint effects of background selection and expected to see a large effect of balancing selection, so balancing selection on neutral diŠersity that peaks of high diversity will be very narrow, even We have previously shown that background selection in largely selfing populations. reduces genetic diversity in highly inbreeding popu- The intuitive reason for this is that the decline in lations at most neutral sites, even in the presence of a diversity with r4 reflects the reduced coalescent time for balanced polymorphism that is maintained for an the neutral sites that are far away from the selected indefinite period of evolutionary time (Nordborg et locus (Hudson & Kaplan, 1988). With conservative al., 1996a, b). In outcrossing populations, unusually migration in the sense of Nagylaki (1982) (i.e. when high diversity is found only at sites very closely linked migration does not alter deme size), the coalescence to a locus subject to balancing selection. This might time for a pair of neutral alleles sampled from the

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 Genetic diŠersity in subdiŠided populations 165

same populations is determined by the metapopulation number of selected loci. This is because it is more size (Maruyama, 1971; Nagylaki, 1982; Slatkin, 1987; difficult for a migrant neutral allele to disentangle Strobeck, 1987; Hudson, 1990). The effect of linkage itself from a block of linked loci subject to negative to a locus maintained by balancing selection thus selection than from a single locus (Barton & reflects the overall rate at which recombination events Bengtsson, 1986; Barton & Gale, unpublished results). occur in the species as a whole, which is dependent on This effect will be very important in highly selfing Nrg . It remains to be tested whether this holds more populations, where even loci on different chromo- generally, but there seems little reason to doubt the somes behave as though closely linked (Narain, generality of the conclusion, given that an effective 1966; Nordborg, 1997). We thus conclude that large population size can always be defined for a meta- effects of local selection are quite possible, even for population with an arbitrary population structure sites loosely linked to polymorphic loci, especially in (Nagylaki, 1982; Whitlock & Barton, 1997). It can be inbreeding populations. conjectured that this effective metapopulation size would replace N in the equations in Table 1, but this remains to be rigorously established. (iii) Distinguishing between local selection and balancing selection In principle, if populations are somewhat isolated (ii) Effects of population subdiŠision and local from one another, it should be possible to detect local selection on neutral diŠersity selection, and distinguish it empirically from poly- morphism maintained by balancing selection within Diversity differences under local selection show a populations, because only the former will show peaks different relationship with N. Between-population in diversity between populations (i.e. in π )at and between-allelic class variability depends on 1\Nmg , T−S particular locations in the genome, as well as between where m4 is an effective migration rate that depends on alleles. Peaks of π are also generated at neutral sites the relation between r4 and the intensity of local S linked to the locus under local selection, which makes selection (Table 1). With population subdivision, such it appear as though there is balancing selection within that Nm is of order 1, local selection can increase populations (Fig. 7), but the much larger diversity diversity between allelic classes with respect to the peaks between populations should still make it polymorphic locus, especially at sites closely linked to possible to distinguish the causes of the peaks. Such this locus (these diversity values are very similar to tests to determine the kind of selection that is acting those shown in Fig. 5 for between-deme diversity). can, of course, be applied only if one is aware of the Even with moderate local selection and outcrossing, population boundaries, so that one can make com- π in our simulations rises to about 10% of the total T−S parisons between and within populations. If the diversity (averaged over the whole chromosome), population subdivision is not apparent, unexpectedly compared with an insignificant fraction when there is high π values may be found, as is sometimes observed no local selection. A similar, but more extreme, effect S in inbred plant populations (Bonnin et al., 1996). This occurs in selfing populations. might be incorrectly attributed to balancing selection Because of the difference between allele frequencies at a locus closely linked to the one studied, but is at the selected locus in the two sub-populations, this distinguishable because diversity should be high for enhanced diversity is mainly between demes. Local loci throughout the genome in this case. selection lowers the effective amount of migration, because migrants will tend to carry the disadvan- tageous allele for the populations into which they (iv) The effects of local or balancing selection on move, and so are selected against (Barton, 1979; linkage disequilibria among neutral loci Petry, 1983; Bengtsson, 1985; Barton & Bengtsson, 1986; Barton & Gale, unpublished results). The net With data on diversity at a set of loci in different effect is to increase πT−S (and thus FST) over the populations, peaks of diversity may thus indicate the neutral expectation. The analytical theory (see Table 1 presence of selection and suggest its nature. Such and the Appendix, Section (i)) predicts substantial peaks can also be detected as linkage disequilibrium effects in many situations but under-predicts the between neutral loci, because selection at a locus effects found in the simulations, except for outcrossing produces long coalescence times for alleles at linked populations without strong local selection (Section loci. Thus alleles linked to one allele at the poly- 3(iii)). For sites close to a locus under local selection, morphic locus should tend to form a set different from extremely high FST values can be generated, especially those linked to the other allele. If we were to compare when πS is reduced by background selection (Fig. 8). alleles at two of these linked loci, they would not be in While the selection coefficients that we have used linkage equilibrium. In the Appendix, Section (ii), we many seem unrealistically high, it should be borne in provide some analytical results that relate πT−A and mind that larger effects are to be expected from the FAT (the analogue of FST for between-allele diversity) same total intensity of local selection, the larger the values for a pair of loci to their expected squared

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 B. Charlesworth et al. 166

linkage disequilibrium and the Ohta–Kimura existence of this kind of transient state (see, for standardized linkage disequilibrium measure (Ohta & example, Hudson et al., 1994). Kimura, 1969), respectively. High levels of FAT for a set of neutral loci correspond to high levels of linkage (v) Discrepancies between simulations and analytical disequilibrium among them, so that a haplotype theory structure would be perceived if multi-locus genotype data were collected. As discussed above, we have useful approximate If evidence were obtained for such a situation, it analytical expectations for the effects on the expected would be helpful to ask whether the different levels of diversity at neutral loci of population haplotypes tend to be found in different populations, subdivision, selection pressures maintaining variation suggesting local selection. If they coexist within at other loci in the genome, and background selection. populations, some form of balancing selection that In general, the theoretical results fit our simulation can maintain variability within demes is indicated. results well, although there are some discrepancies, Studies of inbreeding populations have often found particularly in cases where there is background that multi-locus allozyme genotypes tend to be selection and high selfing. This good agreement associated with local environments (e.g. Hamrick & suggests that we have a reasonable understanding of Allard, 1972; Nevo et al., 1994), and this has been the major features of these rather complex situations, thought to be evidence for locally co-adapted geno- subject to the assumptions of the model. For the most types (reviewed in Hedrick, 1979). Such data do not, part, the different factors act nearly additively (see however, necessarily provide convincing evidence for Table 1). selection on the loci scored, nor for epistatic fitness Some of the discrepancies between the theory and interactions (Hedrick et al., 1978; Hedrick, 1979). the simulation results may be due to our having Hedrick (1979) suggested that the correlations between simulated rather small populations (N l 2000). The different loci that are to be expected in inbreeding theory assumes constant allele frequencies at the populations will promote genetic hitch-hiking, such selected loci, which is clearly untrue in practice. In that selection for advantageous alleles at one locus addition, the approximations used to derive the will cause allele frequency differences at neutral loci, expected diversity values in the case of frequency- and he showed that effects of the magnitudes observed dependent balancing selection assume small Nr.We could readily be generated. The hitch-hiking model thus expect the best agreement in our simulations does not account for the maintenance of diversity, either for diversity measures where proximity to a however, since advantageous alleles will presumably locus subject to such selection has no effect, or for eventually spread throughout all populations if there sites close to such a locus. We have few data for sites is some migration. Data of this kind are more plausibly tightly linked to the balanced polymorphism, because explained by local selective differences of the kind of the long coalescence times at such sites, but it is modelled here. clear that there is in general no obvious pattern of this Our results on the effects of local selective differences kind. Large population sizes would reduce fluctuations focus on steady-state values of diversity measures that in allele frequencies at the selected locus and should will be produced over evolutionary time, rather than thus improve agreement, but this remains to be on transient alterations that will occur if a genetically rigorously tested. variable population experiences a changed breeding With background selection, diversity between allelic system. The results can account for persistent local classes with respect to a polymorphic locus was lower allele frequency differences at neutral loci, even in the than predicted, with both local and balancing selec- face of some migration, provided that gene flow is tion. This occurred at neutral sites close to the sufficiently restricted for local adaptation to occur. polymorphic locus at least as much as at distant sites, Since migration between habitat patches subject to the so that high Nr is not the explanation. Background same selection regimes is not impeded, this process selection was also associated with reduced diversity should also eventually lead to similar genotypes at between demes, as well as within them. An assumption neutral loci in similar environments, as is observed in of the analytical theory that may be relevant to these inbreeding plant populations. Our results do not, discrepancies is the absence of linkage disequilibrium however, deal with newly established alleles. If an between the loci subject to background selection. allele at a polymorphic locus has recently risen to a With population subdivision, especially with local high frequency under local or balancing selection, it selection, this is likely to be violated in the simulations. will be expected to show greatly reduced variability Subdivision will cause between-deme heterosis, since with respect to closely linked neutral sites, which will genotypes from within each deme are more likely to also exhibit strong linkage disequilibrium, but these share deleterious alleles at any given locus than hybrid sites will not be unusually divergent from their individuals formed from crosses between demes (or counterparts linked to the alternative allele. between allelic classes, in the case of a balanced Asymmetries in the amounts of diversity within allelic polymorphism maintained within a deme). Migration classes or between populations would suggest the and recombination rates will therefore be somewhat

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 Genetic diŠersity in subdiŠided populations 167

greater in the simulations than expected if we ignore genomic regions where recombination is infrequent this effect, particularly in inbred populations. The (Charlesworth et al., 1993). Measures such as FST, and presence of large numbers of deleterious mutations measures derived from it (see Brown, 1979; van Dijk may therefore reduce diversity between allelic classes et al., 1988), can be very useful for descriptive and between demes. purposes, but the fact that they depend on both To test this, some runs were done with additive genetic divergence and genetic diversity within popu- effects of the deleterious mutations. In this case, one lations causes problems when comparisons are being would predict that effects of this kind would be much made between species with different mating systems or smaller than those observed with our standard between regions of the genome with different rates of dominance coefficient of 0n2 (although some heterosis crossing over. This is especially true if interpretations is still expected because mutations at different loci of the causes of different patterns of between- affect fitness multiplicatively rather than additively in population variation are to be attempted. It may often our simulations, so that hybrids will always have be better for comparative purposes to use DST values higher fitness than the average genotype). In runs with for describing allozyme data (analogous to our πT−S). subdivided populations, 90% selfing, and no local These are not as often reported as FST, however, and selection, however, changing the dominance coefficient are also less used in theoretical studies. In reviewing to 0n5 gave almost no reduction in the effect of the published data (see below), it is thus necessary to background selection on πT−S, though it somewhat consider FST. reduced the effect on πT−A. Heterosis thus cannot There will also be problems if data from loci linked account for most of this discrepancy. Further theor- to loci subject to local selection are used, as both πT−S etical and simulation work is needed to solve this and FST can be much higher than for unlinked loci (see problem. Section 4(iii)). In selfing populations, where loci are not independent even if they are not close in the genetic map, this is a real possibility. The effects of (vi) Measurement of between-population diŠersity local selection, inflating FST for neutral loci elsewhere In the next section, we briefly review empirical findings in the genome, imply that estimates of Nm derived on patterns of genetic diversity, in order to ask how from FST (e.g. Slatkin, 1993) could seriously under- the models here help us to understand them, but estimate migration if local selection is occurring. before doing so it is worthwhile to discuss how genetic Estimates based on loci closely linked to a locus diversity is quantified. There is no single standard subject to local selection might be seriously in error, notation, so we here review some of the terminology. but even neutral loci distant from any selected loci will Total genetic diversity can be partitioned into within- yield biased estimates (Fig. 8). and between-population values. For allozyme data, The difficulties in using FST to interpret differences the components are usually denoted by HT, HS and in patterns of diversity are exemplified by data on DST l HTkHS (Nei, 1973, 1987; Brown, 1979), where sequence divergence between isolated Drosophila the within-deme diversity (HS) and the excess between populations. Such data might offer a test of our populations (DST) are the allozyme analogues of the prediction that, even in outcrossing populations, local nucleotide-site diversity measures πS and πT−S which selection on certain loci in the genome should we have used above. Diversity between sets of sometimes produce large differences in allele frequency populations is often measured by quantities such as at other loci, if gene flow is restricted (see Fig. 5a). FST, which for allozymes can be defined as Other things being equal, these effects should be most (HTkHS)\HT (i.e. the ratio of the excess between- pronounced in genomic regions with low recom- deme diversity to that within demes), which is identical bination, if loci involved in local adaptation are to Nei’s GST (Nei, 1973). Such measures have the present in all regions. Results of this kind have been advantage that data using different types of genetic reported for sequence comparisons of loci in regions markers, such as allozymes and DNA sequence of low versus high recombination regions of D. diversity, can be compared, since they measure the anannassae (Stephan, 1994) and D. melanogaster relative magnitude of between- versus within-popu- (Begun & Aquadro, 1993, 1995). However, πS is much lation diversity. Other measures, which differ slightly lower at the loci in the regions with high FST than at from these, have been proposed by Weir & Cockerham loci with low FST in these studies. Whatever the causes (1984) and Slatkin (1993). of these difference in πS, the above discussion suggests A serious problem with FST, however, is that it that comparisons of divergence between populations depends inversely on within-deme diversity (Nei, 1973, should be made on the basis of the extent of πT−S, not 1987, p. 190). As explained in Section 1,itisnow FST. When such a comparison is made, there is no known that several processes (bottlenecks, back- evidence for a significant effect of chromosomal region ground selection, selective sweeps and fluctuating (B. Charlesworth, unpublished results). Thus, these selection) can reduce within-deme diversity. Effects of apparent differences in the extent of divergence are selection at linked loci will be more important in explicable solely in terms of differences in levels of inbreeding than outcrossing populations, and in within-population variability.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 B. Charlesworth et al. 168

Table 2. Summary of studies of comparisons of allozyme or microsatellite diŠersity in related populations with different selfing rates (S)

Species or Genus population SHSHTDST FST Reference Phlox drummondii 00n176 0n246 0n070 0n283 Levin (1978) roemeriana 00n277 0n356 0n079 0n223 cuspidata High 0n046 0n074 0n028 0n373 Oenothera grandis 00n190n22 0n03 0n136 Ellstrand & Levin (1980) laciniata High 0n140n180n04 0n222 Plectritis congesta 0n30n226 0n266 0n040 0n150 Layton & Ganders (1984) brachystemon 0n98 0n060 0n166 0n106 0n639 Gilia achilleifolia (4 pops.) 0n21 0n209 0n272 0n063 0n232a Schoen (1982) achilleifolia (3 pops.) 0n71 0n139 0n300 0n161 0n537a Lolium 3 spp. 0n08b 0n341 0n384 0n043 0n099 Loos (1993) remotum High 0n004 0n442 0n438 0n750 Plantago lanceolata 00n127 0n133 0n006 0n045 Wolff (1991) major (total) High 0n047 0n056 0n0180n321 Mimulus guttatusc 0n49 0n506 0n657 0n151 0n230 Awadalla (1996) lacinatusc 0n88 0n311 0n642 0n321 0n515

a Re-calculated from the published DST values. b Based on the published FIS value. c Based on microsatellites.

species (0n165 Š.0n091 for selfers). The few available (vii) Patterns of neutral diŠersity in inbreeding and within-genus comparisons available give similar outbreeding populations results (Table 2). In addition, a survey of microsatellite There are few good data to compare with the results diversity in the highly inbreeding plant Arabidopsis of our models, apart from the Drosophila results thaliana found very high diversity between populations which we have just discussed. The ideal data would be and uniformity within populations (Todokoro et al., DNA sequence diversity, preferably using neutral 1995); in the inbreeding plant Medicago truncatula, differences, since these relate most directly to co- 67% of the variance in gene frequency for randomly alescence times and to the infinite sites model used in amplified DNA (RAPD) markers was found to be our simulations. Data on several loci are needed between subpopulations (Bonnin et al., 1996). A because, if there are local selection pressures, between- similar result has been found for restriction fragment population differences are not independent of map length polymorphism (RFLP) variation in A. thaliana position, and very high values can be produced at loci (M. Kreitman, personal communication). It would be in the region of a locus subject to local selection (see interesting to have comparable surveys of related Section 4(iv) above). Allozyme data are much less outbreeding species. Such a comparison in Mimulus useful because, even assuming selective neutrality, species, using microsatellites, displays a pattern differences between genotypes probably do not in- consistent with the other data just cited, even though crease linearly with coalescence times (Marshall & the species compared differed only moderately in their Brown, 1975). Moreover, allozyme allelic differences selfing rates (Awadalla & Ritland, 1997); in this may not be neutral. Several kinds of evidence suggest dataset, much of the difference in FST is clearly caused some selection on such variants (e.g. Karl & Avise, by lower HS in the more inbreeding populations. 1992; Pogson & Zouros, 1994; Kreitman, 1996). Since The existence of consistent differences in patterns of nearly all currently available data are from allozyme diversity between inbreeding and outcrossing species, surveys, however, we will discuss them here. with inbreeders having low within-population di- It has been estimated that between-population versity and often low total diversity, but relatively or allozyme diversity values of selfing plants are up to 5 even absolutely high divergence between populations, times greater than those of outcrossers. Hamrick & is well known in plant population biology. The results Godt (1996) find that breeding system is a major summarized here agree well with those reviewed in the factor influencing FST values of plant species, being past for morphological and quantitative genetic involved in all the two-trait combinations that diversity, as well as diversity for genetic markers explained large proportions of the variance in this (Baker, 1953; Brown, 1979; Mitchell-Olds & quantity. They estimate average FST values for Bergelson, 1990; Heywood, 1991). inbreeding and outcrossing dicotyledonous plants to be 0n587 and 0n184, yielding a ratio of 3n2. Much of this difference is caused by the approximately twofold higher within-population diversity in outbreeding

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 Genetic diŠersity in subdiŠided populations 169

suming the infinite allele neutral model and the island (viii) Interpretation of the obserŠed patterns model of migration, Maruyama & Tachida (1992) A full explanation does not yet exist for the patterns obtained an expression for FST as a function of the just reviewed. Previous discussions have mostly selfing rate. Nordborg (1997, eq. 55) has also derived focussed on the effect of changing from an outbreeding such a result for the case of two populations with to an inbreeding mating system, assuming the prior migration, using the coalescence approach, and existence of genetic diversity. For instance, it is often showed that FST l 1\(1j4Nm\[1jFp ]), where Fp is the stated that inbreeding alone should not lead to loss of equilibrium inbreeding coefficient in an infinite popu- genetic variation, but merely changes how it will be lation with a given rate of selfing. These results for distributed (Hamrick & Godt, 1990). This ignores the neutral models show that selfing alone can increase effect of inbreeding on effective population size FST by at most a factor of 2, as expected if the effect (Pollak, 1987), which must affect the long-term is entirely due to reduction in πS. Our simulations with equilibrium diversity. Furthermore, the breeding no local selection (but with a polymorphism main- system influences the maintenance of diversity under tained within populations by balancing selection) selection. For instance, balanced polymorphisms agree with this conclusion (compare the cases with no maintained by overdominance are maintained under local selection in Fig. 8a, c). Very high observed more restrictive conditions in inbred than outbred values of FST in selfing versus comparable outcrossing populations (Kimura & Ohta, 1971), and the breeding populations must therefore be attributed to additional system also influences expected levels of quantitative effects of the kinds discussed above. genetic variability (Charlesworth & Charlesworth, Can we explain the observed differences in diversity 1995). With spatial differences in selection, inbreeding by local selection pressures combined with population can relax the conditions for different alleles to be subdivision, together with the differences in breeding maintained in different subpopulations, presumably systems, or do we need background selection as well, because it reduces gene flow via pollen (Hedrick, in order to account for the data? Although our 1990). Background selection and selective sweeps will simulations use parameter values that are not bio- have much more profound effects on within-popu- logically very realistic, it is nevertheless worth asking lation diversity in inbreeders than outbreeders whether effects of magnitudes comparable to those (Charlesworth et al., 1993). actually observed can be produced under plausible To understand the possible causes of breeding assumptions with restricted migration. The simu- system differences in diversity patterns, it is helpful to lations with no local selection produced FST values review briefly the effects of different factors on around 0n1 for populations with 90% selfing, and individual diversity measures in such populations at lower with outcrossing (Fig. 8). These are similar to statistical equilibrium under mutation and genetic those estimated using allozymes in outcrossing plants, drift. Isolation alone enhances πT−S, but within- as just discussed. With background selection, the population diversity can be predicted from the effective values increase to about 0n4 for inbred, and 0n1 for metapopulation size, which is calculated by weighting outbred populations, respectively. Background selec- the contributions of each deme appropriately tion alone could thus potentially account for increased (Maruyama, 1971, 1972; Nagylaki, 1982; Slatkin, FST under selfing, although our simulations assumed a 1987; Strobeck, 1987; Tajima, 1990, 1992; Whitlock high per chromosome mutation rate to deleterious & Barton, 1997). Large asymmetries in deme size or mutations (Section 2(i)), which makes the results highly asymmetric patterns of migration that lead to unrealistic for outcrossing populations. In highly a few populations being the source of most migrants inbreeding populations, however, the lack of in- can greatly reduce the effective metapopulation size dependence of loci means that the mutation rate used below the total species number (Tajima, 1990, 1992; is not unreasonable as a per-genome value Nordborg, 1997). Local extinction and recolonization (Charlesworth et al., 1990; Johnston & Schoen, 1995). can also lead to greatly reduced effective meta- Local selection could also explain the observed population size (Maruyama & Kimura, 1980; differences in FST between inbreeders and outbreeders. Whitlock & Barton, 1997). We have not included Moderate local selection, without background selec- these complications in the present study, but they may tion, increased FST in our simulations to a similar obviously be of considerable importance in many extent to background selection, except for loci close to cases in nature. the selected locus, where very high values occurred As intuitively expected, subdivision without local (Fig. 8). Our local selection models thus yield values selection increases the fraction of diversity between, as that are reasonable for both outcrossing and in- opposed to within, populations, regardless of prox- breeding populations. Moreover, the selection imity to a locus under balancing selection. Selfing coefficient assumed in our model of moderate local reduces the effective sizes of the subpopulations, selection is similar to the lower range of values which will tend to increase πT−S and decrease πS. Thus estimated from reciprocal transplant experiments on a higher proportion of the total variability is between plants between different environments; in some cases, demes, compared with outcrossing populations. As- there is evidence for selection as intense as that

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 B. Charlesworth et al. 170

assumed in our strong local selection model (Bell, In has in fact been suggested that bottlenecks are 1996; Linhart & Grant, 1996). These estimates more important in inbreeding than outbreeding presumably represent somewhat stronger selection populations, because levels of diversity (and Ne values than typically occurs, as there is probably some bias estimated from them) vary more from one population towards studying and reporting situations with de- to another in inbreeders, consistent with recent tectable selection. However, they measure selection on bottlenecks in the ancestry of populations with low entire genotypes, whereas our model assumes a single diversity values (Schoen & Brown, 1991). However, genetic difference. Since even a single locus affects a this evidence for bottlenecks is not conclusive, and the wide region of the genome, it is likely that the effects data could as well be interpreted in terms of low of several selected loci produce larger effects than we average within-population diversity in inbreeders observe, for the same total selection intensity (see caused by background selection or other factors. The Section 4(ii)). Models of local selection involving occasional populations with high values (Schoen & several loci require future study, but it appears that Brown, 1991) could arise by population mixture, by our selection parameters may not be unreasonable for local selection when subdivision is not apparent (see plant populations. Simulations of selfing populations above), or even by their being the product of the yielded FST values for loci loosely linked to the recent of selfing. Since it is likely that selfing selected locus that also correspond with those often often evolves, but selfing taxa do not persist for very observed, with values of 0n4–0n5 under moderate local long evolutionary time periods (Stebbins, 1957; selection. When there was also background selection Schoen et al., 1996), this last possibility cannot be the values were higher than most observed ones. ignored. Unless direct evidence for bottlenecks is Greater migration would, of course, reduce the values. obtained, or the alternative interpretations ruled out, On the other hand, in our models we compared this issue is unresolved. inbreeders and outbreeders with the same migration This work was supported by grants DEB-9317683 from the rates, but in inbreeding plant populations migration US National Science Foundation and GM50355 from the via pollen is effectively lowered because of the rarity of National Institutes of Health. We thank Thomas Nagylaki outcrossing (Hedrick, 1990). This will only cause and two reviewers for their comments on the manuscript. increased between-population differentiation, and hence cannot in itself explain the greatly reduced Appendix within-population variability associated with inbreed- (i) The effectiŠe rate of migration with population ing (Table 2). subdiŠision and local selection The evidence that inbreeding populations generally have lower within-population diversity levels than This section extends the approach of Bengtsson (1985) outbreeders suggests a possible role for background and Barton & Bengtsson (1986) to the case of a selection or other forms of selection at linked loci. partially selfing population, using a heuristic ar- However, other processes such as population bottle- gument. We assume that there are two subpopulations necks associated with colonization events could lead of size N\2, connected by migration at rate m.Asin to reduced within-population diversity and increased the text, let the frequency of the favoured allele at the FST values. As pointed out by Ingvarsson et al.(1997), locus subject to local selection in a given sub- increases in FST due to extinction and recolonization population be p, and the frequency of the disfavoured (Slatkin, 1977; Wade & McCauley, 1988; Whitlock & allele be q (q is assumed to be so small that second- McCauley, 1990; Ingvarsson, 1997) may occur pri- order terms in q are negligible). The fitnesses of the marily through reductions in within-population di- heterozygotes and homozygotes for the disfavoured versity rather than by increases in absolute between- allele are 1khs and 1ks, respectively. If q is small, the population diversity, since the effective size of a effect of selection on genotype frequencies in each metapopulation is reduced by local extinction and generation is of order sq. The deviation of Wright’s recolonization events (Whitlock & Barton, 1997), fixation index, F, from its neutral value must therefore causing a reduction in total diversity in the meta- also be O(sq) (cf. Nagylaki, 1997). With a frequency population (Maruyama & Kimura, 1980). Absolute of self-fertilization S, the equilibrium F value for a measures of between-population differentiation may neutral locus is Fp l S\(2kS) (Crow & Kimura, 1970, thus be reduced by extinction and recolonization, in p. 94) so that F in the present case is approximately # contrast to the effects of local selection. Since our equal to Fp jO(sq). Neglecting terms in O(q ), the simulation results for highly selfing populations standard selection equation with non-random mating indicate that background selection also reduces (Wright 1969, p. 244) implies that the change in between-population differentiation (Fig. 4), the role of frequency of the disfavoured allele due to selection is selfing in enhancing the effects of selection at linked ∆qs $kqsoFp jh(1kFp )q.(A1a) sites may be difficult to distinguish from the effects of local extinction and recolonization events by this The effective selection coefficient against the locally criterion alone, if such events are more common in disfavoured allele is therefore given by inbreeders. sg $ soFp jh(1kFp )q.(A2b)

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 Genetic diŠersity in subdiŠided populations 171

If we follow the fate of a set of disfavoured alleles rate for a neutral locus linked to the selected locus is introduced into a particular subpopulation in a given mg l r\(rjhs), and that this accurately predicts the generation, it follows that their frequency will be variance in allele frequencies between subpopulations reduced by a factor of (1ksg ) each generation. If when substituted into the standard neutral formulae. migration precedes selection in each generation (as This expression for effective migration rate is close to assumed in our simulations), the frequency of this (A 3) for the random-mating case if selection is weak, i cohort of disfavoured alleles will be reduced by (1ksg ) q is small, and recombination is rare. after i generations. If selection precedes migration, the i−" factor is only (1ksg ) . Clearly, only the survivors of selection contribute to the gene pool of the sub- (ii) Linkage disequilibria with two allelic classes or population in any given generation. two subpopulations Next, consider a neutral locus that is linked to the Assume that there are two alleles maintained by selected locus. With probability p, the migrants into a selection, with frequencies p and q. Consider a neutral given subpopulation will carry a neutral allele site i which has recombination frequency ri with the associated with the allele at the locally selected locus polymorphic locus, and that is segregating for two that is now disfavoured by selection. As shown by nucleotide variants. Let δi be the difference in the Nordborg (1997), the effective rate at which the frequency of one of these variants between the two neutral allele recombines with the selected allele in an allelic classes at the polymorphic locus. The coefficient inbred population is of linkage disequilibrium between the polymorphic locus and site i is (cf. Nordborg et al., 1996a) rg $ r(1kFp ). (A 2) Di l pqδi If migration precedes selection, the probability that the neutral allele persists for ik1 generations, and and the corresponding correlation coefficient is then recombines to become associated with the locally i i−" pqδi favoured allele is thus (1ksg ) (1krg ) rg . Summing ρi l ,(A5b) N [pqx (1 x )] over all generations, the net fraction of neutral alleles i k i which are initially associated with the locally dis- where xi and 1kxi are the frequencies of the two favoured allele, but escape selective elimination, is the alternative variants at site i. sum of terms of this form over all generations, i.e. It follows from equation (5) of Nei & Li (1973) that (1ksg ) rg \(1k[1krg ][1ksg]). We can define an effective the components of the linkage disequilibrium and migration rate, m4 , for the neutral locus as the product correlation coefficients between two segregating neu- of the migration rate and the probability that the tral sites i and j due to the differences in their variant neutral allele survives selection (Bengtsson, 1985). If frequencies between the two allelic classes are the case in which the allele at the neutral locus is # initially associated with the locally favoured allele at DijT−A l (pq) δi δj the selected locus (which has probability q) is taken l Di Dj (A 6a) into account, we have and p(1ksg ) rg mg $ m qj .(A3)ρijT−A l ρi ρj.(A6b) ( (1k[1ksg ][1krg])* If we take expectations of the squares of the right- If selection precedes migration, the factor (1ksg ) in the hand side of (A 5a) over the statistical equilibrium, numerator of the second term in braces is omitted. the expected squared linkage disequilibrium between The degree of differentiation between sub- the polymorphic locus and the ith segregating neutral populations at a neutral locus is then given by using site is thus this effective migration rate in the standard equation # # # " EoDi ql(pq) Eoδi ql#(pq) πiT−A,(A7a) for migration and drift in the case of two sub- populations (Hudson, 1990; Slatkin, 1991), i.e. where πiT−A is the between-allele component of diversity under the infinite sites model for site i. This # 1 follows from the relation πiT−Al2pqEoδi q. πT−S l .(A4) 8Nmg Similarly, if we take expectations of the squares of the denominator and numerator of the right-hand side A similar result was obtained by Petry (1983), for of (A 5b) and substitute into (A 6b), we obtain the the case of an island model of migration with local Ohta–Kimura (Ohta & Kimura, 1969) standardized selection, such that immigrants into a local population linkage disequilibrium measure for the polymorphic are disfavoured by selection at a particular locus. He locus and the ith neutral site showed that, for weak selection and low recom- bination (such that r and hs are of the order of the # πiT−A σidl lFiAT,(A7b) reciprocal of population size), the effective migration 2Eoxi(1kxi)q

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 B. Charlesworth et al. 172

where FiAT is analogue of FST for the between-allele Begun, D. J. & Aquadro, C. F. (1993). African and North diversity relative to within-allele diversity at site i. American populations of Drosophila melanogaster are A similar argument applied to the component of very different at the DNA level. Nature 365, 548–550. Begun, D. J. & Aquadro, C. F. (1995). Evolution at the tip linkage disequilibrium between two neutral sites due and base of the X chromosome in an African population to differences in variant frequencies between the two of Drosophila melanogaster. Molecular Biology and EŠol- allelic classes yields the expressions ution 12, 382–390. # " # Bell, G. (1996). Selection. The Mechanism of EŠolution. New EoDijT−Aql%(pq) Eoδi δjq,(A8a)York: Chapman and Hall. # # # Bengtsson, B. O. (1985). The flow of genes through a genetic #(pq) Eoδi δj q barrier. In EŠolution. Essays in Honour of John Maynard σijT−A l (A 8b) 4Eoxi(1kxi) xj(1kxj)q Smith (ed. P. J. Greenwood, P. H. Harvey & M. Slatkin), pp. 31–42. Cambridge: Cambridge University Press. Correlations between the genealogical histories of the Bonnin, I., Huguet, T., Gherardi, M., Prosperi, J.-M. & Olivieri, I. (1996). High level of polymorphism and spatial two neutral sites must cause positive covariances # # structure in a selfing plant species, Medicago truncatula between δi and δj , so that (Leguminosae), shown using RAPD markers. American # " # Journal of Botany 83, 843–855. EoDijT−Aq&%(pq) πiT−AπjT−A.(A9a)Braverman, J. M., Hudson, R. R., Kaplan, N. L., Langley, C. H. & Stephan, W. (1995). The hitchhiking effect on the The expectation of the denominator of (A 8b) exceeds site frequency spectrum of DNA polymorphisms. Genetics the product of the expectations of xi(1kxi) and 140, 783–796. xj(1kxj), by the covariance between xi(1kxi) and Brown, A. H. D. (1979). Enzyme polymorphism in plant xj(1kxj) due to shared geneaological history. This is populations. Theoretical and Applied Genetics 15, 1–42. likely to be very small in large populations, even for Charlesworth, D. & Charlesworth, B. (1995). Quantitative genetics in plants: the effect of breeding system on genetic closely linked sites, since the expectations are taken variability. EŠolution 49,911–920. over repeated introductions of new variants at the Charlesworth, B., Charlesworth, D. & Morgan, M. T. sites in question, which involve independent (1990). Genetic loads and estimates of mutation rates in mutational events. For example, in the case of a very inbred plant populations. Nature 347, 380–382. random-mating population of size N (and with no Charlesworth, B., Morgan, M. T. & Charlesworth, D. (1993). The effect of deleterious mutations on neutral polymorphic locus), Pluzhnikov & Donnelly (1996, molecular variation. Genetics 134, 1289–1303. Appendix, final equation) show that the covariance in Crow, J. F. & Kimura, M. (1970). An Introduction to diversities between two sites is always very small Population Genetics Theory. New York: Harper and compared with the product of the expected diversities, Row. with a maximum of about 1\9 for absolutely linked Ellstrand, N. C. & Levin, D. A. (1980). Recombination sites. A similar result can be derived from equations system and population structure in Oenothera. EŠolution 32, 245–263. (7) of Ohta & Kimura (1971). Gillespie, J. H. (1991). The Causes of Molecular EŠolution. Evaluating the denominator in (A 8b), and ignoring Oxford: Oxford University Press. the terms due to shared genealogies, we thus have Gillespie, J. H. (1994). Alternatives to the neutral theory. In Non-Neutral EŠolution: Theories and Molecular Data # (ed. σijT−A & FiATFjAT.(A9b)B. Golding), pp. 1–17. London: Chapman and Hall. Haldane, J. B. S. (1919). The combination of linkage values This shows that the extent of differentiation between and the calculation of distance between loci of linked the allelic classes at the polymorphic locus with factors. Journal of Genetics 8, 299–309. respect to linked neutral sites is directly related to the Haldane, J. B. S. (1930). A mathematical theory of natural and artificial selection. VI. Isolation. Proceedings of the amount of linkage disequilibrium between neutral Cambridge Philosophical Society 26, 220–230. sites. Hamrick, J. L. & Allard, R. W. (1972). Microgeographical variation in allozyme frequencies in AŠena barbata. Proceedings of the National Academy of Sciences of the USA 69,2100–2104. References Hamrick, J. L. & Godt, M. J. (1990). Allozyme diversity in plant species. In Plant Population Genetics, Breeding, and Awadalla, P. (1996). Microsatellite variation and evolution Genetic Resources (ed. A. H. D. Brown, M. T. Clegg, in mixed mating species of Mimulus (Scrophulariaceae). A. L. Kahler & B. S. Weir), pp. 43–63. Sunderland, MSc , University of Toronto. Mass.: Sinauer. Awadalla, P. & Ritland, K. (1997). Microsatellite variation Hamrick, J. L. & Godt, M. J. W. (1996). Effects of life in Mimulus species of contrasting mating systems. history traits on genetic diversity in plant species. Molecular Biology and EŠolution (in press). Philosophical Transactions of the Royal Society of London Baker, H. G. (1953). Race formation and reproductive Series B 351, 1291–1298. method in flowering plants. Society for Experimental Hedrick, P. W. (1979). Hitch-hiking: an alternative to Biology Symposia 7, 114–145. coadaptation for the barley and slender wild oat examples. Barton, N. H. (1979). Gene flow past a cline. Heredity 43, Heredity 43, 79–86. 333–339. Hedrick, P. W. (1980). Hitch-hiking: a comparison of Barton, N. H. & Bengtsson, B. O. (1986). The barrier to linkage and partial selfing. Genetics 94,791–808. genetic exchange between hybridising populations. Hedrick, P. W. (1990). Mating systems and evolutionary Heredity 56, 357–376. genetics. In Population Biology: Ecological and EŠol-

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 Genetic diŠersity in subdiŠided populations 173

utionary Viewpoints (ed. K. Wo$ hrmann & S. Jain), pp. Maruyama, T. (1970). Rate of decrease of genetic variability 83–118. New York: Springer-Verlag. in a subdivided population. Biometrika 57, 299–311. Hedrick, P. W., Jain, S. K. & Holden, L. (1978). Multilocus Maruyama, T. (1971). An invariant property of a structured systems in evolution. EŠolutionary Biology 2, 104–184. population. Genetical Research 18,81–84. Heywood, J. S. (1991). Spatial analysis of genetic variation Maruyama, T. (1972). Some invariant properties of a in plant populations. Annual ReŠiew of Ecology and geographically structured finite population: distribution Systematics 22, 335–355. of heterozygotes under irreversible mutation. Genetical Hudson, R. R. (1990). Gene genealogies and the coalescent Research 20, 141–149. process. Oxford SurŠeys in EŠolutionary Biology 7, 1–45. Maruyama, T. & Kimura, M. (1980). Genetic variability Hudson, R. R. & Kaplan, N. L. (1988). The coalescent and effective population size when local extinction and process in models with selection and recombination. recolonization of subpopulations are frequent. Pro- Genetics 120,831–840. ceedings of the National Academy of Sciences of the USA Hudson, R. R. & Kaplan, N. L. (1994). Gene trees with 77,6710–6714. background selection. In Non-Neutral EŠolution: Theories Maruyama, K. & Tachida, H. (1992). Genetic variability and Molecular Data (ed. B. Golding), pp. 140–153. and geographic structure in partially selfing populations. London: Chapman and Hall. Japanese Journal of Genetics 67, 39–51. Hudson, R. R. & Kaplan, N. L. (1995). Deleterious back- Maynard Smith, J. & Haigh, J. (1974). The hitch-hiking ground selection with recombination. Genetics 141, effect of a favorable gene. Genetical Research 23, 23–35. 1605–1617. Mitchell-Olds, T. & Bergelson, J. (1990). Statistical genetics Hudson, R. R., Boos, D. D. & Kaplan, N. L. (1992). A of an annual plant Impatiens capensis. I. Genetic basis of statistical test for detecting geographic subdivision. quantitative variation. Genetics 124, 407–415. Molecular Biology and EŠolution 9, 138–151. Nagylaki, T. (1982). Geographical invariance in population Hudson, R. R., Bailey, K., Skerecky, D., Kwiatowski, J. & genetics. Journal of Theoretical Biology 99, 159–172. Ayala, F. J. (1994). Evidence for positive selection in the Nagylaki, T. (1997). The diffusion model for migration and superoxide dismutase (Sod) region of Drosophila melano- selection in a plant population. Journal of Mathematical gaster. Genetics 136, 1329–1340. Biology 35,391–431. Ingvarsson, P. K. (1997). The effect of delayed population Narain, P. (1966). Effect of linkage on homozygosity of a growth on the genetic differentiation of local populations population under mixed selfing and random mating. subject to frequent extinctions and colonizations. Genetics 54, 303–314. EŠolution 51, 29–35. Nei, M. (1973). Analysis of gene diversity in subdivided Ingvarsson, P. K., Olsson, K. & Ericson, L. (1997). populations. Proceedings of the National Academy of Extinction–recolonisation dynamics in the mycophagous Sciences of the USA 70, 3321–3323. beetle Phalacrus substriatus. EŠolution 51, 187–195. Nei, M. (1987). Molecular EŠolutionary Genetics. New Johnston, M. O. & Schoen, D. J. (1995). Mutation rates and York: Columbia University Press. dominance levels of genes affecting total fitness in two Nei, M. & Li, W.-H. (1973). Linkage disequilibrium in angiosperm species. Science 267, 226–229. subdivided populations. Genetics 75,213–219. Kaplan, N. L., Hudson, R. R. & Langley, C. H. (1989). The Nevo, E., Krugman, T. & Beiles, A. (1994). Edaphic natural ‘hitchhiking effect’ revisited. Genetics 123, 887–899. selection of allozyme polymorphisms in Aegilops peregrina Karl, S. A. & Avise, J. C. (1992). Balancing selection at at a Galilee microsite in Israel. Heredity 72, 109–112. allozyme loci in oysters: implications from nuclear Nordborg, M. (1997). Structured coalescent processes on RFLPs. Science 256, 100–102. different time scales. Genetics, in press. Karlin, S. (1982). Classifications of selection–migration Nordborg, M., Charlesworth, B. & Charlesworth, D. structures and conditions for a protected polymorphism. (1996a). The effect of recombination on background EŠolutionary Biology 14,61–204. selection. Genetical Research 67, 159–174. Kimura, M. (1971). Theoretical foundations of population Nordborg, M., Charlesworth, B. & Charlesworth, D. genetics at the molecular level. Theoretical Population (1996b). Increased levels of polymorphism surrounding Biology 2, 174–208. selectively maintained sites in highly selfing species. Kimura, M. (1983). The Neutral Theory of Molecular Proceedings of the Royal Society of London, Series B 163, EŠolution. Cambridge: Cambridge University Press. 1033–1039. Kimura, M. & Ohta, T. (1971). Theoretical Topics in Ohta, T. & Kimura, M. (1969). Linkage disequilibrium due Population Genetics, Princeton, NJ: Princeton University to random genetic drift. Genetical Research 13, 47–55. Press. Ohta, T. & Kimura, M. (1971). Linkage disequilibrium Kreitman, M. (1996). The neutral theory is dead. Long live between two segregating nucleotide sites under steady flux the neutral theory. BioEssays 18, 678–683. of mutations in a finite population. Genetics 68,571–580. Levene, H. (1953). Genetic equilibrium when more than one Petry, D. (1983). The effect on neutral gene flow of selection ecological niche is available. American Naturalist 87, at a linked locus. Theoretical Population Biology 23, 331–333. 300–313. Levin, D. A. (1978). Genetic variation in annual Phlox: self- Pluzhnikov, A. & Donnelly, P. (1996). Optimal sequencing compatible vs self-incompatible species. EŠolution 32, strategies for surveying molecular genetic diversity. 245–263. Genetics 144, 1321–1328. Linhart, J. & Grant, M. C. (1996). Evolutionary significance Pogson, G. H. & Zouros, E. (1994). Allozyme and RFLP of local genetic differentiation in plants. Annual ReŠiew of heterozygosities as correlates of growth rate in the scallop Ecology and Systematics 27, 237–277. Placopecten magellanicus: a test of the associative over- Loos, B. P. (1993). Allozyme variation within and between dominance hypothesis. Genetics 137,221–231. populations in Lolium (Poaceae). Plant Systematics and Pollak, E. (1987). On the theory of partially inbreeding finite EŠolution 188, 101–113. populations. I. Partial selfing. Genetics 117, 353–360. Marshall, D. R. & Brown, A. H. D. (1975). The charge state Schoen, D. J. (1982). Genetic variation and the breeding model of protein polymorphisms in natural populations. system of Gilia achilleifolia. EŠolution 36,361–370. Journal of Molecular EŠolution 6, 149–163. Schoen, D. J. & Brown, A. H. D. (1991). Intraspecific

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954 B. Charlesworth et al. 174

variation in population gene diversity and effective Strobeck, C. (1987). Average number of nucleotide population size correlates with the mating system in differences in a sample from a single population: a test for plants. Proceedings of the National Academy of Sciences population subdivision. Genetics 117, 149–153. of the USA 88, 4494–4497. Tajima, F. (1990). Relationship between migration and Schoen, D. J., Morgan, M. T. & Bataillon, T. (1996). How DNA polymorphism in a local population. Genetics 126, does self-pollination evolve? Inferences from floral ecol- 231–234. ogy and molecular genetic variation. Philosophical Trans- Tajima, F. (1992). Measurement of DNA polymorphism. In actions of the Royal Society of London, Series B 351, Mechanisms of Molecular EŠolution (ed. N. Takahata & 1281–1290. A. G. Clark), pp. 37–59. Sunderland, Mass.: Sinauer. Simonsen, K. L., Churchill, G. A. & Aquadro, C. F. (1995). Todokoro, S., Terauchi, R. & Kawano, S. (1995). Micro- Properties of statistical tests of neutrality for DNA satellite polymorphisms in natural populations of polymorphism data. Genetics 141,413–429. Arabidopsis thaliana in Japan. Japanese Journal of Genetics Slatkin, M. (1977). Gene flow and genetic drift in a 70, 543–554. population subject to frequent local extinctions. van Dijk, H. V., Wolff, K. & Fries, A. D. (1988). Genetic Theoretical Population Biology 12, 253–262. variability in Plantago species in relation to their ecology. Slatkin, M. (1987). The average number of sites separating 3. Genetic structure of populations of Plantago major, P. lanceolata P coronopus Theoretical and Applied DNA sequences drawn from a subdivided population. and . . Genetics 75,518–528. Theoretical Population Biology 32, 42–49. Wade, M. J. & McCauley, D. E. (1988). Extinction and Slatkin, M. (1991). Inbreeding coefficients and coalescence recolonization: their effects on the genetic differentiation times. Genetical Research 58, 167–175. of local populations. EŠolution 42, 995–1005. Slatkin, M. (1993). Isolation by distance in equilibrium and Weir, B. S. & Cockerham, C. C. (1984). Estimating non-equilibrium populations. EŠolution 47, 264–279. F-statistics for the analysis of population structure. Stebbins, G. L. (1957). Self fertilization and population EŠolution 38, 1358–1370. variation in the higher plants. American Naturalist 91, Whitlock, M. C. & Barton, N. H. (1997). The effective size 337–354. of a subdivided population. Genetics 146, 427–441. Stephan, W. (1994). Effects of genetic recombination and Whitlock, M. & McCauley, D. E. (1990). Some population population subdivision on nucleotide sequence variation genetic consequences of colony formation and extinction. in Drosophila ananassae.InNon-neutral EŠolution: EŠolution 44, 1717–1724. Theories and Molecular Data (ed. B. Golding), pp. 57–66. Wolff, K. (1991). Analysis of allozyme variability in three London: Chapman and Hall. Plantago species and a comparison to morphological Stephan W., Wiehe, T. H. E. & Lenz, M. W. (1992). The variability. Theoretical and Applied Genetics 81, 119–126. effect of strongly selected substitutions on neutral poly- Wolff, K., Rogstad, S. H. & Schaal, B. A. (1994). Population morphism: analytical results based on diffusion theory. and species variation of minisatellite DNA in Plantago. Theoretical Population Biology 41, 237–254. Theoretical and Applied Genetics 87, 733–740. Strobeck, C. (1983). Expected linkage disequilibrium for a Wright, S. (1969). EŠolution and the Genetics of Populations, neutral locus linked to a chromosomal arrangement. vol. 2, The Theory of Gene Frequencies. Chicago: Genetics 103, 545–555. Press.

Downloaded from https://www.cambridge.org/core. IP address: 170.106.34.90, on 28 Sep 2021 at 07:05:31, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0016672397002954