INVESTIGATION

Measuring Selection Coefficients Below 1023: Method, Questions, and Prospects

Romain Gallet,* Tim F. Cooper,† Santiago F. Elena,‡,§ and Thomas Lenormand*,1 *CEFE–UMR 5175 1919 route de Mende, F-34293 Montpellier, CEDEX 5, France, † and Ecology Group, Department of Biology and Biochemistry, University of Houston, Houston, Texas 77204, ‡Instituto de Biología Molecular y Celular de Plantas (CSIC-UPV), 46022 València, Spain, and §Santa Fe Institute, Santa Fe, New Mexico 87501

ABSTRACT Measuring fitness with precision is a key issue in , particularly in studying of small effects. It is usually thought that sampling error and drift prevent precise measurement of very small fitness effects. We circumvented these limits by using a new combined approach to measuring and analyzing fitness. We estimated the mutational fitness effect (MFE) of three independent mini-Tn10 transposon insertion mutations by conducting competition experiments in large populations of Escherichia coli under controlled laboratory conditions. Using flow cytometry to assess genotype frequencies from very large samples alleviated the problem of sampling error, while the effect of drift was controlled by using large populations and massive replication of fitness measures. Furthermore, with a set of four competition experiments between ancestral and mutant genotypes, we were able to decompose fitness measures into four estimated parameters that account for fitness effects of our fluorescent marker (a), the (b), epistasis between the mutation and the marker (g), and departure from transitivity (t). Our method allowed us to estimate mean selection coefficients to a precision of 2 · 1024. We also found small, but significant, epistatic interactions between the allelic effects of mutations and markers and confirmed that fitness effects were transitive in most cases. Unexpectedly, we also detected variation in measures of s that were significantly bigger than expected due to drift alone, indicating the existence of cryptic variation, even in fully controlled experiments. Overall our results indicate that selection coefficients are best understood as being distributed, representing a limit on the precision with which selection can be measured, even under controlled laboratory conditions.

UTATIONS of small effect can play an important role 1996). Finally, a low precision in fitness measures limits the Min evolution, but they are difficult to measure exper- ability to determine whether the fitness effect of a mutation imentally because the precision with which fitness effects can varies across different environmental or genetic contexts and be measured is relatively low. For this reason, it remains adds to other sources of stochasticity (Lenormand et al. 2009) unclear to what extent mutations with small beneficial ef- to make it difficult to reliably predict evolutionary trajectories. fi fects contribute to tness improvements (Orr 2005). It is Precisely measuring fitness poses technical, conceptual, also unclear how much deleterious mutations of small effect and statistical challenges. The technical challenge is to set contribute to the genetic load and inbreeding depres- up a technique that allows experiments to be carried out sion (Charlesworth and Charlesworth 1998; Bataillon and efficiently. The first major advance was to use “population Kirkpatrick 2000). More generally, the existence and influence cages” with Drosophila or other small animals (starting in of mutations of small effect is at the heart of the neutralist– the 1930s with the work of L’Heritier and Teissier 1937a,b). selectionist controversy (e.g., Nei 2005). This debate can be With such devices, environmental conditions are relatively addressed experimentally only if the precision of fitness mea- controlled and gene flow can be eliminated. However, drift surements is lower than the inverse of effective population size, and indirect selection caused by loci under selection in link- which seems beyond reach for large populations (Kreitman age disequilibrium with the focal locus are difficult to account for. The same approach was applied to microorgan- Copyright © 2012 by the Genetics Society of America isms (Dykhuizen and Hartl 1980), which can be made iso- doi: 10.1534/genetics.111.133454 Manuscript received July 29, 2011; accepted for publication September 19, 2011 genic save for a focal gene, thereby reducing indirect Supporting information is available online at http://www.genetics.org/content/ selection due to initial linkage disequilibrium (e.g., Carrasco suppl/2011/10/31/genetics.111.133454.DC1. 1Corresponding author: CEFE–UMR 5175 1919 route de Mende, F-34293 Montpellier, et al. 2007; Domingo-Calap et al. 2009 for distribution of CEDEX 5, France. E-mail: [email protected] mutation fitness effects; Elena et al. 1998; Sanjuan et al. 2004;

Genetics, Vol. 190, 175–186 January 2012 175 Peris et al. 2010) and can be propagated as large popu- Manly 1985; Arnason and Lewontin 1991; Lenormand and lations, minimizing the effect of drift relative to selection. Raymond 2000; Saccheri et al. 2008; Labbe et al. 2009), They can also be followed over many generations (Dykhuizen which can include drift if longitudinal data are available and Hartl 1983; Thatcher et al. 1998; Lunzer et al. 2002). (Manly 1985; O’hara 2005; Bollback et al. 2008). When Long-term monitoring increases the ability to detect small selection can be approximated by a continuous process differences in fitness between competing genotypes, but through time in an isolated population, a simple approach adds the complication that newly arising mutations may is to regress Log(p/q) (where p and q represent the frequen- perturb the assay (Dykhuizen and Hartl 1983). An impor- cies of the two competitors) over time expressed in units of tant technical issue in all competition experiments is to de- generations (Fisher 1930). The connection with logistic re- termine the frequency of competing genotypes reliably and gression and general linear models is then straightforward quickly. In many cases the idea is to link an easily recognized (Arnason and Barker 1999) and more appropriate than the marker with the gene under scrutiny. It is, however, impor- use of least squares. However, complications arise in the tant to recognize that a marker can confer a selective differ- analysis of time series and correlated error in repeated mea- ence (a marker “cost”), which might vary with the genetic surement through time (Arnason and Barker 1999; O’hara background (epistasis) or external environment (G · E inter- 2005), especially when both drift and fluctuating selection actions). Finally, inferring allelic selection coefficients cause frequency variation. The latter problems can be im- against a common reference strain requires that genotypic portant, particularly when analyzing multiple time point se- fitness is transitive. These potential complications require ries (e.g., arising in long-term population cage or chemostat adding proper controls to competition experiments. experiments), although they are rarely taken into account. Often, replicated experiments are simply pooled, even if Akeyconceptualdifficulty in measuring the fitness effects fi of mutations is to distinguish selection from drift (Beatty 1984; signi cantly different, and not analyzed to consider variance Millstein 2008), which is at the heart of several population in the estimates of selection. The development of mixed cage experiments with Drosophila (Dobzhansky and Pavlovsky models offers an attractive alternative to circumvent this 1957). To account for the effect of drift, a selection coefficient problem and to measure selection and its variation. can be defined from the expected change in allele frequency We present an approach combining several features to over one generation (e.g., Rousset 2004), which can be es- improve and quantify the precision of fitness measures. First, timated from the mean frequency change in independent we use techniques that have proved to be among the most competition experiments. Because of drift, replication is fun- efficient to measure fitness: competition assay between large damentally necessary to estimate fitness, and the precision populations of Escherichia coli strains to minimize drift and of a given fitness measure must account for the interrepli- engineered mutations to avoid the problem of indirect selec- cate variance. Indeed, it is possible to count all organisms in tion. Specifically, we used three genotypes, each carrying an experimental population, so that the genotype frequen- a single mutation introduced by the integration of a mini- cies are known without sampling error. Such an experiment Tn10 transposon. These mutations were considered neutral, would allow frequency variation to be determined “exactly,” relative to a common progenitor genotype, in a previous ex- but would clearly not account for the possibility that drift periment (Elena et al. 1998). We use two fluorescent markers will cause different outcomes in different replicates. A fur- (Rosenfeld et al. 2005) combined with flow cytometry ther complication is that fitness may vary because of chang- (Lunzer et al. 2002) to measure frequency variation with ing environmental conditions. Fluctuating selection during great precision, and thus minimize sampling error. Other the course of a competition experiment or varying selection studies have shown the utility of these approaches in mea- across replicates of a competition assay can mimic drift suring genotype fitness (Lunzer et al. 2002; Zhu et al. 2005; (Felsenstein 1976; Lynch 1987; O’hara 2005). If selection Lee et al. 2009). Key aspects of our approach are as follows: varies, and it probably always does to some extent (Bell 2008; Bell 2010), measuring selection requires measuring 1. A comprehensive set of four competition assays enables fi both a mean and a variance (the latter not including sam- us to separately estimate mutational selection coef cients pling error). The remaining variance can be caused by drift (a), the cost of the marker (b), epistasis between muta- or by heterogeneity in selection, which are difficult to dis- tion and marker (g), and transitivity (t). entangle without extra information on the effective popula- 2. We use short-term batch culture to facilitate massive rep- tion size. In summary, measuring selection with precision lication and to reduce the possibility that de novo bene- fi requires estimating an expectation over several replicates, cial mutations will occur. so that its variance can be decomposed into components due 3. We analyze the data in an integrated likelihood framework to sampling error, drift, and variable selection. with random effects to partition sources of variation in our estimates (sampling error vs. drift vs. variable selection). From a statistical point of view, selection coefficients in the field or in the laboratory are best estimated by using Our approach allowed us to estimate both mean and vari- a fully specified selection model in a likelihood framework ance in selection coefficients at a precision of 0.02%. This (e.g., Clark 1979; Wilson et al. 1982; Oakeshott et al. 1983; precision allowed us to detect variation in measures of some

176 R. Gallet et al. Table 1 Strain construction description

Step Experiment Description Achievement

1 PCR PA1 promoter was introduced in front of the CFP and YFP genes. PCR products called PA1-CFP and PA1-YFP a 2 Cloning PA1-CFP and PA1-YFP were cloned in pKD4. plasmids called pKD4-CFP and pKD4-YFP 3 PCR pKD4-CFP and pKD4-YFP were used as templates. Primers with 50 PCR products called rhaA-CFP-Kan bases sequences homologous to the E. coli rhaA gene at their and rhaA-YFP-Kan 59 ends were used. 4 Transformation The plasmid pKD46a was electroporated into the REL4548 recipient cells. REL4548 carrying pKD46 5 Transformation and PCR products rhaA-CFP-Kan and rhaA-YFP-Kan were electroporated REL4548 CFP-KanR and REL4548 homologous in REL4548 carrying pKD46. YFP-KanR recombination 6 Transformation The plasmid pCP20a was electroporated into REL4548 CFP-KanR and REL4548 CFP-KanR pCP20 and REL4548 YFP-KanR. REL4548 YFP-KanR pCP20 7 Heat Shock KanR cassette were excised from the REL4548 CFP-KanR and REL4548 REL4548/CFP and REL4548/YFP YFP-KanR genomes. aFrom Datsenko and Wanner (2000). mutation selection coefficients that were significantly larger by P1 transduction to have each mutation associated with than expected due to drift alone, indicating the action of each fluorescent marker. Since P1 transductions were per- some kind of cryptic variation during our competitions. This formed between isogenic strains (except for the marker and finding implies that, in practice, selection coefficients should the mobilized mutation), the risk of secondary mutations be considered as being distributed and that precise measures was low. Transductants were selected on LBA-Tet plates require evaluating both the mean and the variance of this (LB agar plates supplemented 10 mg/ml tetracycline). We distribution. Furthermore, the variance in s indicates that denote the wild-type genotype with CFP marker wc (wc for some uncontrolled processes occur in these experiments wild-type cyan), wy for wild-type YFP, mc for mutant CFP, (cryptic environmental or ), which impose my for mutant YFP. a limit to further dissecting the differences seen across replicates. We discuss implications of these findings and Competition experiments fi the prospects of this high-throughput method for tness Media: Lysogeny broth (LB) was used for routine molecular measurement. work and for reviving strains from storage (10 g/liter NaCl, 10 g/liter tryptone, 5 g/liter yeast extract; LB Agar LB + 15 Materials and Methods g/liter agar). Davis minimal (DM) medium supplemented with 250 mg/ml glucose (DM250) was used for all compe- Strain construction tition assays (KH2PO43H20 7 g/liter, KH2PO4 2 g/liter, The E. coli B strain used in this study, REL4548, was evolved (NH4)2SO4 1 g/liter, sodium citrate 0.5 g/liter; pH was ad- in Davis minimal medium supplemented with 25 mg/ml glu- justed to 7.0 with HCl or NaOH as necessary). Bottles were cose (DM25) for 10,000 generations as part of a long-term weighed before and after autoclaving and sterile milliQ wa- evolution experiment (Elena et al. 1998). ter was added to compensate for evaporation. After auto- claving, DM was supplemented with: 2.5 ml glucose 10%, Insertions of the chromosomal fluorescent markers: 22 The 1 ml MgSO4 10%, 1 ml thiamine (vitamin B1) 0.2%. We YFP and CFP genes (provided by the Yeast Resource Center call this medium DM250, which is equivalent to the one of the University of Washington) were inserted at the rhaA used by Lenski et al. (1991), in which the strain REL4548 locus of REL4548 using a technique developed by Datsenko grew for 10,000 generations, but with 10 times more and Wanner (2000). Table 1 gives a description of this glucose. method as applied to our experiments. A full description of the method is given in supporting information, File S1. Glycerol stocks: All strains were grown overnight and a sample of 750 ml of each culture was mixed to 250 mlof Mutant construction: The three mutants studied here were 60% glycerol and kept at 280 for storage. constructed by Elena et al. (1998) and were obtained by random single insertions of mini-Tn10 derivative 104—which Culture: The relative fitness, W, of each mutant was esti- contains a tetracycline resistance cassette (Kleckner et al. mated by measuring the change in its relative frequency in 1991)—into REL4548. We chose mutations T63, T103, and competition experiments. To measure the mutation fitness T121 from this original collection because they were identi- effect (MFE) and to control for potential marker effects and fied as neutral using the standard plating method. These mu- epistasis between the mutation and the marker, we per- tations were transduced into REL4548/CFP and REL4548/YFP formed four competition types for each mutant: (a) wc/wy,

High-Precision Measures 177 mental block consisted of each of these 10 competitions replicated 10-fold. Each experimental block was repeated at four different dates.

Flow cytometry: The relative frequency of competitors marked with CFP or YFP was measured using a Gallios Beckman Coulter flow cytometer at 0 and 24 hr following mixing of competing genotypes. We decided to separate competitor populations only on the basis of their fluorescent markers, because CFP and YFP cell populations did not have the same distribution pattern on forward vs. side scatter plots. Thresholds were applied manually (since clustering algorithms often introduce more noise) on the CFP–YFP plots to determine the boundaries of each population (CFP, YFP, unmarked cells, and doubled marked objects) as shown on Figure 1. These thresholds were the same for all competition plots because in such a constant environ- ment, cell clusters were always localized in the same areas fl fl Figure 1 Measuring genotype frequencies with ow cytometry. The uo- of the plot. The frequency of each marker type was calcu- rescence of each bacterium was measured in the FL1 (YFP) and FL10 (CFP) channels. Here we show a representative contour plot. Quadrants repre- lated using CFP and YFP population counts only. Unmarked sent thresholds delimiting the YFP (C1), CFP (C4), doubled marked (C2), and double-marked populations represented approximately and unmarked (C3) populations. In this example, populations are com- 0.2 and 1% of the total population, respectively. For simplic- posed of YFP, 110718 (51.03%); CFP, 103996 (47.93%); doubled- ity, “doublets” (objects composed of two cells) were ex- marked, 2110 (0.97%); Unmarked 151 (0.07%) individuals. cluded from our frequency estimates. CC, YY, and CY doublets occur, but only the latter are detected in the C2 (b) mc/my, (c) my/wc, and (d) mc/wy. The rationale for population (Figure 1). Furthermore, doublets may not form performing all these competitions is presented below. Com- at random; doublets with the same color were often over- petitions were begun by growing the strains to be competed represented (data not shown). Nevertheless, even consider- at 37 overnight with shaking at 250 rpm in 24-well micro- ing these complications, ignoring doublets only introduces titer plates (Greiner Bio—one 662102—suspension culture a bias on s measures proportional to se, where e is the frac- plates) containing 1 ml/well of DM250. We used DM250 as tion of the CY population (C2 in Figure 1). Under our con- the growth medium to obtain large population sizes, which ditions, e 1% making this bias negligible compared to s limit the effect of drift, and to facilitate the measurement of (see File S1 and File S2 for details). hundreds of thousands of cells without having to sample large volumes. To limit evaporation, each 24-well plate Precision of frequency measures with cytometry was placed in a 2-liter plastic box containing paper towels Our method is based on measuring the relative frequency p soaked with 100 ml water (at the bottom of the box). The of two competing genotypes at different time points by next day, 10 ml (100-fold dilution) of each culture was trans- counting C = 200000 cells. This large figure, however, still ferred to a fresh plate and incubated for 24 hr under iden- represents a small fraction of the total population and, tical conditions. On the third day, competitors were mixed at therefore, we estimate frequencies with sampling error. a 1:1 ratio (5 ml of each competitor) and transferred to The theoretical expectation for this sampling error is a fresh plate under identical conditions. On day 4, 20 ml of each competition was transferred into 10 replicate wells 2 2 = : se ¼ pð1 pÞ C containing 1980 ml of DM250. After mixing, 1 ml was re- moved from each well and placed in a plastic test tube at 4 If nothing else contributes to measurement error, we for a subsequent flow cytometry measurement (performed should obtain this variance when measuring repeatedly the 1 hr later), while the remaining 1 ml was kept in the micro- frequency in a given test tube. Preliminary experiments (not titer plate to be cultivated under the conditions described shown) indicated that much larger error could occur, in above. Finally, on the fifth day, a 100-ml sample was taken particular when test tubes were insufficiently mixed. This is from each competition, diluted in DM (not containing glu- an important technical issue and comparing actual mea- 22 2 cose, thiamine, or MgSO4 ), and placed in a plastic test surement error to se provides an internal check that mea- tube at 4 for a subsequent flow cytometry measurement surement error is not inflated above the sampling error (performed 1 hr later). Ten different types of competitions expectation. In the experiments presented here, we used were performed: wc vs. wy, mT63c vs. mT63y, mT63c vs. wy, wc measures of initial frequencies (p0) in our replicated com- vs. mT63y, mT103c vs. mT103y, mT103c vs. wy, wc vs. mT103y, petitions to estimate the variance of frequency measures 2 2 2 mT121c vs. mT121y, mT121c vs. wy, wc vs. mT121y. Each experi- performed with cytometry sobs. We found that sobs/se was

178 R. Gallet et al. Table 2 Selection coefficient expected in the different combinations of competition assays: wc/wy, mc/my, my/wc, and mc/wy

Genotypes Wild-type CFP (wc) Wild-type YFP (wy) Mutant CFP (mc)

Wild-type YFP (wy) sa ¼ Wwc /Wwy 2 1 Mutant CFP (mc) sd ¼ Wmc /Wwy – 1+t Mutant YFP (my) sc ¼ Wmy/Wwc 2 1 sb ¼ Wmc /Wmy 2 1 Each combination was performed for each mutant strain.

Z X N Y Y 0.94, 1.83, 1.07, and 0.95 for the four different dates where n p0;; ; ; j j ; ln Pr j 1 s ss ¼ ln N s ss s n ln p ds 2N i¼1;2 j¼0;t ik ik all the competitions were performed. Except for date 2, k measurement error was very close to that inherent to sam- (3) pling only. However, as shown by s2 at date 2 (and other obs n ; ; ; ... p0 preliminary assays showing more dramatic results), using where is the data matrix {n1 n2 n3 }, 1 the vector of cytometry does not guarantee that measurement error will all initial frequencies, and Nðm; s; xÞ denotes the probability be low. In particular, thorough mixing of test tubes through- density function of the Normal distribution with mean m out the growth cycle limits cell aggregation and is a crucial and standard deviation s. In all cases, parameters were step in taking advantage of the advantages offered by the estimated by maximizing the log-likelihood. Support limits cytometric approach (or any other approach based on fre- for a given estimate were computed within 2 units of log- quency variation). likelihood from the maximum with all other parameters be- ing freely fitted. An equivalent of standard error SEeq was computed as a quarter of the support range (similarly, 95% Measure of genotypic fitness confidence intervals are 6 1.96 SE). Computations were We measured fitness on the basis of a continuous time performed with Mathematica (Wolfram Research 2008). model dp=dt ¼ spð12pÞ, which defines selection coefficient (s) on the basis of frequency (p) variation. This frequency Fitness transitivity, allelic fitness, and epistasis variation was measured in the competition experiments de- To test whether a constant fitness can be attributed to scribed above. For a given competition assay k, the data are a genotype irrespective of its competitor, we performed all n 0 ; 0 ; t ; t a vector k ¼fn1k n2k n1k n2kg giving the number of geno- possible combinations of competition assays for a given types 1 and 2 counted at time 0 (beginning) and t (end of mutant. At a first locus we have the wild-type (w) and mu- the competition). The log-likelihood of this data given initial tant (m) alleles. At a second locus we have two alleles c and 0 fi frequency of genotype 1 p1k and selection coef cient sk is y (corresponding to the CFP and YFP marker proteins, re- computed as spectively). Each competition assay requires competing gen- X X otypes to have different alleles at the marker locus. There n 0 ; j j ; ln Pr kjp1k sk ¼ n ln p (1) are thus four possible combinations: (a) wc/wy, (b) mc/my, i¼1;2 j¼0;t ik ik (c) my/wc, and (d) mc/wy. Table 2 indicates the selection fi where p j ¼ 1 2 p j and coef cient expected in each of these cases if we assume that 2k 1k fi the tness of genotypes wc, wy, mc, and my are constant and t skt 0 = 0 skt 0 : equal to W , W , W , and W , respectively. When mea- p1k ¼ e p1k p2k þ e p1k (2) wc wy mc my suring the marker effect in the same background (compet- itions a and b), we measured the selection coefficient of the The frequency variation is measured over 24 hr. To scale CFP genotype. Otherwise, we measured the selection coef- fitness measurements per generation, we used the number ficient of the mutant genotype against the wild type (com- of cell generations as the time unit. This measure is an petitions c and d). average over the duration of the competition, which does models usually assume that fitness not contradict the fact that conditions change with time in effects are transitive, i.e., that they could be deduced from a given assay (e.g., the glucose becomes limiting) because it some absolute value ranking of the different genotypes. does so similarly in all replicates. Because populations ex- However, this is an assumption that requires evaluation pand by binary fission, we have t = ln(100)/ln(2) = 6.6 in before attributing a selection coefficient to genotypes.

Equation 2. Across replicates of the same competition, sk Since competitions a, b, and c are sufficient to estimate might vary for reasons other than sampling error, owing, all fitness if they are transitive, competition d can be used for example, to drift or to cryptic environmental variation. to measure departure from transitivity. Specifically, we in- To measure this variation, we used the same logistic regres- troduce a parameter t measuring this departure (see Table sion approach (Equations 1 and 2), but including the as- 2). Further reparameterization allows decomposing geno- sumption that s was normally distributed s Nðs; ssÞ among typic fitness into allelic effects and their interaction (epista- replicates. The log-likelihood of this logistic regression with sis). We note that Wwc = Wwy + a, Wmy = Wwy + b, Wmc = random slope is then Wwy + a + b + g. a is the “cost” of the CFP marker, b is

High-Precision Fitness Measures 179 the selective effect of the mini-Tn10 mutation, and g is the We also used a simple one-way ANOVA to test whether the epistasis between the two loci. standard deviation in s measures among replicates was con- To fit this model, for each mutant, we used Equation 1 sistent when measured at different weeks. This was the case, summed over the four competition assays and their repli- although the repeatability was not extremely high. Specifi- cates, with the parameterization indicated above. Support cally, we found that 53% of the variance in this standard limits for estimates were computed within 2 units of log- deviation was among competitions and 47% within competi- likelihood, all other parameters being freely fitted. tion between weeks. This variation among competitions is significantly larger than across dates for the same competition Expected amount of drift (F9,30 =3.7,P = 0.003). Repeatability of means and variance In our experiments, population size increases by binary at different weeks is crucial for measuring fitness with pre- fission. To compute the variance in frequency introduced by cision: it is fairly easy to obtain a very precise measure of drift, we first determine that each bacteria division increases frequency change in a single assay (or even an exact measure this variance by a quantity pq=n2, where n is the population if all individuals in the competition are counted at the begin- size at the time of this division. We then sum this variance to ning and the end), but this is not equivalent to obtaining an the end of population growth: accurate measure of fitness, which must account for inter- replicate variance. As this example shows, analyzing a very X 2 nf pq 1 1 1 fi ¼ pq 2 þ O : (4) large data set also provides suf cient statistical power for n¼n 2 i n ni nf ni detecting very small biological effects, but it can also reveal “nuisance” effects (almost anything tested becoming “signifi- cant”). To cope with these issues, we used an approach quan- We thus expect the variance of selection coefficients tifying variance components in a mixed model (Equation 3 in contributed by drift to be Materials and Methods). n 2 n 2 f i ; Precision of fitness measures ss ¼ 2 (5) g nf nipq Competition assays provide a direct measure of the fitness of one genotype relative to a competing genotype. To de- where g is the number of generations (6.6 as explained termine if the differences we observed in fitness estimates above, over the time course of the competition experiment). between replicate competitions was biologically meaningful, The variance in frequency change caused by selective differ- as opposed to a sampling effect, we used a mixed model to ences among replicates is var(sgpq)=s2ðgpqÞ2. In our s directly estimate the amount of variation in s (s ) beyond experiments we have n in the range 1082109 and n 100 s f i sampling error (Equation 3). This approach provides an es- times less. These population sizes were estimated by serial timate of average selection intensity (s), a measure of bi- dilution and plating (not shown), and these numbers repre- ological heterogeneity in selection among replicates (s ), sent the extreme cases. Thus we expect s to be between s s and standard errors associated with these two parameters. 1024 and 3 · 1024. Significantly larger s would indicate s Beyond estimating average selection (s) with some preci- that a source of variation, in addition to sampling error sion, it is important to indicate the magnitude of variation and drift, contributed to differences among replicate com- in s (s ) and the precision reached to estimate it. Table 3 petitions (e.g., such as random fluctuations in selection coef- s presents these estimates for our 10 competition assays. Esti- ficients among replicates). mates of s (Table 3) range from 0.00088 (T121 YFP) to

20.024 (T103 CFP). Estimates of ss range from 0 to Results 0.0035, with 7 of 10 estimates being greater than zero. In We used a flow cytometric approach to measure the fitness all cases the precision of these estimates is about 60.0002. of three mutants, each carrying a single mutation, that were The origin of variation in s among replicates classified as being neutral with conventional methods (Elena et al. 1998). We performed 10 types of competition There are four nonexclusive reasons that changes in the assays, each replicated 10-fold at each of 4 weeks, giving frequency of reference and mutant genotypes during com- a total of 400 fitness measures (Figure 2). A standard anal- petitions could be truly different among replicates: (1) ysis of deviance (Equations 1 and 2) revealed that 96.6% experimental error unrelated to sampling (e.g., pipeting), of the deviance was among competition assay types. There (2) new mutations occurring in some replicated competi- were significant week (0.6% of the total deviance), week · tions, (3) drift, and (4) variation in selection intensity competition (1.3%), and replicate (1.4%) effects, although among replicates (e.g., due to cryptic environmental varia- they accounted for a very small fraction of the total devi- tions). We consider each possibility in turn. ance. In particular, although detectable, the week effect was Experimentalerrorisunlikelytobethesourceofthe smaller than the replicate effect (well-to-well variation for variation in s in our experiment. We repeated each competition the same competition during the same week), indicating that type at four dates and ss was consistently low and comparable the experiments were repeatable from one week to another. to the drift expectation in competitions a and b (Figure 3). It is

180 R. Gallet et al. Table 3 Estimation of the mean (s) and standard deviation (ss) of genotypic selection coefficients per generation in the different competition assays (code in first column)

23 ð · 10 Þ s Inf Sup SEeq ss Inf Sup SEeq Wild type (a) 24.13 23.74 24.53 0.20 0.80 0.25 1.27 0.26 T63 (b) 1.16 1.58 0.74 0.21 0.59 0.00 1.18 0.29 T63 CFP (c) 212.23 211.93 212.63 0.18 3.51 3.20 3.82 0.16 T63 YFP (d) 28.28 27.28 26.28 0.29 1.49 1.02 1.94 0.23 T103 (b) 26.48 26.14 26.81 0.17 0.00 0.00 0.60 0.15 T103 CFP (c) 224.26 224.10 224.82 0.19 2.84 2.52 3.13 0.15 T103 YFP (d) 215.55 214.55 213.55 0.24 1.11 0.71 1.60 0.22 T121 (b) 23.60 23.39 23.80 0.10 0.00 0.00 0.44 0.11 T121 CFP (c) 26.40 25.92 26.87 0.24 1.15 0.75 1.65 0.23 T121 YFP (d) 0.88 1.88 2.88 0.21 0.87 0.35 1.33 0.25

Inf and Sup indicate the inferior and superior support limits of the estimates, respectively. SEeq gives a measure analogous to standard error and equals (sup 2 inf)/4. All figures are multiplied by 1000. unlikely that systematic error would occur only for some com- Drift can also cause variation in genotype frequency petition types and even more unlikely that this pattern would changes in the different replicates. This process scales with be repeatable at different dates. The second hypothesis is that the inverse of population size and should effectively vanish new deleterious or beneficial mutations, unrelated to the mu- in very large populations. In our experiments, we expect ss tation of interest, may occur during the competition and in- to be between 1024 and 3 · 1024 if it was due to drift

fluencetheoutcome.Thecaseofdeleteriousmutationsisnot alone (see Materials and Methods). Our estimates of ss (Ta- really problematic, because they are unlikely to reach high ble 3) varied among the competition assays. In competitions frequency in a large population and because, if many occur, a and b, estimates of ss were not different from the maxi- they will occur equally in the two competing genotypes. The mum value that would be expected because of drift case of beneficial mutations may, at first sight, seem trickier. (3 · 1024, Figure 3). These competition assays correspond Let us consider a worst-case scenario of the early occurrence of to CFP vs. YFP competitions within the same genetic back- a beneficial mutation providing a growth advantage of 10% ground. We thus conclude that the cost of expressing the per division. If we consider the appearance of this mutant at the different fluorescent proteins is not significantly affected by very start of the preculture (i.e., 17 generations before the uncontrolled cryptic environmental variation in our experi- start of the competition assay), its frequency at the end of ments. Other estimates of ss (in competitions c and d) are the competition will be ,1025 (assuming a competition of much larger than the drift expectation (Figure 3). One 6.6 generations and an effective population size of 106 as used in this study), which is too low to have any impact on our measures. Significant frequency variation (above 0.02% in our case) would require a mutation to confer a benefit greater than 30% (see File S1 for details), which is very unlikely in a strain that has been adapted to the environment for 10,000 generations and for which no such mutations have been identified during the early stages of this , when fitness increases were most rapid (Barrick et al. 2009). Moreover, even if such large-effect mutations were available to our strains, they would have to occur repeatedly in many competitions because our ob- served var(s) is not due to isolated outliers (Figure 2). We note that some mutation types, notably genomic amplifica- tions, have been observed to occur at high frequencies and may sometimes confer beneficial effects either directly or indirectly by increasing the mutational target for new muta- tions to occur. However, if these mutations occur at a very high rate, they would occur in both competitors and thus have a limited effect on var(s). Furthermore, if the occur- rence of de novo genomic amplifications were increasing var Figure 2 Selection coefficients (s) of the wild-type (REL4548) and mutant wc wy mc (s) in our experiments, they should do so in all competition strains (REL4548 T63, T103, or T121) measured in the (a) / , (b) / my, (c) my/wc, and (d) mc/wy competitions. Each point represents an s types, and not only in competitions c and d (Figure 3). In measure estimated from a single competition experiment. s measures are summary, we conclude that the rise and spread of new muta- grouped in lines to show the variance between experiments performed at tions is very unlikely to explain our results. different dates.

High-Precision Fitness Measures 181 Table 4 Estimation of allelic selection coefficients per generation in the different competition assays (the subscript refers to the mutant T63, T103, or T121)

23 23 Parameter Estimate ð · 10 Þ SEeq ð · 10 Þ Sign. a 24.13 0.14 ***

b63 212.27 0.22 *** g63 5.28 0.22 *** t63 20.13 0.30 NS b103 217.89 0.22 *** g103 22.23 0.22 *** t103 21.71 0.30 ** b121 22.81 0.22 *** g121 0.55 0.22 * t121 20.48 0.30 NS a is the cost of the marker (bacteria expressing the fluorescent proteins CFP having a 0.4% cost relative to those expressing YFP). b are the allelic effects of the three random mutations. g are the epistasis between the random mutations and the

marker. t measures departure from transitive genotypic fitness. SEeq gives a measure analogous to standard error and is a quarter of the support range. Sign. (signifi- cance) indicates whether the estimates are different from zero (LRT). NS, nonsignif- Figure 3 Variance in selection coefficients (ss estimate bold point 6 fi P , P , support limits indicated by bars) of the wild-type (REL4548) and mutant icant. All gures are multiplied by 1000. *** -value 0.001; ** -value 0.01. strains (REL4548 T63, T103, or T121) measured in the (a) wc/wy, (b) mc/ my, (c) my/wc, and (d) mc/wy competitions. The predicted variation very controlled and standardized conditions, cryptic environ- expected by alone is represented by the shaded line at the fi bottom of the figure. mental variation has a detectable impact on tness measures. Fitness transitivity Population genetic models of selection usually consider possibility is that the effect of drift is greater than expected fitness effects to be transitive between competing genotypes. from consideration of population size alone. This may be the In this view, fitness can be associated with a given genotype case if there was substantial phenotypic diversity in the com- rather than being defined locally relative to particular peting populations so that a subset of the population con- competitors. [There are, of course, particular frequency- tributed disproportionately to population growth. In fact, dependent selection schemes that can generate nontransi- this explanation seems unlikely. We find a typical value of tive fitness measures (e.g., Sinervo and Lively 1996).] Meth- s of about 0.001, which would require that N was reduced s e odologically, transitivity is also an important assumption in to 9% of the actual population (from Equation 5) (Figure inferring allelic from genotypic fitness effects, as when using 3). This means that drift can explain our observed s only if s a marker to infer the effect of a mutation. Our experimental more than 90% of the sampled population is not growing. design allows us to test for departures from transitivity be- (In the most extreme case, ,1% of the population would cause we measured relative fitness in four combinations of have to be growing (T63, competition c). Studies performed genotypes pairs (see Materials and Methods). Specifically, to on E. coli populations showing that only a few percent of the test for transitivity, we need to make three estimates for total population were in an “atypical” nongrowing physio- a mutation: (1) the allelic cost of the marker, (2) the allelic logical state during exponential population growth (Balaban effect of the mutation, and (3) the epistasis between both. et al. 2004; Levin and Rozen 2006) support the conclusion With three competitions, we have three equations and three that phenotypic variation is not sufficient to account for unknowns. Thus, adding one competition adds one equation variance among replicates in some of our competitions. and provides a means to estimate a departure from consis- In cases where variation is too high to be explained by drift tency (i.e., transitivity) among competition types. We found (competitions c and d), variation necessarily implies that that t, a parameter measuring deviations from transitivity, selection intensity changes slightly among replicates, perhaps was not significantly different from zero for competitions due to environmental variation among replicates. Further- involving the T63 and T121 mutants (LRT; Table 4), mean- more, ss estimates were larger for large jsj (Pearson r = ing that fitness was transitive. By contrast, t was signifi- 0.69), a situation that would be expected when different cantly different from zero for T103, but this departure was competitors have environmental tolerance curves with differ- quite small (t = 20.00171 6 0.0003) and, more impor- · ent slopes (i.e.,aG E effect). In this case environmental tantly, very small compared to the fitness differences mea- variation will not necessary affect both competitors with the sured in those competition assays (Table 4). same intensity. Thus, small environmental variations across replicated competitions can have a nonnegligible impact on Allelic fitness and epistasis ss. Both the high values of ss (compared to the drift expec- To test for epistasis between our markers and the focal tation) and its pattern of variation (larger in assays with large mutations, we decomposed genotypic fitness into the allelic fitness differences) support the conclusion that, even under effects of the marker and the mutation and their interaction

182 R. Gallet et al. (epistasis). The expression of CFP was more costly than YFP not only in competitions c and d. Such variation arose de- (a 0.4% difference in the wild type) and the allelic effect of spite considerable effort to perform all competitions in pre- mutations was 21.2%, 21.7%, and 20.3% for T63, T103, and cisely controlled conditions. In an absolute sense, this T121, respectively (Table 4). However, we detected significant variation was not large (although much larger than our pre- differences between fitness effects of the same mutations when cision), but it supports the idea that the effect of mutations measured in the CFP and YFP backgrounds, indicating the can be strongly context dependent. For instance, it is possi- existence of epistasis between the marker and the three indi- ble that if our experiment was performed in a different lab, vidual mutations. Even though the strength of epistatic inter- the s and ss might be slightly different (because of differ- actions was quite small, it could represent an important part of ences in the average environment or in the magnitude of the genotypic selection coefficients. For instance, for T63, epis- microenvironmental fluctuations, respectively). A fortiori, tasis was larger than the cost of the marker and represented we expect ss to be even larger under environmentally het- 43% of the allelic mutational effect. For the two other muta- erogeneous natural conditions. These observations raise the tions, the quantitative importance of epistasis was much question of whether selection coefficients should be de- smaller. A caveat to our interpretation of epistasis is that it is scribed by only their mean values s, or more appropriately possible that we inadvertently introduced secondary mutations by distributions (with two parameters s and ss), and conse- into genotypes during some step required for strain construc- quently if mutations are appropriately described as benefi- tion (see Materials and Methods). Since we observed fitness cial, neutral, or deleterious, since their effects are context differences with the three mutants we tested, the hypothesis dependent, even within controlled laboratory environments. of secondary mutation introduction supposes a very high rate So far, population genetic models do not typically consider of such mutations during P1 transduction. that s values are distributed, such that one mutation can have

very different fates depending on its ss. For instance, the prob- ability of fixation of a mutation with s =0andss . 0willnot Discussion be driven by drift only, as described by the neutral theory, In a large population, even mutations with very small fitness but will depend on the environmental pattern responsible effects can play a role in the process of adaptation. However, for ss . 0 (see, e.g., Ewens 1979). In any case, much more studying them empirically is a significant practical chal- attention should be paid to variable selection coefficients and lenge. Measurement error and drift obviously limit the their evolutionary impact. The experimental design and statis- precision of fitness measures that can be obtained experi- tical analysis we propose here offers an efficient and new mentally. Is it possible to measure selection up to a limit approach to doing that. imposed by the noise produced by sampling error and drift? If not, how close to this limit can we go? We addressed these Epistasis with the marker questions by performing competition experiments in large In competition experiments with microbes, the neutrality of E. coli populations (to minimize drift) and by tracking fre- the marker is always verified; however, the potential epistatic quency changes using flow cytometry to count marked cells interactions between the marker and mutations is not usually (to minimize sampling error). systematically investigated. Controlling for this issue requires Our experiments are based on short-term batch cultures switching the markers between backgrounds [to perform (6.6 generations). This design has several convenient features. complementary competition assays (Dykhuizen and Hartl First, it is a relatively simple experimental set-up that can be 1980)] and a high level of precision. We found epistatic inter- massively replicated. Second, it reduces, although does not actions between the inserted mutations and the fluorescent eliminate, the complication of newly arising mutations. Third, marker in all cases, suggesting that epistatic effects, although it entirely accounts for the effect of the marker. Finally, it perhaps very small, may be common. Such interactions com- avoids the complication of using time series data. plicate measures of s because they require separating the MFE from the marker cost and the epistatic interactions between Cryptic variation in s them. We note that if we had found that epistatic effects were A surprising, and we think important, result was that, for of a similar size to (or larger than) the allelic effects it would some competitions types, selection was variable across have raised the concern that the compared strains may not replicates, probably because of cryptic environmental vari- have been isogenic. This was not the case in our experiment; ation to which the competing genotypes had different nevertheless, we cannot formally exclude the possibility that sensitivity. Although not empirically excluded, the alterna- transformation and P1 transduction manipulations did not tive hypothesis that variation in estimates of s was caused by introduce any secondary mutations. beneficial mutations spreading in a large number of the batch cultures, seems unlikely for two reasons: (1) such Transitivity mutations would have to confer a very large benefit (un- The assumption of fitness transitivity is made in most likely to appear in a strain that has evolved in the same population genetic models that do not specifically include environment for 10,000 generations) and (2) adaptive social effects or frequency dependence. This assumption mutations would increase var(s) in all competition types, has been evaluated on several occasions and in different

High-Precision Fitness Measures 183 organisms (Richmond et al. 1975; Goodman 1979; Paquin dividual precision of fitness estimates (determined by the and Adams 1983; De Visser and Lenski 2002; Bell 2008). sampling effort: the number of colonies counted when plat- The main conclusion is that fitness tends to be transitive ing, or the number of cells counted with flow cytometry), unless special social interactions are present. Like for any which is not usually reported with fitness competition “null hypothesis,” it is, however, important to realize that experiments. Thus, we can discriminate between sampling the statistical power of an experiment gives an inherent limit error and other sources of variance across replicates (due to to the detectable departure of transitivity. In our experiment, drift, variance in s, etc.), which greatly enhances the infor- we tested this hypothesis and did not find consequential mation that can be extracted from the data. By considering departures from transitivity in the genotypic fitness meas- these factors, we were able to measure mean and variance in ures (Table 4). Given our high statistical power, this finding selection coefficients down to a precision of 0.02% (Tables 3 represents a strong internal check that our experimental and 4). Regarding mean selection, this precision represents results are robust. The fact that fitness effects are transitive a 10-fold improvement over typical studies using flow is also an important result in simplifying experiments: with- cytometry (Lunzer et al. 2002; Ali and Yang 2006; Lee out the need to check for transitivity, only three types of et al. 2009) and is comparable to the precision reached in competition need to be performed to estimate allelic effects Zhu et al. (2005). More importantly, as explained above, the and epistasis (instead of four in our design). massive replication we used also provides a precise measure

of the variance of selection coefficient ss (60.02%). Precision of fitness measures and the statistics of selection Neutralist vs. selectionist Although sampling error and drift can make it difficult to The neutral theory of proposes that the measure small fitness effects, replicated measures will tend fate of many mutations is governed by the effect of drift not to significantly differ from each other. Consequently, (Kimura 1983). The development of precise fitness meas- a single fitness value can be legitimately attributed to a given ures was used in the neutralist/selectionist controversy on genotype in a given environment and the precision of this proteins to determine if allozymes differed in terms of selec- estimate can be determined as the standard error of the tion (Dykhuizen and Hartl 1980). Today, this debate has mean fitness effect across replicates. New high-throughput shifted toward smaller fitness effects at the molecular level counting methods alleviate limits due to sampling error and (Kreitman 1996; Nei 2005). The analysis of sequence poly- drift. Here, we show that such methods can reveal that morphism provides different indirect ways to confront neu- replicated measures differ from one another for a given tralist vs. selectionist expectations. For most mutations, it is genotype in a given environment—i.e., that var(s) is signif- usually thought that there is no alternative to resolving this icantly greater than the value expected by drift and sam- question. Measuring s with ever-increasing precision may pling error alone. This situation challenges the simple start to change this perspective and may help to answer concept of precision mentioned above. When confronted some of the key questions fueling this debate. As we have with this problem, one approach is to do “as usual” and already seen in our study, mutations formerly considered as neglect the observation that replicates differ. In this case, neutral (albeit in a slightly different medium, DM25 in Elena providing only a mean fitness effect and its standard error et al. 1998) actually confer small, but significant, fitness do not reflect the actual precision of the experiment. In effects. Importantly, even in an apparently constant environ- particular, it fails to acknowledge that replicates may differ ment, their effect is best understood as being distributed, beyond this standard error. The second approach is to admit which complicates straightforward application of discrete that replicates actually differ and represent different draws classifications (deleterious/neutral/beneficial), and which in a distribution of fitness effects, even if the environment is would have to be accounted for in theoretical expectations, supposed constant. If selection coefficients are distributed, it e.g., for the analysis of sequence . is thus necessary to measure the mean effect, but also the Sampling error, de novo beneficial mutations, and drift variance, and possibly higher moments (skewness, kurtosis, introduce elements of chance into fitness competitions, which etc.) of the distribution of s values. The concept of precision can limit our ability to measure very small fitness effects. in this case must incorporate estimates of these moments The effect of these factors can, however, be reduced. Sam- and their standard errors. This is the approach we have pling error can be dramatically decreased using flow cytom- taken, introducing the use of a mixed model that allowed etry, and the problem of the occurrence of new beneficial us to decompose sources of variation in frequency change mutations and of drift is reduced by using very short-term (sampling error, drift, environmental variation of s). batch cultures and high replication. However, a remaining We measure fitness on the basis of frequency change as in issue is the variation in selection due to microenvironmental classical population genetics (Dykhuizen and Hartl 1980; variation across replicated cultures. While this would not be Arnason and Barker 1999), which differs from common surprising if replicate measurements had been obtained practices in (Chevin 2010), where from different environments (e.g., Remold and Lenski fitness is measured as a ratio of growth rates (e.g., Lenski 2001), that was not the case in our experiment in which et al. 1991). We also fully use the information on the in- all competitions were carried out in an environment that

184 R. Gallet et al. was kept as consistent as possible. Increased sampling effort, Charlesworth, B., and D. Charlesworth, 1998 Some evolutionary larger population sizes, or longer lasting experiments are consequences of deleterious mutations. Genetica 103: 3–19. Chevin, L.-M., 2010 On measuring selection in experimental evo- not likely to resolve this issue. It is thus unclear how far – fi lution. Biol. Lett. 7: 210 213. precision in tness measurement can be improved. It all Clark, A., 1979 Effects of interspecific competition on the dynam- relies on understanding the sources of variation, and con- ics of a polymorphism in an experimental population of Dro- trolling them, whenever possible. Would it be possible to sophila melanogaster. Genetics 92: 1315–1328. measure the fitness effects of synonymous mutations or Datsenko, K. A., and B. L. Wanner, 2000 One-step inactivation of mutations occurring in noncoding sequences? We cannot chromosomal genes in Escherichia coli K-12 using PCR products. Proc. Natl. Acad. Sci. USA 97: 6640–6645. answer these questions yet; however, our method is a step de Visser, J. A. G. M., and R. Lenski, 2002 Long-term experimental in that direction and it will certainly help to bridge the gap evolution in Escherichia coli. XI. Rejection of non-transitive in- between studies measuring s experimentally and studies in- teractions as cause of declining rate of adaptation. BMC Evol. ferring s from genetic sequences (see Eyre-Walker and Biol. 2: 19. Keightley 2007 for review). Dobzhansky, T., and O. Pavlovsky, 1957 An experimental study of interaction between genetic drift and . Evolu- tion 11: 311–319. Acknowledgments Domingo-Calap, P., J. M. Cuevas, and R. Sanjuan, 2009 The fit- ness effects of random mutations in single-stranded DNA and We thank E. Flaven, M.-P. Dubois for lab management, C. RNA bacteriophages. PLoS Genet. 5: e1000742. Duperray (IRB – Montpellier), C. Mongellaz (IGM, Montpellier), Dykhuizen, D., and D. Hartl, 1980 Selective neutrality of 6PGD and the Montpellier RIO Imaging platform for training R.G. allozymes in Escherichia coli and the effects of genetic back- – to flow cytometry and their help in designing the cytometer ground. Genetics 96: 801 817. fl Dykhuizen, D. E., and D. L. Hartl, 1983 Selection in chemostats. protocols, N. Le Meur for her help with the owCore R Microbiol. Rev. 47: 150–168. package, and L. M. Chevin, P. A. Gros, G. Martin, and F. Elena, S. F., L. Ekunwe, N. Hajela, S. A. Oden, and R. E. Lenski, Rousset for fruitful discussions and insightful comments. 1998 Distribution of fitness effects caused by random insertion We also thank two anonymous reviewers for useful com- mutations in Escherichia coli. Genetica 102–3: 349–358. ments. This work was supported by the European Research Ewens, W. J., 1979 Mathematical Population Genetics. Springer- ‘ ’ Verlag, Berlin. Council Starting Grant Quantevol to T.L. and a National Eyre-Walker, A., and P. D. Keightley, 2007 The distribution of Science Foundation grant (IOS-1022373) to T.F.C. fitness effects of new mutations. Nat. Rev. Genet. 8: 610–618. Felsenstein, J., 1976 The theoretical population genetics of vari- able selection and migration. Annu. Rev. Genet. 10: 253–280. Literature Cited Fisher, R. A., 1930 The Genetical Theory of Natural Selection. Oxford Univ. Press, Oxford. Ali, A., and O. O. Yang, 2006 A novel small reporter gene and Goodman, D., 1979 Competitive hierarchies in laboratory Dro- HIV-1 fitness assay. J. Virol. Methods 133: 41–47. sophila. Evolution 33: 207–219. Arnason, E., and J. Barker, 1999 Analysis of selection in labo- Kimura, M., 1983 The Neutral Theory of Molecular Evolution, ratory and field populations, pp. 182–203 in Evolutionary Ge- Cambridge Univ. Press, Cambridge, UK. netics: From Molecules to Morphology,editedbyR.Singh,and Kleckner, N., J. Bender, and S. Gottesman, 1991 Uses of transpo- C. Krimbas. Cambridge Univ. Press, Cambridge, UK. sons with emphasis on Tn10. Methods Enzymol. 204: 139–180. Arnason, E., and R. C. Lewontin, 1991 Perturbation–reperturbation Kreitman, M., 1996 The neutral theory is dead: long live the test of selection vs. hitchiking ofthe 2 malor alleles of esterase-5 in neutral theory. BioEssays 18: 678–683. Drosophila pseudoobscura. Genetics 129: 145–168. Labbe,P.,N.Sidos,M.Raymond,andT.Lenormand,2009 Resistance Balaban, N. Q., J. Merrin, R. Chait, L. Kowalik, and S. Leibler, gene replacement in the mosquito culex pipiens: fitness estimation 2004 Bacterial persistence as a phenotypic switch. Science from long-term cline series. Genetics 182: 303–312. 305: 1622–1625. Lee, M. C., H. H. Chou, and C. J. Marx, 2009 Asymmetric, bi- Barrick, J. E., D. S. Yu, S. H. Yoon, H. Jeong, T. K. Oh et al., modal trade-offs during adaptation of methylobacterium to dis- 2009 Genome evolution and adaptation in a long-term exper- tinct growth substrates. Evolution 63: 2816–2830. iment with Escherichia coli. Nature 461: 1243–1274. Lenormand, T., and M. Raymond, 2000 Analysis of clines with Bataillon, T., and M. Kirkpatrick, 2000 Inbreeding depression due variable selection and variable migration. Am. Nat. 155: 70–82. to mildly deleterious mutations in finite populations: size does Lenormand, T., D. Roze, and F. Rousset, 2009 Stochasticity in matter. Genet. Res. 75: 75–81. evolution. Trends Ecol. Evol. 24: 157–165. Beatty, J., 1984 Chance and natural selection. Philos. Sci. 51: Lenski, R. E., M. R. Rose, S. C. Simpson, and S. C. Tadler, 183–211. 1991 Long-term experimental evolution in Escherichia coli.I. Bell, G., 2008 Selection: The Mechanism of Evolution. Oxford Univ. Adaptation and divergence during 2,000 generations. Am. Nat. Press, Oxford. 138: 1315–1341. Bell, G., 2010 Fluctuating selection: the perpetual renewal of ad- Levin, B. R., and D. E. Rozen, 2006 Opinion: non-inherited anti- aptation in variable environments. Philos. Trans. R. Soc. B-Biol. biotic resistance. Nat. Rev. Microbiol. 4: 556–562. Sci. 365: 87–97. L’Heritier, P., and G. Teissier, 1937a Elimination of the mutant Bollback, J. P., T. L. York, and R. Nielsen, 2008 Estimation of 2N(e) forms in Drosophila populations: case of Drosophila with the s from temporal allele frequency data. Genetics 179: 497–502. “ebony gene.” C. R. Seances Soc. Biol. Fil. 124: 882–884. Carrasco, P., F. de la Iglesia, and S. F. Elena, 2007 Distribution of L’Heritier, P., and G. Teissier, 1937b Elimination of the mutant fitness and virulence effects caused by single-nucleotide substi- forms in Drosophila populations: the case of Drosophila with tutions in tobacco etch virus. J. Virol. 81: 12979–12984. the “bar gene.” C. R. Seances Soc. Biol. Fil. 124: 880–882.

High-Precision Fitness Measures 185 Lunzer, M., A. Natarajan, D. E. Dykhuizen, and A. M. Dean, Richmond, R. C., M. E. Gilpin, S. Perezsalas, and F. J. Ayala, 2002 Enzyme kinetics, substitutable resources and competi- 1975 Search for emergent competitive phenomena: dynamics tion: from biochemistry to frequency-dependent selection in of multispecies drosophila systems. Ecology 56: 709–714. lac. Genetics 162: 485–499. Rosenfeld, N., J. W. Young, U. Alon, P. S. Swain, and M. B. Elowitz, Lynch, M., 1987 The consequences of fluctuating selection for 2005 Gene regulation at the single-cell level. Science 307: isozyme polymorphisms in Daphnia. Genetics 115: 657–669. 1962–1965. Manly, B. F. J., 1985 The Statistics of Natural Selection. Chapman Rousset, F., 2004 Genetic Structure and Selection in Subdivided & Hall, London. Populations. Princeton Univ. Press, Princeton. Millstein, R. L., 2008 Distinguishing drift and selection empirically: Saccheri, I. J., F. Rousset, P. C. Watts, P. M. Brakefield, and L. M. “the great snail debate” of the 1950s. J. Hist. Biol. 41: 339–367. Cook, 2008 Selection and gene flow on a diminishing cline of Nei, M., 2005 Selectionism and neutralism in molecular evolu- melanic peppered moths. Proc. Natl. Acad. Sci. USA 105: tion. Mol. Biol. Evol. 22: 2318–2342. 16212–16217. Oakeshott, J. G., S. R. Wilson, and J. B. Gibson, 1983 An attempt Sanjuan, R., A. Moya, and S. F. Elena, 2004 The distribution of to measure selection coefficients affecting the alcohol-dehydro- fitness effects caused by single-nucleotide substitutions in an genase polymorphism in Drosophila melanogaster populations RNA virus. Proc. Natl. Acad. Sci. USA 101: 8396–8401. maintained on ethanol media. Genetica 61: 151–159. Sinervo, B., and C. M. Lively, 1996 The rock-paper-scissors game O’Hara, R. B., 2005 Comparing the effects of genetic drift and and the evolution of alternative male strategies. Nature 380: fluctuating selection on genotype frequency changes in the scar- 240–243. let tiger moth. Proc. Biol. Sci. 272: 211–217. Thatcher, J. W., J. M. Shaw, and W. J. Dickinson, 1998 Marginal Orr, H. A., 2005 The genetic theory of adaptation: a brief history. fitness contributions of nonessential genes in yeast. Proc. Natl. Nat. Rev. Genet. 6: 119–127. Acad. Sci. USA 95: 253–257. Paquin, C., and J. Adams, 1983 Frequency of fixation of adaptive Wilson, S. R., J. G. Oakeshott, J. B. Gibson, and P. R. Anderson, mutations is higher in evolving diploid than haploid yeast pop- 1982 Measuring selection coefficients affecting the alcohol- ulations. Nature 302: 495–500. dehydrogenase polymorphism in Drosophila melanogaster. Ge- Peris, J. B., P. Davis, J. M. Cuevas, M. R. Nebot, and R. Sanjuan, netics 100: 113–126. 2010 Distribution of fitness effects caused by single-nucleotide Wolfram Research, 2008 Mathematica, v. 7.0, Champaign, IL. substitutions in bacteriophage f1. Genetics 185: 603–609. Zhu, G. P., G. B. Golding, and A. M. Dean, 2005 The selective Remold, S. K., and R. E. Lenski, 2001 Contribution of individual cause of an ancient adaptation. Science 307: 1279–1282. random mutations to genotype-by-environment interactions in Escherichia coli. Proc. Natl. Acad. Sci. USA 98: 11388–11393. Communicating editor: J. Lawrence

186 R. Gallet et al. GENETICS

Supporting Information http://www.genetics.org/content/suppl/2011/10/31/genetics.111.133454.DC1

Measuring Selection Coefficients Below 10À3: Method, Questions, and Prospects

Romain Gallet, Tim F. Cooper, Santiago F. Elena, and Thomas Lenormand

Copyright © 2012 by the Genetics Society of America DOI: 10.1534/genetics.111.133454 File S1

!" !"##$%&%'()*+,&)(%*-)$!Supporting Information ,

#" -./01234./,45,260,761484/489:,5:;410/70.2,891<01/,

A" $%&" '()" *+," -()" .&+&/" 0&1&" 2+/&13&," *3" 3%&" !"#$" 4567/" 58" 9:;<=<>" 7/2+." 3%&" 5+&?/3&@"

<" 2+*632B*325+"58"6%15C5/5C*4".&+&/"3&6%+2D7&",&B&45@&,"EF"G*3/&+H5"I"J*++&1"KGL$M:NOP" =" *+," JLNN:9" #QQQRS" @TLA#?'()" *+," @T:!9?-()" ?" @4*/C2,/" H2+,4F" @15B2,&," EF" G1" U26%*&4"

W" :45023V"?"0&1&"7/&,"*/"3&C@4*3&/S"

X" $%&" 821/3" /3&@" 58" 3%2/" 65+/3176325+" 0*/" 35" 2+315,76&" 3%&" )L!" KE*63&125@%*.&" λ" @15C53&1R" >" 7@/31&*C"58"3%&"-()"*+,"'()".&+&/S"Y&6*7/&"3%&"'()"*+,"-()"/&D7&+6&/"K024,?3F@&"65,5+/Z"

\" ,&B&45@&,"EF"[+2B&1/23F"58"J*/%2+.35+"'&*/3"9&/5716&"-&+3&1R"5+4F",288&1"EF"#Q"+764&532,&/"

!Q" K573"58"X!X"E@RZ"3%&"/*C&"@12C&1/"6574,"E&"7/&,"851"E53%")-9/S"$%&"@12C&1")?Y/@]^?@15CL!? !!" '-_()"851"K!Q\"E*/&/R"2/"65C@5/&,"58"*"=`"$-$-"3*24"K2+61&*/2+.",2.&/325+"&88262&+6F"EF"Y/@]^RZ"

!#" *"Y/@]^"1&/3126325+"/23&Z"3%&")L!"@15C53&1"65+3*2+2+."*"12E5/5C&"E2+,2+."/23&"K1E/R"*+,"*"#Q"E@" !A" /&D7&+6&"%5C545.57/"35"3%&"'()"*+,"-()".&+&/"=`"&a31&C23F"K/&&"/7@@4&C&+3*1F"C*3&12*4/"

!<" 851"@12C&1"/&D7&+6&/"*+,")-9"C2aRS"$%&")?-4*^?'-_()"1&B"@12C&1"2/"65C@5/&,"58"*"=`"$-$-" !=" 3*24Z"*"-4*^"1&/3126325+"/23&"*+,"*"#Q"E@"/&D7&+6&"%5C545.57/"35"3%&"'()"*+,"-()".&+&/"A`" !W" &a31&C23FS"$%&"1&/7432+.")-9"@15,763/"0&1&"+*C&,") ?-()"*+,") ?'()S" L! L!

!X" $%&"/&65+,"/3&@"65+/2/3&,"2+"645+2+."3%&") ?-()"*+,") ?'()"2+"@OG<"KGL$M:NOP"*+,"JLNN:9" L! L! !>" #QQQRS")L!?-()"*+,")L!?'()"0&1&",2.&/3&,"EF"Y/@]^"*+,"-4*^Z"*+,"3%&+".&4"@71282&,S"@OG<"0*/" !\" 821/3",2.&/3&,"EF"Y/@]^"*+,"-4*^Z"*+,"3%&+"31&*3&,"023%"/%12C@"*4H*42+&"@%5/@%*3*/&"KML)RS"

#Q" $%&" ML)" 6*3*4FV&/" 3%&" 1&4&*/&" 58" =b?" *+," Ab?@%5/@%*3&" .157@/" 815C" GNLZ" @1&647,2+." 3%&"

#!" 42.*325+"58"3%&"42+&*12V&,"@4*/C2,"023%"3%&"1&C*2+2+."@OG<"Y/@]^?-4*^"81*.C&+3/S"(2+*44FZ")L!?

##" -()"*+,") ?'()"0&1&"645+&,"2+"@OG<"EF"7/2+."*"$<"42.*/&S";2.*325+"@15,763/"6*44&,"@OG

#<" 023%" =Q" µ._C;" O*+*CF62+" K;YL?O*+RS" (4751&/6&+3" 6545+2&/" 0&1&" @71282&," EF" /31&*H2+." 5+"

#=" 81&/%";YL?O*+"@4*3&/Z"*+,"3%&"+&a3",*FZ"*"6545+F"0*/"2+5674*3&,"2+"42D72,";Y?O*+S"

#W" L83&1" 2/54*325+" 023%" c2*.&+" C2+2@1&@" H23Z" @OG

#>" O*+*CF62+" 6*//&33&" 6*112&,"EF" @OG

!" "

2 SI R. Gallet et al. >" #$%&" '&()*+,-." /0&" 1%*#&%," 2,&'" *3" (0*," ,&4$3'" 567" 8" 1819:;870)<=>8!?@A8!???" B$%" C?;"

!" D),&,-" )3'" 1819:;1!=!8>;E@8>;FF%&G" CFE" D),&,-" H" 4$3()*3&'" &*(0&%" )" I)#=J" $%" )" K)+J" A" %&,(%*4(*$3" ,*(&" )("(0&*%"LM" &N(%&#*(*&," C%&,(%*4(*$3" ,*(&," 3$(" 2,&'" *3" (0*," ,(2'O-P" B*B(O" D),&,"

;" 0$#$+$Q$2,"$B"(0&"!"#$%&'"()*+"+$42,")3'"(0&"!>"$%"!F"D),&,"0$#$+$Q$2,"($"(0&"(+)(&")(" L" (0&*%" AM" &N(%&#*(*&,." /0&" %&,2+(*3Q" 567" 1%$'24(," R&%&" &+&4(%$1$%)(&'" *3($" 7ST;L;?"

@" 4$#1&(&3("4&++,"4)%%O*3Q"(0&"(0&%#$,&3,*(*G&"1+),#*'"19:;@"C:

F" )3'"(%)3,B$%#&'"4&++,",1%&)'"$3"TI<89)3"C*342D)(*$3")("AFY6"*3"$%'&%"($"Q&("%*'"$B"19:;@-." ?" Z+2$%&,4&3("%&4$#D*3)3(,"R&%&",(%&)[&'"$3"TI<89)3"B$%"12%*B*4)(*$3P")3'"$3"TI<8<#1"CLX"

E" µQ\#+" <#1*4*+*3-" ($" 40&4[" B$%" (0&" +$,," $B" 19:;@." 567," R&%&" 4)%%*&'" $2(" $3" %&4$#D*3)3(" >X" 4+$3&,"R*(0"(R$",&(,"$B"1%*#&%,"CR*(0")"B$%R)%'"1%*#&%"1819:;8!?@A8!???",1&4*B*4"($"(0&"

>>" *3,&%(P"$%"1870)<8B$%",1&4*B*4"($"(0&"()*+"Q&3&P")3'"(0&"%&G&%,&"1%*#&%"70)<8>!@!8>!;A8%&G" >!" ,1&4*B*4" ($" (0&" ()*+" Q&3&-" ($" 40&4[" (0)("(0&" *3,&%(*$3" $B" 6Z589)3" )3'" ]Z589)3" 4),,&((&,"

>A" $442%%&'")("(0&"%*Q0("+$42,."

>;" /0&" &N4*,*$3" $B" (0&" [)3)#O4*3" %&,*,()34&" 4),,&((&P" 4$%%&,1$3'*3Q" ($" (0&" +),(" ,(&1" $B" (0*,"

>L" 4+$3*3QP" R)," 1&%B$%#&'" DO" (0&" &+&4(%$1$%)(*$3" )3'" *3'24(*$3" $B" (0&" 1+),#*'" 165!X" >@" C:

>F" 1+)(&,."/0&"B+*1),&"1%$'24(*$3"C)++$R*3Q"(0&"%&4$#D*3)(*$3"$B"(0&"Z7/",&^2&34&,",2%%$23'*3Q"

>?" (0&"9)3)#O4*3"4),,&((&-"R),"*3'24&'"DO",(%&)[*3Q"4$+$3*&,"B%$#"(0&"TI<8<#1"1+)(&,P"$3"TI" >E" 1+)(&," )(" ;!Y6." /$" 40&4[" B$%" 9)3)#O4*3" %&,*,()34&" 4),,&((&" %&#$G)+" )3'" +$,," $B" 165!XP"

!X" 4$+$3*&,"R&%&",(%&)[&'"$3"TI" 1&%B$%#&'"$3";L;?"6Z5"9)3K"<#1K")3'";L;?"]Z5"9)3K"<#1K"4+$3&,"),"R&++"),"$3";L;?P";L;?" K !!" 6Z589)3")3'";L;?"]Z589)3."I$(0"(0&"9)3 "),,$4*)(&'"R*(0")"B+2$%&,4&3("10&3$(O1&")3'"(0&" !A" ,*_&"$B"(0&"567"1%$'24(,"4$3B*%#&'"(0&"%&#$G)+"$B"(0&"[)3)#O4*3"4),,&((&")3'"(0&"*3,&%(*$3"

!;" $B"(0&"B+2$%&,4&3("Q&3&,."

!L" "

!@" !"#$%"&'%()%*+%'&

U)#&" K&^2&34&" 6$3,(%24(*$ 7&B&%&34&"

3",(&1"

58I,1=J8 /6/6!"#!$#//" /0*,",(2'O"

1<>8 <<66/

!" "

R. Gallet et al. 3 SI

#$%&'"()*" !"!"""++$+,,,-+$+-,,,++,+,,+,,$!./01"234546" 8" )("

78"29:5";"<4=>7"*54?*@A?@)8"4@?5" =*)D)?5*" =,/" 78"293AB;"',/" :=4?*53D" 78"=@8B;"C@2)4)D5"2@8E@8F"4@?5" ?H5" $&'" )*"

#&'" 78"F*558G"$&'%#&'"H)D)9)F):4"45I:58A5" A3445??5"

'J$937J -$-$,-$+,---,---+-,-,+--$,-$$,-+$$" '$C" /" -H@4"4?:EL"

#$%&'"*5K" 78"29:5";"$937"*54?*@A?@)8"4@?5" 78?*)E:A?@)

8" )(" 78"F*558G"$&'%#&'"H)D)9)F):4"45I:58A5" =*)D)?5*" " =,/"

:=4?*53D" ?H5" $&'" )*"

#&'" A3445??5"

=J=MNOJ +,+-$+,$++$+$-+$+$$,,$--+,-$+---,$$$+- '$C" P" -H@4"4?:EL" #$"%>/J --$,,-+$,$-+$-++$,+++,,+$,---,-$,+++-- $*53?@8F" 3"

PQR!J ,--+-$-$".QO"234546" 9@853*"

PQQQ"()*" 78"29:5";"S397"*54?*@A?@)8"4@?5" (*3FD58?"

()*" 78"*5E;"#$"%"H)D)9)F):4"45I:58A5" H)D)9)F): 78" F*558;" =MNOJ$&'" .)*" =MNOJ#&'6" H)D)9)F):4" 4"

45I:58A5" *5A)D2@83?@ )8"

=J ,$++,-$$,$$,$+$++$,,-+$++--+,-,+,++$, '$C" P" -H@4"4?:EL"

=MNO=P> -$+,,+,,+-$,,++$$+,-,-$$-$$--,+--$$-,-- $*53?@8F" 3"

PJ/O1RJ $$+".T1"234546" 9@853*" /OTT*5K" (*3FD58?" 78"29:5";"<3D>7"*54?*@A?@)8"4@?5" ()*" 78"*5E;"#$"%"H)D)9)F):4"45I:58A5"

!" " 4 SI R. Gallet et al. #$" %&''$(" )*+!,-./" 01&" )*+!,2./3" 415161%178" 415161%17

8'97'$:'! 8" &':15;<$=><

1$"

),"#$%,?1&! @A--A-B-"AA-B@@AA-A@@--! /-C" D" B4<8"8>7GH"

-4':E<$%" ?1&" -./F2./"

<$8'&><1$" <$" >4'" "#$%"

61:78"

),)*+!, @AA@-ABBBAB-A@@@BBABB@B-B-! /-C" D" B4<8"8>7GH"

IJKD, -4':E<$%" IJJJ"?1&" ?1&" -./F2./"

<$8'&><1$" <$" >4'" "#$%"

61:78"

"#$%, --A-@-B@@-B-AAAAB-@-! /-C" D" B4<8"8>7GH"

LIKI, -4':E<$%" LI!D,&'M! ?1&" -./F2./"

<$8'&><1$" <$" >4'" "#$%"

61:78"

L" "

I" ""

!" "

R. Gallet et al. 5 SI #" "

$" !"#$%&'()*+,-.$

%" "

=" &'("#")"*+,-./01,2.+".3"4-.5.,6-"&7#"048,-695",:6"';&".-"<;&"19886,,6"

!" &>9852/"4?@#()';&".-"4?7%$)<;&" " #"µA"

E" &-256-"&)B84C*)47#)<'D;&"3.-" " $"µA"

G" &-256-"&)'>9*)<'D;&"-6F" " " $"µA"

K" &30"#HI"J0336-"" " " " $"µA"

µ N" /LM&8" " " " " " #" A" #H" &30" " " " " " HO!"µA"

##" C$P" " " " " " ##O!"µA"

#$" &-.Q-95"R"CSJ-2/2T9,2.+"9,"!$U'"3.-"#"52+V"6>.+Q9,2.+"9,"G$U'"3.-"$52+V"$!"1S1>68O"

#%" "

#=" &'("$")"'-69,2+Q"9">2+69-"3-9Q56+,"3.-":.5.>.Q.08"-61.5J2+9,2.+"

#!" &>9852/"4WX=)';&"Y.-"4WX=)<;&Z" " $"µA"

#E" &-256-"4)4WX=)(:97C#)$KE%)$KKK"3.-" #"µA"

#G" &-256-"4WX=4$C$)#=NE)#=GG-6F" " #"µA"

#K" &:082.+"$I"598,6-52["" " " #H"µA"

#N" C$P" " " " " " E"µA"

$H" &-.Q-95"R"CSJ-2/2T9,2.+"9,"EHU'"3.-"#"52+V"6>.+Q9,2.+"9,"G$U'"3.-"%52+V"$!"1S1>68O"

$#" "

$$" &'("%")"1:61\2+Q"';&"9+/"<;&"19886,,6"2+86-,2.+8"9+/"W9+95S12+"19886,,6"-65.F9>"

$%" '.>.+S"2+.10>9,6/"2+",:6"&'("52[V"]2,:"9",..,:421\O"

$=" &-256-"3.-]9-/" " " " #"µA"

!" "

6 SI R. Gallet et al.

*" #$%&'$"$'('$)'" " " " *"µ+"

4" ,-."*/0"1233'$" " " " 4"µ+"

;" 56784" " " " " " *9:"µ+"

>" <=,#"" " " " " *"µ+"

:" ,-."?@8<)A-$"B'<" " " " /94"µ+"

!" C4D" " " " " " *;9;"µ+"

E" "

M" #$@6$-&"F"CG1$%<%H-A%@I"-A"!/J7"3@$"*"&%IK"'8@I6-A%@I"-A"E4J7"3@$";&%I"*:")'LK";/"LGL8')9"

N" "

*/" !""#$%&'"&()'*+,#%-.&'/&$'#""0$0#/%&'"&-#,#$%0'/&#-%012%#-&

**" 7O"P-$A%L8')"Q74RK"77"-I<"OO"<@218'A"P-$A%L8')"&-G"P@)'"-"P$@18'&"3@$"-LL2$-A'"-I-8G)%)"@3"

*4" &-$S'$" P$@P@$A%@I)9" T'" U-('" 3@2I<" AU-A" V)%I68'AW" -I<" V<@218'AW" P-$A%L8')" L-I" 1'" I%L'8G"

*;" )'P-$-A'<"@I"-"XY7",DX"QA%&'"@3"38%6UAR"Z"XY7"#'-S"P8@A9",U'"3$'.2'ILG"@3"OO"Q7*"$'6%@IRK"7O" *>" Q74"$'6%@IRK"-I<"77"Q7>"$'6%@IR"L-I"AU2)"1'"')A%&-A'<9"[3"L'88)"\'$'"-))@L%-A'<"$-I<@&8G"%I"-"

*:" <@218'AK"\U'I"7"-I<"O"-$'"%I%A%-88G"%IA$@<2L'"-A"-"$-A%@"@3"*F*K"\'"\@28<"']P'LA"A@"U-('"7*"^" *!" /94:K"74"^/9:"-I<"7;"^/94:"\%AU%I"<@218'A)9"5@)A"@3"AU'"A%&'K"\U-A"\'"@1)'$('"%)"7*"^"/9;;K"

*E" 74"^"/9;;"-I<"7;"^"/9;;9",U%)"&'-I)"AU-A"L'88)"-$'"I@A"-))@L%-A'<"$-I<@&8G"%I"-"<@218'AK"12A" *M" AU-A"AU'G"U-('"&@$'"LU-IL'"A@"1'"\%AU"-"L'88"@3"AU'" )-&'" L@8@$9" D2$" UGP@AU')%)" 3@$" )2LU" *N" @1)'$(-A%@I" %)" AU-A" \U'I" 'IA'$%I6" )A-A%@I-$G" PU-)'K" -" 3$-LA%@I" @3" AU'" <-26UA'$" L'88"

4/" P@P28-A%@I" &%6UA" )A-G" -AA-LU'<" A@" '-LU" @AU'$9" [I" B-I6" 'A" -8" 4//;K" AU'G" @1)'$('<" AU-A" 4*" V!"#$"%&'( )*( $+&( ,-.$&%/-( 0)'$( -**&.$&1( ,2( 345( &6+/,/$&1( -( 7%)7)%$/)8( )*( &#)89-$&1( .&##':(

44" ;+/.+('"99&'$'($+-$(345(7%)1".$/)8(.)"#1(/8$&%*&%&(;/$+(.&##(1/

4>" =@\K"AU'".2')A%@I"%)"A@"'(-82-A'"AU'"1%-)"%IA$@<2L'<"1G"%6I@$%I6"<@218'A)9"T'"L-I"%IA$@<2L'"

4:" )@&'"I@A-A%@I"A@"1'")P'L%3%L9",U'$'"-$'")'('$-8"2ISI@\I)F"AU'"3$-LA%@I"@3"<@218'A)"Q'%AU'$" 4!" 77K"OO"@$"7OK"@I8G"AU'"8-AA'$"1'%I6"&'-)2$'<"%I"AU'"74"$'6%@IR9"+'A_)"<'I@A'"%A"φ9",U'I"AU'$'"%)"

4E" AU'"3$'.2'ILG"@3"7"1-LA'$%-"Q7R9"+-)AK" \'"L-I"L@I)%<'$"AU-A"<@218'A)"&-G"I@A"3@$&"1G"AU'"

4M" $-I<@&" -))@L%-A%@I" @3" A\@" 1-LA'$%-" Q'969" A\@" 7" 1-LA'$%-" &-G" 1'" &@$'" 8%S'8G" A@" 3@$&" -"

!" " R. Gallet et al. 7 SI ;" #$%&'()*")+,-",".,/0"$1"2",-#"3"&,4)(0/,5",*"(6.',/-(#",&$7(89":("4,-"/-)0$#%4(","#(.,0)%0("

C" 10$<"0,-#$<",**$4/,)/$-"!"=,>/-")$","#(.,0)%0("10$<"?,0#@":(/-&(0A".0$.$0)/$-89"B)"/*")+(-" F" *)0,/A+)1$0D,0#")$"4$<.%)(")+("10(E%(-4@"$1"(,4+")@.("$1".,0)/4'("=25"35"225"335"2389":("

I" <(,*%0("10(E%(-4@"&@")+("0,)/$"=2G228H=2G22G3G3385"D+/4+"/-)0$#%4(*","&/,*"10$<")+(")0%(""" J" 7,'%("(E%,'")$""

K" !"# ! $%"#!&%&"# ! '&%()*"(%+9"

!" L+/*"&/,*"4,-"&("(6.0(**(#"/-")(0<*"$1"ε5")+(".0$.$0)/$-"$1")+("2C".$.%',)/$-"=#(1/-(#",*"

M" 23H=2G22G23G33G389"B)"/*"*/<.'@"

!,"& ! #-'%"

O" D+/4+"/*"7(0@"*<,''5"(*.(4/,''@"D+(-"""/*"4'$*(")$"N9"B-"$%0"4,*(5",7(0,A("/-/)/,'"10(E%(-4@"/*" ;P" P9IOFQP9P;J",-#","R";S5"D+/4+"4$00(*.$-#*")$","&/,*"4'$*(")$";PTI9"L+("&/,*"<,#("$-")+("

.& ;;" 10(E%(-4@" 4+,-A(5" 5"#%0/-A")+("4$<.()/)/$-"/*"#/11(0(-)",-#"/)"/*")+/*" &/,*" )+,)" /*" <$*)" ;C" 0('(7,-)")$"(*)/<,)/-A")+("*)0(-A)+"$1"*('(4)/$-9"U$''$D/-A")+("*,<(",..0$,4+",-#",**%

;F" $",-#"(",0(",..0$6/<,)('@"4$-*),-)"#%0/-A")+("4$<.()/)/$-5"/)"/*""

# .&"# ! $%(" '

;I" "

;J" B-"$)+(0"D$0#*")+("&/,*"$-")+("10(E%(-4@"4+,-A("/*".0$.$0)/$-,'")$"/)*('19"V6.0(**/-A")+/*"/-"

;K" )(0<*" $1" ," ,-#" )+(" /-)(-*/)@" $1" *('(4)/$-" .(0" A(-(0,)/$-" =D/)+" .& / 0121&385" D(" $&),/-" ;!" 20,&341"L+("&/,*"$-"*('(4)/$-"4$(11/4/(-)"/*")+(0(1$0("(E%,'")$"0,9"L+%*5"/)"/*",'D,@*"7(0@"*<,''"

;M" 4$<.,0(#")$"#5".0$7/#(#")+("10,4)/$-"$1"2C".$.%',)/$-"0(<,/-*"RR;9"2$00(4)/-A"1$0"#$%&'()*" ;O" D$%'#"&(".0$&,&'@"-(4(**,0@"D+(-",1/*"A0(,)(0")+,-","1(D"S"=D+/4+5"/-"$%0"(6.(0/<(-)*5"/)"/*"

CP" -$)89" W-#(0" )+$*(" 4/04%<*),-4(*5" /)" D$%'#" &(" /<.$0),-)" )$" +,7(" ," 4'(,0" (6.(0/<(-),'"

C;" %-#(0*),-#/-A"$1")+("$0/A/-"$1"#$%&'()*".,0)/4'(*"=,-#"<(,*%0(*"$1"$",-#"(",*"D("#/#89"X,*)5" CC" /)" /*" /<.$0),-)" )$" %-#(0'/-(" )+,)" )+(*(" *'/A+)" &/,*(*" #$" -$)" /-)0$#%4(" (00$0*5" *$" )+,)"

CF" (*)/<,)/$-*"$1"7,0/,-4(*"0(<,/-"%-,')(0(#9""

CI" !"#$%&'()'*+',(-('.+,+)/%/$0'"1&$&/(,2'*13/,4'&5+'$22$6'

CJ" "

!" "

8 SI R. Gallet et al. 7" #$%$&'(')*"+,-)-'.%/"/0.,*1"0)2$")%"'+3*),/'4*5"0'60"/$*$(-'.%"(.$&&'('$%-"-."'+3)(-"&'-%$//"

=" +$)/,8$/"+)1$"'%".,8"$93$8'+$%-/:";0$"(.+3,-)-'.%"'/"1'&&$8$%-"'&"<$"(.%/'1$8"4$-<$$%" A" <$$>"2)8')%($".8"<'-0'%"<$$>"?4$-<$$%"8$3*'()-$/@"2)8')%($:"

B" !"#$%&'&(")*+,&-,,)*-,,./"

F F" ;0$" 38$C(,*-,8$/" /-)8-" &8.+" )" &8.D$%" 6*5($8.*" /-.(>" <'-0" )" /)+3*$" /'D$" .&" !"#$" )8.,%1" 7E :"

I" G+3.8-)%-*5H" )**" 8$3*'()-$/" <'-0'%" )" <$$>" ,/$"-0$" /)+$" 38$C(,*-,8$" )%1" .%*5" $93$8'+$%-/" J" 3$8&.8+$1" '%" 1'&&$8$%-" <$$>/" ,/$" 1'&&$8$%-" 38$C(,*-,8$/" ?%&'&" 1'&&$8$%-" /,4/)+3*$/" .&" -0$"

!" /)+$" 6*5($8.*" /-.(>@:" K$2$%-$$%" 6$%$8)-'.%/" *)-$8" ?)8.,%1" 7E" 6$%$8)-'.%/" 1,8'%6" 38$C L" (,*-,8$H" )%1" /$2$%" +.8$" 1,8'%6" )" /$(.%1" 8.,%1" .&" 38$C(,*-,8$" 8$+.2'%6" )%5" -8)($" .&"

7E" 6*5($8.*@H" -0$" (.+3$-'-'.%" /-)8-/" &.8" I:I" +.8$" 6$%$8)-'.%/:" M,8'%6" -0$" (.+3$-'-'.%H" -0$"

77" 38$/$%($" .&" )" 4$%$&'(')*" +,-)-'.%" <'-0" /$*$(-'.%" (.$&&'('$%-" ('" +)5" 1'/-.8-" -0$" &8$N,$%(5" 7=" (0)%6$" +$)/,8$1" '%" .,8" $93$8'+$%-:" G&" (.+3$-'-'.%/" )*<)5/" /-)8-" <'-0" -0'/" 4$%$&'(')*"

7A" +,-)-'.%H"*'--*$"2)8')%($")+.%6"(.+3$-'-'.%/"<'**"4$"6$%$8)-$1:"O%"-0$"(.%-8)85H"2)8')%($"'%" 7B" (")+.%6"<$$>/"<'**".((,8"'&"-0$"+,-)-'.%"'/".%*5"38$/$%-"'%"/.+$"<$$>/:"P$%($H"-0$"<.8/-"

7F" ()/$" /'-,)-'.%" .((,8/" <0$%" -0$8$" '/" )8.,%1" Q" (0)%($/"-."/)+3*$"-0$"4$%$&'(')*"+,-)-'.%"

7I" &8.+"-0$"6*5($8.*:"R.+3,-'%6"-0'/"38.4)4'*'-5"&8.+")"S.'//.%"1'/-8'4,-'.%H"<$"&'%1"-0)-"-0$" 7J" <.8/-"()/$"'/")"&8$N,$%(5"'%"-0$"6*5($8.*".&")8.,%1""

7!" "

7L" ) TU.6V=WX*! :" "#$ "#$

=E" "

=7" K-)8-'%6"&8.+"-0'/"'%'-')*"&8$N,$%(5H"-0$"4$%$&'(')*"+,-)-'.%"<'**"$%1",3")-"-0$"$%1".&"-0$" ==" 38$(,*-,8$")&-$8"7J"6$%$8)-'.%/")-")"&8$N,$%(5""

=A" "

=B" )+"T")"#$"Y93V('"7JW"

=F" "

=I" ?&8.+"*.6'/-'("68.<-0@:"U$-Z/")//,+$"-0)-"-0$"4$%$&'(')*"+,-)-'.%".((,88$1"'%")"[\S"($**:"G%"

=J" .,8" $93$8'+$%-/H" )$,)" ]-0$" &8$N,$%(5" .&" -0$" [\S" ($**/^'/" )*<)5/" (*./$" -." Q:" M,8'%6" -0$"

!" "

R. Gallet et al. 9 SI

9" #$%&'()()$*")(+',-."(/'"0'*'-)#)1,"%2(1()$*"3),,"#12+'"1*"'4(51"#/1*6'"∆!"")*"-5'72'*#8"$-"(/'" ?" :;<"01#=65$2*>"

@" "

F" ∆!""A"#"!$B9C&%&!DE"

G" "

I" H/'"0)1+"&'5"6'*'51()$*")*"(/'"'+()%1('>"+','#()$*"#$'--)#)'*(")+"(/'*"

J" "

L" #'()#"A"∆!""K"B!%&!B9C!%&!D"IEIDE"

!" "

9O" M$,N)*6" -$5" (/'" +','#()$*" #$'--)#)'*(" #"" #12+)*6" 1" 0)1+" )*" +','#()$*" 1(" ,'1+(" 1+" ,156'" 1+" $25" CF 99" &5'#)+)$*"B(*"*"$-"(/'"$5>'5"$-"#'()#"A"?E9O D."3'"$0(1)*"#""A"OE@@E"P'"15'"2*1315'"$-"1>1&()N'"

9?" #/1*6'+"1+"/)6/"1+"(/)+")*"(/'"'#$,$6)#1,"+)(21()$*"$-"(/'"65$3(/"'*N)5$*%'*(+"3'"2+'>E"Q*" 9@" &15()#2,15."*$*'"/1+"0''*"-$2*>")*"(/'"R'*+=)"'4&'5)%'*(E"S$5'$N'5."$25"+(51)*"/1+"1,5'1>8"

9F" 'N$,N'>"9O.OOO"6'*'51()$*+."3/)#/")+"(/'"&'5)$>"3/'5'"(/'"+(5$*6'+("1>1&()N'"#/1*6'+"/1N'"

9G" 1,5'1>8" 0''*" -)4'>" BT155)#=" '(" 1,E." ?OO!DE" Q*>''>." (/'" -)(*'++" $-" (/'" 5','N1*(" &$&2,1()$*" 9I" )*#5'1+'>"08",'++"(/1*"@OU")*"?O.OOO"6'*'51()$*+"-$,,$3)*6"(/'"()%'"&$)*("1("3/)#/"VWRFGFL"

9J" 31+")+$,1('>"BT155)#="'("1,E."?OO!DE""

9L" !"#$%&'&(")*+(&,()*+--./"

9!" W1#/" 5'&,)#1('>" #$%&'()()$*" 3)(/)*" 1" 3''=" +(15(+" -5$%" (/'" +1%'" &5'#2,(25'" 1*>" 1" ,156'" ?O" )*$#2,2%"B9OI"#',,+DE"H/'"3$5+("#1+'")+"3/'*"(/'"0'*'-)#)1,"%2(1()$*"$##25+"'15,8")*"/1,-"$-" G ?9" (/'"#$%&'()()$*+E"M)*#'"3'"+1%&,'"+#,)-,"A"GE9O "#',,+"$-"1"6)N'*"#$%&'()($5.")("%'1*+"(/1(""

??" "

?@" !#,)-,AR$6X?YK.+#,)-,.

?F" .

!" "

10 SI R. Gallet et al.

!# $%&'(# )*+# +,-.)# %-/+# .0/12)-)&0'# -%# -304+5# +,.+1)# )*-)# )*+# 3+'+6&.&-7# *-%# 0'78# 9:9#

A# (+'+;-)&0'# )0# .*-'(+# &'# 6;+<2+'.85# =+# 03)-&'# !"# ># ?@95# =*&.*# &%# .7+-;78# 2';+-7&%)&.:# G# B/10;)-')785# =+# 03%+;4+# &'67-)+C# 4-;D!E#=&)*&'#=++F%#06#)*+#%-/+#0;C+;#06#/-('&)2C+#-%#

?# 3+)=++'#=++F%5#%0#)*-)#&)#&%#<2&)+#.7+-;#)*-)#-C-1)&4+# .*-'(+%# -;+# 2'-37+# )0# +,17-&'# 02;# H# ;+%27)%:#

!"# # R. Gallet et al. 11 SI File S2

Supporting Data

File S2 is available for download at http://www.genetics.org/content/suppl/2011/10/31/genetics.111.133454.DC1 as a compressed

folder.

12 SI R. Gallet et al.