Dynamics of Adaptation in Sexual and Asexual Populations Joachim Krug Institut Für Theoretische Physik, Universität Zu Köln

Dynamics of adaptation in sexual and asexual populations Joachim Krug Institut für Theoretische Physik, Universität zu Köln • Motivation and basic concepts • The speed of evolution in large asexual populations • Epistasis and recombination Joint work with Su-Chan Park and Arjan de Visser [PNAS 104, 18135 (2007); arXiv:0807.3002] I have deeply regretted that I did not proceed far enough at least to understand something of the great leading principles of mathematics, for men thus endowed seem to have an extra sense. The Autobiography of Charles Darwin The modern synthesis R.A. Fisher J.B.S. Haldane S. Wright The evolutionary process is concerned, not with individuals, but with the species, an intricate network of living matter, physically continuous in space- time, and with modes of response to external conditions which it appears can be related to the genetics of individuals only as a statistical consequence of the latter. Sewall Wright (1931) It is not generally realized that genetics has finally solved the age-old problem of the reason for the existence (i.e., the function) of sexuality and sex, and that only geneticists can properly answer the question, “Is sex necessary?” H.J. Muller (1932) The problem of sex Sex is costly: • Two-fold cost of males (Maynard-Smith, 1971) • Cost of finding and courting a mate Nevertheless sexual reproduction is ubiquitous in plants and animals ⇒ What evolutionary forces are responsible for the maintenance of sex? Genetic mechanisms: • Sex helps to eliminate deleterious mutations (Muller’s ratchet) • Sex speeds up the establishment of beneficial mutations (Muller-Fisher effect/clonal interference) The Muller-Fisher hypothesis for the advantage of sex J.F. Crow & M. Kimura, Am. Nat. 99, 439 (1965) • Dynamics of an adapting population: periodic selection The Muller-Fisher hypothesis for the advantage of sex • Clonal interference slows down the adaptation of asexual populations • How to quantify this picture? The speed of evolution in large asexual populations The Wright-Fisher model U U U U time • Constant population size N, discrete non-overlapping generations • Each individual chooses an ancestor from the preceding generation • Individual i is chosen with probability ∼ wi Wrightian fitness • Mutations occur with probability U per individual and generation Fixation • U = 0: When a single mutant of fitness w′ is introduced into a genetically homogeneous population of fitness w, the outcome for t → ∞ is either fixation (all w′) or loss of the mutation (all w) • Fixation probability for the Wright-Fisher model (Kimura, 1962) 1 − e−2s w′ π (s) ≈ , s = − 1 selection coefficient N 1 − e−2Ns w • Under strong selection: (N|s| ≫ 1) deleterious mutations (s < 0) cannot fix, while beneficial mutations (s > 0) fix with probability π(s)= 1 − e−2s ≈ 2s, s ≪ 1 • Mean time to fixation of a beneficial mutation: lnN t ≈ fix s Model I: Single selection coefficient • Infinitely many sites approximation: Each mutation creates a new genotype, no recurrent mutations • Multiplicative fitness: Fitness of offspring w′ related to parental fitness w by w → w′ = w(1 + s) • Beneficial mutations occur with probability Ub, and all have the same selection coefficient s = s0 > 0 • Time between fixed beneficial mutations: 1 tmut = 2s0UbN • s0 < 0 corresponds to Muller’s ratchet Periodic selection vs. clonal interference • Periodic selection requires tfix ≪ tmut population fraction 1 0 t tfix tmut • Rate of adaptation in the periodic selection regime: lnhwi s R = lim ≈ 0 = 2s2NU ∞ 0 b t→ t tmut • Beneficial mutations interfere when tfix ≫ tmut or 2NUb lnN ≫ 1 • Clonal interference is inevitable for large N [provided Ub is constant] Rate of adaptation in the clonal interference regime • Crow & Kimura 1965: Each fixed beneficial mutation must occur in the offspring of the preceding one • Number of offspring of a successful mutant arising at time t = 0: t τ −1 τ −2 Noff(t)= d s0 exp(s0 )= s0 exp(s0t) Z0 • Time ts between successful mutants determined by 2s0UbNoff(ts)= 1 s s2 ⇒ R = 0 = 0 ts ln(s0/2Ub) • Fluctuations lead to a logarithmic speedup: M.M. Desai & D.S. Fisher 2007 2 2s0 lnN R ≈ 2 ln (s0/2Ub) Model II: Distribution of selection coefficients • Extremal statistics arguments suggest that the distribution of selection coefficients for beneficial mutations is exponential: H.A. Orr 2003 −1 −s/sb Pb(s)= sb e , s > 0 • Probability distribution of contending mutations destined for fixation: −s/sb Pc(s) ∼ sPb(s) ∼ se 1 all mutations contending mutations 0.8 0.6 P(s) 0.4 0.2 0 0 1 2 3 4 5 6 7 8 s/s_b The Gerrish-Lenski theory of clonal interference P.J. Gerrish, R.E. Lenski, Genetica 102/103, 127 (1998) • Key idea: A mutation s survives clonal competition if no superior mutation s′ > s arises during the time to fixation of s. • The survival probability is exp[−λ(s)] with ∞ ∞ λ ′ π ′ ′ N lnNUb ′ π ′ −1 ′ (s)= NUb tfix ds (s )Pb(s )= ds (s )sb exp[−s /sb] Zs s Zs • GL theory does not (explicitly) account for the complex interaction of different clones. In particular, the possibility of beneficial mutations arising within a growing clone (multiple mutations) is ignored. • Qualitative predictions: Clonal interference reduces the rate of substition Γ = 1/ts but increases the mean selection coefficient of fixed mutations hsi. −6 Distribution of fixed mutations: Simulation [Ub = 10 ,sb = 0.02] 20 103 104 16 105 106 107 ) 12 8 s 10 9 ( 10 fix P.S. P 8 4 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 s Distribution of fixed mutations: Experiments (E. coli) L. Perfeito et al., Science 317, 813 (2007) A: N = 2 × 104 B: N = 107 GL-theory: Asymptotics for large N C.O. Wilke (2004); S.C. Park & JK (2007) • Rate of substitution: γ ≈ 0.577215... Euler’s constant s Γ ≈ b [ln(U N lnN)+ γ − 1] → s lnN b b • Mean selection coefficient of fixed mutations: hsi≈ sb[ln(UbN lnN)+ γ] • Rate of adaptation: Γ 2 R = hsi → sb ln(UbN lnN) • Logarithmic dependence on the mutation supply NUb Extremal statistics estimates • Probability to find a selection coefficient larger than S : ∞ −S/sb Prob[s > S]= Pb(s)ds = e ZS • The largest selection coefficient smax in t generations is determined by 1 Prob[s > smax]= ⇔ smax = sb ln(NUbt) NUb t • Self-consistency requires that t = tfix(smax)= lnN/smax ⇒ smax = sb ln(NUb lnN/smax) ⇒ smax = hsi≈ sb ln(NUb lnN) • Rate of substitution: 1 smax sb Γ = = ≈ ln(NUb lnN) tfix(smax) lnN lnN Other mutation distributions β • Extremal statistics for Pb(s) ∼ exp[−(s/sb) ] yields 1/β Γ 1/β−1 2 2/β−1 smax ∼ sb(lnN) , ∼ sb(lnN) , R ∼ sb(lnN) • Compare to behavior for mutations of single strength Pb(s)= δ(s − s0): 2 2s0 lnN 2 R ≈ 2 ∼ s0 lnN ln (Ub/s0) • Adaptation driven by (i) single mutations of large effect for β < 1 (ii) multiple mutations of average effect for β > 1 • The “standard case” β = 1 is marginal GL-theory vs. simulations: Rate of adaptation simulation 10-3 GL(Wilke) 0.002 10-4 -5 w 10 t ln ¯ 0.001 10-6 103 104 105 106 0 103 104 105 106 107 108 109 N GL-theory vs. simulations: Selection coefficient of fixed mutations 0.25 0.2 ] 0.15 s [ E 0.1 simulation 0.05 GL large N 0 103 104 105 106 107 108 109 N Breakdown of GL-theory at very large N 106 103 4 4 10 10 105 106 102 107 108 9 100 ∞10 w Eq. (17) ln ¯ 10-2 10-4 10-6 10-8 100 101 102 103 104 t • Top curve: Exact solution of the infinite population limit • For large N the process is dominated by multiple mutations ⇒ temporal substructure of the substitution process, Γ → 1 ≫ sb Rate of substitution 10-1 10-2 ] k [ 10-3 E GL -4 10 Fixation Origination sb 10-5 103 104 105 106 107 108 109 N Epistasis and recombination Epistasis • So far: Fitness effects of different beneficial mutations are independent • Epistasis implies interactions between the effects of different mutations • General setting: Genome of L binary loci (sites) i = 1,...,L at which a mutation can be present (σi = 1) or absent (σi = 0) • A fitness landscape is a function w(σ) on the set of 2L genotype sequences σ =(σ1,...,σL) σ L ω σ • In the absence of epistatic interactions w( )= ∏i=1 i( i) • What is the effect of epistasis on the speed of adaptation? • How epistatic are real fitness landscapes? A two-locus system with recombination S.C. Park, JK, in preparation 00 w(00)=1 µ µ 01 10 w(10)=w(01)=1−t−s µ µ 11 w(11)=1−t • Infinite population dynamics X0 = P(00), X1 = P(10)= P(01), X2 = P(11) ′ µ 2 µ µ µ2 µ 2 wX¯ 0 = (1 − ) w00X0 + 2 (1 − )w10X1 + w11X2 − (r/2w ¯)(1 − 2 ) D ′ µ µ µ µ µ 2 wX¯ 1 = [1 − 2 (1 − )]w10X1 + (1 − )[w00X0 + w11X2]+(r/2w ¯)(1 − 2 ) D ′ µ 2 µ µ µ 2 wX¯ 2 = (1 − ) w11X2 + 2 (1 − )w10X1 − (r/2w ¯)(1 − 2 ) D w¯ = w00X0 + 2w10X1 + w11X2 mean fitness 2 2 D = w00w11X0X2 − w10X1 linkage disequilibrium Types of epistasis ln w negative null positive sign 00 10 11 01 • Epistasis measure: ε = w00w11 − w10w01 • Follow adaptation from low fitness genotype 11 to global optimum 00 Effect of recombination on the speed of adaptation No epistasis (t=0.5, s=-0.207106...) 1.3 r=0 1.2 r=0.4 r=1 1.1 1 0.9 fitness 0.8 0.7 0.6 0.5 0.4 0 10 20 30 40 50 generations • In the absence of epistasis (w00w11 = w10w01) recombination has no effect on infinite populations Effect of recombination on the speed of adaptation Negative epistasis (t=0.5, s=-0.4) 1.2 r=0 1.1 r=0.1 r=1 1 0.9 0.8 fitness 0.7 0.6 0.5 0.4 0 20 40 60 80 100 generations • Negative epistasis (w00w11 < w10w01) speeds up adaptation by generating negative linkage disequilibrium (D < 0) Effect of recombination on the speed of adaptation Positive epistasis (t=0.5, s=-0.1) 1.3 r=0 1.2 r=0.4 r=1 1.1 1 0.9 fitness 0.8 0.7 0.6 0.5 0.4 0 10 20 30 40 50 generations • Positive epistasis (w00w11 > w10w01) slows down adaptation by generating positive linkage disequilibrium (D > 0) Bistability Fitness valley (t=0.2, s=0.2) 1.1 r=0 1.05 r=0.2 r=0.4 r=1 1 0.95 0.9 fitness 0.85 0.8 0.75 0.7 0 200 400 600 800 1000 generations • In the presence of sign epistasis (w10 < min[w11,w00]) the population remains at the initial genotype 11 forever if r > 2t and 2µ < s/(1 −t) A fitness landscape for Aspergillus niger J.A.G.M.

Load more