Regimes of asexual evolution in rugged fitness landscapes
Macroevolution and microevolution
The adaptive landscape
Sequence space, fitness and Fisher-Wright sampling
Regimes of asexual population dynamics
Evolutionary trajectories in the quasispecies model
Joint work with Kavita Jain JSTAT (2005) P04008, q-bio.PE/0606025, q-bio.PE/0508008 Supported by DFG within SFB/TR 12 “Symmetries and Universality in Mesoscopic Systems” & SFB 680 “Molecular Basis of Evolutionary Innovations” Macroevolution: Episodic patterns in the fossil record
60
Number of extinct families of known marine organisms (Sepkoski, 1992) 40
Extinction rate is intermittent with power 20 law event size distribution extinction per My (families)
Decline of the average 0 600 400 200 0
extinction rate age (My) non-stationary dynamics Newman & Eble (1999) Microevolution: Adaptation of viral populations
Fitness decline induced in populations of bacteriophage φ6 by successive population bottlenecks
Fitness recovery through increasing population size
Step-like fitness changes reflect single deleterious or beneficial (compensatory) mutations
Evidence for ruggedness of the fitness landscape
C.L. Burch, L. Chao, Genetics 151, 921 (1999) Microevolution: Adaptation of viral populations
Fitness decline induced in populations of bacteriophage φ6 by successive population bottlenecks
Fitness recovery through increasing population size
Step-like fitness changes reflect single deleterious or beneficial (compensatory) mutations
Evidence for ruggedness of the fitness landscape
C.L. Burch, L. Chao, Genetics 151, 921 (1999) Microevolution: Adaptation of bacterial populations
S.F. Elena, R.E. Lenski, Nature Reviews Genetics 4, 457 (2003)
12 populations of E. coli propagated in identical environments 1.6
1.5 • • • Step-wise increase of cell • • s • • s 1.4 • • size and fitness reflects e ••
n •
t •
i •
f •• • selection of rare beneficial 1.3 • • • • e • • v i • • t • mutations; 4 - 6 steps in • • • • • a • • l 1.2 •
e •
10000 generations r • • • 1.1 •
Fitness increase slows 1.0 • down, but rate of genetic 100 1000 10000 change does not time (generations) fit: Sibani, Brandt & Alstrøm (1998)
Parallel and divergent changes Adaptation as a hill-climbing process
S. Wright, 1932
fitness landscape
population
Initial population placed in a new environment Adaptation as a hill-climbing process
S. Wright, 1932
fitness landscape
phenotypic trait
Rapid adaptation to a nearby fitness maximum Adaptation as a hill-climbing process
S. Wright, 1932
fitness landscape
phenotypic trait
Stabilizing selection at the fitness peak; mutation-selection balance Adaptation as a hill-climbing process
S. Wright, 1932
fitness landscape
phenotypic trait
Rare, rapid transition through non-adaptive zone to a higher fitness peak: Adaptation as a hill-climbing process
S. Wright, 1932
fitness landscape
phenotypic trait
Rare, rapid transition through non-adaptive zone to a higher fitness peak: Quantum evolution (G.G. Simpson, 1944) Adaptation as a hill-climbing process
S. Wright, 1932
fitness landscape
phenotypic trait
Goal: To turn this mental imagery into a theory of adaptation Sequence space
Each individual carries a genetic sequence of length L
σ σ σ σ σ ¥ £ £ £ with genotype ¢§¦ ¢§¨ ¢ © ¢ ¢ ¢ ¡ 1 2 ¤ L i
Simplification: Binary sequences σ i 0¢ 1
Point mutations change individual letters in the sequence:
σ σ i 1 i
The Hamming distance between two sequences σ σ is the number of ¢ letters in which they differ:
L
2
σ σ σ σ
d ¢ ∑ ¤ ¤ ¡ ¡ i i
i 1
L
Total number of sequences: S 2 volume
Maximal distance between two sequences: L log S diameter 2 ¡ ¤
infinite dimensionality! L-dimensional hypercubes/Hamming graphs Fitness
The fitness W σ of genotype σ is the expected number of offspring of ¡ ¤ an individual carrying σ
σ σ The mapping W is very complicated: ¡ ¤
genotype phenotype + fitness environment
Single peak landscapes: σ σ σ σ : master sequence W w d ¢ ¤ ¤ ¡ ¤ ¡ ¡ 0 0
Maximally rugged fitness landscape: Fitnesses W σ are uncorrelated ¡ ¤ quenched random variables drawn from a common distribution
random energy model (REM) of spin glasses (Derrida, 1981)
house of cards model of population genetics (Kingman, 1977) Evolution of asexual populations
Basic model: Stochastic Fisher-Wright sampling of a finite population
N individuals
1 generations
Each individual choses an ancestor from the preceding generation
Genotype σ is chosen with probability W σ ¡ ¤
Point mutations occur with probability µ per site and generation
Interplay of disorder with two distinct sources of fluctuations µ 1 N¢ ¤ ¡ ¡ Fixation
In the absence of mutations µ 0 the population becomes genetically ¤ ¡ ∞ homogeneous (monomorphic) for t
When a single mutant σ is introduced into a monomorphic population with
σ ∞ σ genotype , the outcome for t is either fixation (all ) or loss of the mutation (all σ)
σ σ Fixation of deleterious mutations (W W ) is exponentially unlikely ¡ ¤ ¡ ¤ for large N
σ σ Beneficial mutations W ¡ W are fixed with probability ¡ ¤ ¡ ¤ ¤ ¡
W σ
¡ ¤
Π σ σ
¢ 1 ¤ ¡ W σ ¡ ¤ Neutral evolution and adaptive walks
¢ £ In a flat fitness landscape [W ¡ £¤ ] the population diffuses randomly in sequence space (genetic drift)
Average genetic distance between two individuals: (Derrida & Peliti 1991)
L 4µN
σ σ
d ¢ ¥ ¡ ¤ ¦ µ 2 1 § 4 N
µ
L N ¨ 1 d ¨ 1 population is monomorphic most of the ¥ ¦ time and evolves by fixation of rare neutral mutations
¢ £ In the presence of selection [ ¡ £ ¤ ] the population performs an uphill
W © adaptive walk by fixation of rare beneficial mutations (Gillespie, 1984)
Adaptive walks terminate at local fitness maxima ¢ 5 ¢ 6 ¢ 7 Adaptive walk: µ ¡ ¡ L 15 ¡ N 1024 ¡ 10 10 10
1 + X28795 (0-00------0--) × X26725 (0-01------0--) ✳ X21345 (1-00------0--) 0.5 ■ X6634 (1-10------0--) Ο X52 (1-10------1--)
0
1 + X28795 (--0----0----0--) × X24458 (--0----1----0--) ✳ X13912 (--0----1----1--) 0.5 ■ X6507 (--1----1----1--)
0
1 + X28795 (--0---0--0-----) × X25483 (--0---0--1-----) ✳ X11692 (--0---1--1-----) 0.5 ■ X2947 (--1---1--1-----)
0 1 10 100 1000 ¢ 4
“Clonal interference”: µ ¡ L 15 ¡ N 1024 10
1 + X28795 (--0------00---) × X20940 (--0------10---) ✳ X4688 (--1------00---) 0.5 ■ X159 (--1------01---)
0
1 + X28795 (--000---0-000--) × X20940 (--000---0-100--) ✳ X9195 (--001---0-000--) 0.5 ■ X4117 (--010---0-100--) Ο X1524 (--110---0-100--) ● X940 (--110---1-101--)
0
1 + X28795 (-----00----0---) × X14622 (-----00----1---) ✳ X2711 (-----10----1---) 0.5 ■ X5 (-----01----1---)
0 1 10 100 1000 ¢ 4
Locally deterministic evolution: µ ¡ L 6 ¡ N 16384 10
1 + X55 (00001-) ● X40 (01001-) × X28 (00101-) 0.5 ✳ X5 (00111-) ■ X4 (01000-) Ο X1 (11010-)
0
1 + X55 (--001-) × X28 (--101-) ✳ X5 (--111-) 0.5 ■ X2 (--010-)
0
1 + X55 (--00--) × X28 (--10--) ✳ X5 (--11--) 0.5
0 1 10 100 1000 Classification of evolutionary regimes
K. Jain, JK, q-bio.PE/0606025
Key parameters: Population size N, mutation probability µ per site & generation, sequence length L
LNµ: Number of mutants produced per generation
µ LN ¨ 1: Adaptive walk of a monomorphic population
µ 1 ¨ LN ¨ L: Stochastic regime with clonal interference
µ LN ¡ L: Deterministic evolution within a shell of size
lnN d ¡ ln µ ¢ ¢
around the dominating genotype
d L: Fully deterministic evolution quasispecies dynamics ¡
The quasispecies model
M. Eigen, Naturwissenschaften 58, 465 (1972); K. Jain, JK, q-bio.PE/0508008
Unnormalized population fraction σ satisfies linear time evolution Z ¢ t ¡ ¤
µ ¡ ¢ σ σ σ σ σ § ¢ ¢ Z ¢ t 1 ∑W M Z t ¡ ¤ ¡ ¤ ¡ ¤ ¡ ¤
σ
£ µ £ µ µ ¡ ¢ ¡ ¢ ¡ ¢
ˆ ˆ ˆ ˆ ˆ § diagonal, random Z t 1 E Z t ¢ E WM ¢ W : ¡ ¤ ¡ ¤
¥
µ σ σ σ σ σ σ ¤ ¤ d ¤ L d d ¢ ¢ ¢ ¡ ¡ ¡ ¡ ¢
Mutation matrix: σ σ µ µ ¦ µ M ¢ 1 ¡ ¤ ¤ ¡
σ σ
d ¤ ¡ ¢
σ δ σ µ 0 Initial condition σ σ after one generation ¢ Z ¢ 0 Z 1 ¡ ¤ ¡ ¤ ¤ 0
population fraction drops to 1 N at distance ¡
lnN
d ¦ d ¡ ln µ ¢ ¢
Ev olutionar 0.5 0.5 0.5 0 1 0 1 0 1 1 y tr ajector
ies
L are 10
15 ¢
deter
µ ministic
10
¥ ¢ 8 10
and
¥ ¢ 6 10
independent ¥ 100 ● Ο ■ ✳ × + ● Ο ■ ✳ × + ● Ο ■ ✳ × + X X X X X X X X X X X X X X X X X X 4 4928 28795 4928 28795 (0) 4928 28795 11920 5 5 9195 9195 9195 4688 4688 4688 5 1 1 (1) (0) (1) (1) (0) (1) (2) (2) (1) (1) (1) (1) (1) (1) (2) (7) (7) ▼ ▲ ∆ X X X 74 (2) 0 1 (10) (7)
of µ : 1000 The strong selection limit
JK, C. Karl, Physica A 318, 137 (2003)
µ For 0 mutations are irrelevant after the first generation: £ £ £ µ t 0 ¥ t 1 ¡ ¢ ¡ ¢
Z t Eˆ Z 0 Eˆ Z 1 ¡ ¤ ¡ ¤ ¡ ¤ ¡ ¡
¥ σ σ
t 1 d ¤ ¡ ¢
σ ¦ σ σ µ 0
Z ¢ t W W ¡ ¤ ¡ ¤ ¡ ¤ 0
Logarithmic population variables evolve linearly in time:
σ ¦ σ µ σ σ ¢ lnZ ¢ t lnW t 1 ln d ¡ ¤ ¡ ¤ ¡ ¤ ¡ ¤ ¢ ¢ 0
Change in µ is compensated by rescaling of time with ln µ ¢ ¢ ¢
Most populated genotype σ is determined by maximizing σ with t lnZ ¢ t ¤ ¡ ¤ ¡ respect to σ Geometry of the evolutionary race
lnZ(σ,t) σ*(t)
t Only the most fit mutant d=1 in each shell of constant 2 Hamming distance σ σ d ¢ 3 ¡ ¤ 0
needs to be considered L
system size 2 L
L ¢ ¢ σ £ σ σ σ ¡ lnZ ¡ t lnW t 1 d ¢ ¢ ¢ ¢ 0 Geometry of the evolutionary race
lnZ(σ,t) σ*(t)
t d=1 Contenders and 2 spectators: Contenders 3 σ are fitness records k with σ σ W ¡ W ¡ ¤ ¡ ¤
k 1 k
L ¢ ¢ σ £ σ σ σ ¡ lnZ ¡ t lnW t 1 d ¢ ¢ ¢ ¢ 0 Geometry of the evolutionary race
lnZ(σ,t) σ*(t)
Contenders and spectators: Contenders t σ d=1 are fitness records k with σ σ 2 W ¡ W ¡ ¤ ¡ ¤
k 1 k 3
Not all contenders are winners; some are bypassed
L ¢ ¢ σ £ σ σ σ ¡ lnZ ¡ t lnW t 1 d ¢ ¢ ¢ ¢ 0 Geometry of the evolutionary race
lnZ(σ,t)
Contenders and σ*(t) spectators: Contenders σ are fitness records k with σ σ W ¡ W ¡ ¤ ¡ ¤
k 1 k t d=1 Not all contenders are 2 3 winners; some are bypassed
The evolutionary trajectory is composed of non- bypassed fitness records; how many? L ¢ ¢ σ £ σ σ σ ¡ lnZ ¡ t lnW t 1 d ¢ ¢ ¢ ¢ 0 Bypassing for i.i.d. shell fitnesses
Simplification: Replace logarihmic shell fitnesses
F max lnW σ k ¡ ¤
σ σ
d ¤ k ¢ ¡ 0
by i. i. d. random variables from a distribution P F ¡ ¤
The number of records is lnL with Poisson distribution
β β Numerical conjecture: Bypassing probability 1 with 1 depending on the tail of P F JK, C. Karl (2003) ¡ ¤
(i) exponential-like: β 1 2 ¡
¥ ¥ δ
1
(ii) power law: β δ 1 2δ 1 P F F ¤ ¡ ¤ ¡ ¤ ¡ ¡ ν
β ν ν § § ¢ (iii) bounded: 2 3 2 P F F ¡ F ¤ ¡ ¤ ¡ ¤ ¡ ¤ ¡ ¡
Fluctuations in the number of non-bypassed records are sub-Poissonian
Analytic proof by first-passage techniques C. Sire, S.N. Majumdar, D.S. Dean, J. Stat. Mech. (2006) L07001 Bypassing in sequence space
K. Jain, JK, JSTAT (2005) P04008
L Number of sequences at distance d from σ is 0 d ¡
Probability to find a new fitness record at d: (M.C.K. Yang, 1975)
L 1 2d L ¡ d ¡
¦ ∞
for P d L¢ d ¢ d L 1 2 ¡ ¤ ¡ ¡ ∑d L 1 d L ¡
k 1 k ¡
independent of the fitness distribution
Total number of records ¢ has mean and variance
2 2
¦ ¦ £ £ £ £ £ £ £ £ ¢ ¢ ¢
1 ln2 N 0 307 N¢ 3ln2 2 N 0 0794 N ¡ ¥ ¦ ¤ ¡ ¥ ¦ ¦ ¤ ¥
Bypassing reduces this to £ L for exponential-like fitness distributions, ¡¥¤ ¤
and to £ 1 for power-law fitness distributions ¡ ¤ Distribution of evolution times
T T T 3 2 1 fitness
time
Global fitness maximum is reached at time T1
Universal power law tail of distribution of time Tk of the k’th last jump: ¥
1 k ¢ ¡ P T T k ¡ ¤ k k
with prefactor depending on fitness distribution and sequence length.
Expected evolution time is infinite: T ∞ ¥ 1 ¦
P Exponential
P(T) o w 0.0001 1e-06 1e-05 er 0.001 0.01 0.1 1 la 1 w fitness fitness Nonlocal shell model, power law p(F) with exponent 2, 1000000 runs distr distr ib ib ution: ution: 10 P
P ¡ F
1
¤ ¡
T ¤
T
F
¥
¤
¡ δ
L
¡ 1
T
¢
2 100 P
1 ¡
T ¤ L=20 L=15 L=10
L=5
¤ L
2 ¥
L
δ ¡ T 2
P Exponential All scaled P(T) o w 0.0001 1e-06 1e-05 er 0.001 inter 0.01 0.1 10 1 la 1 w mediate fitness Nonlocal shell model, power law p(F) with exponent 2, P(T) scaled according to theory fitness maxima distr distr ib ib ution: ution: are 10 b P ypassed
P ¡ F
1
¤ ¡
T ¤
T
F
¥ ¤
in
¡ δ the
L
¡ 1
T ¢
po
2 100 w P er
1 ¡ T
la ¤ w L=20 L=15 L=10
L=5
case! ¤ L
2 ¥
L
δ ¡ T 2 Summary
Evolutionary biology & population genetics a rich source of stochastic many-body problems with disorder
State of the art largely restricted to neutral evolution, small number of sites
( ), non-interacting (=multiplicative) fitness landscapes L 1¢ 2
Distinct fluctuations from mutations & sampling noise
Different kinds of “thermodynamic limits”: ∞
N quasispecies model ∞
L infinite sites model
Future directions (in SFB 680) (i) Speed of adaptation in smooth landscapes (ii) Timing of adaptive events in the infinite sites model (iii) Modes of reproduction in diploids (sexual, parthenogenetic, selfing)