Quasi-Species ~~ evolution at the speed of light ~~

Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi Quasi-Species : Introduction

I Species : Single genotype

I Quasispecies

I Large group of genotypes with high mutation rate I RNA viruses, Macromolecules like RNA/DNA I Proposed by Manfred Eigen and Peter Schuster in 1970s

J. J. Bull et. al., PLOS Computational Biology 1, e61 (2005) C. O. Wilke, BMC Evolutionary Biology 5:44 (2005) Quasi-Species : Introduction

I Species : Single genotype

I Quasispecies

I Large group of genotypes with high mutation rate I RNA viruses, Macromolecules like RNA/DNA I Proposed by Manfred Eigen and Peter Schuster in 1970s

J. J. Bull et. al., PLOS Computational Biology 1, e61 (2005) C. O. Wilke, BMC Evolutionary Biology 5:44 (2005) Quasi-Species : Introduction

I Species : Single genotype

I Quasispecies

I Large group of genotypes with high mutation rate I RNA viruses, Macromolecules like RNA/DNA I Proposed by Manfred Eigen and Peter Schuster in 1970s

J. J. Bull et. al., PLOS Computational Biology 1, e61 (2005) C. O. Wilke, BMC Evolutionary Biology 5:44 (2005) Quasi-Species : Introduction

I Species : Single genotype

I Quasispecies

I Large group of genotypes with high mutation rate I RNA viruses, Macromolecules like RNA/DNA I Proposed by Manfred Eigen and Peter Schuster in 1970s

J. J. Bull et. al., PLOS Computational Biology 1, e61 (2005) C. O. Wilke, BMC Evolutionary Biology 5:44 (2005) RNA Virus

I Contains single- or double-stranded RNA as its genetic material

I SARS, influenza, hepatitis C, West Nile fever, polio and measles

I Retrovirus : Replication process through a DNA intermediate

I HIV-1 and HIV-2

I Ribovirus : RNA virus that does not use an DNA intermediate

I Very small size

I RNA Virus ~ 1.7 Kb to 10 Kb I Largest Virus : 1180 Kbp (Mimivirus) I Bacteria ~ 160 Kbp to 13 Mbp I Human ~ 3Gbp I dubium ~ 670 Gbp () RNA Virus

I Contains single- or double-stranded RNA as its genetic material

I SARS, influenza, hepatitis C, West Nile fever, polio and measles

I Retrovirus : Replication process through a DNA intermediate

I HIV-1 and HIV-2

I Ribovirus : RNA virus that does not use an DNA intermediate

I Very small

I RNA Virus ~ 1.7 Kb to 10 Kb I Largest Virus : 1180 Kbp (Mimivirus) I Bacteria ~ 160 Kbp to 13 Mbp I Human ~ 3Gbp I Polychaos dubium ~ 670 Gbp (amoeba) RNA Virus

I Contains single- or double-stranded RNA as its genetic material

I SARS, influenza, hepatitis C, West Nile fever, polio and measles

I Retrovirus : Replication process through a DNA intermediate

I HIV-1 and HIV-2

I Ribovirus : RNA virus that does not use an DNA intermediate

I Very small genome size

I RNA Virus ~ 1.7 Kb to 10 Kb I Largest Virus : 1180 Kbp (Mimivirus) I Bacteria ~ 160 Kbp to 13 Mbp I Human ~ 3Gbp I Polychaos dubium ~ 670 Gbp (amoeba) RNA Virus

I Contains single- or double-stranded RNA as its genetic material

I SARS, influenza, hepatitis C, West Nile fever, polio and measles

I Retrovirus : Replication process through a DNA intermediate

I HIV-1 and HIV-2

I Ribovirus : RNA virus that does not use an DNA intermediate

I Very small genome size

I RNA Virus ~ 1.7 Kb to 10 Kb I Largest Virus : 1180 Kbp (Mimivirus) I Bacteria ~ 160 Kbp to 13 Mbp I Human ~ 3Gbp I Polychaos dubium ~ 670 Gbp (amoeba) RNA Virus

I Contains single- or double-stranded RNA as its genetic material

I SARS, influenza, hepatitis C, West Nile fever, polio and measles

I Retrovirus : Replication process through a DNA intermediate

I HIV-1 and HIV-2

I Ribovirus : RNA virus that does not use an DNA intermediate

I Very small genome size

I RNA Virus ~ 1.7 Kb to 10 Kb I Largest Virus : 1180 Kbp (Mimivirus) I Bacteria ~ 160 Kbp to 13 Mbp I Human ~ 3Gbp I Polychaos dubium ~ 670 Gbp (amoeba) Natural Selection

The process by which traits

become more or less common in a population

due to consistent effects

upon the survival or reproduction of their bearers.

~~ Survival of the fittest ~~ Fitness (natural selection) vs. Mutation rate

I Consider a group of genotypes existing together

I If mutation rate is low, natural selection works

I If mutation rate is high, the idea of ‘fittest’ becomes meaningless

I 2 genotypes could just have a difference of 1 nucleotide

I Equilibrium between various genotypes

I Error catastrophe

I Loss of a fitter genotype in a population due to high mutation rate I Mutagenesis : Increase the mutation rate of viruses using drugs

I Extinction : All genotypes become extinct Fitness (natural selection) vs. Mutation rate

I Consider a group of genotypes existing together

I If mutation rate is low, natural selection works

I If mutation rate is high, the idea of ‘fittest’ becomes meaningless

I 2 genotypes could just have a difference of 1 nucleotide

I Equilibrium between various genotypes

I Error catastrophe

I Loss of a fitter genotype in a population due to high mutation rate I Mutagenesis : Increase the mutation rate of viruses using drugs

I Extinction : All genotypes become extinct Fitness (natural selection) vs. Mutation rate

I Consider a group of genotypes existing together

I If mutation rate is low, natural selection works

I If mutation rate is high, the idea of ‘fittest’ becomes meaningless

I 2 genotypes could just have a difference of 1 nucleotide

I Equilibrium between various genotypes

I Error catastrophe

I Loss of a fitter genotype in a population due to high mutation rate I Mutagenesis : Increase the mutation rate of viruses using drugs

I Extinction : All genotypes become extinct Fitness (natural selection) vs. Mutation rate

I Consider a group of genotypes existing together

I If mutation rate is low, natural selection works

I If mutation rate is high, the idea of ‘fittest’ becomes meaningless

I 2 genotypes could just have a difference of 1 nucleotide

I Equilibrium between various genotypes

I Error catastrophe

I Loss of a fitter genotype in a population due to high mutation rate I Mutagenesis : Increase the mutation rate of viruses using drugs

I Extinction : All genotypes become extinct Fitness (natural selection) vs. Mutation rate

I Consider a group of genotypes existing together

I If mutation rate is low, natural selection works

I If mutation rate is high, the idea of ‘fittest’ becomes meaningless

I 2 genotypes could just have a difference of 1 nucleotide

I Equilibrium between various genotypes

I Error catastrophe

I Loss of a fitter genotype in a population due to high mutation rate I Mutagenesis : Increase the mutation rate of viruses using drugs

I Extinction : All genotypes become extinct Fitness (natural selection) vs. Mutation rate

I Consider a group of genotypes existing together

I If mutation rate is low, natural selection works

I If mutation rate is high, the idea of ‘fittest’ becomes meaningless

I 2 genotypes could just have a difference of 1 nucleotide

I Equilibrium between various genotypes

I Error catastrophe

I Loss of a fitter genotype in a population due to high mutation rate I Mutagenesis : Increase the mutation rate of viruses using drugs

I Extinction : All genotypes become extinct Fitness (natural selection) vs. Mutation rate

I Consider a group of genotypes existing together

I If mutation rate is low, natural selection works

I If mutation rate is high, the idea of ‘fittest’ becomes meaningless

I 2 genotypes could just have a difference of 1 nucleotide

I Equilibrium between various genotypes

I Error catastrophe

I Loss of a fitter genotype in a population due to high mutation rate I Mutagenesis : Increase the mutation rate of viruses using drugs

I Extinction : All genotypes become extinct Simple case of 2 genotypes Quasispecies : Mathematical Model Assuming discrete time-steps

0 n1 = n1w1 (1 − µ1) + n2w1µ3 0 n2 = n1w2µ1 + n2w2 (1 − µ2)

N0 = MN

where   w1 (1 − µ1) w1µ3 M = w2µ1 w2 (1 − µ2)

 n1  N = n2

w ≥ 0 and 0 ≤ µ ≤ 1 Quasispecies : Mathematical Model Assuming discrete time-steps

0 n1 = n1w1 (1 − µ1) + n2w1µ3 0 n2 = n1w2µ1 + n2w2 (1 − µ2)

N0 = MN

where   w1 (1 − µ1) w1µ3 M = w2µ1 w2 (1 − µ2)

 n1  N = n2

w ≥ 0 and 0 ≤ µ ≤ 1 Quasispecies : Mathematical Model

Assuming discrete time-steps

0 n1 = n1w1 (1 − µ1) + n2w1µ3 0 n2 = n1w2µ1 + n2w2 (1 − µ2)

N0 = MN

where   w1 (1 − µ1) w1µ3 M = w2µ1 w2 (1 − µ2)

 n1  N = n2

w ≥ 0 and 0 ≤ µ ≤ 1 Quasispecies : Mutation-Selection Balance

    w1 (1 − µ1) w1µ3 n1 M = N = w2µ1 w2 (1 − µ2) n2

N0 = MN

At equilibrium, N0 = λN ⇒ MN = λN

If µ3 = 0,

N1 w1 (1 − µ1) − w2 (1 − µ2) λ1 = w1 (1 − µ1) = N2 w2µ1

λ2 = w2 (1 − µ2) N1 = 0

w1 (1 − µ1) = w2 (1 − µ2): Error Threshold Quasispecies : Mutation-Selection Balance

    w1 (1 − µ1) w1µ3 n1 M = N = w2µ1 w2 (1 − µ2) n2

N0 = MN

At equilibrium, N0 = λN ⇒ MN = λN

If µ3 = 0,

N1 w1 (1 − µ1) − w2 (1 − µ2) λ1 = w1 (1 − µ1) = N2 w2µ1

λ2 = w2 (1 − µ2) N1 = 0

w1 (1 − µ1) = w2 (1 − µ2): Error Threshold Quasispecies : Mutation-Selection Balance

    w1 (1 − µ1) w1µ3 n1 M = N = w2µ1 w2 (1 − µ2) n2

N0 = MN

At equilibrium, N0 = λN ⇒ MN = λN

If µ3 = 0,

N1 w1 (1 − µ1) − w2 (1 − µ2) λ1 = w1 (1 − µ1) = N2 w2µ1

λ2 = w2 (1 − µ2) N1 = 0

w1 (1 − µ1) = w2 (1 − µ2): Error Threshold Quasispecies : Mutation-Selection Balance

    w1 (1 − µ1) w1µ3 n1 M = N = w2µ1 w2 (1 − µ2) n2

N0 = MN

At equilibrium, N0 = λN ⇒ MN = λN

If µ3 = 0,

N1 w1 (1 − µ1) − w2 (1 − µ2) λ1 = w1 (1 − µ1) = N2 w2µ1

λ2 = w2 (1 − µ2) N1 = 0

w1 (1 − µ1) = w2 (1 − µ2): Error Threshold Quasispecies : Mutation-Selection Balance

    w1 (1 − µ1) w1µ3 n1 M = N = w2µ1 w2 (1 − µ2) n2

N0 = MN

At equilibrium, N0 = λN ⇒ MN = λN

If µ3 = 0,

N1 w1 (1 − µ1) − w2 (1 − µ2) λ1 = w1 (1 − µ1) = N2 w2µ1

λ2 = w2 (1 − µ2) N1 = 0

w1 (1 − µ1) = w2 (1 − µ2): Error Threshold w1 > w2 µ1 = kµ µ2 = µ k > 1

λ1 = w1 (1 − µ1) λ2 = w2 (1 − µ2) In the presence of back-mutations Quasispecies: Survival of the flattest Quasispecies: Survival of the flattest Quasispecies: Assumptions

I Constant fitness and mutation rates

I Genotype fitness usually depends on its relative population size I Presence of other genotypes puts pressure on available resources

I Linear Model

I Nonlinearity due to complex fitness landscapes

I Deterministic model

I Population genetics for stochastic effects (genetic drift)

I Limitation

I Cannot make quantitative predictions since its hard to obtain parameters from actual biological systems Quasispecies: Assumptions

I Constant fitness and mutation rates

I Genotype fitness usually depends on its relative population size I Presence of other genotypes puts pressure on available resources

I Linear Model

I Nonlinearity due to complex fitness landscapes

I Deterministic model

I Population genetics for stochastic effects (genetic drift)

I Limitation

I Cannot make quantitative predictions since its hard to obtain parameters from actual biological systems Quasispecies: Assumptions

I Constant fitness and mutation rates

I Genotype fitness usually depends on its relative population size I Presence of other genotypes puts pressure on available resources

I Linear Model

I Nonlinearity due to complex fitness landscapes

I Deterministic model

I Population genetics for stochastic effects (genetic drift)

I Limitation

I Cannot make quantitative predictions since its hard to obtain parameters from actual biological systems Quasispecies: Assumptions

I Constant fitness and mutation rates

I Genotype fitness usually depends on its relative population size I Presence of other genotypes puts pressure on available resources

I Linear Model

I Nonlinearity due to complex fitness landscapes

I Deterministic model

I Population genetics for stochastic effects (genetic drift)

I Limitation

I Cannot make quantitative predictions since its hard to obtain parameters from actual biological systems Quasispecies: Assumptions

I Constant fitness and mutation rates

I Genotype fitness usually depends on its relative population size I Presence of other genotypes puts pressure on available resources

I Linear Model

I Nonlinearity due to complex fitness landscapes

I Deterministic model

I Population genetics for stochastic effects (genetic drift)

I Limitation

I Cannot make quantitative predictions since its hard to obtain parameters from actual biological systems Quasispecies: Assumptions

I Constant fitness and mutation rates

I Genotype fitness usually depends on its relative population size I Presence of other genotypes puts pressure on available resources

I Linear Model

I Nonlinearity due to complex fitness landscapes

I Deterministic model

I Population genetics for stochastic effects (genetic drift)

I Limitation

I Cannot make quantitative predictions since its hard to obtain parameters from actual biological systems Quasispecies: Assumptions

I Constant fitness and mutation rates

I Genotype fitness usually depends on its relative population size I Presence of other genotypes puts pressure on available resources

I Linear Model

I Nonlinearity due to complex fitness landscapes

I Deterministic model

I Population genetics for stochastic effects (genetic drift)

I Limitation

I Cannot make quantitative predictions since its hard to obtain parameters from actual biological systems Quasispecies: Assumptions

I Constant fitness and mutation rates

I Genotype fitness usually depends on its relative population size I Presence of other genotypes puts pressure on available resources

I Linear Model

I Nonlinearity due to complex fitness landscapes

I Deterministic model

I Population genetics for stochastic effects (genetic drift)

I Limitation

I Cannot make quantitative predictions since its hard to obtain parameters from actual biological systems Quasispecies: Assumptions

I Constant fitness and mutation rates

I Genotype fitness usually depends on its relative population size I Presence of other genotypes puts pressure on available resources

I Linear Model

I Nonlinearity due to complex fitness landscapes

I Deterministic model

I Population genetics for stochastic effects (genetic drift)

I Limitation

I Cannot make quantitative predictions since its hard to obtain parameters from actual biological systems Quasispecies: Assumptions

I Constant fitness and mutation rates

I Genotype fitness usually depends on its relative population size I Presence of other genotypes puts pressure on available resources

I Linear Model

I Nonlinearity due to complex fitness landscapes

I Deterministic model

I Population genetics for stochastic effects (genetic drift)

I Limitation

I Cannot make quantitative predictions since its hard to obtain parameters from actual biological systems