Compositional Assemblies Behave Similarly to Quasispecies Model
Total Page:16
File Type:pdf, Size:1020Kb
Compositional Assemblies Behave Similarly to Quasispecies Model Renan Gross, Omer Markovitch and Doron Lancet Department of Molecular Genetics, Weizmann Institute of Science, Israel Group meeting, 01/10/2013 Introduction Quasispecies are a cloud of genotypes that appear in a population at mutation-selection balance. (J.J Bull et al, PLoS 2005) - Theoretical model (equations and assumptions), with experimental support by RNA viruses. - Usually applied when mutation rates are high. - GARD composomes replicate with relatively low fidelity (high mutation rate). Could they show similar dynamic behavior? 2 Quasispecies model • Basically a population model • n different genotypes / identities 3 Quasispecies model • Basically a population model • n different genotypes / identities Reactor 4 Quasispecies model • Basically a population model • n different genotypes / identities • Their relative concentration in the environment is denoted by Reactor the fraction xi. 푛 푥푖 = 1 푖=1 5 Quasispecies model • Basically a population model • n different genotypes / identities • Their relative concentration in the environment is denoted by Reactor the fraction xi. 푛 • Goal: To find out how xi behave as a function of time. 푥푖 = 1 푖=1 6 Quasispecies model • Each one replicates at a certain rate – how many offspring it has per unit time. • Some replicate faster than others. • This is called the replication rate, denoted Ai 푛푢푚푏푒푟 표푓 표푓푓푠푝푟푖푛푔 • 퐴 = 7 푖 푡푖푚푒 Quasispecies model • However, replication is not exact. Sometimes, the offspring is of another genotype. • The chance that a genotype j replicates into genotype i is denoted Qij. Q11 = 0.5 Q21 = 0.2 Q31 = 0.2 Q41 = 0.1 8 Quasispecies model • We can put everything in a matrix, called the transition matrix, Q. From ퟎ. ퟓ 0.1 0 0.7 To 0.2 ퟎ. ퟓ 0 0.2 0.2 0.2 ퟏ 0 0.1 0.2 0 ퟎ. ퟏ • The main diagonal is faithful self replication. 9 Quasispecies model • How can we find out Q and A? • Assume that we are dealing with RNA sequences of length v: – There is a single genotype with highest replication rate: the master sequence 10 Quasispecies model • How can we find out Q and A? • Assume that we are dealing with RNA sequences of length v: – There is a single genotype with highest q 1 replication rate: the master sequence 1 – Single digit replication: 0 ≤ q ≤ 1 1-q 0 • What is the chance for error-less replication? 11 Quasispecies model • How can we find out Q and A? • Assume that we are dealing with RNA sequences of length v: – There is a single genotype with highest q 1 replication rate: the master sequence 1 – Single digit replication: 0 ≤ q ≤ 1 1-q 0 • What is the chance for error-less replication? – All mutations of the master sequence replicate slower according to the Hamming distance. • Effectively, many genotypes are grouped together. • Q and A are built only from q and v. 12 Quasispecies model 13 Quasispecies model • q = 1 14 Quasispecies model • q = 1 exact replication 15 Quasispecies model • q = 1 exact replication • q = 0 16 Quasispecies model • q = 1 exact replication • q = 0 exact complimentary replication 17 Quasispecies model • q = 1 exact replication • q = 0 exact complimentary replication • q = 0.5 18 Quasispecies model • q = 1 exact replication • q = 0 exact complimentary replication all data is lost. • q = 0.5 19 Quasispecies model • q = 1 exact replication • q = 0 exact complimentary replication all data is lost. • q = 0.5 • Starting at q = 1, lowering it results in loss of the master sequence as the most frequent genotype. – (requires lack of back mutation) ERROR CATASTROPHE • RNA viruses may be fought by bringing them to error catastrophe. 20 Quasispecies model • Under constant population assumptions, the transition matrix Q and the replication rates A are all that are needed in order to find out the concentration dynamics. 21 Quasispecies model • Under constant population assumptions, the transition matrix Q and the replication rates A are all that are needed in order to find out the concentration dynamics. • The Eigen equation (after Manfred Eigen): 푑푥 푖 = 퐴 푄 − 퐸෨ 푡 푥 + 퐴 푄 푥 푑푡 푖 푖푖 푖 푖 푖푗 푗 푗≠푖 22 Quasispecies model • Under constant population assumptions, the transition matrix Q and the replication rates A are all that are needed in order to find out the concentration dynamics. • The Eigen equation (after Manfred Eigen): 푑푥 푖 = 퐴 푄 − 퐸෨ 푡 푥 + 퐴 푄 푥 푑푡 푖 푖푖 푖 푖 푖푗 푗 푗≠푖 • Where E is the “average excess rate” 푛 퐸෨ 푡 = 퐴푖푥푖 푖=1 23 Quasispecies model • Q and A can be combined into one matrix W, which tells how much of each genotype is produced per unit time. 푾 = 푸 ⋅ 푑푖푎푔(푨) 24 Examples: 1 0.001 0.001 0.01 0.1 2 0.01 0.01 푾 = 0.001 0.1 3 0.01 0.001 0.001 0.001 4 • Initial conditions: x1 = 1, all others are 0. 25 Examples: 1 0.001 0.001 0.01 0.1 2 0.01 0.01 푾 = 0.001 0.1 3 0.01 0.001 0.001 0.001 4 • Initial conditions: x1 = 1, all others are 0. 26 Examples: 1 0.001 0.001 1 0.1 2 0.01 1 푾 = 0.001 0.1 3 1 0.001 0.001 0.001 4 • Initial conditions: x1 = 1, all others are 0. 27 Examples: 1 0.001 0.001 1 0.1 2 0.01 1 푾 = 0.001 0.1 3 1 0.001 0.001 0.001 4 • Initial conditions: x1 = 1, all others are 0. 28 • The most frequent genotype is not necessarily the one with highest Ai. • The steady state population distribution is called the quasispecies. – That means, a vertical slice 29 GARD model (Graded Autocatalysis Replication Domain) RNA world Lipid world DNA / RNA / Polymers Sequence covalent bonds 30 GARD model (Graded Autocatalysis Replication Domain) RNA world Lipid world Assemblies / Clusters / DNA / RNA / Polymers Vesicles / Membranes Sequence Composition covalent bonds non-covalent bonds Segre and Lancet, EMBO Reports 1 (2000) 31 GARD model (Graded Autocatalysis Replication Domain) Synthetic chemistry Kinetic model Catalytic network (b) of rate-enhancement values Segre, Ben-Eli and Lancet, Proc. Natl. Acad. Sci. 97 (2000) GARD model (Graded Autocatalysis Replication Domain) Synthetic chemistry Kinetic model Catalytic network (b) of rate-enhancement values Rate enhancement dn NG n i k N k n 1 b j i 1...N Molecular repertoire f i b i ij G dt j1 N Segre, Ben-Eli and Lancet, Proc. Natl. Acad. Sci. 97 (2000) GARD model (Graded Autocatalysis Replication Domain) Synthetic chemistry Kinetic model Homeostatic growth Catalytic network (b) of rate-enhancement values b Fission / Split Rate enhancement dn NG n i k N k n 1 b j i 1...N Molecular repertoire f i b i ij G dt j1 N Segre, Ben-Eli and Lancet, Proc. Natl. Acad. Sci. 97 (2000) GARD model (Graded Autocatalysis Replication Domain) Following a single lineage. 35 GARD model (Graded Autocatalysis Replication Domain) Following a single lineage. Similarityng=30; split=1.5; seed=361 ‘carpet’ 1000 1 800 0.8 600 0.6 400 0.4 Generation 200 0.2 Compositional Similarity 0 200 400 600 800 1000 Generation 36 GARD model (Graded Autocatalysis Replication Domain) Following a single lineage. Similarityng=30; split=1.5; seed=361 ‘carpet’ 1000 1 800 0.8 600 0.6 400 0.4 Generation 200 0.2 Compositional Similarity 0 200 400 600 800 1000 Generation Composome (compositional genome): a faithfully replicating composition/assembly. Compotype (composome type): a collection of similar composomes. Molecular Compotype: the center of mass of the compotype cloud, treated as a molecular assembly. 37 GARD model (Graded Autocatalysis Replication Domain) • Idea: if we ignore the stochasticity inherent in the model, then errorless-replication occurs according to beta matrix eigenvectors. dn NG n i k N k n 1 b j i 1...N f i b i ij G dt j1 N 38 Moment Algebraux • Multiplying a matrix by a vector gives another vector • 푨 ⋅ 푥Ԧ = 푦Ԧ 푎 푎 푥 푎 푥 + 푎 푥 푦 • 11 12 ⋅ 1 = 11 1 12 2 = 1 푎21 푎22 푥2 푎21푥1 + 푎22푥2 푦2 • An eigenvector is a vector 푥Ԧ such that: • 푨 ⋅ 푥Ԧ = 휆푥Ԧ – l is called the eigenvalue. – l may be complex, and so may the eigenvector. 39 Moment Algebraux • The Perron-Frobenius Theorem: – A matrix with strictly positive entries contains a maximal real eigenvalue. – Its eigenvector is real and non-negative. In fact, it’s the only one with this property. • As molecular assemblies must contain real non- negative number of molecules, this looks interesting. 40 GARD and Quasispecies • Do GARD population dynamics behave like the quasispecies model? • What we would like to do: – For each assembly, experimentally find out the transition frequencies and replication rates – In other words: find Q and A. • Problem: 199 – N = 100, n = 100 There are possible G max 100 assemblies ( ~4 ⋅ 1058) 41 GARD and Quasispecies • Solution: group some assemblies together and treat them as one genotype. • We decided to group together by distances from the eigenvector. N space (simplified to 2d) 2 3 G 2 Molecule 1 Eigenvector Molecule 1 42 GARD and Quasispecies • Problem: how do we sample such a large space? • 30000 assemblies are randomly generated. 43 GARD and Quasispecies • Problem: how do we sample such a large space? • 30000 assemblies are randomly generated. – By filling up assemblies until they reach nmax. • Assemblies generated this way are far from the target. 43 GARD and Quasispecies • Problem: how do we sample such a large space? • 30000 assemblies are randomly generated.