Genotype-Phenotype Maps: from molecules to simple cells
Camille Stephan-Otto Attolini Institute for Theoretical Chemistry, University of Vienna, Austria Vienna, February 28th, 2006 The Genotype-Phenotype Map
ACGGUUUGSGCGCIGCCUCUACUGUACUAGUACUGUCCA
A Genotype Phenotype C C U G U C A U G A U C A U G U C A
Genotype U Phenotype C U C C G I C G C G S G U U U G G C A The Genotype-Phenotype Map II A C C U G U C A U G A U C A U G U C A U C
Genotype U Phenotype Fitness C C G I C G C G S G U U U G G C A
Environment
f g G x E P x E F Genotype, Phenotype and Fitness
Genotype Phenotype Fitness Characteristics I
Neutrality
ACGGUAGCGAUCGACAUCAGCGAUCAGC ACGGUAGCGAUCGAUAUCAGCGAUCAGC Characteristics I (bis)
Neutrality and the fitness landscape
Neutral networks Characteristics II
Plasticity
• Is the capability of a phenotype to change due to impulses of the external conditions. In higher order organisms it is controlled by the genotype. Amoeba Naegleria Gruberi Characteristics III
Variability • Variability is the capability of a and genome to change, usually due to Evolvability errors in the copying process. Without this, evolution would be impossible.
• An evolvable genotype has the possibility of finding fitter phenotypes.
It has been shown by P. Schuster and coworkers that the RNA sequence to secondary structure map has all these characteristics… Levels A C C U G U C A U G A U C A
U RNA secondary structure C A U C
Molecular Level U C C G I C G C G S G U U C A A C C U G U C A U G A U C
Multi-Molecular A U C
A RNA cofolding U C U C
Level C G I C G C G S G U U C A Hypercycle
TATA box Promoter
Regulation RNA transcription
Regulatory regions At the molecular level A U C G A C U A C G
A The secondary structure of a given RNA C U A
G sequence is the list of (Watson-Crick and C U A
G wobble) base pairs such that: C G A
G • each nucleotide takes part in at most one C U A
RNAfold G base pair, and C A U
C • base pairs do not cross (knots and pseudo- G A knots are forbidden) A G C A U C G C A C
U The RNAcofold algorithm computes the A C G
A hybridization complex of two RNA C U A
G molecules. The energy of the loop where the C U A
G two sequences meet is computed as an
RNAcofold C G A
G external loop. C U A G C A U C G A Neutrality in RNAcofold A G C A U C G C A A G C C U A A U C C G G A C C A U C A U A A G C C G A A U C C U G A C G A C C U U A A G C C G G A A C G U C A U G A C G U C A A G U C C G G A A G C U A G C A U C
Two sequences G A Three sequences Neutrality and length of neutral paths in RNAcofold
Frequency of Frequency of neutral neutral mutations: mutations: 0.32 0.18
Length of sequences: Two sequences Three sequences 100 bases RNAfold vs. RNAcofold
Sequences Neutrality Length of N.P.
One 0.33 100 sequence RNAfold
Two 0.32 75 sequences RNAcofold RNAcofold Three with two 0.18 40 different sequences sequences Multi-molecular level
The copying process of molecules as the RNA allows only a few dozens of nucleotides in length, due to the high rate of errors. The hypercycle was proposed by Eigen and Schuster in ‘79 as an answer to the Selfish pCaartacshi t2e2. The Hypercycle SSThehShort-cutolefr itHs-hcyu pt aeparasitepraacrsyaictseliete The Hypercycle in two dimensions
Hogeweg and Boerljist proposed in the late 80’s a model of the hypercycle in a two dimensional lattice which resulted to be resistant against selfish parasites.
The main characteristic of this model, was the formation of spirals. This spatial organization is responsible of the expulsion of the parasite outside of the system. Evolving towards the hypercycle A C C U G U C A U G A U C A U C A U C U C C G I
C RNAfold Hamming distance G C G S G U U C A Self-replication and catalytic rates Hypercycle in two dimensions Concentration and Fitness
Concentration of molecules corresponding to different targets
Fitness evolution Diffusion in sequence space
Diffusion in sequence space Gene regulation
The regulation of a gene expression depends on multiple factors. A typical Eukaryote gene has the following structure: Coding region Promoter Regulatory sequences TATA box Promoter
RNA transcription
Regulatory regions The CelloS model A Phenotype and environment • Spatial environment with changing food sources • Finite life time for cells and reproduction by division B a A b
B Regulatory network •Mutually repressing genes •Cost for expressing proteins UAGCAUCAGGCGAUCAGCGAUCAUCGAUCAUCACGAUCGAUCGAUGCUACGAUAGCUUAGCGGAA Genome decoding Genome RNA sequence The genotype and phenotype are embodied in different entities The Potts Model
Membrane Cell-cell interaction
Random movements of the border accepted or rejected depending on the total energy of the cell:
Cells J E = type,type + J + J + #(v "V )2 cell ! 2 ! type,A ! type,S
Over all entries of the membrane, where: -J is the interaction matrix between type of cells, cells and the air and cells and a substrate. -V is the target volume and v the actual volume of a cell. -λ is the compressibility of the cell. Genome and genes
Movement genes Metabolic genes
Distance Distance
Secondary structure of a gene
UAGCAUCAGGCGAUCAGCGAUCAUCGAUCAUCACGAUCGAUCGAUGCUACGAUAGCUUAGCGGAAC
TATA box Coding region The regulatory network
At the moment, we are using a very simple regulation network for our two kind of genes:
A dpm 1 = Gm " k 3 # c " pm dt 1+ p f
B a A b dp 1 Regulatory regions f ! = G f " k 3 # c " p f B dt 1+ pm
! In motion Protein expression curves
With these equations we obtain a switch between the protein concentrations every time an impulse is given from the outside. Population and food sources
Population grows when cells are feeding from the sources. Depending on the changes in the environment, the number of cells is reduced or increased. Genome evolution
s Expected number of e n e genes: g f o
9.875 r e b Number of genes at the m u
N end: 18
Difference between efficiencies is also controlled by the regulatory network Conclusions and outlook
Conclusions:
Neutrality is decreased in the case of more than two sequences interacting. Depending on how interactions are defined, systems of several molecules can evolve as in models of individual sequences, the hypercycle for example, or be destroyed by the low neutrality of the map (results not shown). CelloS does not measure fitness as a real number but selection acts through surviving to changing conditions. (And selection does work) Conclusions and outlook
Further work:
More detailed study of cofold neutrality and its relation to the degree of connectivity of a network Decoding of the regulatory networks from the genome Phylogenetic trees from these models to be compared with real data The end I would like to thank:
Peter Stadler Peter Schuster Christoph Flamm all the people of the TBI and the audience