<<

Aoife McLysaght Dept. of Genetics of genome arrangement

• Evolution of genome content

–. • Evolution of genome arrangement – Gene order changes • Inversions, translocations • Evolution of genome content

•. • Evolution of genome arrangement – Gene order changes • Inversions, translocations • Evolution of genome content – Gene gain (sequence divergence, duplication, recombination, horizontal transfer) – Gene loss (deletion)

•. • Evolution of genome arrangement – Gene order changes • Inversions, translocations • Evolution of genome content – Gene gain (sequence divergence, duplication, recombination, horizontal transfer) – Gene loss (deletion)

• One or more genes per event • Translate knowledge from sequenced or model genomes to organism of interest – Positional cloning of genes – Use probes designed in one genome to detect a target in another genome • Improve model parameters for phylogenetic inference from genome arrangement • Not just a bag of genes • Genome organisation contains information – Order of Hox genes corresponds to spatial pattern of gene expression – Clustering of housekeeping genes • By observation of ‘allowed’ changes gain understanding of genomic constraints and plasticity • Greater power to detect change • Precision – Can infer lineage in which change occurred • Detect direction and rate of change

• More genomes also increase computational burden • 20 completely sequenced genomes

• 150-300kb containing ~200 genes • Double-stranded DNA viruses, no RNA stage • Replicate in the host cytoplasm

• Entomopox – insect infecting • Chordopox – infecting • Orthopox – subset of chordopox which includes smallpox (variola) and vaccinia • How are these genomes arranged? • How has genome content changed? • Is the rate of change constant? • How are these genomes arranged? • How has genome content changed? • Is the rate of change constant?

• Can we detect adaptive genome evolution? Significant sequence similarity – How significant? over a long stretch of the protein – How long? e-value threshold

Minimum aligned proportion 1 1e-5 1e-10 1e-20 0.0 0 31 29 19 0.1 0 31 29 19 0.2 4 32 29 19 0.3 7 33 31 20 0.4 10 34 31 20 0.5 17 32 30 20 0.6 29 33 30 19 0.7 28 30 26 18 0.8 26 25 22 14 0.9 15 16 14 10 1.0 0 0 0 0 • Complete linkage • Single-link clustering • Our method C

A E

D B C

A E G

F D B C

A E

J F

D B I C G E

D H

B • 4042 total proteins • 3384 proteins classified into 875 groups – 813 complete linkage • 521 groups of 1 member • 150 groups of 2 members • 204 ≥ 3 members

34 orthologues present in all genomes 34 orthologues present in all genomes 92 orthologues present in all orthopox genomes • Examine phylogenetic spread of a group of orthologues

• Assign gene gain and loss events to branches in the phylogeny

• Tested for uniform rate of gene acquisition • Assume a molecular clock • Tested for uniform rate of gene acquisition • Assume a molecular clock

• Are gene acquisition events distributed randomly throughout the tree? • Tested for uniform rate of gene acquisition • Assume a molecular clock

• Are gene acquisition events distributed randomly throughout the tree? • Simulations Significant deficit Significant excess • Slower rate of amino acid substitution within this clade (leading to abberantly short branch lengths) – Takezaki relative rate test – Branch lengths from synonymous distances • Slower rate of amino acid substitution within this clade (leading to abberantly short branch lengths) – Takezaki relative rate test – Branch lengths from synonymous distances • Increased rate of gene gain • Increased selection for the retention of gained genes • Extensive sequence divergence • Recombination • Horizontal transfer • AMV-EPB_034 – inhibitor of apoptosis from Amsacta moorei entomopoxvirus (AMV-EPB) • GenBank sequence – inhibitor of apoptosis from Bombyx mori (silkworm) BLAST e-value 9e-81 • Amsacta moorei entomopoxvirus infects Amsacta moorei (Red Hairy Caterpillar) • Bombyx and Amsacta both Order Lepidoptera • AMV-EPB_034 – inhibitor of apoptosis from Amsacta moorei entomopoxvirus (AMV-EPB) • GenBank sequence – inhibitor of apoptosis from Bombyx mori (silkworm) BLAST e-value 9e-81 • Amsacta moorei entomopoxvirus infects Amsacta moorei (Red Hairy Caterpillar) • Bombyx and Amsacta both Order Lepidoptera • 62% of best non-viral GenBank hits are from same taxonomic Class as viral host • Events are not independent • Depend on previous (in time) gain and loss events of the gene family

• Requires a probabilistic model?

• Selection for diversification – Positive selection

• Characteristic of host-parasite co-evolution Phe UUU Ser UCU Tyr UAU Cys UGU UUC UCC UAC UGC Leu UUA UCA ter UAA ter UGA UUG UCG ter UAG Trp UGG

Leu CUU Pro CCU His CAU Arg CGU CUC CCC CAC CGC CUA CCA Gln CAA CGA CUG CCG CAG CGG

Ile AUU Thr ACU Asn AAU Ser AGU AUC ACC AAC AGC AUA ACA Lys AAA Arg AGA Met AUG ACG AAG AGG

Val GUU Ala GCU Asp GAU Gly GGU GUC GCC GAC GGC GUA GCA Glu GAA GGA GUG GCG GAG GGG • Two classes of DNA substitutions – Synonymous (DNA change without amino acid change) – Nonsynonymous (DNA change causing amino acid change) • Neutral – equal frequencies • Conservative selection – fewer nonsynonymous substitutions • Positive selection – more nonsynonymous substitutions • Two classes of DNA substitutions – Synonymous (DNA change without amino acid change) – Nonsynonymous (DNA change causing amino acid change) • Neutral – equal frequencies • Conservative selection – fewer nonsynonymous substitutions • Positive selection – more nonsynonymous substitutions • Two classes of DNA substitutions – Synonymous (DNA change without amino acid change) – Nonsynonymous (DNA change causing amino acid change) • Neutral – equal frequencies • Conservative selection – fewer nonsynonymous substitutions • Positive selection – more nonsynonymous substitutions • Two classes of DNA substitutions – Synonymous (DNA change without amino acid change) – Nonsynonymous (DNA change causing amino acid change) • Neutral – equal frequencies • Conservative selection – fewer nonsynonymous substitutions • Positive selection – more nonsynonymous substitutions • 204 groups of orthologues • Maximum liklihood test for positive selection (PAML)

• Significantly higher frequency of nonsynonymous substitutions • Detected positive selection on 26 genes • Examples: – Membrane glycoprotein – Haemagluttinin – Immunoglobulin domain protein • 13 genes are unique to orthopox clade – Significantly more than expected (P < 0.05)

• Disproportionate frequency of positive selection on genes gained within the orthopox lineage • Association of positive selection on protein sequences and increased rate of gene acquisition • Association of positive selection on protein sequences and increased rate of gene acquisition

• Adaptive significance of gene acquisition? – Mimic host defences – Avoid host recognition – Block cell death • The rate of genome evolution is not constant • The rate of gene acquisition has increased in the orthopox lineage • Orthopox lineage is also has an increased frequency of positive selection

• Possible adaptive significance of genome evolution • University of California, Irvine – Brandon Gaut –