<<

Phylogeny reconstruction Phylogenetic reconstruction How do we reconstruct the tree of life?

Outline: Terminology Methods distance parsimony maximum likelihood bootstrapping Problems homoplasy Dr. Sean hybridisation Graham, UBC.

Phylogenetic reconstruction Phylogenetic reconstruction

•Rooted trees •Rooted trees :

1 Phylogenetic reconstruction Understanding Trees Introduction Amphibians Mammals Lizards Birds Turtles Snakes Crocodiles Birds Lizards Turtles Crocodiles Snakes Mammals Amphibians

Do these phylogenies agree? Branch lengths

A

B

C

D

A

B

C

D

Figure 14.17 1 nt change

2 What is the relationship between Understanding Trees Trees can be used to describe taxonomic groups taxonomic names and phylogenetic ABCDE groups? Monophyletic Amniotes Paraphyletic Amphibians Snakes Turtles A BC DE Birds Crocodiles Lizards Mammals A BC DE

Polyphyletic Amnion

What is the relationship between What is the relationship between taxonomic names and phylogenetic taxonomic names and phylogenetic groups? groups?

Reptiles Rodents Amphibians Turtles Crocodiles Birds Bats Lizards Snakes Birds Crocodiles Snakes Turtles Lizards

Cold Blooded Wings

3 Polyphyletic example: Amentiferae Polyphyletic example: Amentiferae

Oaks Willows Walnuts

Evolution of catkins

Ancestor with separate flowers

Vertebrate Phylogeny Constructing Trees

Are these groups monophyletic, paraphyletic or polyphyletic? Methods:

fish? distance (UPGMA, ) parsimony tetrapods? (= four limbed) maximum likelihood (Bayesian)

amphibians?

mammals?

ectotherms (= warm blooded)?

4 Distance methods rely on Distance Methods () clustering algorithms (e.g. UPGMA)

B D Example 1: morphology A

C

Distance matrix Trait 2

A B C D Trait 1 A 1.0 3.0 4.9 B 3.3 3.0 C 3.0 D

UPGMA UPGMA

B D B D Example 1: morphology A Example 1: morphology A

C C Trait 2 Trait 2 Distance matrix

A B C D A B C D Trait 1 Trait 1 A 1.0 3.0 4.9 A 1.0 3.0 4.9 B 3.3 3.0 A B 3.3 3.0 A B B C 3.0 C 3.0 C D D D

5 Distance methods with sequence Distance methods with sequence data A B C D data 1 3 5 A: ATTGCAATCGG A 3 7 B: ATTACGATCGG B C: GTTACAACCGG C 7 Distance matrix D: CTCGTAGTCGA D

A B C D AB C D A 1 3 5 3 6 New Distance matrix: AB take averages B 3 7 A C 7 A B B C 7 D D

Distance methods with sequence Distance methods with sequence A B C D data A B C D data 1 3 5 1 3 5 A A A A B 3 7 B B 3 7 B C C C 7 C 7 D D

AB C D AB C D AB 3 6 AB 3 6 C 7 A C 7 A B B D C D C D D

6 Strengths and weaknesses of Assumptions of distance methods distance methods

II. Parsimony Methods Applying parsimony () Hennig (German entomologist) wrote in 1966 • Consider four taxa (1-4) and four Translated into English in 1976: very influential characters (A-D) • Ancestral state: abcd

Trait A B C D 1 a’ b c d 2 a’ b’ c d’

Taxon 3 a’ b’ c’ d 4 a’ b’ c d

7 Applying parsimony Applying parsimony

• Consider four taxa (1-4) and four characters (A-D) • Consider four taxa (1-4) and four characters (A-D) • Ancestral state: abcd • Ancestral state: abcd Unique changes Unique changes Convergences or Convergences or reversals reversals 1234 14 3 2 Trait Trait a’bcd a’b’c’d’ a’b’c’d a’b’cd a’bcd a’b’cd a’b’c’d a’b’c’d’ A B C D A B C D b 1 a’ b c d d’ 1 a’ b c d d’ c’ 2 a’ b’ c’ d’ c’ 2 a’ b’ c’ d’ b’ b’ Taxon 3 a’ b’ c’ d Taxon 3 a’ b’ c’ d 4 a’ b’ c d a’ 5 steps 4 a’ b’ c d a’ 4 steps abcd abcd

Strengths and weaknesses of Parsimony practice parsimony Position Strengths Taxon 1234567 K AGTACCG L AAGACTA M AACCTTA N AAAGTTA

Weaknesses Which unrooted tree is most parsimonious? . L N L M L K 2 2 2 K M M KNN

Plot each change on each tree. Positions 1 and 2 are done. Which positions help to determine relationships?

8 Inferring the direction of evolution III. Maximum likelihood (and

Where did the Mouse ACGCTAGCTAGG Bayesian) mutation occur, and what was the change? Orangutan ACGCTAGCTAGG

Gorilla ACGCTAGCTAGG

Human ACGCTAGCTAGG

Bonobo ACGCTAGCTACG

Chimp ACGCTAGCTACG

Maximum likelihood: a starting sketch Maximum likelihood: a starting sketch • Probabilities • Probabilities – transition: 0.2 transversion: 0.1 no change 0.7 – transition: 0.2 transversion: 0.1 no change 0.7

Transitions Transitions A G A A G A P = (.7)(.1)(.2)(.7)(.7) A A T T T T T C A T C A G A Transversions G Transversions G G G A G A A G A G G G C G C G G G G G G G G G G C G C G A A Find the tree with the G Find the tree with the G highest probability G highest probability G A A

9 Maximum likelihood: a starting sketch Assessment of Maximum • Probabilities Likelihood (also Bayesian) – transition: 0.2 transversion: 0.1 no change 0.7 • Strengths Transitions A G A P = (.7)(.1)(.2)(.7)(.7) A T T T C A A Transversions G G A G A G • Weaknesses G C G G P = (.7)(.1)(.7)(.7)(.7) G G G G A A C C A A A A Find the tree with the G G G G P = (.1)(.2)(.7)(.7)(.2) highest probability G G G G G A A A

Characters to use in phylogeny Challenges of using DNA data

• Morphology Alignment can be very challenging! Taxon 1 AATGCGC Taxon 2 AATCGCT

• DNA sequence Taxon 1 AATGCGC Taxon 2

10 Informative sequences evolve at Example of insufficient evidence: moderate rates metazoan phylogeny • Too slow?

– not enough variation Metazoans –Taxon1 AATGCGC –Taxon2 AATGCGC –Taxon3 AATGCGC

Fungi

Polytomy

Challenges: sunflower phylogeny Informative sequences evolve at • Recent radiation (200,000 years) • Many species, much hybridization moderate rates • Need more rapidly evolving markers!! • Too fast? – homoplasy likely – “saturation” – only 4 possible states for DNA –Taxon1 ATTCTGA –Taxon2 GTAGTGG = 15 spp! = 12 spp! –Taxon3 CGTGCTG

Polytomy

11 Saturation: mammalian Saturation mitochondrial DNA • Imagine changing one nucleotide every hour to a random nucleotide • Split the ancestral population in 2.

ACTTGCT ACCTGAA AGCGGAA ACCAGAA ACGTGCT ACGAGCT GCGATCC GAGCTCC AGCCTCC

One Four 8 hours 12 hours hour hours

Red indicates multiple 24 hours? mutations at a site

Forces of evolution and phylogeny Phylogeny case study I: whales reconstruction Are whales ungulates (hoofed mammals)? Figure 14.4 How does each force affect the ability to reconstruct phylogeny? mutation? drift? selection? non-random mating? migration?

12 How consistent are the data? Whales: DNA sequence data • Take the dataset (5 taxa, 10 characters)

Taxon 1 2 3 4 5 6 7 8 9 10 Human A C G T T G T A C T Chimp A G G T T C T A T T Bonobo A G G T T C T A T G Gorilla A C T T G C T G T C Orang T C G T G T A C C C • Create a new data set by sampling characters at random, with replacement

Taxon 3 8 2 6 10 10 5 8 8 7 3 Human G A C G T T T A A T G Chimp G A G C T T T A A T G Bonob G A G C G G T A A T G Hillis, D. A. 1999. o Gorilla T G C C C C G G G T T How reliable is this tree? Bootstrapping. Orang G C C T C C G C C A G

Whales: DNA sequence data Molecular clocks

Hillis, D. A. 1999.

13 Challenges for phylogeny: gene Basic idea of molecular clocks flow

chimps 6 substitutions humans

whales

60 substitutions

hippos 56 mya

Sunflower annuals Different genes may have different histories!

14 Phylogeny summary Phylogeny study questions 1) Explain in words the difference between monophyletic, paraphyletic, and polyphyletic taxa. Draw a hypothetical phylogeny representing each type. Give an actual example of a commonly recognized paraphyletic taxon in both animals and in plants.

2) How can a reconstructed phylogeny be used to determine if a similar character in two taxa is due to homoplasy?

3) Whales are classified as cetaceans, not artiodactyl ungulates. This makes artiodactyls paraphyletic – why? What is the evidence that whales belong in the artiodactyls?

4) Phenetics (distance methods) and cladistics (parsimony) differ in the ways they recognize and use similarities among taxa to form phylogenetic groupings. What types of similarity does each school recognize, and how useful is each type of similarity considered to be for identifying groups?

Phylogeny study questions Phylogeny study questions 5) What is “bootstrapping” in the context of phylogenetic analysis, and 10) You are studying a group of species that lives in two very different why is this procedure performed? environments. You build two phylogenies: one is based on a locus that is probably under divergent selection in the two environments, while the 6) Why are maximum likelihood methods increasing in popularity for other phylogeny is based on a neutral locus. Which phylogeny would reconstructing phylogenies? In your answer, include a short be more likely to represent the species history? why? description of how this method identifies the best phylogeny. 11) For a number of years, Anolis lizards are found in similar micro- 7) For what kinds of data can maximum likelihood methods of phylogeny habitats on many separate islands in the Carribean are very similar to construction be used? Why is this so? What types of data are typically each other (for example, large lizards that feed on the ground, smaller not used, and why? lizards that feed on tree trunks, and very small lizards that feed at the tops of branches). Two different, historical explanations have been proposed to explain this pattern: each morph has evolved repeatedly 8) Would animal mitochondrial DNA provide a reasonable molecular tool on each island, or each morph has evolved just once, then dipsersed. for evaluating deep phylogenetic relationships between animal phyla? Sketch a phylogeny that would support each hypothesis. What about ribosomal DNA? Justify your answers. 12) Integrative question: the Cameroon lake cichlid phylogeny, showing 9) Integrative question: Draw a pair of axes with “Time since divergence” that the lake species were monophyletic, was based on mitochondrial on the x axis and “percent of sites that are the same” on the y axis. DNA. Explain why this might not reflect the species history. How could Draw a graph that shows the basic pattern for third codon sites: is you be more certain about the phylogeny? your graph linear? Explain why or why not. 13) Explain why allopolyploid taxa pose problems for phylogenies.

15