Lecture 5 Phylogenetic Inference From Darwin’s notebook in 1837
Phylogenomics of dogs Phylogeny of dog breeds Charles Darwin
Willi Hennig
From “The Origin” in 1859 Cladistics Phylogenetic inference
Willi Hennig, Cladistics
Clade, Monophyletic group, Natural group a. All individuals in the clade derived from a single ancestor b. This ancestor’s descendants are all in the clade
Monophyletic groups Lungfishes
Sarcopterygians
Fishes Tetrapods Fishes
Coelacanths
Ancestor
Phylogenetic inference
Definitions:
2. Ancestral v.s. Derived characters
A B C D Phylogenetic inference
Definitions: apomorphy: derived character
synapomorphy: Shared derived character
A B C D apomorphy
synapomorphy Phylogenetic inference
Definitions:
4. Reversal evolution
←
←
← Phylogenetic inference
5. Homoplasy, Convergent evolution
Fossa, Madagascar Mongoose
Mountain Lion, California, Cat
Thylacine, Tasmania Marsupial Phylogenetic inference
6. Parallel evolution Phylogenetic Inference
• phylogenetic trees are built from “characters”.
Phylogenetic Inference
• phylogenetic trees are built from “characters”.
• characters can be morphological, behavioral, physiological, or molecular.
Phylogenetic Inference
• phylogenetic trees are built from “characters”.
• characters can be morphological, behavioral, physiological, or molecular.
• there are two important assumptions about the characters used to build trees:
Phylogenetic Inference
• phylogenetic trees are built from “characters”.
• characters can be morphological, behavioral, physiological, or molecular.
• there are two important assumptions about the characters used to build trees:
1. they are independent.
Phylogenetic Inference
• phylogenetic trees are built from “characters”.
• characters can be morphological, behavioral, physiological, or molecular.
• there are two important assumptions about characters used to build trees:
1. they are independent.
2. they are homologous.
What is a homologous character?
What is a homologous character?
• a homologous character is shared by two species because it was inherited from a common ancestor.
What is a homologous character?
• a homologous character is shared by two species because it was inherited from a common ancestor.
• a character possessed by two species but was not present in their recent ancestors, it is said to exhibit “homoplasy”.
Types of homoplasy:
Types of homoplasy:
1. Convergent evolution
Example: evolution of eyes, flight.
Examples of convergent evolution Convergent evolution between placental and marsupial mammals Types of homoplasy:
1. Convergent evolution
Example: evolution of eyes, flight.
2. Parallel evolution
Example: lactose tolerance in humans.
What is the difference between convergent and parallel evolution?
What is the difference between convergent and parallel evolution?
Convergent Parallel
What is the difference between convergent and parallel evolution?
Convergent Parallel
Species compared distantly closely related related
What is the difference between convergent and parallel evolution?
Convergent Parallel
Species compared distantly closely related related
Trait produced by different genes/ same genes/ developmental developmental pathways pathways
Types of homoplasy:
1. Convergent evolution
Example: evolution of eyes, flight.
2. Parallel evolution
Example: lactose tolerance in human adults
3. Evolutionary reversals
Example: back mutations at the DNA sequence level (C → A → C). Phylogenetic reconstructions
1. Phenetics (Neighbor - Joining)
2. Cladistics (Maximum Parsimony)
3. Statistics (Maximum Likelihood) Phylogenetic reconstructions
Phenetics (Distance Methods)
A ATGTTGCCA A B C D * A B AAGTTGCCA B 1 ***** C 4 5 C ATCAACCCA D 7 8 4 * ** D CTCAACTTA Phylogenetic reconstructions
Phenetics (Distance Methods) (A,B)=1 (A,B)C=(4+5)/2=4.5 (A,B)D=(7+8)/2=7.5 A B C D (A,B,C)D=(7+8+4)/3=6.3 A B 1 C 4 5 A B C D D 7 8 4 0.5 2.25 1.75 3.15
0.9 Phylogenetic reconstructions
Cladistics: A B C D G G A A Maximum Parsimony G A Method
1 step G
A C B D A D B C G A G A G A G A
G A G A
3 steps 3 steps G G Phylogenetic reconstructions
Cladistics: Maximum Parsimony
Number of possible rooted trees
Number of taxa Number of Number of rooted trees unrooted trees 4 15 3 7 10,395 954 10 34,459,425 2,027,025 Independent gain of camera eye requires two changes
Evolution and loss of camera eye requires six changes
How do distance trees differ from cladograms?
Distance trees Cladograms
Characters used as many as synapomorphies possible only
Monophyly not required absolute requirement
Emphasis branch lengths branch-splitting
Outgroup not required absolute requirement Phylogenetic reconstructions
3. Statistics (Maximum Likelihood)
The only method based on a mutation model ! Phylogenetic reconstructions
3. Maximum Likelihood
A α G
α α α α pAn = 3α
C α T Jukes-Cantor Model Phylogenetic reconstructions
3. Maximum Likelihood
A α G A α G
α α α α β β β β
C α T C α T Jukes-Cantor Kimura - 2 parameter Model Model Phylogenetic reconstructions
3. Maximum Likelihood
A α G pAn = α + 2β β β β β
C α T Kimura - 2 parameter Model Markov chain Monte Carlo
1. Start at an arbitrary point 2. Make a small random move 3. Calculate height ratio (r) of new state to old state: 1. r > 1 -> new state accepted 2. r < 1 -> new state accepted with probability r. If new state not accepted, stay in the old state 4. Go to step 2
always accept 2a 1 The proportion of time the accept sometimes MCMC procedure samples 2b from a particular parameter region is an estimate of that region’s posterior probability density 20 % 48 % 32 %
tree 1 tree 2 tree 3
Markov chain Monte Carlo
1. Start at an arbitrary point 2. Make a small random move 3. Calculate height ratio (r) of new state to old state: 1. r > 1 -> new state accepted 2. r < 1 -> new state accepted with probability r. If new state not accepted, stay in the old state 4. Go to step 2
always accept 2a 1 The proportion of time the accept sometimes MCMC procedure samples 2b from a particular parameter region is an estimate of that region’s posterior probability density 20 % 48 % 32 %
tree 1 tree 2 tree 3 Phylogenetic reconstructions
1. Phenetics (Neighbor - Joining)
2. Cladistics (Maximum Parsimony)
3. Statistics (Maximum Likelihood) Phylogenetic Inference
Two points to keep in mind:
1. Phylogenetic trees are hypotheses
2. Gene trees are not the same as species trees
• a species tree depicts the evolutionary history of a group of species.
• a gene tree depicts the evolutionary history of a specific locus.
Infer relationships among three species:
Outgroup: Conflict between gene trees and species trees Conflict between gene trees and species trees How do we select the “best” tree?
Evaluating tree support by bootstrapping Evaluating tree support by bootstrapping
Species 1 A A C G C C T… G Species 2 A T C G C C T… G Species 3 A T T G A C C… G Species 4 A T T G A C C… G Evaluating tree support by bootstrapping
Species 1 A A C G C C T… G Species 2 A T C G C C T… G Species 3 A T T G A C C… G Species 4 A T T G A C C… G
Species 1
Species 2
Species 3
Species 4 Evaluating tree support by bootstrapping
Species 1 A A C G C C T… G Species 2 A T C G C C T… G Species 3 A T T G A C C… G Species 4 A T T G A C C… G
Step 1. Randomly select a base to represent position 1 Evaluating tree support by bootstrapping
Species 1 A A C G C C T… G Species 2 A T C G C C T… G Species 3 A T T G A C C… G Species 4 A T T G A C C… G á Step 1. Randomly select a base to represent position 1
Species 1 T Species 2 T Species 3 C Species 4 C Evaluating tree support by bootstrapping
Species 1 A A C G C C T… G Species 2 A T C G C C T… G Species 3 A T T G A C C… G Species 4 A T T G A C C… G á Step 2. Randomly select a base to represent position 2
Species 1 T G Species 2 T G Species 3 C G Species 4 C G Evaluating tree support by bootstrapping
Step 3. Generate complete data set (sampling with replacement).
Evaluating tree support by bootstrapping
Step 3. Generate complete data set (sampling with replacement).
Step 4. Build tree and record if groupings match original tree. Evaluating tree support by bootstrapping
Step 3. Generate complete data set (sampling with replacement).
Step 4. Build tree and record if groupings match original tree.
Step 5. Repeat 1,000 times. Evaluating tree support by bootstrapping
Species 1 98
Species 2
Species 3 92
Species 4 Cospeciation of aphids and their bacterial endosymbionts