<<

Lecture 5 Phylogenetic Inference From Darwin’s notebook in 1837

Phylogenomics of dogs Phylogeny of dog breeds Charles Darwin

Willi Hennig

From “The Origin” in 1859 Phylogenetic inference

Willi Hennig, Cladistics

Clade, Monophyletic group, Natural group a. All individuals in the derived from a single ancestor b. This ancestor’s descendants are all in the clade

Monophyletic groups Lungfishes

Sarcopterygians

Fishes Tetrapods Fishes

Coelacanths

Ancestor

Phylogenetic inference

Definitions:

2. Ancestral v.s. Derived characters

A B C D Phylogenetic inference

Definitions: apomorphy: derived character

synapomorphy: Shared derived character

A B C D apomorphy

synapomorphy Phylogenetic inference

Definitions:

4. Reversal evolution

← Phylogenetic inference

5. Homoplasy, Convergent evolution

Fossa, Madagascar Mongoose

Mountain Lion, California, Cat

Thylacine, Tasmania Marsupial Phylogenetic inference

6. Parallel evolution Phylogenetic Inference

• phylogenetic trees are built from “characters”.

Phylogenetic Inference

• phylogenetic trees are built from “characters”.

• characters can be morphological, behavioral, physiological, or molecular.

Phylogenetic Inference

• phylogenetic trees are built from “characters”.

• characters can be morphological, behavioral, physiological, or molecular.

• there are two important assumptions about the characters used to build trees:

Phylogenetic Inference

• phylogenetic trees are built from “characters”.

• characters can be morphological, behavioral, physiological, or molecular.

• there are two important assumptions about the characters used to build trees:

1. they are independent.

Phylogenetic Inference

• phylogenetic trees are built from “characters”.

• characters can be morphological, behavioral, physiological, or molecular.

• there are two important assumptions about characters used to build trees:

1. they are independent.

2. they are homologous.

What is a homologous character?

What is a homologous character?

• a homologous character is shared by two species because it was inherited from a common ancestor.

What is a homologous character?

• a homologous character is shared by two species because it was inherited from a common ancestor.

• a character possessed by two species but was not present in their recent ancestors, it is said to exhibit “homoplasy”.

Types of homoplasy:

Types of homoplasy:

1. Convergent evolution

Example: evolution of eyes, flight.

Examples of convergent evolution Convergent evolution between placental and marsupial mammals Types of homoplasy:

1. Convergent evolution

Example: evolution of eyes, flight.

2. Parallel evolution

Example: lactose tolerance in humans.

What is the difference between convergent and parallel evolution?

What is the difference between convergent and parallel evolution?

Convergent Parallel

What is the difference between convergent and parallel evolution?

Convergent Parallel

Species compared distantly closely related related

What is the difference between convergent and parallel evolution?

Convergent Parallel

Species compared distantly closely related related

Trait produced by different genes/ same genes/ developmental developmental pathways pathways

Types of homoplasy:

1. Convergent evolution

Example: evolution of eyes, flight.

2. Parallel evolution

Example: lactose tolerance in human adults

3. Evolutionary reversals

Example: back mutations at the DNA sequence level (C → A → C). Phylogenetic reconstructions

1. (Neighbor - Joining)

2. Cladistics (Maximum Parsimony)

3. Statistics (Maximum Likelihood) Phylogenetic reconstructions

Phenetics (Distance Methods)

A ATGTTGCCA A B C D * A B AAGTTGCCA B 1 ***** C 4 5 C ATCAACCCA D 7 8 4 * ** D CTCAACTTA Phylogenetic reconstructions

Phenetics (Distance Methods) (A,B)=1 (A,B)C=(4+5)/2=4.5 (A,B)D=(7+8)/2=7.5 A B C D (A,B,C)D=(7+8+4)/3=6.3 A B 1 C 4 5 A B C D D 7 8 4 0.5 2.25 1.75 3.15

0.9 Phylogenetic reconstructions

Cladistics: A B C D G G A A Maximum Parsimony G A Method

1 step G

A C B D A D B C G A G A G A G A

G A G A

3 steps 3 steps G G Phylogenetic reconstructions

Cladistics: Maximum Parsimony

Number of possible rooted trees

Number of taxa Number of Number of rooted trees unrooted trees 4 15 3 7 10,395 954 10 34,459,425 2,027,025 Independent gain of camera eye requires two changes

Evolution and loss of camera eye requires six changes

How do distance trees differ from ?

Distance trees Cladograms

Characters used as many as synapomorphies possible only

Monophyly not required absolute requirement

Emphasis branch lengths branch-splitting

Outgroup not required absolute requirement Phylogenetic reconstructions

3. Statistics (Maximum Likelihood)

The only method based on a mutation model ! Phylogenetic reconstructions

3. Maximum Likelihood

A α G

α α α α pAn = 3α

C α T Jukes-Cantor Model Phylogenetic reconstructions

3. Maximum Likelihood

A α G A α G

α α α α β β β β

C α T C α T Jukes-Cantor Kimura - 2 parameter Model Model Phylogenetic reconstructions

3. Maximum Likelihood

A α G pAn = α + 2β β β β β

C α T Kimura - 2 parameter Model Markov chain Monte Carlo

1. Start at an arbitrary point 2. Make a small random move 3. Calculate height ratio (r) of new state to old state: 1. r > 1 -> new state accepted 2. r < 1 -> new state accepted with probability r. If new state not accepted, stay in the old state 4. Go to step 2

always accept 2a 1 The proportion of time the accept sometimes MCMC procedure samples 2b from a particular parameter region is an estimate of that region’s posterior probability density 20 % 48 % 32 %

tree 1 tree 2 tree 3

Markov chain Monte Carlo

1. Start at an arbitrary point 2. Make a small random move 3. Calculate height ratio (r) of new state to old state: 1. r > 1 -> new state accepted 2. r < 1 -> new state accepted with probability r. If new state not accepted, stay in the old state 4. Go to step 2

always accept 2a 1 The proportion of time the accept sometimes MCMC procedure samples 2b from a particular parameter region is an estimate of that region’s posterior probability density 20 % 48 % 32 %

tree 1 tree 2 tree 3 Phylogenetic reconstructions

1. Phenetics (Neighbor - Joining)

2. Cladistics (Maximum Parsimony)

3. Statistics (Maximum Likelihood) Phylogenetic Inference

Two points to keep in mind:

1. Phylogenetic trees are hypotheses

2. Gene trees are not the same as species trees

• a species tree depicts the evolutionary history of a group of species.

• a gene tree depicts the evolutionary history of a specific locus.

Infer relationships among three species:

Outgroup: Conflict between gene trees and species trees Conflict between gene trees and species trees How do we select the “best” tree?

Evaluating tree support by bootstrapping Evaluating tree support by bootstrapping

Species 1 A A C G C C T… G Species 2 A T C G C C T… G Species 3 A T T G A C C… G Species 4 A T T G A C C… G Evaluating tree support by bootstrapping

Species 1 A A C G C C T… G Species 2 A T C G C C T… G Species 3 A T T G A C C… G Species 4 A T T G A C C… G

Species 1

Species 2

Species 3

Species 4 Evaluating tree support by bootstrapping

Species 1 A A C G C C T… G Species 2 A T C G C C T… G Species 3 A T T G A C C… G Species 4 A T T G A C C… G

Step 1. Randomly select a base to represent position 1 Evaluating tree support by bootstrapping

Species 1 A A C G C C T… G Species 2 A T C G C C T… G Species 3 A T T G A C C… G Species 4 A T T G A C C… G á Step 1. Randomly select a base to represent position 1

Species 1 T Species 2 T Species 3 C Species 4 C Evaluating tree support by bootstrapping

Species 1 A A C G C C T… G Species 2 A T C G C C T… G Species 3 A T T G A C C… G Species 4 A T T G A C C… G á Step 2. Randomly select a base to represent position 2

Species 1 T G Species 2 T G Species 3 C G Species 4 C G Evaluating tree support by bootstrapping

Step 3. Generate complete data set (sampling with replacement).

Evaluating tree support by bootstrapping

Step 3. Generate complete data set (sampling with replacement).

Step 4. Build tree and record if groupings match original tree. Evaluating tree support by bootstrapping

Step 3. Generate complete data set (sampling with replacement).

Step 4. Build tree and record if groupings match original tree.

Step 5. Repeat 1,000 times. Evaluating tree support by bootstrapping

Species 1 98

Species 2

Species 3 92

Species 4 Cospeciation of aphids and their bacterial endosymbionts