Phylogenetics 1: An overview
Phylogenetics 1: An overview
Phylogenetic tree used in The Origin of Species. Darwin wasn’t just thinking about “The affinities of all beings of the same class have classification based on phylogenies. He used them to visualize the process of divergence within species and the splitting of populations into separate species. Darwin used this figure to illustrate divergence of variants within species; over time successively sometimes been represented by a great tree. I believe this more variation accumulates. Eventually some of this variation forms the basis for new simile largely speaks the truth. The green budding twigs species. may represent existing species; and those produced during former years may represent the long succession of extinct species...and this connection of the former and present buds by ramifying branches may well represent the classification of all extinct and living species in groups subordinate to groups.” Charles Darwin, in Chapter IV of On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life.
Unrooted tree diagram drawn in the margin of one of Charles Darwin’s notebooks
1 Phylogenetics: The biological discipline devoted to reconstructing, gene or genome phylogenies
Growth of phylogenetics: 1. Phylogenetic methods (1960’s) 2. Recognition that phylogenies were relevant to nearly all disciplines of biology (1970’s?) 3. Molecular biotechnology revolution [PCR] (1980’s) 4. Economics of computational capacity (1990’s)
Phylogenetics 1: An overview
0.1 Scale bar
2 Phylogenetics 1: An overview
• Analogy • Homology • Polarity • Ancestral character • Derived character
Phylogenetics 1: An overview
Branch lengths estimated under the assumption of the Branch lengths estimated without assumption of the molecular molecular clock clock
Felis Felis Canis Canis Ursus Ursus
oot Bos Bos R Hippopotamus Hippopotamus Physeter Physeter
Balaenoptera Balaenoptera Rhinoceros Rhinocero s Equus Equus 0.1 0.1
Tips are contemporary; the distance Tips are NOT contemporary; the distance from root to each tip is the same from root to each tip is NOT the same
3 The phylogenetic comparative method
Hypothetical dataset for phenotype (Y) Two point dataset from early in and ecological variable (X) evolutionary history
Y Y
X X
Hypothetical example: Y: size of a primates big toe X: The stubbiness of the habitat
The phylogenetic comparative method
Phylogeny of two groups of close relatives Hypothetical dataset with points coloured according to clade of origin “Big-toe clade” “Little-toe clade”
Y
Recent diversifications X
“Little-toed” clade
“Big-toed” clade Old divergence of “big-toed” and “little-toed” primates
Species are NOT drawn independently from the same distribution.
“phylogenies are fundamental to comparative biology; there is no doing it without taking them into account” ⎯ Joseph Felsenstein
4 Applications of phylogenetics
1. Sytematics, classification, and taxonomy 2. Biogeography 3. Health Sciences 4. Agriculture 5. Conservation 6. Linguistics
Applications of phylogenetics: systematics
ERNST HAECKEL’S “TREE OF LIFE”, DRAWN SOMETIME IN THE LATE 1800’S
Placed Menschen (“Men”) at the “top” of the tree among the Affen (“Apes”). Haeckle was first to suggest man’s ancestry was among the Great Apes.
This tree was a tree of “men”, and Haeckels’s placement of Menschen at the top was intentional
This tree and associated system of classification is Non-mammalian different from modern ones vertebrates in that it is based on the notion of linear progress (like a ladder) from the most primitive single-celled organisms “upwards” to man (at the very top).
Haeckel considered the things near the top as “more Invertebrates evolved” and things near the bottom as “primitive”.
Ernst Haeckel (1834-1919) was a German biologist and scientific illustrator. He was one of the first popularizers of Darwin’s Theory of Evolution. The tree to the Protozoa left is from his book “General Morphology – founded on the descent theory”.
5 Applications of phylogenetics: systematics
Monophyly, paraphyly and polyphyly
A B CDE A B C D E
F F H H
G G
J J
Monophyletic group Paraphyletic group (AHJGFDE) and a [Clade] polyphyletic group (BC)
Applications of phylogenetics: systematics
The old Reptilia as an example of classification based on a paraphyletic group.
Aves (birds)
Old Reptilia is a GRADE
Lots of dinosaur diversity
Ornithischia (some plant eating dinosaurs) Crocodylomorph (gators and crocs)
Lepidosauromorph (lizards snakes, etc.) Amniota is a clade
Anapsids (turtles and relatives) I am a synapsid too! Diversity of extinct mammal-like reptiles
Mammals (Synapsids)
6 Applications of phylogenetics: systematics
http://www.tolweb.org/tree/
Applications of phylogenetics: biogeography
Phylogeorgaphy allows one to test hypotheses such as whether geographic/ environmental factors have been historically important barriers to gene flow. WEST: low elevationand dry EAST: high elevation and wet wet and elevation high EAST:
7 Applications of phylogenetics: biogeography of mouse lemurs
Applications of phylogenetics: biogeography of mouse lemurs
Phylogeographic analysis of mouse lemurs contradicts the expected east-west disjunction for Madagascar, and suggests a completely novel north-south disjunction. The observed phylogenetic tree was inferred from mitochondrial DNA gene sequences.
Figure adapted from separate figures in A. D. Yoder (2004) In press
8 Applications of phylogenetics: biogeography of mouse lemurs
Applications of phylogenetics: biogeography ⇒ conservation
9 Applications of phylogenetics: Ann Yoder’s research group
Applications of phylogenetics: Health Sciences and HIV
10 Applications of phylogenetics: Health Sciences and HIV
HIV transmission in health care: 1. Patient ⇒ health care worker: well known 2. Health care worker ⇒ patient: unknown
CDC: epidemiological investigation of dentist with infected patients in 1990’s • only risk factor was a common dentist • phylogenetics of HIV env gene sequences
HIV-1 genome:
Applications of phylogenetics: Health Sciences and HIV
11 Applications of phylogenetics: Health Sciences and HIV
Dentist
Patient C
Patient A
Patient G Patient G No other risk factors. Patient B All had invasive dental Patient E procedures. Patient A
Dentist
Local No2
Local No3 Patient F Sex partner with HIV Local No9
Local No35
Local No3
Patient D Behavioral risk for HIV
Applications of phylogenetics: agriculture
1. What was the origin of a pest or agricultural disease species?
2. How did some pest organisms evolve resistance to pesticides?
3. How did a pest species spread through agriculture?
4. Are there species that are closely related to known pests that might also cause problems?
12 Applications of phylogenetics: agriculture
(and health science) Fusarium: an economically significant fungal crop pathogen
Powerful toxin that inhibits eukaryotic protein synthesis
Applications of phylogenetics: agriculture
Fursarium garminariam is a fungal pathogen of commercially important species of grains. Phylogenetic analysis indicates substantial genetic divergence among strains in different agricultural settings.
Phylogenetic tree inferred from the combined gene sequences of six single-copy nuclear gene sequences (7,120 bp) by using the methods of maximum parsimony. Numbers above the nodes are bootstrap proportions.
Genetic divergence among strains of Fusarium indicates that movement of crops among different agricultural settings must be carefully monitored to prevent introduction of “foreign strains”. Local crops are likely to be much less resistant to the “foreign” strains of Fusarium, as compared with the local strain.
Figure adapted from O’Donnell et al. (2000) PNAS, 97:7905-7910.
13 Applications of phylogenetics: conservation
This article highlights three uses of the comparative method in conservation: (i) develop predictive models for risk assessment (ii) identifying the general ecological principles that cause conservation problems (iii) identifying and using endangering traits as triage to prioritize research and conservation efforts
Potential pitfalls are: (i) large and expensive sample sizes required for high power of the method (ii) problems with correlation-based methods to identify causal mechanisms
Despite the limitations, it seems that the comparative method will grow to be one of many essential tools for conservation research. A hypothetical example from this paper is presented blow that illustrates how application of fisher’s exact test to the raw data (ignoring phylogenetic non-independence) overestimate the relationship between extinction risk and body size
Should we use a Fisher exact test?
Applications of phylogenetics: linguistics
Language phylogeny and divergence dates support the Anatolian-origin theory of the Indo-European language family.
Data: Cognate word forms were sampled from 87 languages. Three extinct languages thought to be more distantly related than the extant languages were included for the purpose of rooting the tree. Cognates were coded as present or absent (1 or 0) for each language. The final dataset was a binary matrix of 2,449 cognates.
Methods: Phylogenetic analysis was conducted under a stochastic model binary character evolution that allowed for unequal character state frequencies, and heterogeneous rate of evolution among Estimated date of cognates. Bayesian methods were used ancestral node to infer the tree topology shown to the left. Values above each branch (in black) are Bayesian posterior probabilities.
Divergence times were estimated by first assuming maximum and minimum divergence dates for 11 “calibration nodes” on the phylogeny. A semi parametric likelihood based method was used to infer the divergence dates for the nodes of the phylogeny
Grey and Atkinson (2003) Nature 426:435-439 Root
Extinct languages used as outgroups
14 Applications of phylogenetics: manuscript evolution
..discover relationships between different manuscript versions of a text
Rooting a phylogeney with an outgroup
Let’s define some terms:
INGROUP: A group of lineages, assumed to be monophyletic, but whose phylogenetic relationships are of primary interest.
OUTGROUP: One or more terminal taxa that are assumed to be outside of the monophyletic group that has been specified as the ingroup. Unlike the ingroup, the outgroup does not have to be monophyletic
ROOT: The most evolutionary basal point of a phylogeny. The root orients the direction of change along a phylogeny relative to time.
CHARACTER POLARITY: The evolutionary relationship between two or more states for a given character. Say we have a character with two states, “a” and “b”. By mapping them on a phylogeny we can determine that “b” preceded “a” in evolutionary history; hence “a” is the derived state and “b” is the primitive state.
15 Rooting a phylogeney with an outgroup
Rooting a phylogeney with an outgroup
Rooting a phylogenetic tree by placing the root between the ingroup and outgroup
IG: ingroup OG OG: outgroup Root
Root IG-4
OG IG-3
IG-4 IG-3 IG-1 IG-2 IG-1 IG-2 IG-3 IG-4 OG IG-1 IG-2
Unrooted tree Placing root between ingroup and outgroup Rooted tree
16 Rooting a phylogeney with an outgroup
Flowchart of the general method of outgroup analysis. This method is based on simultaneous phylogenetic analysis of ingroup and outgroup lineages.
Define ingroup, Define outgroup, Combine ingroup Conduct Root tree Read characters usually by usually by more and outgroup unrooted between ingroup from phylogeny synapomorphies inlcusive into single phylogentic and outgroup synapomorphies dataset analysis
Other methods do not use outgroups; Treat outgoups as Any method can be Distinguish between e.g., mid-point terminal taxa used: parsimony, primitive and derived, methods, and likelihood, etc. and between hypothetical homology and analogy ancestors
Rooting a phylogeney with an outgroup
17 Rooting a phylogeney with an outgroup
Outgroup myths:
Myth 1: The character state in the outgroup should be considered primitive. In reality, character states in the outgroup can, and often are, highly derived features of the organism.
Myth 2: The outgroup should be the sister taxon to the ingroup. There are many reasons why this is desirable; however it is not absolutely necessary. It is possible to root a tree by using an outgroup that is more distantly related to the ingroup than its sister group.
Myth 3: More than one outgroup is required to root a tree. Of course larger sample sizes are generally better than smaller ones, but as we have shown above, it is possible to place a root on a tree by using only a single outgroup taxon.
18