<<

Introduction to Biosystematics - Zool 575

Introduction to Biosystematics Confidence - Assessment of the Strength of Lecture 25 - Confidence - Assessment 2 the Phylogenetic Signal - part 2

1. Consistency Index

2. g1 statistic, PTP - test “Quantifying the uncertainty of a phylogenetic 3. Consensus trees estimate is at least as important a goal as obtaining the phylogenetic estimate itself.” 4. Decay index (Bremer Support)

- Huelsenbeck & Rannala (2004) 5. Bootstrapping / Jackknifing

6. Statistical hypothesis testing (frequentist)

7. Posterior probability (see lecture on Bayesian)

Derek S. Sikes University of Calgary Zool 575

Multiple optimal trees Multiple optimal trees • Many methods can yield multiple equally • If multiple optimal trees are found we know optimal trees that all of them are wrong except, possibly, (hopefully) one • We can further select among these trees with additional criteria, but • Some have argued against consensus tree methods for this reason • Typically, relationships common to all the optimal trees are summarized with • Debate over quest for true tree (point consensus trees estimate) versus quantification of uncertainty

Consensus methods Strict consensus methods

• A consensus tree is a summary of the agreement • Strict consensus methods require agreement among a set of fundamental trees across all the fundamental trees

• There are many consensus methods that differ in: • They show only those relationships that are 1. the kind of agreement unambiguously supported by the data 2. the level of agreement • The commonest method (strict component • Consensus methods can be used with multiple consensus) focuses on clades/components/full trees from a single analysis or from multiple splits analyses

1 Introduction to Biosystematics - Zool 575

Strict consensus methods Strict consensus methods TWO FUNDAMENTAL TREES • This method produces a consensus tree that A B C D E F G A B C E D F G includes all and only those full splits found in all the fundamental trees

• Other relationships (those in which the fundamental trees disagree) are shown as A B C D E F G unresolved polytomies

• Can be less optimal than any of the optimal trees Simplest to interpret

STRICT CONSENSUS TREE

Majority rule consensus Majority rule consensus

• Majority-rule consensus methods require • This method produces a consensus tree that agreement across a majority of the fundamental includes all and only those full splits found in a trees majority (>50%) of the fundamental trees

• May include relationships that are not supported • Other relationships are shown as unresolved by the most parsimonious interpretation of the data polytomies

• The commonest method focuses on • Of particular use in bootstrapping and Bayesian clades/components/full splits Inference (best not to use for single searches)

• Implemented in PAUP* and MrBayes

Majority rule consensus Majority rule consensus

THREE FUNDAMENTAL TREES Majority Rule Consensus trees are used for

A B C D E F G A B C E F D G A B C E D F G 1. Summarizing multiple equally optimal trees from one search (but they shouldn’t be!)

2. Summarizing the results of a bootstrapping analysis (multiple searches)

Numbers indicate A B C E D F G frequency of 3. Summarizing the results of a Bayesian 100 66 clades in the 66 66 analysis

fundamental trees 66 Don’t confuse these! The numbers on the branches MAJORITY-RULE CONSENSUS TREE mean very different things in each case

2 Introduction to Biosystematics - Zool 575

Reduced consensus methods Consensus methods Three TWO FUNDAMENTAL TREES fundamental agreement subtree A B C D E F G A G B C D E F trees strict consensus Ochromonas Ochromonas excluded Symbiodinium Prorocentrum Prorocentrum Symbiodinium Loxodes Prorocentrum Loxodes Spirostomumum Tetrahymena Loxodes Tracheloraphis Tetrahymena Euplotes Gruberia Spirostomum Euplotes Ochromonas Tracheloraphis Gruberia A G Symbiodinium Gruberia B C D E F Prorocentrum A B C D E F Loxodes Ochromonas Tetrahymena majority-rule Spirostomumum Euplotes Ochromonas Tracheloraphis Gruberia 100 Symbiodinium Strict component consensus Prorocentrum Ochromonas 100 completely unresolved Symbiodinium 66 100 Loxodes Prorocentrum Tetrahymena Loxodes 66 Tetrahymena Spirostomum AGREEMENT SUBTREE - PAUP* Euplotes Euplotes Spirostomumum 100 Taxon G is excluded Tracheloraphis Tracheloraphis Gruberia Gruberia

Consensus methods Recall • Use strict methods to identify those relationships unambiguously supported by parsimonious • Stochastic error vs Systematic error interpretation of the data • These assessment methods help identify • Use reduced methods where consensus trees are poorly resolved stochastic error – How repeatable are the results? • Avoid methods which have ambiguous – How strongly do the data support them? interpretations. Prevent possible confusion between – This is a measure of precision (which is MR consensus for an optimal tree search and a MR hopefully related to accuracy) consensus for a bootstrapping search

Confidence - Assessment of the Strength of Accuracy and Precision the Phylogenetic Signal - part 2

• Accuracy 1. Consistency Index – Accuracy is correctness. How close a measurement is to the true value. 2. g1 statistic, PTP - test (unless we know the “true tree” in 3. Consensus trees advance we cannot measure this) 4. Decay index (Bremer Support)

• Precision 5. Bootstrapping / Jackknifing – Precision is reproducibility. How closely 6. Statistical hypothesis testing (frequentist) two or more measurements agree with one another. (this we can measure!) 7. Posterior probability (see lecture on Bayesian)

3 Introduction to Biosystematics - Zool 575

Branch Support Decay analysis • Several methods have been proposed that attach • In parsimony analysis, a way to assess support for a group is to see if the group occurs in slightly less numerical values to internal branches in trees that parsimonious trees also are intended to provide some measure of the strength of support for those branches and the corresponding groups • The length difference between: the shortest trees including the group and

• These methods include: the shortest trees that exclude the group - The Bootstrap (BS) and jackknife - Decay analyses (aka Bremer Support) (the extra steps required to collapse a group) - Bayesian Posterior Probabilities (PP or BPP)

is the decay index or Bremer support

Decay analyses - in practice Decay analysis -example • Decay indices for each clade can be determined by:

Ciliate SSUrDNA data Randomly permuted data - Using PAUP* to search for the shortest tree that Ochromonas Ochromonas lacks the branch of interest using reverse +27 Symbiodinium +1 Symbiodinium topological constraints Prorocentrum Prorocentrum +1 - with the Autodecay or TreeRot programs (in +45 Loxodes +3 Loxodes Tracheloraphis Tetrahymena conjunction with PAUP*) - MacClade 4 will also Spirostomum Tracheloraphis help prepare for a Decay analysis +8 +15 Gruberia Spirostomum +10 Euplotes Euplotes - An excellent use for the Parsimony Ratchet - +7 Tetrahymena Gruberia because finding the shortest tree length is all that matters (not finding multiple shortest trees)

Decay indices - interpretation Decay indices - interpretation • Generally, the higher the decay index the better the relative support for a group • Unlike BS decay indices are not scaled (0-100) – This has the advantage that the value can exceed 100 • Like Bootstrap values (BS), decay indices may be whereas BS “tops - out” at 100 meaning that we cannot misleading if the data are misleading distinguish between the support of two branches with BS values of 100 although one might have a far greater decay index than the other • Magnitude of decay indices and BS generally correlated (i.e. they tend to agree) • It is even less clear what is an acceptable decay index than a BS value… • Only groups found in all most parsimonious trees – Unlike the BS value very little work has examined the have decay indices > zero properties and behavior of decay indices

4 Introduction to Biosystematics - Zool 575

Decay indices - interpretation Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 One key study is that of DeBry (2001) – He showed that decay indices should be interpreted in 1. Consistency Index light of branch lengths 2. g1 statistic, PTP - test – That the same values, even within the same tree, do not represent the same support if the branch lengths differ 3. Consensus trees

4. Decay index (Bremer Support) - ie Decay Indices are not easily comparable as measures of branch support 5. Bootstrapping / Jackknifing - Values < 4 should be considered weak regardless of branch length 6. Statistical hypothesis testing (frequentist)

DeBry, R.W. (2001) Improving interpretation of the Decay Index for DNA sequence data. Systematic Biology 50: 742-752. 7. Posterior probability (see lecture on Bayesian)

Bootstrapping (non-parametric)

• Bootstrapping is a modern statistical technique that uses computer intensive random resampling of data to determine sampling error or confidence intervals for some estimated parameter

Decay values versus Bootstrap and Jacknife values • Introduced to phylogenetics by from one empirical study Felsenstein in 1985

Norén, M. & U. Jondelius. 1999. Phylogeny of the Prolecithophora • Based on idea of Efron (1979) (Platyhelminthes) inferred from 18S rDNA sequences. Cladistics 15: 103-112.

Bootstrapping (non-parametric)

1. Characters are sampled with replacement to create many (100-1000) bootstrap replicate data sets

(think shuffle vs random play of music)

2. Each bootstrap replicate data set is analysed (e.g. with parsimony, distance, ML)

3. Agreement among the resulting trees is summarized with a majority-rule consensus tree

5 Introduction to Biosystematics - Zool 575

Bootstrapping (non-parametric) Bootstrapping

• Frequency of occurrence of groups, bootstrap Original data matrix Resampled data matrix

support (BS), is a measure of support for those Characters Characters Taxa 1 2 3 4 5 6 7 8 Taxa 1 2 2 5 5 6 6 8 Summarize the results of groups A R R Y Y Y Y Y Y A R R R Y Y Y Y Y multiple analyses with a B R R Y Y Y Y Y Y B R R R Y Y Y Y Y majority-rule consensus tree C Y Y Y Y Y R R R C Y Y Y Y Y R R R D Y Y R R R R R R D Y Y Y R R R R R Bootstrap values (BS) are the Outgp R R R R R R R R Outgp R R R R R R R R frequencies with which • Additional information is given in partition tables (for groups are encountered in Randomly resample characters from the original data with analyses of replicate data groups below 50% support) replacement to build many bootstrap replicate data sets of the sets same size as the original - analyse each replicate data set A B C D A B C D A B C D 1 • Can ask PAUP* to create MR con-tree of higher 5 2 1 5 8 2 96% cut-off, eg 80% - all weaker branches collapse 8 7 2 6 6 5 6 2 66% 4 1 3 Outgroup Outgroup Outgroup

Bootstrapping - an example Bootstrapping - random data

Ciliate SSUrDNA - parsimony bootstrap Partition Table Partition Table 123456789 Freq Ochromonas (1) Randomly permuted data - parsimony bootstrap 123456789 Freq ------.*****.** 71.17 Symbiodinium (2) ------Ochromonas Ochromonas 100 ..**..... 58.87 .**...... 100.00 Symbiodinium Symbiodinium 16 ....*..*. 26.43 Prorocentrum (3) ...**.... 100.00 59 Prorocentrum 59 Prorocentrum .*...... * 25.67 Loxodes 26 Loxodes .***.*.** 23.83 Euplotes (8) .....**.. 100.00 71 84 Tracheloraphis 21 Spirostomumum ...*...*. 21.00 ...****.. 100.00 Spirostomumum 71 16 Tetrahymena .*..**.** 18.50 Tetrahymena (9) .....*..* 16.00 Euplotes Euplotes ...****** 95.50 .*...*..* 15.67 96 Loxodes (4) ...... ** 84.33 Tetrahymena Tracheloraphis .***....* 13.17 100 Gruberia Gruberia ....**.** 12.67 Tracheloraphis (5) ...****.* 11.83 ....**.*. 12.00 100 ...*****. 3.83 ..*...*.. 12.00 Spirostomum (6) 50% Majority-rule consensus (with minority components) .**..*..* 11.00 100 .*******. 2.50 .*...*... 10.80 Gruberia (7) .**....*. 1.00 .....*.** 10.50 Majority-rule consensus .**.....* 1.00 .***..... 10.00

Bootstrapping Bootstrap - interpretation

The probability of a character being omitted • Bootstrapping was introduced as a way of from a bootstrap sample ranges from 0- establishing confidence intervals for 0.367 (depending on N, the number of phylogenies characters)

Rule of thumb: a branch must be • This interpretation of bootstrap values N P supported by 3 or more characters to be depends on the assumption that the 1 0 recovered in >95% of bootstraps original data is a random sample from a 2 0.25 much larger set of independent and 3 0.29 identically distributed data (i.i.d.) 4 0.31 … 0.367

6 Introduction to Biosystematics - Zool 575

Bootstrap - interpretation “…bootstrapping provides • However, several things complicate this interpretation us a confidence interval within which is contained not [necessarily] - These assumptions are often wrong - making any the true phylogeny but strict statistical interpretation of BS invalid the phylogeny that would be estimated - Some theoretical work indicates that BS are very on repeated sampling of conservative (too low), and may underestimate many characters from the confidence intervals - problem increases with underlying pool of numbers of taxa characters.”

- BS can be high for incongruent relationships in separate analyses - and can therefore be misleading Joseph Felsenstein (1985) (misleading data -> misleading BS) recall the Mantra: The data are the things

Bootstrap - interpretation Bootstrap - interpretation Huelsenbeck & Rannala (2004) list 3 common interpretations • High BS (e.g. > 85%) is indicative of strong ‘signal’ in the data (some use 70% as the cutoff, there is no 1. Probability that a clade is correct (accuracy) consensus as to which value is best)

2. Robustness of the results to perturbation • Provided we have no evidence of strong misleading (repeatability / precision) signal due to violation of assumptions (e.g. base composition biases, great differences in branch 3. Probability of incorrectly rejecting a hypothesis of lengths) high BS values are likely to reflect strong monophyly (1-P) : probability of getting that much evidence if, in fact, the group did not exist phylogenetic signal

Huelsenbeck, J.P. and Rannala, B. (2004) Frequentist properties of Bayesian posterior probabilities of • In other words, although technically they are meant phylogenetic trees under simple and complex substitution models. Systematic Biology 53: 904-913. to be a measure precision, they are usually thought to be at least strongly correlated with accuracy

Bootstrap - interpretation Bootstrap - interpretation

Be suspicious of • Low BS values, however, need not mean the maximum bootstrap relationship is false, only that it is poorly supported values… they might be due – This is especially true of morphological data to systematic error. – Morphologists often use the Decay index instead

• Bootstrapping can be viewed as a way of exploring the robustness of phylogenetic inferences to perturbations in the balance of supporting and conflicting evidence for groups

Paul Lewis

7 Introduction to Biosystematics - Zool 575

Bootstrap - interpretation Bootstrap - interpretation Two types of precision (Hillis & Bull 1993): Hillis & Bull (1993) examined precision, repeatability, and accuracy of the bootstrap Precision of bootstrap value vs repeatability of finding a branch: - Found that BS provide a very imprecise measure of repeatability - so imprecise as to be worthless as a measure of repeatability - Precision of bootstrap values increases with the number of bootstrap replicates (variance - Determined that in some cases a BS as low as 70% among analyses decreases) was equivalent to a 95% probability of being true - Bias confirmed by Newton (1996) - Repeatability tells us how likely we are to find the Hillis, D.M. and Bull, J.J. (1993) An empirical test of bootstrapping as a method for assessing confidence in same results using a different but similar phylogenetic analysis. Systematic Biology, 42: 182-192. dataset - Felsenstein’s original idea

Bootstrap - interpretation Jackknifing BS values have been criticized for a variety of • Jackknifing is very similar to bootstrapping and reasons: differs only in the character resampling strategy

Sanderson, M.J. (1995) Objections to Bootstrapping Phylogenies: A Critique. Systematic Biology, 44: 299-320. • Some proportion of characters (e.g. 37%, 50%) are randomly selected and deleted But the top reason has been that they seem to be too conservative - ie underestimates of the • Replicate data sets are analyzed and the results probability of the branch being correct - ie summarized with a majority-rule consensus tree biased downward (erratically & unpredictably) • Jackknifing and bootstrapping tend to produce Newton, M.A. (1996) Bootstrapping phylogenies: Large deviations and dispersion effects. Biometrika, broadly similar results and have similar 83: 315-328. interpretations - Jackknifing is preferred by cladists

Low Support Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 Low branch support can result from 1. Consistency Index 1. Conflicting data (homoplasy) 2. g1 statistic, PTP - test

2. Lack of data - even a dataset with no 3. Consensus trees homoplasy can yield poorly resolved trees if there are branches without change 4. Decay index (Bremer Support)

5. Bootstrapping / Jackknifing 3. Artifact of mid-sized clades? “This indicates that, for all support measures on trees of a given size, the largest clades and the smallest clades are supported most strongly, whereas medium sized clades receive lower 6. Statistical hypothesis testing (frequentist) support”

Picket, K.M. and Randle, C.P. (2005) Strange bayes indeed: uniform topological priors imply non-uniform 7. Posterior probability (see lecture on Bayesian) clade priors. Molecular Phylogenetics and Evolution 34: 203-211. SEE ALSO: Brandley, M. et al. (2006) Are unequal clade priors problematic for Bayesian phylogenetics? Systematic Biology 55: 138-146.

8 Introduction to Biosystematics - Zool 575

Terms - from lecture & readings Study questions

consensus methods Describe the difference between a strict and majority rule consensus tree consensus tree. strict consensus splits What were the key findings of DeBry in his (2001) paper on Decay majority rule consensus Indices? reduced consensus trees agreement subtree What is the rule of thumb in bootstrapping for a branch to receive > branch support 95% support? Decay analysis What are two common but different interpretations of bootstrap Decay index (Bremer Support) values? What did Hillis & Bull (1993) conclude regarding these DeBry (2001) interpretations? Bootstrapping resampling with replacement What are two common explanations for low branch support? repeatability jackknifing

9