Consensus Methods Strict Consensus Methods

Systematics - Bio 615 Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 1. Consistency Index 2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jackknifing 6. Statistical hypothesis testing (frequentist) 7. Posterior probability (see lecture on Bayesian) Derek S. Sikes University of Alaska Multiple optimal trees Multiple optimal trees • Many methods can yield multiple equally • If multiple optimal trees are found we know optimal trees that all of them are wrong except, possibly, (hopefully) one (as species tree, not gene trees) • We can further select among these trees with additional criteria, but • Some have argued against consensus tree methods for this reason • Typically, relationships common to all the optimal trees are summarized with • Debate over quest for true tree (point consensus trees estimate) versus quantification of uncertainty Consensus methods Strict consensus methods • A consensus tree is a summary of the agreement • Strict consensus methods require agreement among a set of fundamental trees across all the fundamental trees • There are many consensus methods that differ in: • They show only those relationships that are 1. the kind of agreement unambiguously supported by the data 2. the level of agreement • The commonest method (strict component • Consensus methods can be used with multiple consensus) focuses on clades/components/full trees from a single analysis or from multiple splits analyses 1 Systematics - Bio 615 Strict consensus methods Strict consensus methods TWO FUNDAMENTAL TREES" • This method produces a consensus tree that A! B! C! D! E! F! G! A! B! C! E! D! F! G! includes all and only those full splits found in all the fundamental trees • Other relationships (those in which the fundamental trees disagree) are shown as A! B! C! D! E! F! G! unresolved polytomies • Can be less optimal than any of the optimal trees Simplest to interpret STRICT CONSENSUS TREE! Majority rule consensus Majority rule consensus •! Majority-rule consensus methods require •! This method produces a consensus tree that agreement across a majority of the fundamental includes all and only those full splits found in a trees majority (>50%) of the fundamental trees •! May include relationships that are not supported by •! Other relationships are shown as unresolved the most parsimonious interpretation of the data polytomies •! The commonest method focuses on clades/ •! Of particular use in bootstrapping and Bayesian components/full splits Inference (best not to use for single searches) •! Implemented in PAUP* and MrBayes Majority rule consensus Majority rule consensus THREE FUNDAMENTAL TREES Majority Rule Consensus trees are used for A B C D E F G A B C E F D G A B C E D F G 1. Summarizing multiple equally optimal trees from one search (but they shouldn’t be!) 2. Summarizing the results of a bootstrapping analysis (multiple searches) 3. Summarizing the results of a Bayesian Numbers indicate A B C E D F G analysis frequency of 100 66 clades in the 66 66 fundamental trees 66 Don’t confuse these! The numbers on the branches mean very different things in each case MAJORITY-RULE CONSENSUS TREE 2 Systematics - Bio 615 Reduced consensus methods Consensus methods Three TWO FUNDAMENTAL TREES! fundamental agreement subtree A! B! C! D! E! F! G! A! G! B! C! D! E! F! trees strict consensus Ochromonas Ochromonas Euplotes excluded Symbiodinium Symbiodinium Symbiodinium Prorocentrum Prorocentrum Loxodes Prorocentrum Tetrahymena Loxodes Spirostomumum Tetrahymena Loxodes Tracheloraphis Tracheloraphis Tetrahymena Euplotes Gruberia Spirostomum Spirostomum Ochromonas Euplotes Tracheloraphis Gruberia A G! Symbiodinium Gruberia B!C!D!E! F! Prorocentrum A! B! C! D! E! F! Loxodes Ochromonas Tetrahymena majority-rule Spirostomumum Euplotes Ochromonas Tracheloraphis Gruberia 100 Symbiodinium Strict component consensus! Prorocentrum Ochromonas 100 Symbiodinium Loxodes completely unresolved! 66 100 Prorocentrum Tetrahymena Loxodes 66 Tetrahymena Spirostomum AGREEMENT SUBTREE - PAUP*! Euplotes Euplotes Spirostomumum 100 Taxon G is excluded! Tracheloraphis Tracheloraphis Gruberia Gruberia Consensus methods Recall •! Use strict methods to identify those relationships unambiguously supported by parsimonious •! Stochastic error vs Systematic error interpretation of the data •! These assessment methods help identify •! Use reduced methods where consensus trees are poorly resolved stochastic error –!How repeatable are the results? •! Avoid methods which have ambiguous –!How strongly do the data support them? interpretations. Prevent possible confusion between –!This is a measure of precision (which is MR consensus for an optimal tree search and a MR hopefully related to accuracy) consensus for a bootstrapping search Confidence - Assessment of the Strength of Accuracy and Precision the Phylogenetic Signal - part 2 •! Accuracy 1. Consistency Index –!Accuracy is correctness. How close a 2. g1 statistic, PTP - test measurement is to the true value. "" "(unless we know the “true tree” in "" 3. Consensus trees "advance we cannot measure this)" 4. Decay index (Bremer Support) •! Precision 5. Bootstrapping / Jackknifing –! Precision is reproducibility. How closely 6. Statistical hypothesis testing (frequentist) two or more measurements agree with one another. (this we can measure!) 7. Posterior probability (see lecture on Bayesian) 3 Systematics - Bio 615 Branch Support Decay analysis •! In parsimony analysis, a way to assess support for a •! Several methods have been proposed that attach group is to see if the group occurs in slightly less numerical values to internal branches in trees that parsimonious trees also are intended to provide some measure of the strength of support for those branches and the corresponding groups •! The length difference between: the shortest trees including the group and •! These methods include: the shortest trees that exclude the group ! - The Bootstrap (BS) and jackknife ! - Decay analyses (aka Bremer Support) (the extra steps required to collapse a group) ! - Bayesian Posterior Probabilities (PP or BPP) is the decay index or Bremer support Decay analyses - in practice Decay analysis -example •! Decay indices for each clade can be determined by: Ciliate SSUrDNA data Randomly permuted data -! Using PAUP* to search for the shortest tree that Ochromonas Ochromonas lacks the branch of interest using reverse +27 Symbiodinium +1 Symbiodinium topological constraints Prorocentrum Prorocentrum +1 -! with the Autodecay or TreeRot programs (in +45 Loxodes +3 Loxodes Tracheloraphis Tetrahymena conjunction with PAUP*) - MacClade 4 will also Spirostomum Tracheloraphis help prepare for a Decay analysis +8 +15 Gruberia Spirostomum +10 Euplotes Euplotes -! An excellent use for the Parsimony Ratchet - +7 Tetrahymena Gruberia because finding the shortest tree length is all that matters (not finding multiple shortest trees) Decay indices - interpretation Decay indices - interpretation •! Generally, the higher the decay index the better the relative support for a group •! Unlike BS decay indices are not scaled (0-100) –! This has the advantage that the value can exceed 100 •! Like Bootstrap values (BS), decay indices may be whereas BS “tops - out” at 100 meaning that we cannot misleading if the data are misleading distinguish between the support of two branches with BS values of 100 although one might have a far greater decay index than the other •! Magnitude of decay indices and BS generally correlated (i.e. they tend to agree) •! It is even less clear what is an acceptable decay index than a BS value… •! Only groups found in all most parsimonious trees –! Unlike the BS value very little work has examined the have decay indices > zero properties and behavior of decay indices 4 Systematics - Bio 615 Decay indices - interpretation Confidence - Assessment of the Strength of the Phylogenetic Signal - part 2 One key study is that of DeBry (2001) –! He showed that decay indices should be interpreted in 1. Consistency Index light of branch lengths 2. g1 statistic, PTP - test –! That the same values, even within the same tree, do not represent the same support if the branch lengths differ 3. Consensus trees 4. Decay index (Bremer Support) -! ie Decay Indices are not easily comparable as measures of branch support 5. Bootstrapping / Jackknifing -! Values < 4 should be considered weak regardless of branch length 6. Statistical hypothesis testing (frequentist) DeBry, R.W. (2001) Improving interpretation of the Decay Index for DNA sequence data. Systematic Biology 50: 742-752. 7. Posterior probability (see lecture on Bayesian) Bootstrapping (non-parametric) •! Bootstrapping is a statistical technique that uses computer intensive random resampling of data to determine sampling error or confidence intervals for some estimated parameter •! Introduced to phylogenetics by Decay values versus Bootstrap and Jacknife values Felsenstein in 1985 from one empirical study •! Based on idea of Efron (1979) Norén, M. & U. Jondelius. 1999. Phylogeny of the Prolecithophora (Platyhelminthes) inferred from 18S rDNA sequences. Cladistics 15: 103-112. Bootstrapping (non-parametric) 1. Characters are sampled with replacement to create many (100-1000) bootstrap replicate data sets (think shuffle vs random play of music) 2. Each bootstrap replicate data set is analysed (e.g. with parsimony, distance, ML) 3. Agreement among the resulting trees is summarized with a majority-rule consensus tree 5 Systematics - Bio 615

Consensus Methods Strict Consensus Methods

Handout Lec. 25

Molecular Data and the Evolutionary History of Dinoflagellates by Juan Fernando Saldarriaga Echavarria Diplom, Ruprecht-Karls-Un

Geotaxis in the Ciliated Protozoon Loxodes

Biologia Celular – Cell Biology

Studies on Reactions to Stimuli in Unicellular Organisms

Relationship Between the Flagellates Andthe Ciliates

Report on the 2015 Workshop of the International Research

Morphology and Phylogeny of Three Trachelocercids (Protozoa

Alveolata) Using Small Subunit Rrna Gene Sequences Suggests They Are the Free-Living Sister Group to Apicomplexans

Molecular Investigation of the Ciliate Spirostomum Semivirescens

Characters and Parsimony Analysis Genetic Relationships

Swimming Microorganisms Acquire Optimal Efficiency with Multiple Cilia